A Commotion in the Stars: The Legacy of Christian Doppler

Christian Andreas Doppler (1803 – 1853) was born in Salzburg, Austria, to a longstanding family of stonemasons.  As a second son, he was expected to help his older brother run the business, so his Father had him tested in his 18th year for his suitability for a career in business.  The examiner Simon Stampfer (1790 – 1864), an Austrian mathematician and inventor teaching at the Lyceum in Salzburg, discovered that Doppler had a gift for mathematics and was better suited for a scientific career.  Stampfer’s enthusiasm convinced Doppler’s father to enroll him in the Polytechnik Institute in Vienna (founded only a few years earlier in 1815) where he took classes in mathematics, mechanics and physics [1] from 1822 to 1825.  Doppler excelled in his courses, but was dissatisfied with the narrowness of the education, yearning for more breadth and depth in his studies and for more significance in his positions, feelings he would struggle with for his entire short life.  He left Vienna, returning to the Lyceum in Salzburg to round out his education with philosophy, languages and poetry.  Unfortunately, this four-year detour away from technical studies impeded his ability to gain a permanent technical position, so he began a temporary assistantship with a mathematics professor at Vienna.  As he approached his 30th birthday this term expired without prospects.  He was about to emigrate to America when he finally received an offer to teach at a secondary school in Prague.

To read about the attack by Joseph Petzval on Doppler’s effect and the effect it had on Doppler, see my feature article “The Fall and Rise of the Doppler Effect in Physics Today, 73(3) 30, March (2020).

Salzburg Austria

Doppler in Prague

Prague gave Doppler new life.  He was a professor with a position that allowed him to marry the daughter of a sliver and goldsmith from Salzburg.  He began to publish scholarly papers, and in 1837 was appointed supplementary professor of Higher Mathematics and Geometry at the Prague Technical Institute, promoted to full professor in 1841.  It was here that he met the unusual genius Bernard Bolzano (1781 – 1848), recently returned from political exile in the countryside.  Bolzano was a philosopher and mathematician who developed rigorous concepts of mathematical limits and is famous today for his part in the Bolzano-Weierstrass theorem in functional analysis, but he had been too liberal and too outspoken for the conservative Austrian regime and had been dismissed from the University in Prague in 1819.  He was forbidden to publish his work in Austrian journals, which is one reason why much of Bolzano’s groundbreaking work in functional analysis remained unknown during his lifetime.  However, he participated in the Bohemian Society for Science from a distance, recognizing the inventive tendencies in the newcomer Doppler and supporting him for membership in the Bohemian Society.  When Bolzano was allowed to return in 1842 to the Polytechnic Institute in Prague, he and Doppler became close friends as kindred spirits. 

Prague, Czech Republic

On May 25, 1842, Bolzano presided as chairman over a meeting of the Bohemian Society for Science on the day that Doppler read a landmark paper on the color of stars to a meagre assembly of only five regular members of the Society [2].  The turn-out was so small that the meeting may have been held in the robing room of the Society rather than in the meeting hall itself.  Leading up to this famous moment, Doppler’s interests were peripatetic, ranging widely over mathematical and physical topics, but he had lately become fascinated by astronomy and by the phenomenon of stellar aberration.  Stellar aberration was discovered by James Bradley in 1729 and explained as the result of the Earth’s yearly motion around the Sun, causing the apparent location of a distant star to change slightly depending on the direction of the Earth’s motion.  Bradley explained this in terms of the finite speed of light and was able to estimate it to within several percent [3].  As Doppler studied Bradley aberration, he wondered how the relative motion of the Earth would affect the color of the star.  By making a simple analogy of a ship traveling with, or against, a series of ocean waves, he concluded that the frequency of impact of the peaks and troughs of waves on the ship was no different than the arrival of peaks and troughs of the light waves impinging on the eye.  Because perceived color was related to the frequency of excitation in the eye, he concluded that the color of light would be slightly shifted to the blue if approaching, and to the red if receding from, the light source. 

Doppler wave fronts from a source emitting spherical waves moving with speeds β relative to the speed of the wave in the medium.

Doppler calculated the magnitude of the effect by taking a simple ratio of the speed of the observer relative to the speed of light.  What he found was that the speed of the Earth, though sufficient to cause the detectable aberration in the position of stars, was insufficient to produce a noticeable change in color.  However, his interest in astronomy had made him familiar with binary stars where the relative motion of the light source might be high enough to cause color shifts.  In fact, in the star catalogs there were examples of binary stars that had complementary red and blue colors.  Therefore, the title of his paper, published in the Proceedings of the Royal Bohemian Society of Sciences a few months after he read it to the society, was “On the Coloured Light of the Double Stars and Certain Other Stars of the Heavens: Attempt at a General Theory which Incorporates Bradley’s Theorem of Aberration as an Integral Part” [4]

Title page of Doppler’s 1842 paper introducing the Doppler Effect.

Doppler’s analogy was correct, but like all analogies not founded on physical law, it differed in detail from the true nature of the phenomenon.  By 1842 the transverse character of light waves had been thoroughly proven through the work of Fresnel and Arago several decades earlier, yet Doppler held onto the old-fashioned notion that light was composed of longitudinal waves.  Bolzano, fully versed in the transverse nature of light, kindly published a commentary shortly afterwards [5] showing how the transverse effect for light, and a longitudinal effect for sound, were both supported by Doppler’s idea.  Yet Doppler also did not know that speeds in visual binaries were too small to produce noticeable color effects to the unaided eye.  Finally, (and perhaps the greatest flaw in his argument on the color of stars) a continuous spectrum that extends from the visible into the infrared and ultraviolet would not change color because all the frequencies would shift together preserving the flat (white) spectrum.

The simple algebraic derivation of the Doppler Effect in the 1842 publication..

Doppler’s twelve years in Prague were intense.  He was consumed by his Society responsibilities and by an extremely heavy teaching load that included personal exams of hundreds of students.  The only time he could be creative was during the night while his wife and children slept.  Overworked and running on too little rest, his health already frail with the onset of tuberculosis, Doppler collapsed, and he was unable to continue at the Polytechnic.  In 1847 he transferred to the School of Mines and Forrestry in Schemnitz (modern Banská Štiavnica in Slovakia) with more pay and less work.  Yet the revolutions of 1848 swept across Europe, with student uprisings, barricades in the streets, and Hungarian liberation armies occupying the cities and universities, giving him no peace.  Providentially, his former mentor Stampfer retired from the Polytechnic in Vienna, and Doppler was called to fill the vacancy.

Although Doppler was named the Director of Austria’s first Institute of Physics and was elected to the National Academy, he ran afoul of one of the other Academy members, Joseph Petzval (1807 – 1891), who persecuted Doppler and his effect.  To read a detailed description of the attack by Petzval on Doppler’s effect and the effect it had on Doppler, see my feature article “The Fall and Rise of the Doppler Effect” in Physics Today, March issue (2020).

Christian Doppler

Voigt’s Transformation

It is difficult today to appreciate just how deeply engrained the reality of the luminiferous ether was in the psyche of the 19th century physicist.  The last of the classical physicists were reluctant even to adopt Maxwell’s electromagnetic theory for the explanation of optical phenomena, and as physicists inevitably were compelled to do so, some of their colleagues looked on with dismay and disappointment.  This was the situation for Woldemar Voigt (1850 – 1919) at the University of Göttingen, who was appointed as one of the first professors of physics there in 1883, to be succeeded in later years by Peter Debye and Max Born.  Voigt received his doctorate at the University of Königsberg under Franz Neumann, exploring the elastic properties of rock salt, and at Göttingen he spent a quarter century pursuing experimental and theoretical research into crystalline properties.  Voigt’s research, with students like Paul Drude, laid the foundation for the modern field of solid state physics.  His textbook Lehrbuch der Kristallphysik published in 1910 remained influential well into the 20th century because it adopted mathematical symmetry as a guiding principle of physics.  It was in the context of his studies of crystal elasticity that he introduced the word “tensor” into the language of physics.

At the January 1887 meeting of the Royal Society of Science at Göttingen, three months before Michelson and Morely began their reality-altering experiments at the Case Western Reserve University in Cleveland Ohio, Voit submitted a paper deriving the longitudinal optical Doppler effect in an incompressible medium.  He was responding to results published in 1886 by Michelson and Morely on their measurements of the Fresnel drag coefficient, which was the precursor to their later results on the absolute motion of the Earth through the ether. 

Fresnel drag is the effect of light propagating through a medium that is in motion.  The French physicist Francois Arago (1786 – 1853) in 1810 had attempted to observe the effects of corpuscles of light emitted from stars propagating with different speeds through the ether as the Earth spun on its axis and traveled around the sun.  He succeeded only in observing ordinary stellar aberration.  The absence of the effects of motion through the ether motivated Augustin-Jean Fresnel (1788 – 1827) to apply his newly-developed wave theory of light to explain the null results.  In 1818 Fresnel derived an expression for the dragging of light by a moving medium that explained the absence of effects in Arago’s observations.  For light propagating through a medium of refractive index n that is moving at a speed v, the resultant velocity of light is

where the last term in parenthesis is the Fresnel drag coefficient.  The Fresnel drag effect supported the idea of the ether by explaining why its effects could not be observed—a kind of Catch-22—but it also applied to light moving through a moving dielectric medium.  In 1851, Fizeau used an interferometer to measure the Fresnel drag coefficient for light moving through moving water, arriving at conclusions that directly confirmed the Fresnel drag effect.  The positive experiments of Fizeau, as well as the phenomenon of stellar aberration, would be extremely influential on the thoughts of Einstein as he developed his approach to special relativity in 1905.  They were also extremely influential to Michelson, Morley and Voigt.

 In his paper on the absence of the Fresnel drag effect in the first Michelson-Morley experiment, Voigt pointed out that an equation of the form

is invariant under the transformation

From our modern vantage point, we immediately recognize (to within a scale factor) the Lorentz transformation of relativity theory.  The first equation is common Galilean relativity, but the last equation was something new, introducing a position-dependent time as an observer moved with speed  relative to the speed of light [6].  Using these equations, Voigt was the first to derive the longitudinal (conventional) Doppler effect from relativistic effects.

Voigt’s derivation of the longitudinal Doppler effect used a classical approach that is still used today in Modern Physics textbooks to derive the Doppler effect.  The argument proceeds by considering a moving source that emits a continuous wave in the direction of motion.  Because the wave propagates at a finite speed, the moving source chases the leading edge of the wave front, catching up by a small amount by the time a single cycle of the wave has been emitted.  The resulting compressed oscillation represents a blue shift of the emitted light.  By using his transformations, Voigt arrived at the first relativistic expression for the shift in light frequency.  At low speeds, Voigt’s derivation reverted to Doppler’s original expression.

A few months after Voigt delivered his paper, Michelson and Morley announced the results of their interferometric measurements of the motion of the Earth through the ether—with their null results.  In retrospect, the Michelson-Morley experiment is viewed as one of the monumental assaults on the old classical physics, helping to launch the relativity revolution.  However, in its own day, it was little more than just another null result on the ether.  It did incite Fitzgerald and Lorentz to suggest that length of the arms of the interferometer contracted in the direction of motion, with the eventual emergence of the full Lorentz transformations by 1904—seventeen years after the Michelson results.

            In 1904 Einstein, working in relative isolation at the Swiss patent office, was surprisingly unaware of the latest advances in the physics of the ether.  He did not know about Voigt’s derivation of the relativistic Doppler effect  (1887) as he had not heard of Lorentz’s final version of relativistic coordinate transformations (1904).  His thinking about relativistic effects focused much farther into the past, to Bradley’s stellar aberration (1725) and Fizeau’s experiment of light propagating through moving water (1851).  Einstein proceeded on simple principles, unencumbered by the mental baggage of the day, and delivered his beautifully minimalist theory of special relativity in his famous paper of 1905 “On the Electrodynamics of Moving Bodies”, independently deriving the Lorentz coordinate transformations [7]

One of Einstein’s talents in theoretical physics was to predict new phenomena as a way to provide direct confirmation of a new theory.  This was how he later famously predicted the deflection of light by the Sun and the gravitational frequency shift of light.  In 1905 he used his new theory of special relativity to predict observable consequences that included a general treatment of the relativistic Doppler effect.  This included the effects of time dilation in addition to the longitudinal effect of the source chasing the wave.  Time dilation produced a correction to Doppler’s original expression for the longitudinal effect that became significant at speeds approaching the speed of light.  More significantly, it predicted a transverse Doppler effect for a source moving along a line perpendicular to the line of sight to an observer.  This effect had not been predicted either by Doppler or by Voigt.  The equation for the general Doppler effect for any observation angle is

Just as Doppler had been motivated by Bradley’s aberration of starlight when he conceived of his original principle for the longitudinal Doppler effect, Einstein combined the general Doppler effect with his results for the relativistic addition of velocities (also in his 1905 Annalen paper) as the conclusive treatment of stellar aberration nearly 200 years after Bradley first observed the effect.

Despite the generally positive reception of Einstein’s theory of special relativity, some of its consequences were anathema to many physicists at the time.  A key stumbling block was the question whether relativistic effects, like moving clocks running slowly, were only apparent, or were actually real, and Einstein had to fight to convince others of its reality.  When Johannes Stark (1874 – 1957) observed Doppler line shifts in ion beams called “canal rays” in 1906 (Stark received the 1919 Nobel prize in part for this discovery) [8], Einstein promptly published a paper suggesting how the canal rays could be used in a transverse geometry to directly detect time dilation through the transverse Doppler effect [9].  Thirty years passed before the experiment was performed with sufficient accuracy by Herbert Ives and G. R. Stilwell in 1938 to measure the transverse Doppler effect [10].  Ironically, even at this late date, Ives and Stilwell were convinced that their experiment had disproved Einstein’s time dilation by supporting Lorentz’ contraction theory of the electron.  The Ives-Stilwell experiment was the first direct test of time dilation, followed in 1940 by muon lifetime measurements [11].


Further Reading

D. D. Nolte, “The Fall and Rise of the Doppler Effect“, Phys. Today 73(3) 30, March 2020.


Notes

[1] pg. 15, Eden, A. (1992). The search for Christian Doppler. Wien, Springer-Verlag.

[2] pg. 30, Eden

[3] Bradley, J (1729). “Account of a new discoved Motion of the Fix’d Stars”. Phil Trans. 35: 637–660.

[4] C. A. DOPPLER, “Über das farbige Licht der Doppelsterne und einiger anderer Gestirne des Himmels (About the coloured light of the binary stars and some other stars of the heavens),” Proceedings of the Royal Bohemian Society of Sciences, vol. V, no. 2, pp. 465–482, (Reissued 1903) (1842).

[5] B. Bolzano, “Ein Paar Bemerkunen über die Neu Theorie in Herrn Professor Ch. Doppler’s Schrift “Über das farbige Licht der Doppersterne und eineger anderer Gestirnedes Himmels”,” Pogg. Anal. der Physik und Chemie, vol. 60, p. 83, 1843; B. Bolzano, “Christian Doppler’s neuste Leistunen af dem Gebiet der physikalischen Apparatenlehre, Akoustik, Optik and optische Astronomie,” Pogg. Anal. der Physik und Chemie, vol. 72, pp. 530-555, 1847.

[6] W. Voigt, “Uber das Doppler’sche Princip,” Göttinger Nachrichten, vol. 7, pp. 41–51, (1887). The common use of c to express the speed of light came later from Voigt’s student Paul Drude.

[7] A. Einstein, “On the electrodynamics of moving bodies,” Annalen Der Physik, vol. 17, pp. 891-921, 1905.

[8] J. Stark, W. Hermann, and S. Kinoshita, “The Doppler effect in the spectrum of mercury,” Annalen Der Physik, vol. 21, pp. 462-469, Nov 1906.

[9] A. Einstein, “”Über die Möglichkeit einer neuen Prüfung des Relativitätsprinzips”,” vol. 328, pp. 197–198, 1907.

[10] H. E. Ives and G. R. Stilwell, “An experimental study of the rate of a moving atomic clock,” Journal of the Optical Society of America, vol. 28, p. 215, 1938.

[11] B. Rossi and D. B. Hall, “Variation of the Rate of Decay of Mesotrons with Momentum,” Physical Review, vol. 59, pp. 223–228, 1941.

Bohr’s Orbits

The first time I ran across the Bohr-Sommerfeld quantization conditions I admit that I laughed! I was a TA for the Modern Physics course as a graduate student at Berkeley in 1982 and I read about Bohr-Sommerfeld in our Tipler textbook. I was familiar with Bohr orbits, which are already the wrong way of thinking about quantized systems. So the Bohr-Sommerfeld conditions, especially for so-called “elliptical” orbits, seemed like nonsense.

But it’s funny how a a little distance gives you perspective. Forty years later I know a little more physics than I did then, and I have gained a deep respect for an obscure property of dynamical systems known as “adiabatic invariants”. It turns out that adiabatic invariants lie at the core of quantum systems, and in the case of hydrogen adiabatic invariants can be visualized as … elliptical orbits!

Quantum Physics in Copenhagen

Niels Bohr (1885 – 1962) was born in Copenhagen, Denmark, the middle child of a physiology professor at the University in Copenhagen.  Bohr grew up with his siblings as a faculty child, which meant an unconventional upbringing full of ideas, books and deep discussions.  Bohr was a late bloomer in secondary school but began to show talent in Math and Physics in his last two years.  When he entered the University in Copenhagen in 1903 to major in physics, the university had only one physics professor, Christian Christiansen, and had no physics laboratories.  So Bohr tinkered in his father’s physiology laboratory, performing a detailed experimental study of the hydrodynamics of water jets, writing and submitting a paper that was to be his only experimental work.  Bohr went on to receive a Master’s degree in 1909 and his PhD in 1911, writing his thesis on the theory of electrons in metals.  Although the thesis did not break much new ground, it uncovered striking disparities between observed properties and theoretical predictions based on the classical theory of the electron.  For his postdoc studies he applied for and was accepted to a position working with the discoverer of the electron, Sir J. J. Thompson, in Cambridge.  Perhaps fortunately for the future history of physics, he did not get along well with Thompson, and he shifted his postdoc position in early 1912 to work with Ernest Rutherford at the much less prestigious University of Manchester.

Niels Bohr (Wikipedia)

Ernest Rutherford had just completed a series of detailed experiments on the scattering of alpha particles on gold film and had demonstrated that the mass of the atom was concentrated in a very small volume that Rutherford called the nucleus, which also carried the positive charge compensating the negative electron charges.  The discovery of the nucleus created a radical new model of the atom in which electrons executed planetary-like orbits around the nucleus.  Bohr immediately went to work on a theory for the new model of the atom.  He worked closely with Rutherford and the other members of Rutherford’s laboratory, involved in daily discussions on the nature of atomic structure.  The open intellectual atmosphere of Rutherford’s group and the ready flow of ideas in group discussions became the model for Bohr, who would some years later set up his own research center that would attract the top young physicists of the time.  Already by mid 1912, Bohr was beginning to see a path forward, hinting in letters to his younger brother Harald (who would become a famous mathematician) that he had uncovered a new approach that might explain some of the observed properties of simple atoms. 

By the end of 1912 his postdoc travel stipend was over, and he returned to Copenhagen, where he completed his work on the hydrogen atom.  One of the key discrepancies in the classical theory of the electron in atoms was the requirement, by Maxwell’s Laws, for orbiting electrons to continually radiate because of their angular acceleration.  Furthermore, from energy conservation, if they radiated continuously, the electron orbits must also eventually decay into the nuclear core with ever-decreasing orbital periods and hence ever higher emitted light frequencies.  Experimentally, on the other hand, it was known that light emitted from atoms had only distinct quantized frequencies.  To circumvent the problem of classical radiation, Bohr simply assumed what was observed, formulating the idea of stationary quantum states.  Light emission (or absorption) could take place only when the energy of an electron changed discontinuously as it jumped from one stationary state to another, and there was a lowest stationary state below which the electron could never fall.  He then took a critical and important step, combining this new idea of stationary states with Planck’s constant h.  He was able to show that the emission spectrum of hydrogen, and hence the energies of the stationary states, could be derived if the angular momentum of the electron in a Hydrogen atom was quantized by integer amounts of Planck’s constant h

Bohr published his quantum theory of the hydrogen atom in 1913, which immediately focused the attention of a growing group of physicists (including Einstein, Rutherford, Hilbert, Born, and Sommerfeld) on the new possibilities opened up by Bohr’s quantum theory [1].  Emboldened by his growing reputation, Bohr petitioned the university in Copenhagen to create a new faculty position in theoretical physics, and to appoint him to it.  The University was not unreceptive, but university bureaucracies make decisions slowly, so Bohr returned to Rutherford’s group in Manchester while he awaited Copenhagen’s decision.  He waited over two years, but he enjoyed his time in the stimulating environment of Rutherford’s group in Manchester, growing steadily into the role as master of the new quantum theory.  In June of 1916, Bohr returned to Copenhagen and a year later was elected to the Royal Danish Academy of Sciences. 

Although Bohr’s theory had succeeded in describing some of the properties of the electron in atoms, two central features of his theory continued to cause difficulty.  The first was the limitation of the theory to single electrons in circular orbits, and the second was the cause of the discontinuous jumps.  In response to this challenge, Arnold Sommerfeld provided a deeper mechanical perspective on the origins of the discrete energy levels of the atom. 

Quantum Physics in Munich

Arnold Johannes Wilhem Sommerfeld (1868—1951) was born in Königsberg, Prussia, and spent all the years of his education there to his doctorate that he received in 1891.  In Königsberg he was acquainted with Minkowski, Wien and Hilbert, and he was the doctoral student of Lindemann.  He also was associated with a social group at the University that spent too much time drinking and dueling, a distraction that lead to his receiving a deep sabre cut on his forehead that became one of his distinguishing features along with his finely waxed moustache.  In outward appearance, he looked the part of a Prussian hussar, but he finally escaped this life of dissipation and landed in Göttingen where he became Felix Klein’s assistant in 1894.  He taught at local secondary schools, rising in reputation, until he secured a faculty position of theoretical physics at the University in Münich in 1906.  One of his first students was Peter Debye who received his doctorate under Sommerfeld in 1908.  Later famous students would include Peter Ewald (doctorate in 1912), Wolfgang Pauli (doctorate in 1921), Werner Heisenberg (doctorate in 1923), and Hans Bethe (doctorate in 1928).  These students had the rare treat, during their time studying under Sommerfeld, of spending weekends in the winter skiing and staying at a ski hut that he owned only two hours by train outside of Münich.  At the end of the day skiing, discussion would turn invariably to theoretical physics and the leading problems of the day.  It was in his early days at Münich that Sommerfeld played a key role aiding the general acceptance of Minkowski’s theory of four-dimensional space-time by publishing a review article in Annalen der Physik that translated Minkowski’s ideas into language that was more familiar to physicists.

Arnold Sommerfeld (Wikipedia)

Around 1911, Sommerfeld shifted his research interest to the new quantum theory, and his interest only intensified after the publication of Bohr’s model of hydrogen in 1913.  In 1915 Sommerfeld significantly extended the Bohr model by building on an idea put forward by Planck.  While further justifying the black body spectrum, Planck turned to descriptions of the trajectory of a quantized one-dimensional harmonic oscillator in phase space.  Planck had noted that the phase-space areas enclosed by the quantized trajectories were integral multiples of his constant.  Sommerfeld expanded on this idea, showing that it was not the area enclosed by the trajectories that was fundamental, but the integral of the momentum over the spatial coordinate [2].  This integral is none other than the original action integral of Maupertuis and Euler, used so famously in their Principle of Least Action almost 200 years earlier.  Where Planck, in his original paper of 1901, had recognized the units of his constant to be those of action, and hence called it the quantum of action, Sommerfeld made the explicit connection to the dynamical trajectories of the oscillators.  He then showed that the same action principle applied to Bohr’s circular orbits for the electron on the hydrogen atom, and that the orbits need not even be circular, but could be elliptical Keplerian orbits. 

The quantum condition for this otherwise classical trajectory was the requirement for the action integral over the motion to be equal to integer units of the quantum of action.  Furthermore, Sommerfeld showed that there must be as many action integrals as degrees of freedom for the dynamical system.  In the case of Keplerian orbits, there are radial coordinates as well as angular coordinates, and each action integral was quantized for the discrete electron orbits.  Although Sommerfeld’s action integrals extended Bohr’s theory of quantized electron orbits, the new quantum conditions also created a problem because there were now many possible elliptical orbits that all had the same energy.  How was one to find the “correct” orbit for a given orbital energy?

Quantum Physics in Leiden

In 1906, the Austrian Physicist Paul Ehrenfest (1880 – 1933), freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life.  Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest.  It was a monumental task, which was to take him and his physicist wife Tatyana nearly five years to complete.  Part of the delay was the desire by Ehrenfest to close some open problems that remained in Boltzmann’s work.  One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters.  These properties would later be called adiabatic invariants by Einstein.  Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity.  Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system.  This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice.  For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .  

Paul Ehrenfest (Wikipedia)

Ehrenfest published his observations in 1913 [3], the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc.  Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue with his theory of adiabatic invariants, showing that each of Sommerfeld’s quantum conditions were precisely the adabatic invariants of the classical electron dynamics [4]. The remaining question was which coordinates were the correct ones, because different choices led to different answers.  This was quickly solved by Johannes Burgers (one of Ehrenfest’s students) who showed that action integrals were adiabatic invariants, and then by Karl Schwarzschild and Paul Epstein who showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the electron orbits.  Schwarzshild’s paper was published the same day that he died on the Eastern Front.  The work by Schwarzschild and Epstein was the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, which foreshadowed the future importance of Hamiltonians for quantum theory.

Karl Schwarzschild (Wikipedia)

Bohr-Sommerfeld

Emboldened by Ehrenfest’s adiabatic principle, which demonstrated a close connection between classical dynamics and quantization conditions, Bohr formalized a technique that he had used implicitly in his 1913 model of hydrogen, and now elevated it to the status of a fundamental principle of quantum theory.  He called it the Correspondence Principle, and published the details in 1920.  The Correspondence Principle states that as the quantum number of an electron orbit increases to large values, the quantum behavior converges to classical behavior.  Specifically, if an electron in a state of high quantum number emits a photon while jumping to a neighboring orbit, then the wavelength of the emitted photon approaches the classical radiation wavelength of the electron subject to Maxwell’s equations. 

Bohr’s Correspondence Principle cemented the bridge between classical physics and quantum physics.  One of the biggest former questions about the physics of electron orbits in atoms was why they did not radiate continuously because of the angular acceleration they experienced in their orbits.  Bohr had now reconnected to Maxwell’s equations and classical physics in the limit.  Like the theory of adiabatic invariants, the Correspondence Principle became a new tool for distinguishing among different quantum theories.  It could be used as a filter to distinguish “correct” quantum models, that transitioned smoothly from quantum to classical behavior, from those that did not.  Bohr’s Correspondence Principle was to be a powerful tool in the hands of Werner Heisenberg as he reinvented quantum theory only a few years later.

Quantization conditions.

 By the end of 1920, all the elements of the quantum theory of electron orbits were apparently falling into place.  Bohr’s originally ad hoc quantization condition was now on firm footing.  The quantization conditions were related to action integrals that were, in turn, adiabatic invariants of the classical dynamics.  This meant that slight variations in the parameters of the dynamics systems would not induce quantum transitions among the various quantum states.  This conclusion would have felt right to the early quantum practitioners.  Bohr’s quantum model of electron orbits was fundamentally a means of explaining quantum transitions between stationary states.  Now it appeared that the condition for the stationary states of the electron orbits was an insensitivity, or invariance, to variations in the dynamical properties.  This was analogous to the principle of stationary action where the action along a dynamical trajectory is invariant to slight variations in the trajectory.  Therefore, the theory of quantum orbits now rested on firm foundations that seemed as solid as the foundations of classical mechanics.

From the perspective of modern quantum theory, the concept of elliptical Keplerian orbits for the electron is grossly inaccurate.  Most physicists shudder when they see the symbol for atomic energy—the classic but mistaken icon of electron orbits around a nucleus.  Nonetheless, Bohr and Ehrenfest and Sommerfeld had hit on a deep thread that runs through all of physics—the concept of action—the same concept that Leibniz introduced, that Maupertuis minimized and that Euler canonized.  This concept of action is at work in the macroscopic domain of classical dynamics as well as the microscopic world of quantum phenomena.  Planck was acutely aware of this connection with action, which is why he so readily recognized his elementary constant as the quantum of action. 

However, the old quantum theory was running out of steam.  For instance, the action integrals and adiabatic invariants only worked for single electron orbits, leaving the vast bulk of many-electron atomic matter beyond the reach of quantum theory and prediction.  The literal electron orbits were a crutch or bias that prevented physicists from moving past them and seeing new possibilities for quantum theory.  Orbits were an anachronism, exerting a damping force on progress.  This limitation became painfully clear when Bohr and his assistants at Copenhagen–Kramers and Slater–attempted to use their electron orbits to explain the refractive index of gases.  The theory was cumbersome and exhausted.  It was time for a new quantum revolution by a new generation of quantum wizards–Heisenberg, Born, Schrödinger, Pauli, Jordan and Dirac.


References

[1] N. Bohr, “On the Constitution of Atoms and Molecules, Part II Systems Containing Only a Single Nucleus,” Philosophical Magazine, vol. 26, pp. 476–502, 1913.

[2] A. Sommerfeld, “The quantum theory of spectral lines,” Annalen Der Physik, vol. 51, pp. 1-94, Sep 1916.

[3] P. Ehrenfest, “Een mechanische theorema van Boltzmann en zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and its relation to the theory of energy quanta),” Verslag van de Gewoge Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593, 1913.

[4] P. Ehrenfest, “Adiabatic invariables and quantum theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.

Who Invented the Quantum? Einstein vs. Planck

Albert Einstein defies condensation—it is impossible to condense his approach, his insight, his motivation—into a single word like “genius”.  He was complex, multifaceted, contradictory, revolutionary as well as conservative.  Some of his work was so simple that it is hard to understand why no-one else did it first, even when they were right in the middle of it.  Lorentz and Poincaré spring to mind—they had been circling the ideas of spacetime for decades—but never stepped back to see what the simplest explanation could be.  Einstein did, and his special relativity was simple and beautiful, and the math is just high-school algebra.  On the other hand, parts of his work—like gravitation—are so embroiled in mathematics and the religion of general covariance that it remains opaque to physics neophytes 100 years later and is usually reserved for graduate study. 

            Yet there is a third thread in Einstein’s work that relies on pure intuition—neither simple nor complicated—but almost impossible to grasp how he made his leap.  This is the case when he proposed the real existence of the photon—the quantum particle of light.  For ten years after this proposal, it was considered by almost everyone to be his greatest blunder. It even came up when Planck was nominating Einstein for membership in the German Academy of Science. Planck said

That he may sometimes have missed the target of his speculations, as for example, in his hypothesis of light quanta, cannot really be held against him.

In this single statement, we have the father of the quantum being criticized by the father of the quantum discontinuity.

Max Planck’s Discontinuity

In histories of the development of quantum theory, the German physicist Max Planck (1858—1947) is characterized as an unlikely revolutionary.  He was an establishment man, in the stolid German tradition, who was already embedded in his career, in his forties, holding a coveted faculty position at the University of Berlin.  In his research, he was responding to a theoretical challenge issued by Kirchhoff many years ago in 1860 to find the function of temperature and wavelength that described and explained the observed spectrum of radiating bodies.  Planck was not looking for a revolution.  In fact, he was looking for the opposite.  One of his motivations in studying the thermodynamics of electromagnetic radiation was to rebut the statistical theories of Boltzmann.  Planck had never been convinced by the atomistic and discrete approach Boltzmann had used to explain entropy and the second law of thermodynamics.  With the continuum of light radiation he thought he had the perfect system that would show how entropy behaved in a continuous manner, without the need for discrete quantities. 

Therefore, Planck’s original intentions were to use blackbody radiation to argue against Boltzmann—to set back the clock.  For this reason, not only was Planck an unlikely revolutionary, he was a counter-revolutionary.  But Planck was a revolutionary because that is what he did, whatever his original intentions were, and he accepted his role as a revolutionary when he had the courage to stand in front of his scientific peers and propose a quantum hypothesis that lay at the heart of physics.

            Blackbody radiation, at the end of the nineteenth century, was a topic of keen interest and had been measured with high precision.  This was in part because it was such a “clean” system, having fundamental thermodynamic properties independent of any of the material properties of the black body, unlike the so-called ideal gases, which always showed some dependence on the molecular properties of the gas. The high-precision measurements of blackbody radiation were made possible by new developments in spectrometers at the end of the century, as well as infrared detectors that allowed very precise and repeatable measurements to be made of the spectrum across broad ranges of wavelengths. 

In 1893 the German physicist Wilhelm Wien (1864—1928) had used adiabatic expansion arguments to derive what became known as Wien’s Displacement Law that showed a simple linear relationship between the temperature of the blackbody and the peak wavelength.  Later, in 1896, he showed that the high-frequency behavior could be described by an exponential function of temperature and wavelength that required no other properties of the blackbody.  This was approaching the solution of Kirchhoff’s challenge of 1860 seeking a universal function.  However, at lower frequencies Wien’s approximation failed to match the measured spectrum.  In mid-year 1900, Planck was able to define a single functional expression that described the experimentally observed spectrum.  Planck had succeeded in describing black-body radiation, but he had not satisfied Kirchhoff’s second condition—to explain it. 

            Therefore, to describe the blackbody spectrum, Planck modeled the emitting body as a set of ideal oscillators.  As an expert in the Second Law, Planck derived the functional form for the radiation spectrum, from which he found the entropy of the oscillators that produced the spectrum.  However, once he had the form for the entropy, he needed to explain why it took that specific form.  In this sense, he was working backwards from a known solution rather than forwards from first principles.  Planck was at an impasse.  He struggled but failed to find any continuum theory that could work. 

Then Planck turned to Boltzmann’s statistical theory of entropy, the same theory that he had previously avoided and had hoped to discredit.  He described this as “an act of despair … I was ready to sacrifice any of my previous convictions about physics.”  In Boltzmann’s expression for entropy, it was necessary to “count” possible configurations of states.  But counting can only be done if the states are discrete.  Therefore, he lumped the energies of the oscillators into discrete ranges, or bins, that he called “quanta”.  The size of the bins was proportional to the frequency of the oscillator, and the proportionality constant had the units of Maupertuis’ quantity of action, so Planck called it the “quantum of action”. Finally, based on this quantum hypothesis, Planck derived the functional form of black-body radiation.

            Planck presented his findings at a meeting of the German Physical Society in Berlin on November 15, 1900, introducing the word quantum (plural quanta) into physics from the Latin word that means quantity [1].  It was a casual meeting, and while the attendees knew they were seeing an intriguing new physical theory, there was no sense of a revolution.  But Planck himself was aware that he had created something fundamentally new.  The radiation law of cavities depended on only two physical properties—the temperature and the wavelength—and on two constants—Boltzmann’s constant kB and a new constant that later became known as Planck’s constant h = ΔE/f = 6.6×10-34 J-sec.  By combining these two constants with other fundamental constants, such as the speed of light, Planck was able to establish accurate values for long-sought constants of nature, like Avogadro’s number and the charge of the electron.

            Although Planck’s quantum hypothesis in 1900 explained the blackbody radiation spectrum, his specific hypothesis was that it was the interaction of the atoms and the light field that was somehow quantized.  He certainly was not thinking in terms of individual quanta of the light field.

Figure. Einstein and Planck at a dinner held by Max von Laue in Berlin on Nov. 11, 1931.

Einstein’s Quantum

When Einstein analyzed the properties of the blackbody radiation in 1905, using his deep insight into statistical mechanics, he was led to the inescapable conclusion that light itself must be quantized in amounts E = hf, where h is Planck’s constant and f is the frequency of the light field.  Although this equation is exactly the same as Planck’s from 1900, the meaning was completely different.  For Planck, this was the discreteness of the interaction of light with matter.  For Einstein, this was the quantum of light energy—whole and indivisible—just as if the light quantum were a particle with particle properties.  For this reason, we can answer the question posed in the title of this Blog—Einstein takes the honor of being the inventor of the quantum.

            Einstein’s clarity of vision is a marvel to behold even to this day.  His special talent was to take simple principles, ones that are almost trivial and beyond reproach, and to derive something profound.  In Special Relativity, he simply assumed the constancy of the speed of light and derived Lorentz’s transformations that had originally been based on obtuse electromagnetic arguments about the electron.  In General Relativity, he assumed that free fall represented an inertial frame, and he concluded that gravity must bend light.  In quantum theory, he assumed that the low-density limit of Planck’s theory had to be consistent with light in thermal equilibrium in thermal equilibrium with the black body container, and he concluded that light itself must be quantized into packets of indivisible energy quanta [2].  One immediate consequence of this conclusion was his simple explanation of the photoelectric effect for which the energy of an electron ejected from a metal by ultraviolet irradiation is a linear function of the frequency of the radiation.  Einstein published his theory of the quanta of light [3] as one of his four famous 1905 articles in Annalen der Physik in his Annus Mirabilis

Figure. In the photoelectric effect a photon is absorbed by an electron state in a metal promoting the electron to a free electron that moves with a maximum kinetic energy given by the difference between the photon energy and the work function W of the metal. The energy of the photon is absorbed as a whole quantum, proving that light is composed of quantized corpuscles that are today called photons.

            Einstein’s theory of light quanta was controversial and was slow to be accepted.  It is ironic that in 1914 when Einstein was being considered for a position at the University in Berlin, Planck himself, as he championed Einstein’s case to the faculty, implored his colleagues to accept Einstein despite his ill-conceived theory of light quanta [4].  This comment by Planck goes far to show how Planck, father of the quantum revolution, did not fully grasp, even by 1914, the fundamental nature and consequences of his original quantum hypothesis.  That same year, the American physicist Robert Millikan (1868—1953) performed a precise experimental measurement of the photoelectric effect, with the ostensible intention of proving Einstein wrong, but he accomplished just the opposite—providing clean experimental evidence confirming Einstein’s theory of the photoelectric effect. 

The Stimulated Emission of Light

About a year after Millikan proved that the quantum of energy associated with light absorption was absorbed as a whole quantum of energy that was not divisible, Einstein took a step further in his theory of the light quantum. In 1916 he published a paper in the proceedings of the German Physical Society that explored how light would be in a state of thermodynamic equilibrium when interacting with atoms that had discrete energy levels. Once again he used simple arguments, this time using the principle of detailed balance, to derive a new and unanticipated property of light—stimulated emission!

Figure. The stimulated emission of light. An excited state is stimulated to emit an identical photon when the electron transitions to its ground state.

The stimulated emission of light occurs when an electron is in an excited state of a quantum system, like an atom, and an incident photon stimulates the emission of a second photon that has the same energy and phase as the first photon. If there are many atoms in the excited state, then this process leads to a chain reaction as 1 photon produces 2, and 2 produce 4, and 4 produce 8, etc. This exponential gain in photons with the same energy and phase is the origin of laser radiation. At the time that Einstein proposed this mechanism, lasers were half a century in the future, but he was led to this conclusion by extremely simple arguments about transition rates.

Figure. Section of Einstein’s 1916 paper that describes the absorption and emission of light by atoms with discrete energy levels [5].

Detailed balance is a principle that states that in thermal equilibrium all fluxes are balanced. In the case of atoms with ground states and excited states, this principle requires that as many transitions occur from the ground state to the excited state as from the excited state to the ground state. The crucial new element that Einstein introduced was to distinguish spontaneous emission from stimulated emission. Just as the probability to absorb a photon must be proportional to the photon density, there must be an equivalent process that de-excites the atom that also must be proportional the photon density. In addition, an electron must be able to spontaneously emit a photon with a rate that is independent of photon density. This leads to distinct coefficients in the transition rate equations that are today called the “Einstein A and B coefficients”. The B coefficients relate to the photon density, while the A coefficient relates to spontaneous emission.

Figure. Section of Einstein’s 1917 paper that derives the equilibrium properties of light interacting with matter. The “B”-coefficient for transition from state m to state n describes stimulated emission. [6]

Using the principle of detailed balance together with his A and B coefficients as well as Boltzmann factors describing the number of excited states relative to ground state atoms in equilibrium at a given temperature, Einstein was able to derive an early form of what is today called the Bose-Einstein occupancy function for photons.

Derivation of the Einstein A and B Coefficients

Detailed balance requires the rate from m to n to be the same as the rate from n to m

where the first term is the spontaneous emission rate from the excited state m to the ground state n, the second term is the stimulated emission rate, and the third term (on the right) is the absorption rate from n to m. The numbers in each state are Nm and Nn, and the density of photons is ρ. The relative numbers in the excited state relative to the ground state is given by the Boltzmann factor

By assuming that the stimulated transition coefficient from n to m is the same as m to n, and inserting the Boltzmann factor yields

The Planck density of photons for ΔE = hf is

which yields the final relation between the spontaneous emission coefficient and the stimulated emission coefficient

The total emission rate is

where the p-bar is the average photon number in the cavity. One of the striking aspects of this derivation is that no assumptions are made about the physical mechanisms that determine the coefficient B. Only arguments of detailed balance are required to arrive at these results.

Einstein’s Quantum Legacy

Einstein was awarded the Nobel Prize in 1921 for the photoelectric effect, not for the photon nor for any of Einstein’s other theoretical accomplishments.  Even in 1921, the quantum nature of light remained controversial.  It was only in 1923, after the American physicist Arthur Compton (1892—1962) showed that energy and momentum were conserved in the scattering of photons from electrons, that the quantum nature of light began to be accepted.  The very next year, in 1924, the quantum of light was named the “photon” by the American American chemical physicist Gilbert Lewis (1875—1946). 

            A blog article like this, that attributes the invention of the quantum to Einstein rather than Planck, must say something about the irony of this attribution.  If Einstein is the father of the quantum, he ultimately was led to disinherit his own brain child.  His final and strongest argument against the quantum properties inherent in the Copenhagen Interpretation was his famous EPR paper which, against his expectations, launched the concept of entanglement that underlies the coming generation of quantum computers.


Einstein’s Quantum Timeline

1900 – Planck’s quantum discontinuity for the calculation of the entropy of blackbody radiation.

1905 – Einstein’s “Miracle Year”. Proposes the light quantum.

1911 – First Solvay Conference on the theory of radiation and quanta.

1913 – Bohr’s quantum theory of hydrogen.

1914 – Einstein becomes a member of the German Academy of Science.

1915 – Millikan measurement of the photoelectric effect.

1916 – Einstein proposes stimulated emission.

1921 – Einstein receives Nobel Prize for photoelectric effect and the light quantum. Third Solvay Conference on atoms and electrons.

1927 – Heisenberg’s uncertainty relation. Fifth Solvay International Conference on Electrons and Photons in Brussels. “First” Bohr-Einstein debate on indeterminancy in quantum theory.

1930 – Sixth Solvay Conference on magnetism. “Second” Bohr-Einstein debate.

1935 – Einstein-Podolsky-Rosen (EPR) paper on the completeness of quantum mechanics.


Selected Einstein Quantum Papers

Einstein, A. (1905). “Generation and conversion of light with regard to a heuristic point of view.” Annalen Der Physik 17(6): 132-148.

Einstein, A. (1907). “Die Plancksche Theorie der Strahlung und die Theorie der spezifischen W ̈arme.” Annalen der Physik 22: 180–190.

Einstein, A. (1909). “On the current state of radiation problems.” Physikalische Zeitschrift 10: 185-193.

Einstein, A. and O. Stern (1913). “An argument for the acceptance of molecular agitation at absolute zero.” Annalen Der Physik 40(3): 551-560.

Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.

Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.

Einstein, A., B. Podolsky and N. Rosen (1935). “Can quantum-mechanical description of physical reality be considered complete?” Physical Review 47(10): 0777-0780.


Notes

[1] M. Planck, “Elementary quanta of matter and electricity,” Annalen Der Physik, vol. 4, pp. 564-566, Mar 1901.

[2] Klein, M. J. (1964). Einstein’s First Paper on Quanta. The natural philosopher. D. A. Greenberg and D. E. Gershenson. New York, Blaidsdell. 3.

[3] A. Einstein, “Generation and conversion of light with regard to a heuristic point of view,” Annalen Der Physik, vol. 17, pp. 132-148, Jun 1905.

[4] Chap. 2 in “Mind at Light Speed”, by David Nolte (Free Press, 2001)

[5] Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.

[6] Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.

George Stokes’ Law of Drag

The triumvirate of Cambridge University in the mid-1800’s consisted of three towering figures of mathematics and physics:  George Stokes (1819 – 1903), William Thomson (1824 – 1907) (Lord Kelvin), and James Clerk Maxwell (1831 – 1879).  Their discoveries and methodology changed the nature of natural philosophy, turning it into the subject that today we call physics.  Stokes was the elder, establishing himself as the predominant expert in British mathematical physics, setting the tone for his close friend Thomson (close in age and temperament) as well as the younger Maxwell and many other key figures of 19th century British physics.

            George Gabriel Stokes was born in County Sligo in Ireland as the youngest son of the rector of Skreen parish of the Church of Ireland.  No miraculous stories of his intellectual acumen seem to come from his childhood, as they did for the likes of William Hamilton (1805 – 1865) or George Green (1793 – 1841).  Stokes was a good student, attending school in Skreen, then Dublin and Bristol before entering Pembroke College Cambridge in 1837.  It was towards the end of his time at Cambridge that he emerged as a top mathematics student and as a candidate for Senior Wrangler.

https://upload.wikimedia.org/wikipedia/commons/c/cb/Skreen_Church_-_geograph.org.uk_-_307483.jpg
Church of Ireland church in Skreen, County Sligo, Ireland

The Cambridge Wrangler

Since 1748, the mathematics course at Cambridge University has held a yearly contest to identify the top graduating mathematics student.  The winner of the contest is called the Senior Wrangler, and in the 1800’s the Senior Wrangler received a level of public fame and admiration for intellectual achievement that is somewhat like the fame reserved today for star athletes.  In 1824 the mathematics course was reorganized into the Mathematical Tripos, and the contest became known as the Tripos Exam.  The depth and length of the exam was legion.  For instance, in 1854 when Edward Routh (1831 – 1907) beat out Maxwell for Senior Wrangler, the Tripos consisted of 16 papers spread over 8 days, totaling over 40 hours for a total number of 211 questions.  The winner typically scored less than 50%.  Famous Senior Wranglers include George Airy, John Herschel, Arthur Cayley, Lord Rayleigh, Arthur Eddington, J. E. Littlewood, Peter Guthrie Tait and Joseph Larmor.

Pembroke College
Pembroke College, Cambridge

            In his second year at Cambridge, Stokes had begun studying under William Hopkins (1793 – 1866).  It was common for mathematics students to have private tutors to prep for the Tripos exam, and Tripos tutors were sometimes as famous as the Senior Wranglers themselves, especially if a tutor (like Hopkins) was to have several students win the exam.  George Stokes became Senior Wrangler in 1841, and the same year he won the Smith’s Prize in mathematics.  The Tripos tested primarily on bookwork, while the Smith’s Prize tested on originality.  To achieve top scores on both designated the student as the most capable and creative mathematician of his class.  Stokes was immediately offered a fellowship at Pembroke College allowing him to teach and study whatever he willed.

Part I of the Tripos Exam 1890.

            After Stokes graduated, Hopkins suggested that Stokes study hydrodynamics.  This may have been in part motivated by Hopkins’ own interest is hydraulic problems in geology, but it was also a prescient suggestion, because hydrodynamics was poised for a revolution.

The Early History of Hydrodynamics

Leonardo da Vinci (1452 – 1519) believed that an artist, to capture the essence of a subject, needed to understand its fundamental nature.  Therefore, when he was captivated by the idea of portraying the flow of water, he filled his notebooks with charcoal studies of the whorls and vortices of turbulent torrents and waterfalls.  He was a budding experimental physicist, recording data on the complex phenomenon of hydrodynamics.  Yet Leonardo was no mathematician, and although his understanding of turbulent flow was deep, he did not have the theoretical tools to explain what he saw.  Two centuries later, Daniel Bernoulli (1700 – 1782) provided the first mathematical description of water flowing smoothly in his Hydrodynamica (1738).  However, the modern language of calculus was only beginning to be used at that time, preventing Daniel from providing a rigorous derivation. 

            As for nearly all nascent mathematical theories of the mid 1700’s, whether they be Newtonian dynamics or the calculus of variations or number and graph theory or population dynamics or almost anything, the person who placed the theory on firm mathematical foundations, using modern notions and notations, was Leonhard Euler (1707 – 1783).  In 1752 Euler published a treatise that described the mathematical theory of inviscid flow—meaning flow without viscosity.  Euler’s chief results is

where ρ is the density, v is the velocity, p is pressure, z is the height of the fluid and φ is a velocity potential, while f(t) is a stream function that depends only on time.  If the flow is in steady state, the time derivative vanishes, and the stream function is a constant.  The key to the inviscid approximation is the dominance of momentum in fast flow, as opposed to creeping flow in which viscosity dominates.  Euler’s equation, which expresses the well-known Bernoulli principle, works well under fast laminar conditions, but under slower flow conditions, internal friction ruins the inviscid approximation.

            The violation of the inviscid flow approximation became one of the important outstanding problems in theoretical physics in the early 1800’s.  For instance, the flow of water around ship’s hulls was a key technological problem in the strategic need for speed under sail.  In addition, understanding the creation and propagation of water waves was critical for the safety of ships at sea.  For the growing empire of the British islands, built on the power of their navy, the physics of hydrodynamics was more than an academic pursuit, and their archenemy, the French, were leading the way.

The French Analysts

In 1713 when Newton won his priority dispute with Leibniz over the invention of calculus, it had the unintended consequence of setting back British mathematics and physics for over a hundred years.  Perhaps lulled into complacency by their perceived superiority, Cambridge and Oxford continued teaching classical mathematics, and natural philosophy became dogmatic as Newton’s in Principia became canon.  Meanwhile Continental mathematical analysis went through a fundamental transformation.  Inspired by Newton’s Principia rather than revering it, mathematicians such as the Swiss-German Leonhard Euler, the Frenchwoman Emile du Chatelet and the Italian Joseph Lagrange pushed mathematical physics far beyond Newton by developing Leibniz’ methods and notations for calculus.

The matematicians Newton, Navier and Stokes

            By the early 1800’s, the leading mathematicians of Europe were in the French school led by Pierre-Simon Laplace along with Joseph Fourier, Siméon Denis Poisson and Augustin-Louis Cauchy.  In their hands, functional analysis was going through rapid development, both theoretically and applied, far surpassing British mathematics.  It was by reading the French analysts in the 1820’s that the Englishman George Green finally helped bring British mathematics back up to speed.

            One member of the French school was the French engineer Claude-Louis Navier (1785 – 1836).  He was educated at the Ecole Polytechnique and the School for Roads and Bridges where he became one of the leading architects for bridges in France.  In addition to his engineering tasks, he also was involved in developing principles of work and kinetic energy that aided the later work of Coriolis, who was one of the first physicists to recognize the explicit interplay between kinetic energy and potential energy.  One of Navier’s specialties was hydraulic engineering, and he edited a new edition of a classic work on hydraulics.  In the process, he became aware of serious deficiencies in the theoretical treatment of creeping flow, especially with regards to dissipation.  By adopting a molecular approach championed by Poisson, including appropriate boundary conditions, he derived a correction to the Euler flow equations that included a new term with a new material property of viscosity

Navier-Stokes Equation

Navier published his new flow equation in 1823, but the publication was followed by years of nasty in-fighting as his assumptions were assaulted by Poisson and others.  This acrimony is partly to blame for why Navier was not hailed alone as the discoverer of this equation, which today bears the name “Navier-Stokes Equation”.

Stokes’ Hydrodynamics

Despite the lead of the French mathematicians over the British in mathematical rigor, they were also bogged down by their insistence on mechanistic models that operated on the microscale action-reaction forces.  This was true for their theories of elasticity, hydrodynamics as well as the luminiferous ether.  George Green in England would change this.  While Green was inspired by French mathematics, he made an important shift in thinking in which the fields became the quantities of interest rather than molecular forces.  Differential equations describing macroscale phenomena could be “true” independently of any microscale mechanics.  His theories on elasticity and light propagation relied on no underlying structure of matter or ether.  Underlying models could change, but the differential equations remained true.  Maxwell’s equations, a pinnacle of 19th-century British mathematical physics, were field equations that required no microscopic models, although Maxwell and others later tried to devise models of the ether.

            George Stokes admired Green and adopted his mathematics and outlook on natural philosophy.  When he turned his attention to hydrodynamic flow, he adopted a continuum approach that initially did not rely on molecular interactions to explain viscosity and drag.  He replicated Navier’s results, but this time without relying on any underlying microscale physics.  Yet this only took him so far.  To explain some of the essential features of fluid pressures he had to revert to microscopic arguments of isotropy to explain why displacements were linear and why flow at a boundary ceased.  However, once these functional dependences were set, the remainder of the problem was pure continuum mechanics, establishing the Navier-Stokes equation for incompressible flow.  Stokes went on to apply no-slip boundary conditions for fluids flowing through pipes of different geometric cross sections to calculate flow rates as well as pressure drops along the pipe caused by viscous drag.

            Stokes then turned to experimental results to explain why a pendulum slowly oscillating in air lost amplitude due to dissipation.  He reasoned that when the flow of air around the pendulum bob and stiff rod was slow enough the inertial effects would be negligible, simplifying the Navier-Stokes equation.  He calculated the drag force on a spherical object moving slowly through a viscous liquid and obtained the now famous law known as Stokes’ Law of Drag

in which the drag force increases linearly with speed and is proportional to viscosity.  With dramatic flair, he used his new law to explain why water droplets in clouds float buoyantly until they become big enough to fall as rain.

The Lucasian Chair of Mathematics

There are rare individuals who become especially respected for the breadth and depth of their knowledge.  In our time, already somewhat past, Steven Hawking embodied the ideal of the eminent (almost clairvoyant) scientist pushing his field to the extremes with the deepest understanding, while also being one of the most famous science popularizers of his day as well as an important chronicler of the history of physics.  In his own time, Stokes was held in virtually the same level of esteem. 

            Just as Steven Hawking and Isaac Newton held the Lucasian Chair of Mathematics at Cambridge, Stokes became the Lucasian chair in 1849 and held the chair until his death in 1903.  He was offered the chair in part because of the prestige he held as first wrangler and Smith’s prize winner, but also because of his imposing grasp of the central fields of his time. The Lucasian Chair of Mathematics at Cambridge is one of the most famous academic chairs in the world.  It was established by Charles II in 1664, and the first Lucasian professor was Isaac Barrow followed by Isaac Newton who held the post for 33 years.  Other famous Lucasian professors were Airy, Babbage, Larmor, Dirac as well as Hawking.  During his tenure, Stokes made central contributions to hydrodynamics (as we have seen), but also the elasticity of solids, the behavior of waves in elastic solids, the diffraction of light, problems in light, gravity, sound, heat, meteorology, solar physics, and chemistry.  Perhaps his most famous contribution was his explanation of fluorescence, for which he won the Rumford Medal.  Certainly, if the Nobel Prize had existed in his time, he would have been a Nobel Laureate.

Derivation of Stokes’ Law

The flow field of an incompressible fluid around a smooth spherical object has zero divergence and satisfies Laplace’s equation.  This allows the stream velocities to take the form in spherical coordinates

where the velocity components are defined in terms of the stream function ψ.   The partial derivatives of pressure satisfy the equations

where the second-order operator is

The vanishing of the Laplacian of the stream function

allows the function to take the form

The no-slip boundary condition on the surface of the sphere, as well as the asymptotic velocity field far from the sphere taking the form v•cosθ  gives the solution

Using this expression in the first equations yields the velocities, pressure and shear

The force on the sphere is obtained by integrating the pressure and the shear stress over the surface of the sphere.  The two contributions are

Adding these together gives the final expression for Stokes’ Law

where two thirds of the force is caused by the shear stress and one third by the pressure.

Stokes flow around a sphere. On the left is the pressure. On the right is the y-component of the flow velocity.

Stokes Timeline

  • 1819 – Born County Sligo Parish of Skreen
  • 1837 – Entered Pembroke College Cambridge
  • 1841 – Graduation, Senior Wrangler, Smith’s Prize, Fellow of Pembroke
  • 1845 – Viscosity
  • 1845 – Viscoelastic solid and the luminiferous ether
  • 1845 – Ether drag
  • 1846 – Review of hydrodynamics (including French references)
  • 1847 – Water waves
  • 1847 – Expansion in periodic series (Fourier)
  • 1848 – Jelly theory of the ether
  • 1849 – Lucasian Professorship Cambridge
  • 1849 – Geodesy and Clairaut’s theorem
  • 1849 – Dynamical theory of diffraction
  • 1850 – Damped pendulum, explanation of clouds (water droplets)
  • 1850 – Haidinger’s brushes
  • 1850 – Letter from Kelvin (Thomson) to Stokes on a theorem in vector calculus
  • 1852 – Stokes’ 4 polarization parameters
  • 1852 – Fluorescence and Rumford medal
  • 1854 – Stokes sets “Stokes theorem” for the Smith’s Prize Exam
  • 1857 – Marries
  • 1857 – Effect of wind on sound intensity
  • 1861 – Hankel publishes “Stokes theorem”
  • 1880 – Form of highest waves
  • 1885 – President of Royal Society
  • 1887 – Member of Parliament
  • 1889 – Knighted as baronet by Queen Victoria
  • 1893 – Copley Medal
  • 1903 – Dies
  • 1945 – Cartan establishes modern form of Stokes’ theorem using differential forms

Further Reading

Darrigol, O., Worlds of flow : A history of hydrodynamics from the Bernoullis to Prandtl. (Oxford University Press: Oxford 2005.) This is an excellent technical history of hydrodynamics.

Science 1916: A Hundred-year Time Capsule

In one of my previous blog posts, as I was searching for Schwarzschild’s original papers on Einstein’s field equations and quantum theory, I obtained a copy of the January 1916 – June 1916 volume of the Proceedings of the Royal Prussian Academy of Sciences through interlibrary loan.  The extremely thick volume arrived at Purdue about a week after I ordered it online.  It arrived from Oberlin College in Ohio that had received it as a gift in 1928 from the library of Professor Friedrich Loofs of the University of Halle in Germany.  Loofs had been the Haskell Lecturer at Oberlin for the 1911-1912 semesters. 

As I browsed through the volume looking for Schwarzschild’s papers, I was amused to find a cornucopia of turn-of-the-century science topics recorded in its pages.  There were papers on the overbite and lips of marsupials.  There were papers on forgotten languages.  There were papers on ancient Greek texts.  On the origins of religion.  On the philosophy of abstraction.  Histories of Indian dramas.  Reflections on cancer.  But what I found most amazing was a snapshot of the field of physics and mathematics in 1916, with historic papers by historic scientists who changed how we view the world. Here is a snapshot in time and in space, a period of only six months from a single journal, containing papers from authors that reads like a who’s who of physics.

In 1916 there were three major centers of science in the world with leading science publications: London with the Philosophical Magazine and Proceedings of the Royal Society; Paris with the Comptes Rendus of the Académie des Sciences; and Berlin with the Proceedings of the Royal Prussian Academy of Sciences and Annalen der Physik. In Russia, there were the scientific Journals of St. Petersburg, but the Bolshevik Revolution was brewing that would overwhelm that country for decades.  And in 1916 the academic life of the United States was barely worth noticing except for a few points of light at Yale and Johns Hopkins. 

Berlin in 1916 was embroiled in war, but science proceeded relatively unmolested.  The six-month volume of the Proceedings of the Royal Prussian Academy of Sciences contains a number of gems.  Schwarzschild was one of the most prolific contributors, publishing three papers in just this half-year volume, plus his obituary written by Einstein.  But joining Schwarzschild in this volume were Einstein, Planck, Born, Warburg, Frobenious, and Rubens among others—a pantheon of German scientists mostly cut off from the rest of the world at that time, but single-mindedly following their individual threads woven deep into the fabric of the physical world.

Karl Schwarzschild (1873 – 1916)

Schwarzschild had the unenviable yet effective motivation of his impending death to spur him to complete several projects that he must have known would make his name immortal.  In this six-month volume he published his three most important papers.  The first (pg. 189) was on the exact solution to Einstein’s field equations to general relativity.  The solution was for the restricted case of a point mass, yet the derivation yielded the Schwarzschild radius that later became known as the event horizon of a non-roatating black hole.  The second paper (pg. 424) expanded the general relativity solutions to a spherically symmetric incompressible liquid mass. 

Schwarzschild’s solution to Einstein’s field equations for a point mass.

          

Schwarzschild’s extension of the field equation solutions to a finite incompressible fluid.

The subject, content and success of these two papers was wholly unexpected from this observational astronomer stationed on the Russian Front during WWI calculating trajectories for German bombardments.  He would not have been considered a theoretical physicist but for the importance of his results and the sophistication of his methods.  Within only a year after Einstein published his general theory, based as it was on the complicated tensor calculus of Levi-Civita, Christoffel and Ricci-Curbastro that had taken him years to master, Schwarzschild found a solution that evaded even Einstein.

Schwarzschild’s third and final paper (pg. 548) was on an entirely different topic, still not in his official field of astronomy, that positioned all future theoretical work in quantum physics to be phrased in the language of Hamiltonian dynamics and phase space.  He proved that action-angle coordinates were the only acceptable canonical coordinates to be used when quantizing dynamical systems.  This paper answered a central question that had been nagging Bohr and Einstein and Ehrenfest for years—how to quantize dynamical coordinates.  Despite the simple way that Bohr’s quantized hydrogen atom is taught in modern physics, there was an ambiguity in the quantization conditions even for this simple single-electron atom.  The ambiguity arose from the numerous possible canonical coordinate transformations that were admissible, yet which led to different forms of quantized motion. 

Schwarzschild’s proposal of action-angle variables for quantization of dynamical systems.

 Schwarzschild’s doctoral thesis had been a theoretical topic in astrophysics that applied the celestial mechanics theories of Henri Poincaré to binary star systems.  Within Poincaré’s theory were integral invariants that were conserved quantities of the motion.  When a dynamical system had as many constraints as degrees of freedom, then every coordinate had an integral invariant.  In this unexpected last paper from Schwarzschild, he showed how canonical transformation to action-angle coordinates produced a unique representation in terms of action variables (whose dimensions are the same as Planck’s constant).  These action coordinates, with their associated cyclical angle variables, are the only unambiguous representations that can be quantized.  The important points of this paper were amplified a few months later in a publication by Schwarzschild’s friend Paul Epstein (1871 – 1939), solidifying this approach to quantum mechanics.  Paul Ehrenfest (1880 – 1933) continued this work later in 1916 by defining adiabatic invariants whose quantum numbers remain unchanged under slowly varying conditions, and the program started by Schwarzschild was definitively completed by Paul Dirac (1902 – 1984) at the dawn of quantum mechanics in Göttingen in 1925.

Albert Einstein (1879 – 1955)

In 1916 Einstein was mopping up after publishing his definitive field equations of general relativity the year before.  His interests were still cast wide, not restricted only to this latest project.  In the 1916 Jan. to June volume of the Prussian Academy Einstein published two papers.  Each is remarkably short relative to the other papers in the volume, yet the importance of the papers may stand in inverse proportion to their length.

The first paper (pg. 184) is placed right before Schwarzschild’s first paper on February 3.  The subject of the paper is the expression of Maxwell’s equations in four-dimensional space time.  It is notable and ironic that Einstein mentions Hermann Minkowski (1864 – 1909) in the first sentence of the paper.  When Minkowski proposed his bold structure of spacetime in 1908, Einstein had been one of his harshest critics, writing letters to the editor about the absurdity of thinking of space and time as a single interchangeable coordinate system.  This is ironic, because Einstein today is perhaps best known for the special relativity properties of spacetime, yet he was slow to adopt the spacetime viewpoint. Einstein only came around to spacetime when he realized around 1910 that a general approach to relativity required the mathematical structure of tensor manifolds, and Minkowski had provided just such a manifold—the pseudo-Riemannian manifold of space time.  Einstein subsequently adopted spacetime with a passion and became its greatest champion, calling out Minkowski where possible to give him his due, although he had already died tragically of a burst appendix in 1909.

Relativistic energy density of electromagnetic fields.

The importance of Einstein’s paper hinges on his derivation of the electromagnetic field energy density using electromagnetic four vectors.  The energy density is part of the source term for his general relativity field equations.  Any form of energy density can warp spacetime, including electromagnetic field energy.  Furthermore, the Einstein field equations of general relativity are nonlinear as gravitational fields modify space and space modifies electromagnetic fields, producing a coupling between gravity and electromagnetism.  This coupling is implicit in the case of the bending of light by gravity, but Einstein’s paper from 1916 makes the connection explicit. 

Einstein’s second paper (pg. 688) is even shorter and hence one of the most daring publications of his career.  Because the field equations of general relativity are nonlinear, they are not easy to solve exactly, and Einstein was exploring approximate solutions under conditions of slow speeds and weak fields.  In this “non-relativistic” limit the metric tensor separates into a Minkowski metric as a background on which a small metric perturbation remains.  This small perturbation has the properties of a wave equation for a disturbance of the gravitational field that propagates at the speed of light.  Hence, in the June 22 issue of the Prussian Academy in 1916, Einstein predicts the existence and the properties of gravitational waves.  Exactly one hundred years later in 2016, the LIGO collaboration announced the detection of gravitational waves generated by the merger of two black holes.

Einstein’s weak-field low-velocity approximation solutions of his field equations.
Einstein’s prediction of gravitational waves.

Max Planck (1858 – 1947)

Max Planck was active as the secretary of the Prussian Academy in 1916 yet was still fully active in his research.  Although he had launched the quantum revolution with his quantum hypothesis of 1900, he was not a major proponent of quantum theory even as late as 1916.  His primary interests lay in thermodynamics and the origins of entropy, following the theoretical approaches of Ludwig Boltzmann (1844 – 1906).  In 1916 he was interested in how to best partition phase space as a way to count states and calculate entropy from first principles.  His paper in the 1916 volume (pg. 653) calculated the entropy for single-atom solids.

Counting microstates by Planck.

Max Born (1882 – 1970)

Max Born was to be one of the leading champions of the quantum mechanical revolution based at the University of Göttingen in the 1920’s. But in 1916 he was on leave from the University of Berlin working on ranging for artillery.  Yet he still pursued his academic interests, like Schwarzschild.  On pg. 614 in the Proceedings of the Prussian Academy, Born published a paper on anisotropic liquids, such as liquid crystals and the effect of electric fields on them.  It is astonishing to think that so many of the flat-panel displays we have today, whether on our watches or smart phones, are technological descendants of work by Born at the beginning of his career.

Born on liquid crystals.

Ferdinand Frobenius (1849 – 1917)

Like Schwarzschild, Frobenius was at the end of his career in 1916 and would pass away one year later, but unlike Schwarzschild, his career had been a long one, receiving his doctorate under Weierstrass and exploring elliptic functions, differential equations, number theory and group theory.  One of the papers that established him in group theory appears in the May 4th issue on page 542 where he explores the series expansion of a group.

Frobenious on groups.

Heinrich Rubens (1865 – 1922)

Max Planck owed his quantum breakthrough in part to the exquisitely accurate experimental measurements made by Heinrich Rubens on black body radiation.  It was only by the precise shape of what came to be called the Planck spectrum that Planck could say with such confidence that his theory of quantized radiation interactions fit Rubens spectrum so perfectly.  In 1916 Rubens was at the University of Berlin, having taken the position vacated by Paul Drude in 1906.  He was a specialist in infrared spectroscopy, and on page 167 of the Proceedings he describes the spectrum of steam and its consequences for the quantum theory.

Rubens and the infrared spectrum of steam.

Emil Warburg (1946 – 1931)

Emil Warburg’s fame is primarily as the father of Otto Warburg who won the 1931 Nobel prize in physiology.  On page 314 Warburg reports on photochemical processes in BrH gases.     In an obscure and very indirect way, I am an academic descendant of Emil Warburg.  One of his students was Robert Pohl who was a famous early researcher in solid state physics, sometimes called the “father of solid state physics”.  Pohl was at the physics department in Göttingen in the 1920’s along with Born and Franck during the golden age of quantum mechanics.  Robert Pohl’s son, Robert Otto Pohl, was my professor when I was a sophomore at Cornell University in 1978 for the course on introductory electromagnetism using a textbook by the Nobel laureate Edward Purcell, a quirky volume of the Berkeley Series of physics textbooks.  This makes Emil Warburg my professor’s father’s professor.

Warburg on photochemistry.

Papers in the 1916 Vol. 1 of the Prussian Academy of Sciences

Schulze,  Alt– und Neuindisches

Orth,  Zur Frage nach den Beziehungen des Alkoholismus zur Tuberkulose

Schulze,  Die Erhabunen auf der Lippin- und Wangenschleimhaut der Säugetiere

von Wilamwitz-Moellendorff, Die Samie des Menandros

Engler,  Bericht über das >>Pflanzenreich<<

von Harnack,  Bericht über die Ausgabe der griechischen Kirchenväter der dri ersten Jahrhunderte

Meinecke,  Germanischer und romanischer Geist im Wandel der deutschen Geschichtsauffassung

Rubens und Hettner,  Das langwellige Wasserdampfspektrum und seine Deutung durch die Quantentheorie

Einstein,  Eine neue formale Deutung der Maxwellschen Feldgleichungen der Electrodynamic

Schwarschild,  Über das Gravitationsfeld eines Massenpunktes nach der Einsteinschen Theorie

Helmreich,  Handschriftliche Verbesserungen zu dem Hippokratesglossar des Galen

Prager,  Über die Periode des veränderlichen Sterns RR Lyrae

Holl,  Die Zeitfolge des ersten origenistischen Streits

Lüders,  Zu den Upanisads. I. Die Samvargavidya

Warburg,  Über den Energieumsatz bei photochemischen Vorgängen in Gasen. VI.

Hellman,  Über die ägyptischen Witterungsangaben im Kalender von Claudius Ptolemaeus

Meyer-Lübke,  Die Diphthonge im Provenzaslischen

Diels,  Über die Schrift Antipocras des Nikolaus von Polen

Müller und Sieg,  Maitrisimit und >>Tocharisch<<

Meyer,  Ein altirischer Heilsegen

Schwarzschild,  Über das Gravitationasfeld einer Kugel aus inkompressibler Flüssigkeit nach der Einsteinschen Theorie

Brauer,  Die Verbreitung der Hyracoiden

Correns,  Untersuchungen über Geschlechtsbestimmung bei Distelarten

Brahn,  Weitere Untersuchungen über Fermente in der Lever von Krebskranken

Erdmann,  Methodologische Konsequenzen aus der Theorie der Abstraktion

Bang,  Studien zur vergleichenden Grammatik der Türksprachen. I.

Frobenius,  Über die  Kompositionsreihe einer Gruppe

Schwarzschild,  Zur Quantenhypothese

Fischer und Bergmann,  Über neue Galloylderivate des Traubenzuckers und ihren Vergleich mit der Chebulinsäure

Schuchhardt,  Der starke Wall und die breite, zuweilen erhöhte Berme bei frügeschichtlichen Burgen in Norddeutschland

Born,  Über anisotrope Flüssigkeiten

Planck,  Über die absolute Entropie einatomiger Körper

Haberlandt,  Blattepidermis und Lichtperzeption

Einstein,  Näherungsweise Integration der Feldgleichungen der Gravitation

Lüders,  Die Saubhikas.  Ein Beitrag zur Gecschichte des indischen Dramas

Karl Schwarzschild’s Radius: How Fame Eclipsed a Physicist’s own Legacy

In an ironic twist of the history of physics, Karl Schwarzschild’s fame has eclipsed his own legacy.  When asked who was Karl Schwarzschild (1873 – 1916), you would probably say he’s the guy who solved Einstein’s Field Equations of General Relativity and discovered the radius of black holes.  You may also know that he accomplished this Herculean feat while dying slowly behind the German lines on the Eastern Front in WWI.  But asked what else he did, and you would probably come up blank.  Yet Schwarzschild was one of the most wide-ranging physicists at the turn of the 20th century, which is saying something, because it places him into the same pantheon as Planck, Lorentz, Poincaré and Einstein.  Let’s take a look at the part of his career that hides in the shadow of his own radius.

A Radius of Interest

Karl Schwarzschild was born in Frankfurt, Germany, shortly after the Franco-Prussian war thrust Prussia onto the world stage as a major political force in Europe.  His family were Jewish merchants of longstanding reputation in the city, and Schwarzschild’s childhood was spent in the vibrant Jewish community.  One of his father’s friends was a professor at a university in Frankfurt, whose son, Paul Epstein (1871 – 1939), became a close friend of Karl’s at the Gymnasium.  Schwarzshild and Epstein would partially shadow each other’s careers despite the fact that Schwarzschild became an astronomer while Epstein became a famous mathematician and number theorist.  This was in part because Schwarzschild had large radius of interests that spanned the breadth of current mathematics and science, practicing both experiments and theory. 

Schwarzschild’s application of the Hamiltonian formalism for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to fame.

By the time Schwarzschild was sixteen, he had taught himself the mathematics of celestial mechanics to such depth that he published two papers on the orbits of binary stars.  He also became fascinated in astronomy and purchased lenses and other materials to construct his own telescope.  His interests were helped along by Epstein, two years older and whose father had his own private observatory.  When Epstein went to study at the University of Strasbourg (then part of the German Federation) Schwarzschild followed him.  But Schwarzschild’s main interest in astronomy diverged from Epstein’s main interest in mathematics, and Schwarzschild transferred to the University of Munich where he studied under Hugo von Seeliger (1849 – 1924), the premier German astronomer of his day.  Epstein remained at Strasbourg where he studied under Bruno Christoffel (1829 – 1900) and eventually became a professor, but he was forced to relinquish the post when Strasbourg was ceded to France after WWI. 

The Birth of Stellar Interferometry

Until the Hubble space telescope was launched in 1990 no star had ever been resolved as a direct image.  Within a year of its launch, using its spectacular resolving power, the Hubble optics resolved—just barely—the red supergiant Betelgeuse.  No other star (other than the Sun) is close enough or big enough to image the stellar disk, even for the Hubble far above our atmosphere.  The reason is that the diameter of the optical lenses and mirrors of the Hubble—as big as they are at 2.4 meter diameter—still produce a diffraction pattern that smears the image so that stars cannot be resolved.  Yet information on the size of a distant object is encoded as phase in the light waves that are emitted from the object, and this phase information is accessible to interferometry.

The first physicist who truly grasped the power of optical interferometry and who understood how to design the first interferometric metrology systems was the French physicist Armand Hippolyte Louis Fizeau (1819 – 1896).  Fizeau became interested in the properties of light when he collaborated with his friend Léon Foucault (1819–1868) on early uses of photography.  The two then embarked on a measurement of the speed of light but had a falling out before the experiment could be finished, and both continued the pursuit independently.  Fizeau achieved the first measurement using a toothed wheel rotating rapidly [1], while Foucault came in second using a more versatile system with a spinning mirror [2].  Yet Fizeau surpassed Foucault in optical design and became an expert in interference effects.  Interference apparatus had been developed earlier by Augustin Fresnel (the Fresnel bi-prism 1819), Humphrey Lloyd (Lloyd’s mirror 1834) and Jules Jamin (Jamin’s interferential refractor 1856).  They had found ways of redirecting light using refraction and reflection to cause interference fringes.  But Fizeau was one of the first to recognize that each emitting region of a light source was coherent with itself, and he used this insight and the use of lenses to design the first interferometer.

Fizeau’s interferometer used a lens with a with a tight focal spot masked off by an opaque screen with two open slits.  When the masked lens device was focused on an intense light source it produced two parallel pencils of light that were mutually coherent but spatially separated.  Fizeau used this apparatus to measure the speed of light in moving water in 1859 [3]

Fig. 1  Optical configuration of the source element of the Fizeau refractometer.

The working principle of the Fizeau refractometer is shown in Fig. 1.  The light source is at the bottom, and it is reflected by the partially-silvered beam splitter to pass through the lens and the mask containing two slits.  (Only the light paths that pass through the double-slit mask on the lens are shown in the figure.)  The slits produce two pencils of mutually coherent light that pass through a system (in the famous Fizeau ether drag experiment it was along two tubes of moving water) and are returned through the same slits, and they intersect at the view port where they produce interference fringes.  The fringe spacing is set by the separation of the two slits in the mask.  The Rayleigh region of the lens defines a region of spatial coherence even for a so-called “incoherent” source.  Therefore, this apparatus, by use of the lens, could convert an incoherent light source into a coherent probe to test the refractive index of test materials, which is why it was called a refractometer. 

Fizeau became adept at thinking of alternative optical designs of his refractometer and alternative applications.  In an address to the French Physical Society in 1868 he suggested that the double-slit mask could be used on a telescope to determine sizes of distant astronomical objects [4].  There were several subsequent attempts to use Fizeau’s configuration in astronomical observations, but none were conclusive and hence were not widely known.

An optical configuration and astronomical application that was very similar to Fizeau’s idea was proposed by Albert Michelson in 1890 [5].  He built the apparatus and used it to successfully measure the size of several moons of Jupiter [6].  The configuration of the Michelson stellar interferometer is shown in Fig. 2.  Light from a distant star passes through two slits in the mask in front of the collecting optics of a telescope.  When the two pencils of light intersect at the view port, they produce interference fringes.  Because of the finite size of the stellar source, the fringes are partially washed out.  By adjusting the slit separation, a certain separation can be found where the fringes completely wash out.  The size of the star is then related to the separation of the slits for which the fringe visibility vanishes.  This simple principle allows this type of stellar interferometry to measure the size of stars that are large and relatively close to Earth.  However, if stars are too far away even this approach cannot be used to measure their sizes because telescopes aren’t big enough.  This limitation is currently being bypassed by the use of long-baseline optical interferometers.

Fig. 2  Optical configuration of the Michelson stellar interferometer.  Fringes at the view port are partially washed out by the finite size of the star.  By adjusting the slit separation, the fringes can be made to vanish entirely, yielding an equation that can be solved for the size of the star.

One of the open questions in the history of interferometry is whether Michelson was aware of Fizeau’s proposal for the stellar interferometer made in 1868.  Michelson was well aware of Fizeau’s published research and acknowledged him as a direct inspiration of his own work in interference effects.  But Michelson also was unaware of the undercurrents in the French school of optical interference.  When he visited Paris in 1881, he met with many of the leading figures in this school (including Lippmann and Cornu), but there is no mention or any evidence that he met with Fizeau.  By this time Fizeau’s wife had passed away, and Fizeau spent most of his time in seclusion at his home outside Paris.  Therefore, it is unlikely that he would have been present during Michelson’s visit.  Because Michelson viewed Fizeau with such awe and respect, if he had met him, he most certainly would have mentioned it.  Therefore, Michelson’s invention of the stellar interferometer can be considered with some confidence to be a case of independent discovery.  It is perhaps not surprising that he hit on the same idea that Fizeau had in 1868, because Michelson was one of the few physicists who understood coherence and interference at the same depth as Fizeau.

Schwarzschild’s Stellar Interferometer

The physics of the Michelson stellar interferometer is very similar to the physics of Young’s double slit experiment.  The two slits in the aperture mask of the telescope objective act to produce a simple sinusoidal interference pattern at the image plane of the optical system.  The size of the stellar diameter is determined by using the wash-out effect of the fringes caused by the finite stellar size.  However, it is well known to physicists who work with diffraction gratings that a multiple-slit interference pattern has a much greater resolving power than a simple double slit. 

This realization must have hit von Seeliger and Schwarzschild, working together at Munich, when they saw the publication of Michelson’s theoretical analysis of his stellar interferometer in 1890, followed by his use of the apparatus to measure the size of Jupiter’s moons.  Schwarzschild and von Seeliger realized that by replacing the double-slit mask with a multiple-slit mask, the widths of the interference maxima would be much narrower.  Such a diffraction mask on a telescope would cause a star to produce a multiple set of images on the image plane of the telescope associated with the multiple diffraction orders.  More interestingly, if the target were a binary star, the diffraction would produce two sets of diffraction maxima—a double image!  If the “finesse” of the grating is high enough, the binary star separation could be resolved as a doublet in the diffraction pattern at the image, and the separation could be measured, giving the angular separation of the two stars of the binary system.  Such an approach to the binary separation would be a direct measurement, which was a distinct and clever improvement over the indirect Michelson configuration that required finding the extinction of the fringe visibility. 

Schwarzschild enlisted the help of a fine German instrument maker to create a multiple slit system that had an adjustable slit separation.  The device is shown in Fig. 3 from Schwarzschild’s 1896 publication on the use of the stellar interferometer to measure the separation of binary stars [7].  The device is ingenious.  By rotating the chain around the gear on the right-hand side of the apparatus, the two metal plates with four slits could be raised or lowered, cause the projection onto the objective plane to have variable slit spacings.  In the operation of the telescope, the changing height of the slits does not matter, because they are near a conjugate optical plane (the entrance pupil) of the optical system.  Using this adjustable multiple slit system, Schwarzschild (and two colleagues he enlisted) made multiple observations of well-known binary star systems, and they calculated the star separations.  Several of their published results are shown in Fig. 4.

Fig. 3  Illustration from Schwarzschild’s 1896 paper describing an improvement of the Michelson interferometer for measuring the separation of binary star systems Ref. [7].
Fig. 4  Data page from Schwarzschild’s 1896 paper measuring the angular separation of two well-known binary star systems: gamma Leonis and chsi Ursa Major. Ref. [7]

Schwarzschild’s publication demonstrated one of the very first uses of stellar interferometry—well before Michelson himself used his own configuration to measure the diameter of Betelgeuse in 1920.  Schwarzschild’s major achievement was performed before he had received his doctorate, on a topic orthogonal to his dissertation topic.  Yet this fact is virtually unknown to the broader physics community outside of astronomy.  If he had not become so famous later for his solution of Einstein’s field equations, Schwarzschild nonetheless might have been famous for his early contributions to stellar interferometry.  But even this was not the end of his unique contributions to physics.

Adiabatic Physics

As Schwarzschild worked for his doctorate under von Seeliger, his dissertation topic was on new theories by Henri Poincaré (1854 – 1912) on celestial mechanics.  Poincaré had made a big splash on the international stage with the publication of his prize-winning memoire in 1890 on the three-body problem.  This is the publication where Poincaré first described what would later become known as chaos theory.  The memoire was followed by his volumes on “New Methods in Celestial Mechanics” published between 1892 and 1899.  Poincaré’s work on celestial mechanics was based on his earlier work on the theory of dynamical systems where he discovered important invariant theorems, such as Liouville’s theorem on the conservation of phase space volume.  Schwarzshild applied Poincaré’s theorems to problems in celestial orbits.  He took his doctorate in 1896 and received a post at an astronomical observatory outside Vienna. 

While at Vienna, Schwarzschild performed his most important sustained contributions to the science of astronomy.  Astronomical observations had been dominated for centuries by the human eye, but photographic techniques had been making steady inroads since the time of Hermann Carl Vogel (1841 – 1907) in the 1880’s at the Potsdam observatory.  Photographic plates were used primarily to record star positions but were known to be unreliable for recording stellar intensities.  Schwarzschild developed a “out-of-focus” technique that blurred the star’s image, while making it larger and easier to measure the density of the exposed and developed photographic emulsions.  In this way, Schwarzschild measured the magnitudes of 367 stars.  Two of these stars had variable magnitudes that he was able to record and track.  Schwarzschild correctly explained the intensity variation caused by steady oscillations in heating and cooling of the stellar atmosphere.  This work established the properties of these Cepheid variables which would become some of the most important “standard candles” for the measurement of cosmological distances.  Based on the importance of this work, Schwarzschild returned to Munich as a teacher in 1899 and subsequently was appointed in 1901 as the director of the observatory at Göttingen established by Gauss eighty years earlier.

Schwarzschild’s years at Göttingen brought him into contact with some of the greatest mathematicians and physicists of that era.  The mathematicians included Felix Klein, David Hilbert and Hermann Minkowski.  The physicists included von Laue, a student of Woldemar Voigt.  This period was one of several “golden ages” of Göttingen.  The first golden age was the time of Gauss and Riemann in the mid-1800’s.  The second golden age, when Schwarzschild was present, began when Felix Klein arrived at Göttingen and attracted the top mathematicians of the time.  The third golden age of Göttingen was the time of Born and Jordan and Heisenberg at the birth of quantum mechanics in the mid 1920’s.

In 1906, the Austrian Physicist Paul Ehrenfest, freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life.  Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest.  It was a monumental task, which was to take him and his physicist wife Tatyanya nearly five years to complete.  Part of the delay was the desire by the Ehrenfests to close some open problems that remained in Boltzmann’s work.  One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters.  These properties would later be called adiabatic invariants by Einstein. 

Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity.  Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system.  This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice.  For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .  

Ehrenfest published his observations in 1913 [8], the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc

After eight exciting years at Göttingen, Schwarzschild was offered the position at the Potsdam Observatory in 1909 upon the retirement from that post of the famous German astronomer Carl Vogel who had made the first confirmed measurements of the optical Doppler effect.  Schwarzschild accepted and moved to Potsdam with a new family.  His son Martin Schwarzschild would follow him into his profession, becoming a famous astronomer at Princeton University and a theorist on stellar structure.  At the outbreak of WWI, Schwarzschild joined the German army out of a sense of patriotism.  Because of his advanced education he was made an officer of artillery with the job to calculate artillery trajectories, and after a short time on the Western Front in Belgium was transferred to the Eastern Front in Russia.  Though he was not in the trenches, he was in the midst of the chaos to the rear of the front.  Despite this situation, he found time to pursue his science through the year 1915. 

Schwarzschild was intrigued by Ehrenfest’s paper on adiabatic invariants and their similarity to several of the invariant theorems of Poincaré that he had studied for his doctorate.  Up until this time, mechanics had been mostly pursued through the Lagrangian formalism which could easily handle generalized forces associated with dissipation.  But celestial mechanics are conservative systems for which the Hamiltonian formalism is a more natural approach.  In particular, the Hamilton-Jacobi canonical transformations made it particularly easy to find pairs of generalized coordinates that had simple periodic behavior.  In his published paper [9], Schwarzschild called these “Action-Angle” coordinates because one was the action integral that was well-known in the principle of “Least Action”, and the other was like an angle variable that changed steadily in time (see Fig. 5). Action-angle coordinates have come to form the foundation of many of the properties of Hamiltonian chaos, Hamiltonian maps, and Hamiltonian tapestries.

Fig. 5  Description of the canonical transformation to action-angle coordinates (Ref. [9] pg. 549). Schwarzschild names the new coordinates “Wirkungsvariable” and “Winkelvariable”.

During lulls in bombardments, Schwarzschild translated the Hamilton-Jacobi methods of celestial mechanics to apply them to the new quantum mechanics of the Bohr orbits.  The phrase “quantum mechanics” had not yet been coined (that would come ten years later in a paper by Max Born), but it was clear that the Bohr quantization conditions were a new type of mechanics.  The periodicities that were inherent in the quantum systems were natural properties that could be mapped onto the periodicities of the angle variables, while Ehrenfest’s adiabatic invariants could be mapped onto the slowly varying action integrals.  Schwarzschild showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the Bohr electron orbits.  Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue, showing that each of Sommerfeld’s quantum conditions were precisely Schwarzschild’s action-integral invariants of the classical electron dynamics [10].

The works by Schwarzschild, and a closely-related paper that amplified his ideas published by his friend Paul Epstein several months later [11], were the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, foreshadowing the future importance of Hamiltonians for quantum theory.  An essential part of the Hamiltonian formalism is the concept of phase space.  In his paper, Schwarzschild showed that the phase space of quantum systems was divided into small but finite elementary regions whose areas were equal to Planck’s constant h-bar (see Fig. 6).  The areas were products of a small change in momentum coordinate Delta-p and a corresponding small change in position coordinate Delta-x.  Therefore, the product DxDp = h-bar.  This observation, made in 1915 by Schwarzschild, was only one step away from Heisenberg’s uncertainty relation, twelve years before Heisenberg discovered it.  However, in 1915 Born’s probabilistic interpretation of quantum mechanics had not yet been made, nor the idea of measurement uncertainty, so Schwarzschild did not have the appropriate context in which to have made the leap to the uncertainty principle.  However, by introducing the action-angle coordinates as well as the Hamiltonian formalism applied to quantum systems, with the natural structure of phase space, Schwarzschild laid the foundation for the future developments in quantum theory made by the next generation.

Fig. 6  Expression of the division of phase space into elemental areas of action equal to h-bar (Ref. [9] pg. 550).

All Quiet on the Eastern Front

Towards the end of his second stay in Munich in 1900, prior to joining the Göttingen faculty, Schwarzschild had presented a paper at a meeting of the German Astronomical Society held in Heidelberg in August.  The topic was unlike anything he had tackled before.  It considered the highly theoretical question of whether the universe was non-Euclidean, and more specifically if it had curvature.  He concluded from observation that if the universe were curved, the radius of curvature must be larger than between 50 light years and 2000 light years, depending on whether the geometry was hyperbolic or elliptical.  Schwarzschild was working out ideas of differential geometry and applying them to the universe at large at a time when Einstein was just graduating from the ETH where he skipped his math classes and had his friend Marcel Grossmann take notes for him.

The topic of Schwarzschild’s talk tells an important story about the warping of historical perspective by the “great man” syndrome.  In this case the great man is Einstein who is today given all the credit for discovering the warping of space.  His development of General Relativity is often portrayed as by a lone genius in the wilderness performing a blazing act of creation out of the void.  In fact, non-Euclidean geometry had been around for some time by 1900—five years before Einstein’s Special Theory and ten years before his first publications on the General Theory.  Gauss had developed the idea of intrinsic curvature of a manifold fifty years earlier, amplified by Riemann.  By the turn of the century alternative geometries were all the rage, and Schwarzschild considered whether there were sufficient astronomical observations to set limits on the size of curvature of the universe.  But revisionist history is just as prevalent in physics as in any field, and when someone like Einstein becomes so big in the mind’s eye, his shadow makes it difficult to see all the people standing behind him.

This is not meant to take away from the feat that Einstein accomplished.  The General Theory of Relativity, published by Einstein in its full form in 1915 was spectacular [12].  Einstein had taken vague notions about curved spaces and had made them specific, mathematically rigorous and intimately connected with physics through the mass-energy source term in his field equations.  His mathematics had gone beyond even what his mathematician friend and former collaborator Grossmann could achieve.  Yet Einstein’s field equations were nonlinear tensor differential equations in which the warping of space depended on the strength of energy fields, but the configuration of those energy fields depended on the warping of space.  This type of nonlinear equation is difficult to solve in general terms, and Einstein was not immediately aware of how to find the solutions to his own equations.

Therefore, it was no small surprise to him when he received a letter from the Eastern Front from an astronomer he barely knew who had found a solution—a simple solution (see Fig. 7) —to his field equations.  Einstein probably wondered how he could have missed it, but he was generous and forwarded the letter to the Reports of the Prussian Physical Society where it was published in 1916 [13].

Fig. 7  Schwarzschild’s solution of the Einstein Field Equations (Ref. [13] pg. 194).

In the same paper, Schwarzschild used his exact solution to find the exact equation that described the precession of the perihelion of Mercury that Einstein had only calculated approximately. The dynamical equations for Mercury are shown in Fig. 8.

Fig. 8  Explanation for the precession of the perihelion of Mercury ( Ref. [13]  pg. 195)

Schwarzschild’s solution to Einstein’s Field Equation of General Relativity was not a general solution, even for a point mass. He had constants of integration that could have arbitrary values, such as the characteristic length scale that Schwarzschild called “alpha”. It was David Hilbert who later expanded upon Schwarzschild’s work, giving the general solution and naming the characteristic length scale (where the metric diverges) after Schwarzschild. This is where the phrase “Schwarzschild Radius” got its name, and it stuck. In fact it stuck so well that Schwarzschild’s radius has now eclipsed much of the rest of Schwarzschild’s considerable accomplishments.

Unfortunately, Schwarzschild’s accomplishments were cut short when he contracted an autoimmune disease that may have been hereditary. It is ironic that in the carnage of the Eastern Front, it was a genetic disease that caused his death at the age of 42. He was already suffering from the effects of the disease as he worked on his last publications. He was sent home from the front to his family in Potsdam where he passed away several months later having shepherded his final two papers through the publication process. His last paper, on the action-angle variables in quantum systems , was published on the day that he died.

Schwarzschild’s Legacy

Schwarzschild’s legacy was assured when he solved Einstein’s field equations and Einstein communicated it to the world. But his hidden legacy is no less important.

Schwarzschild’s application of the Hamiltonian formalism of canonical transformations and phase space for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to later fame, although he could not express it in probabilistic terms because he came too early.

Schwarzschild is considered to be the greatest German astronomer of the last hundred years. This is in part based on his work at the birth of stellar interferometry and in part on his development of stellar photometry and the calibration of the Cepheid variable stars that went on to revolutionize our view of our place in the universe. Solving Einsteins field equations was just a sideline for him, a hobby to occupy his active and curious mind.


[1] Fizeau, H. L. (1849). “Sur une expérience relative à la vitesse de propagation de la lumière.” Comptes rendus de l’Académie des sciences 29: 90–92, 132.

[2] Foucault, J. L. (1862). “Détermination expérimentale de la vitesse de la lumière: parallaxe du Soleil.” Comptes rendus de l’Académie des sciences 55: 501–503, 792–596.

[3] Fizeau, H. (1859). “Sur les hypothèses relatives à l’éther lumineux.” Ann. Chim. Phys.  Ser. 4 57: 385–404.

[4] Fizeau, H. (1868). “Prix Bordin: Rapport sur le concours de l’annee 1867.” C. R. Acad. Sci. 66: 932.

[5] Michelson, A. A. (1890). “I. On the application of interference methods to astronomical measurements.” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 30(182): 1-21.

[6] Michelson, A. A. (1891). “Measurement of Jupiter’s Satellites by Interference.” Nature 45(1155): 160-161.

[7] Schwarzschild, K. (1896). “Über messung von doppelsternen durch interferenzen.” Astron. Nachr. 3335: 139.

[8] P. Ehrenfest, “Een mechanische theorema van Boltzmann en zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and its relation to the theory of energy quanta),” Verslag van de Gewoge Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593, 1913.

[9] Schwarzschild, K. (1916). “Quantum hypothesis.” Sitzungsberichte Der Koniglich Preussischen Akademie Der Wissenschaften: 548-568.

[10] P. Ehrenfest, “Adiabatic invariables and quantum theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.

[11] Epstein, P. S. (1916). “The quantum theory.” Annalen Der Physik 51(18): 168-188.

[12] Einstein, A. (1915). “On the general theory of relativity.” Sitzungsberichte Der Koniglich Preussischen Akademie Der Wissenschaften: 778-786.

[13] Schwarzschild, K. (1916). “Über das Gravitationsfeld eines Massenpunktes nach der Einstein’schen Theorie.” Sitzungsberichte der Königlich-Preussischen Akademie der Wissenschaften: 189.

Orbiting Photons around a Black Hole

The physics of a path of light passing a gravitating body is one of the hardest concepts to understand in General Relativity, but it is also one of the easiest.  It is hard because there can be no force of gravity on light even though the path of a photon bends as it passes a gravitating body.  It is easy, because the photon is following the simplest possible path—a geodesic equation for force-free motion.

         This blog picks up where my last blog left off, having there defined the geodesic equation and presenting the Schwarzschild metric.  With those two equations in hand, we could simply solve for the null geodesics (a null geodesic is the path of a light beam through a manifold).  But there turns out to be a simpler approach that Einstein came up with himself (he never did like doing things the hard way).  He just had to sacrifice the fundamental postulate that he used to explain everything about Special Relativity.

Throwing Special Relativity Under the Bus

The fundamental postulate of Special Relativity states that the speed of light is the same for all observers.  Einstein posed this postulate, then used it to derive some of the most astonishing consequences of Special Relativity—like E = mc2.  This postulate is at the rock core of his theory of relativity and can be viewed as one of the simplest “truths” of our reality—or at least of our spacetime. 

            Yet as soon as Einstein began thinking how to extend SR to a more general situation, he realized almost immediately that he would have to throw this postulate out.   While the speed of light measured locally is always equal to c, the apparent speed of light observed by a distant observer (far from the gravitating body) is modified by gravitational time dilation and length contraction.  This means that the apparent speed of light, as observed at a distance, varies as a function of position.  From this simple conclusion Einstein derived a first estimate of the deflection of light by the Sun, though he initially was off by a factor of 2.  (The full story of Einstein’s derivation of the deflection of light by the Sun and the confirmation by Eddington is in Chapter 7 of Galileo Unbound (Oxford University Press, 2018).)

The “Optics” of Gravity

The invariant element for a light path moving radially in the Schwarzschild geometry is

The apparent speed of light is then

where c(r) is  always less than c, when observing it from flat space.  The “refractive index” of space is defined, as for any optical material, as the ratio of the constant speed divided by the observed speed

Because the Schwarzschild metric has the property

the effective refractive index of warped space-time is

with a divergence at the Schwarzschild radius.

            The refractive index of warped space-time in the limit of weak gravity can be used in the ray equation (also known as the Eikonal equation described in an earlier blog)

where the gradient of the refractive index of space is

The ray equation is then a four-variable flow

These equations represent a 4-dimensional flow for a light ray confined to a plane.  The trajectory of any light path is found by using an ODE solver subject to the initial conditions for the direction of the light ray.  This is simple for us to do today with Python or Matlab, but it was also that could be done long before the advent of computers by early theorists of relativity like Max von Laue  (1879 – 1960).

The Relativity of Max von Laue

In the Fall of 1905 in Berlin, a young German physicist by the name of Max Laue was sitting in the physics colloquium at the University listening to another Max, his doctoral supervisor Max Planck, deliver a seminar on Einstein’s new theory of relativity.  Laue was struck by the simplicity of the theory, in this sense “simplistic” and hence hard to believe, but the beauty of the theory stuck with him, and he began to think through the consequences for experiments like the Fizeau experiment on partial ether drag.

         Armand Hippolyte Louis Fizeau (1819 – 1896) in 1851 built one of the world’s first optical interferometers and used it to measure the speed of light inside moving fluids.  At that time the speed of light was believed to be a property of the luminiferous ether, and there were several opposing theories on how light would travel inside moving matter.  One theory would have the ether fully stationary, unaffected by moving matter, and hence the speed of light would be unaffected by motion.  An opposite theory would have the ether fully entrained by matter and hence the speed of light in moving matter would be a simple sum of speeds.  A middle theory considered that only part of the ether was dragged along with the moving matter.  This was Fresnel’s partial ether drag hypothesis that he had arrived at to explain why his friend Francois Arago had not observed any contribution to stellar aberration from the motion of the Earth through the ether.  When Fizeau performed his experiment, the results agreed closely with Fresnel’s drag coefficient, which seemed to settle the matter.  Yet when Michelson and Morley performed their experiments of 1887, there was no evidence for partial drag.

         Even after the exposition by Einstein on relativity in 1905, the disagreement of the Michelson-Morley results with Fizeau’s results was not fully reconciled until Laue showed in 1907 that the velocity addition theorem of relativity gave complete agreement with the Fizeau experiment.  The velocity observed in the lab frame is found using the velocity addition theorem of special relativity. For the Fizeau experiment, water with a refractive index of n is moving with a speed v and hence the speed in the lab frame is

The difference in the speed of light between the stationary and the moving water is the difference

where the last term is precisely the Fresnel drag coefficient.  This was one of the first definitive “proofs” of the validity of Einstein’s theory of relativity, and it made Laue one of relativity’s staunchest proponents.  Spurred on by his success with the Fresnel drag coefficient explanation, Laue wrote the first monograph on relativity theory, publishing it in 1910. 

Fig. 1 Front page of von Laue’s textbook, first published in 1910, on Special Relativity (this is a 4-th edition published in 1921).

A Nobel Prize for Crystal X-ray Diffraction

In 1909 Laue became a Privatdozent under Arnold Sommerfeld (1868 – 1951) at the university in Munich.  In the Spring of 1912 he was walking in the Englischer Garten on the northern edge of the city talking with Paul Ewald (1888 – 1985) who was finishing his doctorate with Sommerfed studying the structure of crystals.  Ewald was considering the interaction of optical wavelength with the periodic lattice when it struck Laue that x-rays would have the kind of short wavelengths that would allow the crystal to act as a diffraction grating to produce multiple diffraction orders.  Within a few weeks of that discussion, two of Sommerfeld’s students (Friedrich and Knipping) used an x-ray source and photographic film to look for the predicted diffraction spots from a copper sulfate crystal.  When the film was developed, it showed a constellation of dark spots for each of the diffraction orders of the x-rays scattered from the multiple periodicities of the crystal lattice.  Two years later, in 1914, Laue was awarded the Nobel prize in physics for the discovery.  That same year his father was elevated to the hereditary nobility in the Prussian empire and Max Laue became Max von Laue.

            Von Laue was not one to take risks, and he remained conservative in many of his interests.  He was immensely respected and played important roles in the administration of German science, but his scientific contributions after receiving the Nobel Prize were only modest.  Yet as the Nazis came to power in the early 1930’s, he was one of the few physicists to stand up and resist the Nazi take-over of German physics.  He was especially disturbed by the plight of the Jewish physicists.  In 1933 he was invited to give the keynote address at the conference of the German Physical Society in Wurzburg where he spoke out against the Nazi rejection of relativity as they branded it “Jewish science”.  In his speech he likened Einstein, the target of much of the propaganda, to Galileo.  He said, “No matter how great the repression, the representative of science can stand erect in the triumphant certainty that is expressed in the simple phrase: And yet it moves.”  Von Laue believed that truth would hold out in the face of the proscription against relativity theory by the Nazi regime.  The quote “And yet it moves” is supposed to have been muttered by Galileo just after his abjuration before the Inquisition, referring to the Earth moving around the Sun.  Although the quote is famous, it is believed to be a myth.

            In an odd side-note of history, von Laue sent his gold Nobel prize medal to Denmark for its safe keeping with Niels Bohr so that it would not be paraded about by the Nazi regime.  Yet when the Nazis invaded Denmark, to avoid having the medals fall into the hands of the Nazis, the medal was dissolved in aqua regia by a member of Bohr’s team, George de Hevesy.  The gold completely dissolved into an orange liquid that was stored in a beaker high on a shelf through the war.  When Denmark was finally freed, the dissolved gold was precipitated out and a new medal was struck by the Nobel committee and re-presented to von Laue in a ceremony in 1951. 

The Orbits of Light Rays

Von Laue’s interests always stayed close to the properties of light and electromagnetic radiation ever since he was introduced to the field when he studied with Woldemor Voigt at Göttingen in 1899.  This interest included the theory of relativity, and only a few years after Einstein published his theory of General Relativity and Gravitation, von Laue added to his earlier textbook on relativity by writing a second volume on the general theory.  The new volume was published in 1920 and included the theory of the deflection of light by gravity. 

         One of the very few illustrations in his second volume is of light coming into interaction with a super massive gravitational field characterized by a Schwarzschild radius.  (No one at the time called it a “black hole”, nor even mentioned Schwarzschild.  That terminology came much later.)  He shows in the drawing, how light, if incident at just the right impact parameter, would actually loop around the object.  This is the first time such a diagram appeared in print, showing the trajectory of light so strongly affected by gravity.

Fig. 2 A page from von Laue’s second volume on relativity (first published in 1920) showing the orbit of a photon around a compact mass with “gravitational cutoff” (later known as a “black hole:”). The figure is drawn semi-quantitatively, but the phenomenon was clearly understood by von Laue.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue May 28 11:50:24 2019
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os

plt.close('all')

def create_circle():
	circle = plt.Circle((0,0), radius= 10, color = 'black')
	return circle

def show_shape(patch):
	ax=plt.gca()
	ax.add_patch(patch)
	plt.axis('scaled')
	plt.show()
    
def refindex(x,y):
    
    A = 10
    eps = 1e-6
    
    rp0 = np.sqrt(x**2 + y**2);
        
    n = 1/(1 - A/(rp0+eps))
    fac = np.abs((1-9*(A/rp0)**2/8))   # approx correction to Eikonal
    nx = -fac*n**2*A*x/(rp0+eps)**3
    ny = -fac*n**2*A*y/(rp0+eps)**3
     
    return [n,nx,ny]

def flow_deriv(x_y_z,tspan):
    x, y, z, w = x_y_z
    
    [n,nx,ny] = refindex(x,y)
        
    yp = np.zeros(shape=(4,))
    yp[0] = z/n
    yp[1] = w/n
    yp[2] = nx
    yp[3] = ny
    
    return yp
                
for loop in range(-5,30):
    
    xstart = -100
    ystart = -2.245 + 4*loop
    print(ystart)
    
    [n,nx,ny] = refindex(xstart,ystart)


    y0 = [xstart, ystart, n, 0]

    tspan = np.linspace(1,400,2000)

    y = integrate.odeint(flow_deriv, y0, tspan)

    xx = y[1:2000,0]
    yy = y[1:2000,1]


    plt.figure(1)
    lines = plt.plot(xx,yy)
    plt.setp(lines, linewidth=1)
    plt.show()
    plt.title('Photon Orbits')
    
c = create_circle()
show_shape(c)
axes = plt.gca()
axes.set_xlim([-100,100])
axes.set_ylim([-100,100])

# Now set up a circular photon orbit
xstart = 0
ystart = 15

[n,nx,ny] = refindex(xstart,ystart)

y0 = [xstart, ystart, n, 0]

tspan = np.linspace(1,94,1000)

y = integrate.odeint(flow_deriv, y0, tspan)

xx = y[1:1000,0]
yy = y[1:1000,1]

plt.figure(1)
lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=2, color = 'black')
plt.show()

One of the most striking effects of gravity on photon trajectories is the possibility for a photon to orbit a black hole in a circular orbit. This is shown in Fig. 3 as the black circular ring for a photon at a radius equal to 1.5 times the Schwarzschild radius. This radius defines what is known as the photon sphere. However, the orbit is not stable. Slight deviations will send the photon spiraling outward or inward.

The Eikonal approximation does not strictly hold under strong gravity, but the Eikonal equations with the effective refractive index of space still yield semi-quantitative behavior. In the Python code, a correction factor is used to match the theory to the circular photon orbits, while still agreeing with trajectories far from the black hole. The results of the calculation are shown in Fig. 3. For large impact parameters, the rays are deflected through a finite angle. At a critical impact parameter, near 3 times the Schwarzschild radius, the ray loops around the black hole. For smaller impact parameters, the rays are captured by the black hole.

Fig. 3 Photon orbits near a black hole calculated using the Eikonal equation and the effective refractive index of warped space. One ray, near the critical impact parameter, loops around the black hole as predicted by von Laue. The central black circle is the black hole with a Schwarzschild radius of 10 units. The black ring is the circular photon orbit at a radius 1.5 times the Schwarzschild radius.

Photons pile up around the black hole at the photon sphere. The first image ever of the photon sphere of a black hole was made earlier this year (announced April 10, 2019). The image shows the shadow of the supermassive black hole in the center of Messier 87 (M87), an elliptical galaxy 55 million light-years from Earth. This black hole is 6.5 billion times the mass of the Sun. Imaging the photosphere required eight ground-based radio telescopes placed around the globe, operating together to form a single telescope with an optical aperture the size of our planet.  The resolution of such a large telescope would allow one to image a half-dollar coin on the surface of the Moon, although this telescope operates in the radio frequency range rather than the optical.

Fig. 4 Scientists have obtained the first image of a black hole, using Event Horizon Telescope observations of the center of the galaxy M87. The image shows a bright ring formed as light bends in the intense gravity around a black hole that is 6.5 billion times more massive than the Sun.

Further Reading

Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019)

B. Lavenda, The Optical Properties of Gravity, J. Mod. Phys, 8 8-3-838 (2017)