The Solvay Debates: Einstein versus Bohr

Einstein is the alpha of the quantum. Einstein is also the omega. Although he was the one who established the quantum of energy and matter (see my Blog Einstein vs Planck), Einstein pitted himself in a running debate against Niels Bohr’s emerging interpretation of quantum physics that had, in Einstein’s opinion, severe deficiencies. Between sessions during a series of conferences known as the Solvay Congresses over a period of eight years from 1927 to 1935, Einstein constructed a challenges of increasing sophistication to confront Bohr and his quasi-voodoo attitudes about wave-function collapse. To meet the challenge, Bohr sharpened his arguments and bested Einstein, who ultimately withdrew from the field of battle. Einstein, as quantum physics’ harshest critic, played a pivotal role, almost against his will, establishing the Copenhagen interpretation of quantum physics that rules to this day, and also inventing the principle of entanglement which lies at the core of almost all quantum information technology today.

Debate Timeline

  • Fifth Solvay Congress: 1927 October Brussels: Debate Round 1
    • Einstein and ensembles
  • Sixth Solvay Congress: 1930 Debate Round 2
    • Photon in a box
  • Seventh Solvay Congress: 1933
    • Einstein absent (visiting the US when Hitler takes power…decides not to return to Germany.)
  • Physical Review 1935: Debate Round 3
    • EPR paper and Bohr’s response
    • Schrödinger’s Cat
  • Notable Nobel Prizes
    • 1918 Planck
    • 1921 Einstein
    • 1922 Bohr
    • 1932 Heisenberg
    • 1933 Dirac and Schrödinger

The Solvay Conferences

The Solvay congresses were unparalleled scientific meetings of their day.  They were attended by invitation only, and invitations were offered only to the top physicists concerned with the selected topic of each meeting.  The Solvay congresses were held about every three years always in Belgium, supported by the Belgian chemical industrialist Ernest Solvay.  The first meeting, held in 1911, was on the topic of radiation and quanta. 

Fig. 1 First Solvay Congress (1911). Einstein (standing second from right) was one of the youngest attendees.

The fifth meeting, held in 1927, was on electrons and photons and focused on the recent rapid advances in quantum theory.  The old quantum guard was invited—Planck, Bohr and Einstein.  The new quantum guard was invited as well—Heisenberg, de Broglie, Schrödinger, Born, Pauli, and Dirac.  Heisenberg and Bohr joined forces to present a united front meant to solidify what later became known as the Copenhagen interpretation of quantum physics.  The basic principles of the interpretation include the wavefunction of Schrödinger, the probabilistic interpretation of Born, the uncertainty principle of Heisenberg, the complementarity principle of Bohr and the collapse of the wavefunction during measurement.  The chief conclusion that Heisenberg and Bohr sought to impress on the assembled attendees was that the theory of quantum processes was complete, meaning that unknown or uncertain  characteristics of measurements could not be attributed to lack of knowledge or understanding, but were fundamental and permanently inaccessible.

Fig. 2 Fifth Solvay Congress (1927). Einstein front and center. Bohr on the far right middle row.

Einstein was not convinced with that argument, and he rose to his feet to object after Bohr’s informal presentation of his complementarity principle.  Einstein insisted that uncertainties in measurement were not fundamental, but were caused by incomplete information, that , if known, would accurately account for the measurement results.  Bohr was not prepared for Einstein’s critique and brushed it off, but what ensued in the dining hall and the hallways of the Hotel Metropole in Brussels over the next several days has become one of the most famous scientific debates of the modern era, known as the Bohr-Einstein debate on the meaning of quantum theory.  The debate gently raged night and day through the fifth congress, and was renewed three years later at the 1930 congress.  It finished, in a final flurry of published papers in 1935 that launched some of the central concepts of quantum theory, including the idea of quantum entanglement and, of course, Schrödinger’s cat.

Einstein’s strategy, to refute Bohr, was to construct careful thought experiments that envisioned perfect experiments, without errors, that measured properties of ideal quantum systems.  His aim was to paint Bohr into a corner from which he could not escape, caught by what Einstein assumed was the inconsistency of complementarity.  Einstein’s “thought experiments” used electrons passing through slits, diffracting as required by Schrödinger’s theory, but being detected by classical measurements.  Einstein would present a thought experiment to Bohr, who would then retreat to consider the way around Einstein’s arguments, returning the next hour or the next day with his answer, only to be confronted by yet another clever device of Einstein’s clever imagination that would force Bohr to retreat again.  The spirit of this back and forth encounter between Bohr and Einstein is caught dramatically in the words of Paul Ehrenfest who witnessed the debate first hand, partially mediating between Bohr and Einstein, both of whom he respected deeply.

“Brussels-Solvay was fine!… BOHR towering over everybody.  At first not understood at all … , then  step by step defeating everybody.  Naturally, once again the awful Bohr incantation terminology.  Impossible for anyone else to summarise … (Every night at 1 a.m., Bohr came into my room just to say ONE SINGLE WORD to me, until three a.m.)  It was delightful for me to be present during the conversation between Bohr and Einstein.  Like a game of chess, Einstein all the time with new examples.  In a certain sense a sort of Perpetuum Mobile of the second kind to break the UNCERTAINTY RELATION.  Bohr from out of philosophical smoke clouds constantly searching for the tools to crush one example after the other.  Einstein like a jack-in-the-box; jumping out fresh every morning.  Oh, that was priceless.  But I am almost without reservation pro Bohr and contra Einstein.  His attitude to Bohr is now exacly like the attitude of the defenders of absolute simultaneity towards him …” [1]

The most difficult example that Einstein constructed during the fifth Solvary Congress involved an electron double-slit apparatus that could measure, in principle, the momentum imparted to the slit by the passing electron, as shown in Fig.3.  The electron gun is a point source that emits the electrons in a range of angles that illuminates the two slits.  The slits are small relative to a de Broglie wavelength, so the electron wavefunctions diffract according to Schrödinger’s wave mechanics to illuminate the detection plate.  Because of the interference of the electron waves from the two slits, electrons are detected clustered in intense fringes separated by dark fringes. 

So far, everyone was in agreement with these suggested results.  The key next step is the assumption that the electron gun emits only a single electron at a time, so that only one electron is present in the system at any given time.  Furthermore, the screen with the double slit is suspended on a spring, and the position of the screen is measured with complete accuracy by a displacement meter.  When the single electron passes through the entire system, it imparts a momentum kick to the screen, which is measured by the meter.  It is also detected at a specific location on the detection plate.  Knowing the position of the electron detection, and the momentum kick to the screen, provides information about which slit the electron passed through, and gives simultaneous position and momentum values to the electron that have no uncertainty, apparently rebutting the uncertainty principle.             

Fig. 3 Einstein’s single-electron thought experiment in which the recoil of the screen holding the slits can be measured to tell which way the electron went. Bohr showed that the more “which way” information is obtained, the more washed-out the interference pattern becomes.

This challenge by Einstein was the culmination of successively more sophisticated examples that he had to pose to combat Bohr, and Bohr was not going to let it pass unanswered.  With ingenious insight, Bohr recognized that the key element in the apparatus was the fact that the screen with the slits must have finite mass if the momentum kick by the electron were to produce a measurable displacement.  But if the screen has finite mass, and hence a finite momentum kick from the electron, then there must be an uncertainty in the position of the slits.  This uncertainty immediately translates into a washout of the interference fringes.  In fact the more information that is obtained about which slit the electron passed through, the more the interference is washed out.  It was a perfect example of Bohr’s own complementarity principle.  The more the apparatus measures particle properties, the less it measures wave properties, and vice versa, in a perfect balance between waves and particles. 

Einstein grudgingly admitted defeat at the end of the first round, but he was not defeated.  Three years later he came back armed with more clever thought experiments, ready for the second round in the debate.

The Sixth Solvay Conference: 1930

At the Solvay Congress of 1930, Einstein was ready with even more difficult challenges.  His ultimate idea was to construct a box containing photons, just like the original black bodies that launched Planck’s quantum hypothesis thirty years before.  The box is attached to a weighing scale so that the weight of the box plus the photons inside can be measured with arbitrarily accuracy. A shutter over a hole in the box is opened for a time T, and a photon is emitted.  Because the photon has energy, it has an equivalent weight (Einstein’s own famous E = mc2), and the mass of the box changes by an amount equal to the photon energy divided by the speed of light squared: m = E/c2.  If the scale has arbitrary accuracy, then the energy of the photon has no uncertainty.  In addition, because the shutter was open for only a time T, the time of emission similarly has no uncertainty.  Therefore, the product of the energy uncertainty and the time uncertainty is much smaller than Planck’s constant, apparently violating Heisenberg’s precious uncertainty principle.

Bohr was stopped in his tracks with this challenge.  Although he sensed immediately that Einstein had missed something (because Bohr had complete confidence in the uncertainty principle), he could not put his finger immediately on what it was.  That evening he wandered from one attendee to another, very unhappy, trying to persuade them and saying that Einstein could not be right because it would be the end of physics.  At the end of the evening, Bohr was no closer to a solution, and Einstein was looking smug.  However, by the next morning Bohr reappeared tired but in high spirits, and he delivered a master stroke.  Where Einstein had used special relaitivity against Bohr, Bohr now used Einstein’s own general relativity against him. 

The key insight was that the weight of the box must be measured, and the process of measurement was just as important as the quantum process being measured—this was one of the cornerstones of the Copenhagen interpretation.  So Bohr envisioned a measuring apparatus composed of a spring and a scale with the box suspended in gravity from the spring.  As the photon leaves the box, the weight of the box changes, and so does the deflection of the spring, changing the height of the box.  This change in height, in a gravitational potential, causes the timing of the shutter to change according to the law of gravitational time dilation in general relativity.  By calculating the the general relativistic uncertainty in the time, coupled with the special relativistic uncertainty in the weight of the box, produced a product that was at least as big as Planck’s constant—Heisenberg’s uncertainty principle was saved!

Fig. 4 Einstein’s thought experiment that uses special relativity to refute quantum mechanics. Bohr then invoked Einstein’s own general relativity to refute him.

Entanglement and Schrödinger’s Cat

Einstein ceded the point to Bohr but was not convinced. He still believed that quantum mechanics was not a “complete” theory of quantum physics and he continued to search for the perfect thought experiment that Bohr could not escape. Even today when we have become so familiar with quantum phenomena, the Copenhagen interpretation of quantum mechanics has weird consequences that seem to defy common sense, so it is understandable that Einstein had his reservations.

After the sixth Solvay congress Einstein and Schrödinger exchanged many letters complaining to each other about Bohr’s increasing strangle-hold on the interpretation of quantum mechanics. Egging each other on, they both constructed their own final assault on Bohr. The irony is that the concepts they devised to throw down quantum mechanics have today become cornerstones of the theory. For Einstein, his final salvo was “Entanglement”. For Schrödinger, his final salvo was his “cat”. Today, Entanglement and Schrödinger’s Cat have become enshrined on the alter of quantum interpretation even though their original function was to thwart that interpretation.

The final round of the debate was carried out, not at a Solvay congress, but in the Physical review journal by Einstein [2] and Bohr [3], and in the Naturwissenshaften by Schrödinger [4].

In 1969, Heisenberg looked back on these years and said,

To those of us who participated in the development of atomic theory, the five years following the Solvay Conference in Brussels in 1927 looked so wonderful that we often spoke of them as the golden age of atomic physics. The great obstacles that had occupied all our efforts in the preceding years had been cleared out of the way, the gate to an entirely new field, the quantum mechanics of the atomic shells stood wide open, and fresh fruits seemed ready for the picking. [5]


[1] A. Whitaker, Einstein, Bohr, and the quantum dilemma : from quantum theory to quantum information, 2nd ed. Cambridge University Press, 2006. (pg. 210)

[2] A. Einstein, B. Podolsky, and N. Rosen, “Can quantum-mechanical description of physical reality be considered complete?,” Physical Review, vol. 47, no. 10, pp. 0777-0780, May (1935)

[3] N. Bohr, “Can quantum-mechanical description of physical reality be considered complete?,” Physical Review, vol. 48, no. 8, pp. 696-702, Oct (1935)

[4] E. Schrodinger, “The current situation in quantum mechanics,” Naturwissenschaften, vol. 23, pp. 807-812, (1935)

[5] W Heisenberg, Physics and beyond : Encounters and conversations (Harper, New York, 1971)

A Commotion in the Stars: The Legacy of Christian Doppler

Christian Andreas Doppler (1803 – 1853) was born in Salzburg, Austria, to a longstanding family of stonemasons.  As a second son, he was expected to help his older brother run the business, so his Father had him tested in his 18th year for his suitability for a career in business.  The examiner Simon Stampfer (1790 – 1864), an Austrian mathematician and inventor teaching at the Lyceum in Salzburg, discovered that Doppler had a gift for mathematics and was better suited for a scientific career.  Stampfer’s enthusiasm convinced Doppler’s father to enroll him in the Polytechnik Institute in Vienna (founded only a few years earlier in 1815) where he took classes in mathematics, mechanics and physics [1] from 1822 to 1825.  Doppler excelled in his courses, but was dissatisfied with the narrowness of the education, yearning for more breadth and depth in his studies and for more significance in his positions, feelings he would struggle with for his entire short life.  He left Vienna, returning to the Lyceum in Salzburg to round out his education with philosophy, languages and poetry.  Unfortunately, this four-year detour away from technical studies impeded his ability to gain a permanent technical position, so he began a temporary assistantship with a mathematics professor at Vienna.  As he approached his 30th birthday this term expired without prospects.  He was about to emigrate to America when he finally received an offer to teach at a secondary school in Prague.

To read about the attack by Joseph Petzval on Doppler’s effect and the effect it had on Doppler, see my feature article “The Fall and Rise of the Doppler Effect in Physics Today, 73(3) 30, March (2020).

Salzburg Austria

Doppler in Prague

Prague gave Doppler new life.  He was a professor with a position that allowed him to marry the daughter of a sliver and goldsmith from Salzburg.  He began to publish scholarly papers, and in 1837 was appointed supplementary professor of Higher Mathematics and Geometry at the Prague Technical Institute, promoted to full professor in 1841.  It was here that he met the unusual genius Bernard Bolzano (1781 – 1848), recently returned from political exile in the countryside.  Bolzano was a philosopher and mathematician who developed rigorous concepts of mathematical limits and is famous today for his part in the Bolzano-Weierstrass theorem in functional analysis, but he had been too liberal and too outspoken for the conservative Austrian regime and had been dismissed from the University in Prague in 1819.  He was forbidden to publish his work in Austrian journals, which is one reason why much of Bolzano’s groundbreaking work in functional analysis remained unknown during his lifetime.  However, he participated in the Bohemian Society for Science from a distance, recognizing the inventive tendencies in the newcomer Doppler and supporting him for membership in the Bohemian Society.  When Bolzano was allowed to return in 1842 to the Polytechnic Institute in Prague, he and Doppler became close friends as kindred spirits. 

Prague, Czech Republic

On May 25, 1842, Bolzano presided as chairman over a meeting of the Bohemian Society for Science on the day that Doppler read a landmark paper on the color of stars to a meagre assembly of only five regular members of the Society [2].  The turn-out was so small that the meeting may have been held in the robing room of the Society rather than in the meeting hall itself.  Leading up to this famous moment, Doppler’s interests were peripatetic, ranging widely over mathematical and physical topics, but he had lately become fascinated by astronomy and by the phenomenon of stellar aberration.  Stellar aberration was discovered by James Bradley in 1729 and explained as the result of the Earth’s yearly motion around the Sun, causing the apparent location of a distant star to change slightly depending on the direction of the Earth’s motion.  Bradley explained this in terms of the finite speed of light and was able to estimate it to within several percent [3].  As Doppler studied Bradley aberration, he wondered how the relative motion of the Earth would affect the color of the star.  By making a simple analogy of a ship traveling with, or against, a series of ocean waves, he concluded that the frequency of impact of the peaks and troughs of waves on the ship was no different than the arrival of peaks and troughs of the light waves impinging on the eye.  Because perceived color was related to the frequency of excitation in the eye, he concluded that the color of light would be slightly shifted to the blue if approaching, and to the red if receding from, the light source. 

Doppler wave fronts from a source emitting spherical waves moving with speeds β relative to the speed of the wave in the medium.

Doppler calculated the magnitude of the effect by taking a simple ratio of the speed of the observer relative to the speed of light.  What he found was that the speed of the Earth, though sufficient to cause the detectable aberration in the position of stars, was insufficient to produce a noticeable change in color.  However, his interest in astronomy had made him familiar with binary stars where the relative motion of the light source might be high enough to cause color shifts.  In fact, in the star catalogs there were examples of binary stars that had complementary red and blue colors.  Therefore, the title of his paper, published in the Proceedings of the Royal Bohemian Society of Sciences a few months after he read it to the society, was “On the Coloured Light of the Double Stars and Certain Other Stars of the Heavens: Attempt at a General Theory which Incorporates Bradley’s Theorem of Aberration as an Integral Part” [4]

Title page of Doppler’s 1842 paper introducing the Doppler Effect.

Doppler’s analogy was correct, but like all analogies not founded on physical law, it differed in detail from the true nature of the phenomenon.  By 1842 the transverse character of light waves had been thoroughly proven through the work of Fresnel and Arago several decades earlier, yet Doppler held onto the old-fashioned notion that light was composed of longitudinal waves.  Bolzano, fully versed in the transverse nature of light, kindly published a commentary shortly afterwards [5] showing how the transverse effect for light, and a longitudinal effect for sound, were both supported by Doppler’s idea.  Yet Doppler also did not know that speeds in visual binaries were too small to produce noticeable color effects to the unaided eye.  Finally, (and perhaps the greatest flaw in his argument on the color of stars) a continuous spectrum that extends from the visible into the infrared and ultraviolet would not change color because all the frequencies would shift together preserving the flat (white) spectrum.

The simple algebraic derivation of the Doppler Effect in the 1842 publication..

Doppler’s twelve years in Prague were intense.  He was consumed by his Society responsibilities and by an extremely heavy teaching load that included personal exams of hundreds of students.  The only time he could be creative was during the night while his wife and children slept.  Overworked and running on too little rest, his health already frail with the onset of tuberculosis, Doppler collapsed, and he was unable to continue at the Polytechnic.  In 1847 he transferred to the School of Mines and Forrestry in Schemnitz (modern Banská Štiavnica in Slovakia) with more pay and less work.  Yet the revolutions of 1848 swept across Europe, with student uprisings, barricades in the streets, and Hungarian liberation armies occupying the cities and universities, giving him no peace.  Providentially, his former mentor Stampfer retired from the Polytechnic in Vienna, and Doppler was called to fill the vacancy.

Although Doppler was named the Director of Austria’s first Institute of Physics and was elected to the National Academy, he ran afoul of one of the other Academy members, Joseph Petzval (1807 – 1891), who persecuted Doppler and his effect.  To read a detailed description of the attack by Petzval on Doppler’s effect and the effect it had on Doppler, see my feature article “The Fall and Rise of the Doppler Effect” in Physics Today, March issue (2020).

Christian Doppler

Voigt’s Transformation

It is difficult today to appreciate just how deeply engrained the reality of the luminiferous ether was in the psyche of the 19th century physicist.  The last of the classical physicists were reluctant even to adopt Maxwell’s electromagnetic theory for the explanation of optical phenomena, and as physicists inevitably were compelled to do so, some of their colleagues looked on with dismay and disappointment.  This was the situation for Woldemar Voigt (1850 – 1919) at the University of Göttingen, who was appointed as one of the first professors of physics there in 1883, to be succeeded in later years by Peter Debye and Max Born.  Voigt received his doctorate at the University of Königsberg under Franz Neumann, exploring the elastic properties of rock salt, and at Göttingen he spent a quarter century pursuing experimental and theoretical research into crystalline properties.  Voigt’s research, with students like Paul Drude, laid the foundation for the modern field of solid state physics.  His textbook Lehrbuch der Kristallphysik published in 1910 remained influential well into the 20th century because it adopted mathematical symmetry as a guiding principle of physics.  It was in the context of his studies of crystal elasticity that he introduced the word “tensor” into the language of physics.

At the January 1887 meeting of the Royal Society of Science at Göttingen, three months before Michelson and Morely began their reality-altering experiments at the Case Western Reserve University in Cleveland Ohio, Voit submitted a paper deriving the longitudinal optical Doppler effect in an incompressible medium.  He was responding to results published in 1886 by Michelson and Morely on their measurements of the Fresnel drag coefficient, which was the precursor to their later results on the absolute motion of the Earth through the ether. 

Fresnel drag is the effect of light propagating through a medium that is in motion.  The French physicist Francois Arago (1786 – 1853) in 1810 had attempted to observe the effects of corpuscles of light emitted from stars propagating with different speeds through the ether as the Earth spun on its axis and traveled around the sun.  He succeeded only in observing ordinary stellar aberration.  The absence of the effects of motion through the ether motivated Augustin-Jean Fresnel (1788 – 1827) to apply his newly-developed wave theory of light to explain the null results.  In 1818 Fresnel derived an expression for the dragging of light by a moving medium that explained the absence of effects in Arago’s observations.  For light propagating through a medium of refractive index n that is moving at a speed v, the resultant velocity of light is

where the last term in parenthesis is the Fresnel drag coefficient.  The Fresnel drag effect supported the idea of the ether by explaining why its effects could not be observed—a kind of Catch-22—but it also applied to light moving through a moving dielectric medium.  In 1851, Fizeau used an interferometer to measure the Fresnel drag coefficient for light moving through moving water, arriving at conclusions that directly confirmed the Fresnel drag effect.  The positive experiments of Fizeau, as well as the phenomenon of stellar aberration, would be extremely influential on the thoughts of Einstein as he developed his approach to special relativity in 1905.  They were also extremely influential to Michelson, Morley and Voigt.

 In his paper on the absence of the Fresnel drag effect in the first Michelson-Morley experiment, Voigt pointed out that an equation of the form

is invariant under the transformation

From our modern vantage point, we immediately recognize (to within a scale factor) the Lorentz transformation of relativity theory.  The first equation is common Galilean relativity, but the last equation was something new, introducing a position-dependent time as an observer moved with speed  relative to the speed of light [6].  Using these equations, Voigt was the first to derive the longitudinal (conventional) Doppler effect from relativistic effects.

Voigt’s derivation of the longitudinal Doppler effect used a classical approach that is still used today in Modern Physics textbooks to derive the Doppler effect.  The argument proceeds by considering a moving source that emits a continuous wave in the direction of motion.  Because the wave propagates at a finite speed, the moving source chases the leading edge of the wave front, catching up by a small amount by the time a single cycle of the wave has been emitted.  The resulting compressed oscillation represents a blue shift of the emitted light.  By using his transformations, Voigt arrived at the first relativistic expression for the shift in light frequency.  At low speeds, Voigt’s derivation reverted to Doppler’s original expression.

A few months after Voigt delivered his paper, Michelson and Morley announced the results of their interferometric measurements of the motion of the Earth through the ether—with their null results.  In retrospect, the Michelson-Morley experiment is viewed as one of the monumental assaults on the old classical physics, helping to launch the relativity revolution.  However, in its own day, it was little more than just another null result on the ether.  It did incite Fitzgerald and Lorentz to suggest that length of the arms of the interferometer contracted in the direction of motion, with the eventual emergence of the full Lorentz transformations by 1904—seventeen years after the Michelson results.

            In 1904 Einstein, working in relative isolation at the Swiss patent office, was surprisingly unaware of the latest advances in the physics of the ether.  He did not know about Voigt’s derivation of the relativistic Doppler effect  (1887) as he had not heard of Lorentz’s final version of relativistic coordinate transformations (1904).  His thinking about relativistic effects focused much farther into the past, to Bradley’s stellar aberration (1725) and Fizeau’s experiment of light propagating through moving water (1851).  Einstein proceeded on simple principles, unencumbered by the mental baggage of the day, and delivered his beautifully minimalist theory of special relativity in his famous paper of 1905 “On the Electrodynamics of Moving Bodies”, independently deriving the Lorentz coordinate transformations [7]

One of Einstein’s talents in theoretical physics was to predict new phenomena as a way to provide direct confirmation of a new theory.  This was how he later famously predicted the deflection of light by the Sun and the gravitational frequency shift of light.  In 1905 he used his new theory of special relativity to predict observable consequences that included a general treatment of the relativistic Doppler effect.  This included the effects of time dilation in addition to the longitudinal effect of the source chasing the wave.  Time dilation produced a correction to Doppler’s original expression for the longitudinal effect that became significant at speeds approaching the speed of light.  More significantly, it predicted a transverse Doppler effect for a source moving along a line perpendicular to the line of sight to an observer.  This effect had not been predicted either by Doppler or by Voigt.  The equation for the general Doppler effect for any observation angle is

Just as Doppler had been motivated by Bradley’s aberration of starlight when he conceived of his original principle for the longitudinal Doppler effect, Einstein combined the general Doppler effect with his results for the relativistic addition of velocities (also in his 1905 Annalen paper) as the conclusive treatment of stellar aberration nearly 200 years after Bradley first observed the effect.

Despite the generally positive reception of Einstein’s theory of special relativity, some of its consequences were anathema to many physicists at the time.  A key stumbling block was the question whether relativistic effects, like moving clocks running slowly, were only apparent, or were actually real, and Einstein had to fight to convince others of its reality.  When Johannes Stark (1874 – 1957) observed Doppler line shifts in ion beams called “canal rays” in 1906 (Stark received the 1919 Nobel prize in part for this discovery) [8], Einstein promptly published a paper suggesting how the canal rays could be used in a transverse geometry to directly detect time dilation through the transverse Doppler effect [9].  Thirty years passed before the experiment was performed with sufficient accuracy by Herbert Ives and G. R. Stilwell in 1938 to measure the transverse Doppler effect [10].  Ironically, even at this late date, Ives and Stilwell were convinced that their experiment had disproved Einstein’s time dilation by supporting Lorentz’ contraction theory of the electron.  The Ives-Stilwell experiment was the first direct test of time dilation, followed in 1940 by muon lifetime measurements [11].

Further Reading

D. D. Nolte, “The Fall and Rise of the Doppler Effect“, Phys. Today 73(3) 30, March 2020.


[1] pg. 15, Eden, A. (1992). The search for Christian Doppler. Wien, Springer-Verlag.

[2] pg. 30, Eden

[3] Bradley, J (1729). “Account of a new discoved Motion of the Fix’d Stars”. Phil Trans. 35: 637–660.

[4] C. A. DOPPLER, “Über das farbige Licht der Doppelsterne und einiger anderer Gestirne des Himmels (About the coloured light of the binary stars and some other stars of the heavens),” Proceedings of the Royal Bohemian Society of Sciences, vol. V, no. 2, pp. 465–482, (Reissued 1903) (1842).

[5] B. Bolzano, “Ein Paar Bemerkunen über die Neu Theorie in Herrn Professor Ch. Doppler’s Schrift “Über das farbige Licht der Doppersterne und eineger anderer Gestirnedes Himmels”,” Pogg. Anal. der Physik und Chemie, vol. 60, p. 83, 1843; B. Bolzano, “Christian Doppler’s neuste Leistunen af dem Gebiet der physikalischen Apparatenlehre, Akoustik, Optik and optische Astronomie,” Pogg. Anal. der Physik und Chemie, vol. 72, pp. 530-555, 1847.

[6] W. Voigt, “Uber das Doppler’sche Princip,” Göttinger Nachrichten, vol. 7, pp. 41–51, (1887). The common use of c to express the speed of light came later from Voigt’s student Paul Drude.

[7] A. Einstein, “On the electrodynamics of moving bodies,” Annalen Der Physik, vol. 17, pp. 891-921, 1905.

[8] J. Stark, W. Hermann, and S. Kinoshita, “The Doppler effect in the spectrum of mercury,” Annalen Der Physik, vol. 21, pp. 462-469, Nov 1906.

[9] A. Einstein, “”Über die Möglichkeit einer neuen Prüfung des Relativitätsprinzips”,” vol. 328, pp. 197–198, 1907.

[10] H. E. Ives and G. R. Stilwell, “An experimental study of the rate of a moving atomic clock,” Journal of the Optical Society of America, vol. 28, p. 215, 1938.

[11] B. Rossi and D. B. Hall, “Variation of the Rate of Decay of Mesotrons with Momentum,” Physical Review, vol. 59, pp. 223–228, 1941.

Bohr’s Orbits

The first time I ran across the Bohr-Sommerfeld quantization conditions I admit that I laughed! I was a TA for the Modern Physics course as a graduate student at Berkeley in 1982 and I read about Bohr-Sommerfeld in our Tipler textbook. I was familiar with Bohr orbits, which are already the wrong way of thinking about quantized systems. So the Bohr-Sommerfeld conditions, especially for so-called “elliptical” orbits, seemed like nonsense.

But it’s funny how a a little distance gives you perspective. Forty years later I know a little more physics than I did then, and I have gained a deep respect for an obscure property of dynamical systems known as “adiabatic invariants”. It turns out that adiabatic invariants lie at the core of quantum systems, and in the case of hydrogen adiabatic invariants can be visualized as … elliptical orbits!

Quantum Physics in Copenhagen

Niels Bohr (1885 – 1962) was born in Copenhagen, Denmark, the middle child of a physiology professor at the University in Copenhagen.  Bohr grew up with his siblings as a faculty child, which meant an unconventional upbringing full of ideas, books and deep discussions.  Bohr was a late bloomer in secondary school but began to show talent in Math and Physics in his last two years.  When he entered the University in Copenhagen in 1903 to major in physics, the university had only one physics professor, Christian Christiansen, and had no physics laboratories.  So Bohr tinkered in his father’s physiology laboratory, performing a detailed experimental study of the hydrodynamics of water jets, writing and submitting a paper that was to be his only experimental work.  Bohr went on to receive a Master’s degree in 1909 and his PhD in 1911, writing his thesis on the theory of electrons in metals.  Although the thesis did not break much new ground, it uncovered striking disparities between observed properties and theoretical predictions based on the classical theory of the electron.  For his postdoc studies he applied for and was accepted to a position working with the discoverer of the electron, Sir J. J. Thompson, in Cambridge.  Perhaps fortunately for the future history of physics, he did not get along well with Thompson, and he shifted his postdoc position in early 1912 to work with Ernest Rutherford at the much less prestigious University of Manchester.

Niels Bohr (Wikipedia)

Ernest Rutherford had just completed a series of detailed experiments on the scattering of alpha particles on gold film and had demonstrated that the mass of the atom was concentrated in a very small volume that Rutherford called the nucleus, which also carried the positive charge compensating the negative electron charges.  The discovery of the nucleus created a radical new model of the atom in which electrons executed planetary-like orbits around the nucleus.  Bohr immediately went to work on a theory for the new model of the atom.  He worked closely with Rutherford and the other members of Rutherford’s laboratory, involved in daily discussions on the nature of atomic structure.  The open intellectual atmosphere of Rutherford’s group and the ready flow of ideas in group discussions became the model for Bohr, who would some years later set up his own research center that would attract the top young physicists of the time.  Already by mid 1912, Bohr was beginning to see a path forward, hinting in letters to his younger brother Harald (who would become a famous mathematician) that he had uncovered a new approach that might explain some of the observed properties of simple atoms. 

By the end of 1912 his postdoc travel stipend was over, and he returned to Copenhagen, where he completed his work on the hydrogen atom.  One of the key discrepancies in the classical theory of the electron in atoms was the requirement, by Maxwell’s Laws, for orbiting electrons to continually radiate because of their angular acceleration.  Furthermore, from energy conservation, if they radiated continuously, the electron orbits must also eventually decay into the nuclear core with ever-decreasing orbital periods and hence ever higher emitted light frequencies.  Experimentally, on the other hand, it was known that light emitted from atoms had only distinct quantized frequencies.  To circumvent the problem of classical radiation, Bohr simply assumed what was observed, formulating the idea of stationary quantum states.  Light emission (or absorption) could take place only when the energy of an electron changed discontinuously as it jumped from one stationary state to another, and there was a lowest stationary state below which the electron could never fall.  He then took a critical and important step, combining this new idea of stationary states with Planck’s constant h.  He was able to show that the emission spectrum of hydrogen, and hence the energies of the stationary states, could be derived if the angular momentum of the electron in a Hydrogen atom was quantized by integer amounts of Planck’s constant h

Bohr published his quantum theory of the hydrogen atom in 1913, which immediately focused the attention of a growing group of physicists (including Einstein, Rutherford, Hilbert, Born, and Sommerfeld) on the new possibilities opened up by Bohr’s quantum theory [1].  Emboldened by his growing reputation, Bohr petitioned the university in Copenhagen to create a new faculty position in theoretical physics, and to appoint him to it.  The University was not unreceptive, but university bureaucracies make decisions slowly, so Bohr returned to Rutherford’s group in Manchester while he awaited Copenhagen’s decision.  He waited over two years, but he enjoyed his time in the stimulating environment of Rutherford’s group in Manchester, growing steadily into the role as master of the new quantum theory.  In June of 1916, Bohr returned to Copenhagen and a year later was elected to the Royal Danish Academy of Sciences. 

Although Bohr’s theory had succeeded in describing some of the properties of the electron in atoms, two central features of his theory continued to cause difficulty.  The first was the limitation of the theory to single electrons in circular orbits, and the second was the cause of the discontinuous jumps.  In response to this challenge, Arnold Sommerfeld provided a deeper mechanical perspective on the origins of the discrete energy levels of the atom. 

Quantum Physics in Munich

Arnold Johannes Wilhem Sommerfeld (1868—1951) was born in Königsberg, Prussia, and spent all the years of his education there to his doctorate that he received in 1891.  In Königsberg he was acquainted with Minkowski, Wien and Hilbert, and he was the doctoral student of Lindemann.  He also was associated with a social group at the University that spent too much time drinking and dueling, a distraction that lead to his receiving a deep sabre cut on his forehead that became one of his distinguishing features along with his finely waxed moustache.  In outward appearance, he looked the part of a Prussian hussar, but he finally escaped this life of dissipation and landed in Göttingen where he became Felix Klein’s assistant in 1894.  He taught at local secondary schools, rising in reputation, until he secured a faculty position of theoretical physics at the University in Münich in 1906.  One of his first students was Peter Debye who received his doctorate under Sommerfeld in 1908.  Later famous students would include Peter Ewald (doctorate in 1912), Wolfgang Pauli (doctorate in 1921), Werner Heisenberg (doctorate in 1923), and Hans Bethe (doctorate in 1928).  These students had the rare treat, during their time studying under Sommerfeld, of spending weekends in the winter skiing and staying at a ski hut that he owned only two hours by train outside of Münich.  At the end of the day skiing, discussion would turn invariably to theoretical physics and the leading problems of the day.  It was in his early days at Münich that Sommerfeld played a key role aiding the general acceptance of Minkowski’s theory of four-dimensional space-time by publishing a review article in Annalen der Physik that translated Minkowski’s ideas into language that was more familiar to physicists.

Arnold Sommerfeld (Wikipedia)

Around 1911, Sommerfeld shifted his research interest to the new quantum theory, and his interest only intensified after the publication of Bohr’s model of hydrogen in 1913.  In 1915 Sommerfeld significantly extended the Bohr model by building on an idea put forward by Planck.  While further justifying the black body spectrum, Planck turned to descriptions of the trajectory of a quantized one-dimensional harmonic oscillator in phase space.  Planck had noted that the phase-space areas enclosed by the quantized trajectories were integral multiples of his constant.  Sommerfeld expanded on this idea, showing that it was not the area enclosed by the trajectories that was fundamental, but the integral of the momentum over the spatial coordinate [2].  This integral is none other than the original action integral of Maupertuis and Euler, used so famously in their Principle of Least Action almost 200 years earlier.  Where Planck, in his original paper of 1901, had recognized the units of his constant to be those of action, and hence called it the quantum of action, Sommerfeld made the explicit connection to the dynamical trajectories of the oscillators.  He then showed that the same action principle applied to Bohr’s circular orbits for the electron on the hydrogen atom, and that the orbits need not even be circular, but could be elliptical Keplerian orbits. 

The quantum condition for this otherwise classical trajectory was the requirement for the action integral over the motion to be equal to integer units of the quantum of action.  Furthermore, Sommerfeld showed that there must be as many action integrals as degrees of freedom for the dynamical system.  In the case of Keplerian orbits, there are radial coordinates as well as angular coordinates, and each action integral was quantized for the discrete electron orbits.  Although Sommerfeld’s action integrals extended Bohr’s theory of quantized electron orbits, the new quantum conditions also created a problem because there were now many possible elliptical orbits that all had the same energy.  How was one to find the “correct” orbit for a given orbital energy?

Quantum Physics in Leiden

In 1906, the Austrian Physicist Paul Ehrenfest (1880 – 1933), freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life.  Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest.  It was a monumental task, which was to take him and his physicist wife Tatyana nearly five years to complete.  Part of the delay was the desire by Ehrenfest to close some open problems that remained in Boltzmann’s work.  One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters.  These properties would later be called adiabatic invariants by Einstein.  Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity.  Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system.  This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice.  For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .  

Paul Ehrenfest (Wikipedia)

Ehrenfest published his observations in 1913 [3], the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc.  Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue with his theory of adiabatic invariants, showing that each of Sommerfeld’s quantum conditions were precisely the adabatic invariants of the classical electron dynamics [4]. The remaining question was which coordinates were the correct ones, because different choices led to different answers.  This was quickly solved by Johannes Burgers (one of Ehrenfest’s students) who showed that action integrals were adiabatic invariants, and then by Karl Schwarzschild and Paul Epstein who showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the electron orbits.  Schwarzshild’s paper was published the same day that he died on the Eastern Front.  The work by Schwarzschild and Epstein was the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, which foreshadowed the future importance of Hamiltonians for quantum theory.

Karl Schwarzschild (Wikipedia)


Emboldened by Ehrenfest’s adiabatic principle, which demonstrated a close connection between classical dynamics and quantization conditions, Bohr formalized a technique that he had used implicitly in his 1913 model of hydrogen, and now elevated it to the status of a fundamental principle of quantum theory.  He called it the Correspondence Principle, and published the details in 1920.  The Correspondence Principle states that as the quantum number of an electron orbit increases to large values, the quantum behavior converges to classical behavior.  Specifically, if an electron in a state of high quantum number emits a photon while jumping to a neighboring orbit, then the wavelength of the emitted photon approaches the classical radiation wavelength of the electron subject to Maxwell’s equations. 

Bohr’s Correspondence Principle cemented the bridge between classical physics and quantum physics.  One of the biggest former questions about the physics of electron orbits in atoms was why they did not radiate continuously because of the angular acceleration they experienced in their orbits.  Bohr had now reconnected to Maxwell’s equations and classical physics in the limit.  Like the theory of adiabatic invariants, the Correspondence Principle became a new tool for distinguishing among different quantum theories.  It could be used as a filter to distinguish “correct” quantum models, that transitioned smoothly from quantum to classical behavior, from those that did not.  Bohr’s Correspondence Principle was to be a powerful tool in the hands of Werner Heisenberg as he reinvented quantum theory only a few years later.

Quantization conditions.

 By the end of 1920, all the elements of the quantum theory of electron orbits were apparently falling into place.  Bohr’s originally ad hoc quantization condition was now on firm footing.  The quantization conditions were related to action integrals that were, in turn, adiabatic invariants of the classical dynamics.  This meant that slight variations in the parameters of the dynamics systems would not induce quantum transitions among the various quantum states.  This conclusion would have felt right to the early quantum practitioners.  Bohr’s quantum model of electron orbits was fundamentally a means of explaining quantum transitions between stationary states.  Now it appeared that the condition for the stationary states of the electron orbits was an insensitivity, or invariance, to variations in the dynamical properties.  This was analogous to the principle of stationary action where the action along a dynamical trajectory is invariant to slight variations in the trajectory.  Therefore, the theory of quantum orbits now rested on firm foundations that seemed as solid as the foundations of classical mechanics.

From the perspective of modern quantum theory, the concept of elliptical Keplerian orbits for the electron is grossly inaccurate.  Most physicists shudder when they see the symbol for atomic energy—the classic but mistaken icon of electron orbits around a nucleus.  Nonetheless, Bohr and Ehrenfest and Sommerfeld had hit on a deep thread that runs through all of physics—the concept of action—the same concept that Leibniz introduced, that Maupertuis minimized and that Euler canonized.  This concept of action is at work in the macroscopic domain of classical dynamics as well as the microscopic world of quantum phenomena.  Planck was acutely aware of this connection with action, which is why he so readily recognized his elementary constant as the quantum of action. 

However, the old quantum theory was running out of steam.  For instance, the action integrals and adiabatic invariants only worked for single electron orbits, leaving the vast bulk of many-electron atomic matter beyond the reach of quantum theory and prediction.  The literal electron orbits were a crutch or bias that prevented physicists from moving past them and seeing new possibilities for quantum theory.  Orbits were an anachronism, exerting a damping force on progress.  This limitation became painfully clear when Bohr and his assistants at Copenhagen–Kramers and Slater–attempted to use their electron orbits to explain the refractive index of gases.  The theory was cumbersome and exhausted.  It was time for a new quantum revolution by a new generation of quantum wizards–Heisenberg, Born, Schrödinger, Pauli, Jordan and Dirac.


[1] N. Bohr, “On the Constitution of Atoms and Molecules, Part II Systems Containing Only a Single Nucleus,” Philosophical Magazine, vol. 26, pp. 476–502, 1913.

[2] A. Sommerfeld, “The quantum theory of spectral lines,” Annalen Der Physik, vol. 51, pp. 1-94, Sep 1916.

[3] P. Ehrenfest, “Een mechanische theorema van Boltzmann en zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and its relation to the theory of energy quanta),” Verslag van de Gewoge Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593, 1913.

[4] P. Ehrenfest, “Adiabatic invariables and quantum theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.

Who Invented the Quantum? Einstein vs. Planck

Albert Einstein defies condensation—it is impossible to condense his approach, his insight, his motivation—into a single word like “genius”.  He was complex, multifaceted, contradictory, revolutionary as well as conservative.  Some of his work was so simple that it is hard to understand why no-one else did it first, even when they were right in the middle of it.  Lorentz and Poincaré spring to mind—they had been circling the ideas of spacetime for decades—but never stepped back to see what the simplest explanation could be.  Einstein did, and his special relativity was simple and beautiful, and the math is just high-school algebra.  On the other hand, parts of his work—like gravitation—are so embroiled in mathematics and the religion of general covariance that it remains opaque to physics neophytes 100 years later and is usually reserved for graduate study. 

            Yet there is a third thread in Einstein’s work that relies on pure intuition—neither simple nor complicated—but almost impossible to grasp how he made his leap.  This is the case when he proposed the real existence of the photon—the quantum particle of light.  For ten years after this proposal, it was considered by almost everyone to be his greatest blunder. It even came up when Planck was nominating Einstein for membership in the German Academy of Science. Planck said

That he may sometimes have missed the target of his speculations, as for example, in his hypothesis of light quanta, cannot really be held against him.

In this single statement, we have the father of the quantum being criticized by the father of the quantum discontinuity.

Max Planck’s Discontinuity

In histories of the development of quantum theory, the German physicist Max Planck (1858—1947) is characterized as an unlikely revolutionary.  He was an establishment man, in the stolid German tradition, who was already embedded in his career, in his forties, holding a coveted faculty position at the University of Berlin.  In his research, he was responding to a theoretical challenge issued by Kirchhoff many years ago in 1860 to find the function of temperature and wavelength that described and explained the observed spectrum of radiating bodies.  Planck was not looking for a revolution.  In fact, he was looking for the opposite.  One of his motivations in studying the thermodynamics of electromagnetic radiation was to rebut the statistical theories of Boltzmann.  Planck had never been convinced by the atomistic and discrete approach Boltzmann had used to explain entropy and the second law of thermodynamics.  With the continuum of light radiation he thought he had the perfect system that would show how entropy behaved in a continuous manner, without the need for discrete quantities. 

Therefore, Planck’s original intentions were to use blackbody radiation to argue against Boltzmann—to set back the clock.  For this reason, not only was Planck an unlikely revolutionary, he was a counter-revolutionary.  But Planck was a revolutionary because that is what he did, whatever his original intentions were, and he accepted his role as a revolutionary when he had the courage to stand in front of his scientific peers and propose a quantum hypothesis that lay at the heart of physics.

            Blackbody radiation, at the end of the nineteenth century, was a topic of keen interest and had been measured with high precision.  This was in part because it was such a “clean” system, having fundamental thermodynamic properties independent of any of the material properties of the black body, unlike the so-called ideal gases, which always showed some dependence on the molecular properties of the gas. The high-precision measurements of blackbody radiation were made possible by new developments in spectrometers at the end of the century, as well as infrared detectors that allowed very precise and repeatable measurements to be made of the spectrum across broad ranges of wavelengths. 

In 1893 the German physicist Wilhelm Wien (1864—1928) had used adiabatic expansion arguments to derive what became known as Wien’s Displacement Law that showed a simple linear relationship between the temperature of the blackbody and the peak wavelength.  Later, in 1896, he showed that the high-frequency behavior could be described by an exponential function of temperature and wavelength that required no other properties of the blackbody.  This was approaching the solution of Kirchhoff’s challenge of 1860 seeking a universal function.  However, at lower frequencies Wien’s approximation failed to match the measured spectrum.  In mid-year 1900, Planck was able to define a single functional expression that described the experimentally observed spectrum.  Planck had succeeded in describing black-body radiation, but he had not satisfied Kirchhoff’s second condition—to explain it. 

            Therefore, to describe the blackbody spectrum, Planck modeled the emitting body as a set of ideal oscillators.  As an expert in the Second Law, Planck derived the functional form for the radiation spectrum, from which he found the entropy of the oscillators that produced the spectrum.  However, once he had the form for the entropy, he needed to explain why it took that specific form.  In this sense, he was working backwards from a known solution rather than forwards from first principles.  Planck was at an impasse.  He struggled but failed to find any continuum theory that could work. 

Then Planck turned to Boltzmann’s statistical theory of entropy, the same theory that he had previously avoided and had hoped to discredit.  He described this as “an act of despair … I was ready to sacrifice any of my previous convictions about physics.”  In Boltzmann’s expression for entropy, it was necessary to “count” possible configurations of states.  But counting can only be done if the states are discrete.  Therefore, he lumped the energies of the oscillators into discrete ranges, or bins, that he called “quanta”.  The size of the bins was proportional to the frequency of the oscillator, and the proportionality constant had the units of Maupertuis’ quantity of action, so Planck called it the “quantum of action”. Finally, based on this quantum hypothesis, Planck derived the functional form of black-body radiation.

            Planck presented his findings at a meeting of the German Physical Society in Berlin on November 15, 1900, introducing the word quantum (plural quanta) into physics from the Latin word that means quantity [1].  It was a casual meeting, and while the attendees knew they were seeing an intriguing new physical theory, there was no sense of a revolution.  But Planck himself was aware that he had created something fundamentally new.  The radiation law of cavities depended on only two physical properties—the temperature and the wavelength—and on two constants—Boltzmann’s constant kB and a new constant that later became known as Planck’s constant h = ΔE/f = 6.6×10-34 J-sec.  By combining these two constants with other fundamental constants, such as the speed of light, Planck was able to establish accurate values for long-sought constants of nature, like Avogadro’s number and the charge of the electron.

            Although Planck’s quantum hypothesis in 1900 explained the blackbody radiation spectrum, his specific hypothesis was that it was the interaction of the atoms and the light field that was somehow quantized.  He certainly was not thinking in terms of individual quanta of the light field.

Figure. Einstein and Planck at a dinner held by Max von Laue in Berlin on Nov. 11, 1931.

Einstein’s Quantum

When Einstein analyzed the properties of the blackbody radiation in 1905, using his deep insight into statistical mechanics, he was led to the inescapable conclusion that light itself must be quantized in amounts E = hf, where h is Planck’s constant and f is the frequency of the light field.  Although this equation is exactly the same as Planck’s from 1900, the meaning was completely different.  For Planck, this was the discreteness of the interaction of light with matter.  For Einstein, this was the quantum of light energy—whole and indivisible—just as if the light quantum were a particle with particle properties.  For this reason, we can answer the question posed in the title of this Blog—Einstein takes the honor of being the inventor of the quantum.

            Einstein’s clarity of vision is a marvel to behold even to this day.  His special talent was to take simple principles, ones that are almost trivial and beyond reproach, and to derive something profound.  In Special Relativity, he simply assumed the constancy of the speed of light and derived Lorentz’s transformations that had originally been based on obtuse electromagnetic arguments about the electron.  In General Relativity, he assumed that free fall represented an inertial frame, and he concluded that gravity must bend light.  In quantum theory, he assumed that the low-density limit of Planck’s theory had to be consistent with light in thermal equilibrium in thermal equilibrium with the black body container, and he concluded that light itself must be quantized into packets of indivisible energy quanta [2].  One immediate consequence of this conclusion was his simple explanation of the photoelectric effect for which the energy of an electron ejected from a metal by ultraviolet irradiation is a linear function of the frequency of the radiation.  Einstein published his theory of the quanta of light [3] as one of his four famous 1905 articles in Annalen der Physik in his Annus Mirabilis

Figure. In the photoelectric effect a photon is absorbed by an electron state in a metal promoting the electron to a free electron that moves with a maximum kinetic energy given by the difference between the photon energy and the work function W of the metal. The energy of the photon is absorbed as a whole quantum, proving that light is composed of quantized corpuscles that are today called photons.

            Einstein’s theory of light quanta was controversial and was slow to be accepted.  It is ironic that in 1914 when Einstein was being considered for a position at the University in Berlin, Planck himself, as he championed Einstein’s case to the faculty, implored his colleagues to accept Einstein despite his ill-conceived theory of light quanta [4].  This comment by Planck goes far to show how Planck, father of the quantum revolution, did not fully grasp, even by 1914, the fundamental nature and consequences of his original quantum hypothesis.  That same year, the American physicist Robert Millikan (1868—1953) performed a precise experimental measurement of the photoelectric effect, with the ostensible intention of proving Einstein wrong, but he accomplished just the opposite—providing clean experimental evidence confirming Einstein’s theory of the photoelectric effect. 

The Stimulated Emission of Light

About a year after Millikan proved that the quantum of energy associated with light absorption was absorbed as a whole quantum of energy that was not divisible, Einstein took a step further in his theory of the light quantum. In 1916 he published a paper in the proceedings of the German Physical Society that explored how light would be in a state of thermodynamic equilibrium when interacting with atoms that had discrete energy levels. Once again he used simple arguments, this time using the principle of detailed balance, to derive a new and unanticipated property of light—stimulated emission!

Figure. The stimulated emission of light. An excited state is stimulated to emit an identical photon when the electron transitions to its ground state.

The stimulated emission of light occurs when an electron is in an excited state of a quantum system, like an atom, and an incident photon stimulates the emission of a second photon that has the same energy and phase as the first photon. If there are many atoms in the excited state, then this process leads to a chain reaction as 1 photon produces 2, and 2 produce 4, and 4 produce 8, etc. This exponential gain in photons with the same energy and phase is the origin of laser radiation. At the time that Einstein proposed this mechanism, lasers were half a century in the future, but he was led to this conclusion by extremely simple arguments about transition rates.

Figure. Section of Einstein’s 1916 paper that describes the absorption and emission of light by atoms with discrete energy levels [5].

Detailed balance is a principle that states that in thermal equilibrium all fluxes are balanced. In the case of atoms with ground states and excited states, this principle requires that as many transitions occur from the ground state to the excited state as from the excited state to the ground state. The crucial new element that Einstein introduced was to distinguish spontaneous emission from stimulated emission. Just as the probability to absorb a photon must be proportional to the photon density, there must be an equivalent process that de-excites the atom that also must be proportional the photon density. In addition, an electron must be able to spontaneously emit a photon with a rate that is independent of photon density. This leads to distinct coefficients in the transition rate equations that are today called the “Einstein A and B coefficients”. The B coefficients relate to the photon density, while the A coefficient relates to spontaneous emission.

Figure. Section of Einstein’s 1917 paper that derives the equilibrium properties of light interacting with matter. The “B”-coefficient for transition from state m to state n describes stimulated emission. [6]

Using the principle of detailed balance together with his A and B coefficients as well as Boltzmann factors describing the number of excited states relative to ground state atoms in equilibrium at a given temperature, Einstein was able to derive an early form of what is today called the Bose-Einstein occupancy function for photons.

Derivation of the Einstein A and B Coefficients

Detailed balance requires the rate from m to n to be the same as the rate from n to m

where the first term is the spontaneous emission rate from the excited state m to the ground state n, the second term is the stimulated emission rate, and the third term (on the right) is the absorption rate from n to m. The numbers in each state are Nm and Nn, and the density of photons is ρ. The relative numbers in the excited state relative to the ground state is given by the Boltzmann factor

By assuming that the stimulated transition coefficient from n to m is the same as m to n, and inserting the Boltzmann factor yields

The Planck density of photons for ΔE = hf is

which yields the final relation between the spontaneous emission coefficient and the stimulated emission coefficient

The total emission rate is

where the p-bar is the average photon number in the cavity. One of the striking aspects of this derivation is that no assumptions are made about the physical mechanisms that determine the coefficient B. Only arguments of detailed balance are required to arrive at these results.

Einstein’s Quantum Legacy

Einstein was awarded the Nobel Prize in 1921 for the photoelectric effect, not for the photon nor for any of Einstein’s other theoretical accomplishments.  Even in 1921, the quantum nature of light remained controversial.  It was only in 1923, after the American physicist Arthur Compton (1892—1962) showed that energy and momentum were conserved in the scattering of photons from electrons, that the quantum nature of light began to be accepted.  The very next year, in 1924, the quantum of light was named the “photon” by the American American chemical physicist Gilbert Lewis (1875—1946). 

            A blog article like this, that attributes the invention of the quantum to Einstein rather than Planck, must say something about the irony of this attribution.  If Einstein is the father of the quantum, he ultimately was led to disinherit his own brain child.  His final and strongest argument against the quantum properties inherent in the Copenhagen Interpretation was his famous EPR paper which, against his expectations, launched the concept of entanglement that underlies the coming generation of quantum computers.

Einstein’s Quantum Timeline

1900 – Planck’s quantum discontinuity for the calculation of the entropy of blackbody radiation.

1905 – Einstein’s “Miracle Year”. Proposes the light quantum.

1911 – First Solvay Conference on the theory of radiation and quanta.

1913 – Bohr’s quantum theory of hydrogen.

1914 – Einstein becomes a member of the German Academy of Science.

1915 – Millikan measurement of the photoelectric effect.

1916 – Einstein proposes stimulated emission.

1921 – Einstein receives Nobel Prize for photoelectric effect and the light quantum. Third Solvay Conference on atoms and electrons.

1927 – Heisenberg’s uncertainty relation. Fifth Solvay International Conference on Electrons and Photons in Brussels. “First” Bohr-Einstein debate on indeterminancy in quantum theory.

1930 – Sixth Solvay Conference on magnetism. “Second” Bohr-Einstein debate.

1935 – Einstein-Podolsky-Rosen (EPR) paper on the completeness of quantum mechanics.

Selected Einstein Quantum Papers

Einstein, A. (1905). “Generation and conversion of light with regard to a heuristic point of view.” Annalen Der Physik 17(6): 132-148.

Einstein, A. (1907). “Die Plancksche Theorie der Strahlung und die Theorie der spezifischen W ̈arme.” Annalen der Physik 22: 180–190.

Einstein, A. (1909). “On the current state of radiation problems.” Physikalische Zeitschrift 10: 185-193.

Einstein, A. and O. Stern (1913). “An argument for the acceptance of molecular agitation at absolute zero.” Annalen Der Physik 40(3): 551-560.

Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.

Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.

Einstein, A., B. Podolsky and N. Rosen (1935). “Can quantum-mechanical description of physical reality be considered complete?” Physical Review 47(10): 0777-0780.


[1] M. Planck, “Elementary quanta of matter and electricity,” Annalen Der Physik, vol. 4, pp. 564-566, Mar 1901.

[2] Klein, M. J. (1964). Einstein’s First Paper on Quanta. The natural philosopher. D. A. Greenberg and D. E. Gershenson. New York, Blaidsdell. 3.

[3] A. Einstein, “Generation and conversion of light with regard to a heuristic point of view,” Annalen Der Physik, vol. 17, pp. 132-148, Jun 1905.

[4] Chap. 2 in “Mind at Light Speed”, by David Nolte (Free Press, 2001)

[5] Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.

[6] Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.

Science 1916: A Hundred-year Time Capsule

In one of my previous blog posts, as I was searching for Schwarzschild’s original papers on Einstein’s field equations and quantum theory, I obtained a copy of the January 1916 – June 1916 volume of the Proceedings of the Royal Prussian Academy of Sciences through interlibrary loan.  The extremely thick volume arrived at Purdue about a week after I ordered it online.  It arrived from Oberlin College in Ohio that had received it as a gift in 1928 from the library of Professor Friedrich Loofs of the University of Halle in Germany.  Loofs had been the Haskell Lecturer at Oberlin for the 1911-1912 semesters. 

As I browsed through the volume looking for Schwarzschild’s papers, I was amused to find a cornucopia of turn-of-the-century science topics recorded in its pages.  There were papers on the overbite and lips of marsupials.  There were papers on forgotten languages.  There were papers on ancient Greek texts.  On the origins of religion.  On the philosophy of abstraction.  Histories of Indian dramas.  Reflections on cancer.  But what I found most amazing was a snapshot of the field of physics and mathematics in 1916, with historic papers by historic scientists who changed how we view the world. Here is a snapshot in time and in space, a period of only six months from a single journal, containing papers from authors that reads like a who’s who of physics.

In 1916 there were three major centers of science in the world with leading science publications: London with the Philosophical Magazine and Proceedings of the Royal Society; Paris with the Comptes Rendus of the Académie des Sciences; and Berlin with the Proceedings of the Royal Prussian Academy of Sciences and Annalen der Physik. In Russia, there were the scientific Journals of St. Petersburg, but the Bolshevik Revolution was brewing that would overwhelm that country for decades.  And in 1916 the academic life of the United States was barely worth noticing except for a few points of light at Yale and Johns Hopkins. 

Berlin in 1916 was embroiled in war, but science proceeded relatively unmolested.  The six-month volume of the Proceedings of the Royal Prussian Academy of Sciences contains a number of gems.  Schwarzschild was one of the most prolific contributors, publishing three papers in just this half-year volume, plus his obituary written by Einstein.  But joining Schwarzschild in this volume were Einstein, Planck, Born, Warburg, Frobenious, and Rubens among others—a pantheon of German scientists mostly cut off from the rest of the world at that time, but single-mindedly following their individual threads woven deep into the fabric of the physical world.

Karl Schwarzschild (1873 – 1916)

Schwarzschild had the unenviable yet effective motivation of his impending death to spur him to complete several projects that he must have known would make his name immortal.  In this six-month volume he published his three most important papers.  The first (pg. 189) was on the exact solution to Einstein’s field equations to general relativity.  The solution was for the restricted case of a point mass, yet the derivation yielded the Schwarzschild radius that later became known as the event horizon of a non-roatating black hole.  The second paper (pg. 424) expanded the general relativity solutions to a spherically symmetric incompressible liquid mass. 

Schwarzschild’s solution to Einstein’s field equations for a point mass.


Schwarzschild’s extension of the field equation solutions to a finite incompressible fluid.

The subject, content and success of these two papers was wholly unexpected from this observational astronomer stationed on the Russian Front during WWI calculating trajectories for German bombardments.  He would not have been considered a theoretical physicist but for the importance of his results and the sophistication of his methods.  Within only a year after Einstein published his general theory, based as it was on the complicated tensor calculus of Levi-Civita, Christoffel and Ricci-Curbastro that had taken him years to master, Schwarzschild found a solution that evaded even Einstein.

Schwarzschild’s third and final paper (pg. 548) was on an entirely different topic, still not in his official field of astronomy, that positioned all future theoretical work in quantum physics to be phrased in the language of Hamiltonian dynamics and phase space.  He proved that action-angle coordinates were the only acceptable canonical coordinates to be used when quantizing dynamical systems.  This paper answered a central question that had been nagging Bohr and Einstein and Ehrenfest for years—how to quantize dynamical coordinates.  Despite the simple way that Bohr’s quantized hydrogen atom is taught in modern physics, there was an ambiguity in the quantization conditions even for this simple single-electron atom.  The ambiguity arose from the numerous possible canonical coordinate transformations that were admissible, yet which led to different forms of quantized motion. 

Schwarzschild’s proposal of action-angle variables for quantization of dynamical systems.

 Schwarzschild’s doctoral thesis had been a theoretical topic in astrophysics that applied the celestial mechanics theories of Henri Poincaré to binary star systems.  Within Poincaré’s theory were integral invariants that were conserved quantities of the motion.  When a dynamical system had as many constraints as degrees of freedom, then every coordinate had an integral invariant.  In this unexpected last paper from Schwarzschild, he showed how canonical transformation to action-angle coordinates produced a unique representation in terms of action variables (whose dimensions are the same as Planck’s constant).  These action coordinates, with their associated cyclical angle variables, are the only unambiguous representations that can be quantized.  The important points of this paper were amplified a few months later in a publication by Schwarzschild’s friend Paul Epstein (1871 – 1939), solidifying this approach to quantum mechanics.  Paul Ehrenfest (1880 – 1933) continued this work later in 1916 by defining adiabatic invariants whose quantum numbers remain unchanged under slowly varying conditions, and the program started by Schwarzschild was definitively completed by Paul Dirac (1902 – 1984) at the dawn of quantum mechanics in Göttingen in 1925.

Albert Einstein (1879 – 1955)

In 1916 Einstein was mopping up after publishing his definitive field equations of general relativity the year before.  His interests were still cast wide, not restricted only to this latest project.  In the 1916 Jan. to June volume of the Prussian Academy Einstein published two papers.  Each is remarkably short relative to the other papers in the volume, yet the importance of the papers may stand in inverse proportion to their length.

The first paper (pg. 184) is placed right before Schwarzschild’s first paper on February 3.  The subject of the paper is the expression of Maxwell’s equations in four-dimensional space time.  It is notable and ironic that Einstein mentions Hermann Minkowski (1864 – 1909) in the first sentence of the paper.  When Minkowski proposed his bold structure of spacetime in 1908, Einstein had been one of his harshest critics, writing letters to the editor about the absurdity of thinking of space and time as a single interchangeable coordinate system.  This is ironic, because Einstein today is perhaps best known for the special relativity properties of spacetime, yet he was slow to adopt the spacetime viewpoint. Einstein only came around to spacetime when he realized around 1910 that a general approach to relativity required the mathematical structure of tensor manifolds, and Minkowski had provided just such a manifold—the pseudo-Riemannian manifold of space time.  Einstein subsequently adopted spacetime with a passion and became its greatest champion, calling out Minkowski where possible to give him his due, although he had already died tragically of a burst appendix in 1909.

Relativistic energy density of electromagnetic fields.

The importance of Einstein’s paper hinges on his derivation of the electromagnetic field energy density using electromagnetic four vectors.  The energy density is part of the source term for his general relativity field equations.  Any form of energy density can warp spacetime, including electromagnetic field energy.  Furthermore, the Einstein field equations of general relativity are nonlinear as gravitational fields modify space and space modifies electromagnetic fields, producing a coupling between gravity and electromagnetism.  This coupling is implicit in the case of the bending of light by gravity, but Einstein’s paper from 1916 makes the connection explicit. 

Einstein’s second paper (pg. 688) is even shorter and hence one of the most daring publications of his career.  Because the field equations of general relativity are nonlinear, they are not easy to solve exactly, and Einstein was exploring approximate solutions under conditions of slow speeds and weak fields.  In this “non-relativistic” limit the metric tensor separates into a Minkowski metric as a background on which a small metric perturbation remains.  This small perturbation has the properties of a wave equation for a disturbance of the gravitational field that propagates at the speed of light.  Hence, in the June 22 issue of the Prussian Academy in 1916, Einstein predicts the existence and the properties of gravitational waves.  Exactly one hundred years later in 2016, the LIGO collaboration announced the detection of gravitational waves generated by the merger of two black holes.

Einstein’s weak-field low-velocity approximation solutions of his field equations.
Einstein’s prediction of gravitational waves.

Max Planck (1858 – 1947)

Max Planck was active as the secretary of the Prussian Academy in 1916 yet was still fully active in his research.  Although he had launched the quantum revolution with his quantum hypothesis of 1900, he was not a major proponent of quantum theory even as late as 1916.  His primary interests lay in thermodynamics and the origins of entropy, following the theoretical approaches of Ludwig Boltzmann (1844 – 1906).  In 1916 he was interested in how to best partition phase space as a way to count states and calculate entropy from first principles.  His paper in the 1916 volume (pg. 653) calculated the entropy for single-atom solids.

Counting microstates by Planck.

Max Born (1882 – 1970)

Max Born was to be one of the leading champions of the quantum mechanical revolution based at the University of Göttingen in the 1920’s. But in 1916 he was on leave from the University of Berlin working on ranging for artillery.  Yet he still pursued his academic interests, like Schwarzschild.  On pg. 614 in the Proceedings of the Prussian Academy, Born published a paper on anisotropic liquids, such as liquid crystals and the effect of electric fields on them.  It is astonishing to think that so many of the flat-panel displays we have today, whether on our watches or smart phones, are technological descendants of work by Born at the beginning of his career.

Born on liquid crystals.

Ferdinand Frobenius (1849 – 1917)

Like Schwarzschild, Frobenius was at the end of his career in 1916 and would pass away one year later, but unlike Schwarzschild, his career had been a long one, receiving his doctorate under Weierstrass and exploring elliptic functions, differential equations, number theory and group theory.  One of the papers that established him in group theory appears in the May 4th issue on page 542 where he explores the series expansion of a group.

Frobenious on groups.

Heinrich Rubens (1865 – 1922)

Max Planck owed his quantum breakthrough in part to the exquisitely accurate experimental measurements made by Heinrich Rubens on black body radiation.  It was only by the precise shape of what came to be called the Planck spectrum that Planck could say with such confidence that his theory of quantized radiation interactions fit Rubens spectrum so perfectly.  In 1916 Rubens was at the University of Berlin, having taken the position vacated by Paul Drude in 1906.  He was a specialist in infrared spectroscopy, and on page 167 of the Proceedings he describes the spectrum of steam and its consequences for the quantum theory.

Rubens and the infrared spectrum of steam.

Emil Warburg (1946 – 1931)

Emil Warburg’s fame is primarily as the father of Otto Warburg who won the 1931 Nobel prize in physiology.  On page 314 Warburg reports on photochemical processes in BrH gases.     In an obscure and very indirect way, I am an academic descendant of Emil Warburg.  One of his students was Robert Pohl who was a famous early researcher in solid state physics, sometimes called the “father of solid state physics”.  Pohl was at the physics department in Göttingen in the 1920’s along with Born and Franck during the golden age of quantum mechanics.  Robert Pohl’s son, Robert Otto Pohl, was my professor when I was a sophomore at Cornell University in 1978 for the course on introductory electromagnetism using a textbook by the Nobel laureate Edward Purcell, a quirky volume of the Berkeley Series of physics textbooks.  This makes Emil Warburg my professor’s father’s professor.

Warburg on photochemistry.

Papers in the 1916 Vol. 1 of the Prussian Academy of Sciences

Schulze,  Alt– und Neuindisches

Orth,  Zur Frage nach den Beziehungen des Alkoholismus zur Tuberkulose

Schulze,  Die Erhabunen auf der Lippin- und Wangenschleimhaut der Säugetiere

von Wilamwitz-Moellendorff, Die Samie des Menandros

Engler,  Bericht über das >>Pflanzenreich<<

von Harnack,  Bericht über die Ausgabe der griechischen Kirchenväter der dri ersten Jahrhunderte

Meinecke,  Germanischer und romanischer Geist im Wandel der deutschen Geschichtsauffassung

Rubens und Hettner,  Das langwellige Wasserdampfspektrum und seine Deutung durch die Quantentheorie

Einstein,  Eine neue formale Deutung der Maxwellschen Feldgleichungen der Electrodynamic

Schwarschild,  Über das Gravitationsfeld eines Massenpunktes nach der Einsteinschen Theorie

Helmreich,  Handschriftliche Verbesserungen zu dem Hippokratesglossar des Galen

Prager,  Über die Periode des veränderlichen Sterns RR Lyrae

Holl,  Die Zeitfolge des ersten origenistischen Streits

Lüders,  Zu den Upanisads. I. Die Samvargavidya

Warburg,  Über den Energieumsatz bei photochemischen Vorgängen in Gasen. VI.

Hellman,  Über die ägyptischen Witterungsangaben im Kalender von Claudius Ptolemaeus

Meyer-Lübke,  Die Diphthonge im Provenzaslischen

Diels,  Über die Schrift Antipocras des Nikolaus von Polen

Müller und Sieg,  Maitrisimit und >>Tocharisch<<

Meyer,  Ein altirischer Heilsegen

Schwarzschild,  Über das Gravitationasfeld einer Kugel aus inkompressibler Flüssigkeit nach der Einsteinschen Theorie

Brauer,  Die Verbreitung der Hyracoiden

Correns,  Untersuchungen über Geschlechtsbestimmung bei Distelarten

Brahn,  Weitere Untersuchungen über Fermente in der Lever von Krebskranken

Erdmann,  Methodologische Konsequenzen aus der Theorie der Abstraktion

Bang,  Studien zur vergleichenden Grammatik der Türksprachen. I.

Frobenius,  Über die  Kompositionsreihe einer Gruppe

Schwarzschild,  Zur Quantenhypothese

Fischer und Bergmann,  Über neue Galloylderivate des Traubenzuckers und ihren Vergleich mit der Chebulinsäure

Schuchhardt,  Der starke Wall und die breite, zuweilen erhöhte Berme bei frügeschichtlichen Burgen in Norddeutschland

Born,  Über anisotrope Flüssigkeiten

Planck,  Über die absolute Entropie einatomiger Körper

Haberlandt,  Blattepidermis und Lichtperzeption

Einstein,  Näherungsweise Integration der Feldgleichungen der Gravitation

Lüders,  Die Saubhikas.  Ein Beitrag zur Gecschichte des indischen Dramas

Karl Schwarzschild’s Radius: How Fame Eclipsed a Physicist’s own Legacy

In an ironic twist of the history of physics, Karl Schwarzschild’s fame has eclipsed his own legacy.  When asked who was Karl Schwarzschild (1873 – 1916), you would probably say he’s the guy who solved Einstein’s Field Equations of General Relativity and discovered the radius of black holes.  You may also know that he accomplished this Herculean feat while dying slowly behind the German lines on the Eastern Front in WWI.  But asked what else he did, and you would probably come up blank.  Yet Schwarzschild was one of the most wide-ranging physicists at the turn of the 20th century, which is saying something, because it places him into the same pantheon as Planck, Lorentz, Poincaré and Einstein.  Let’s take a look at the part of his career that hides in the shadow of his own radius.

A Radius of Interest

Karl Schwarzschild was born in Frankfurt, Germany, shortly after the Franco-Prussian war thrust Prussia onto the world stage as a major political force in Europe.  His family were Jewish merchants of longstanding reputation in the city, and Schwarzschild’s childhood was spent in the vibrant Jewish community.  One of his father’s friends was a professor at a university in Frankfurt, whose son, Paul Epstein (1871 – 1939), became a close friend of Karl’s at the Gymnasium.  Schwarzshild and Epstein would partially shadow each other’s careers despite the fact that Schwarzschild became an astronomer while Epstein became a famous mathematician and number theorist.  This was in part because Schwarzschild had large radius of interests that spanned the breadth of current mathematics and science, practicing both experiments and theory. 

Schwarzschild’s application of the Hamiltonian formalism for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to fame.

By the time Schwarzschild was sixteen, he had taught himself the mathematics of celestial mechanics to such depth that he published two papers on the orbits of binary stars.  He also became fascinated in astronomy and purchased lenses and other materials to construct his own telescope.  His interests were helped along by Epstein, two years older and whose father had his own private observatory.  When Epstein went to study at the University of Strasbourg (then part of the German Federation) Schwarzschild followed him.  But Schwarzschild’s main interest in astronomy diverged from Epstein’s main interest in mathematics, and Schwarzschild transferred to the University of Munich where he studied under Hugo von Seeliger (1849 – 1924), the premier German astronomer of his day.  Epstein remained at Strasbourg where he studied under Bruno Christoffel (1829 – 1900) and eventually became a professor, but he was forced to relinquish the post when Strasbourg was ceded to France after WWI. 

The Birth of Stellar Interferometry

Until the Hubble space telescope was launched in 1990 no star had ever been resolved as a direct image.  Within a year of its launch, using its spectacular resolving power, the Hubble optics resolved—just barely—the red supergiant Betelgeuse.  No other star (other than the Sun) is close enough or big enough to image the stellar disk, even for the Hubble far above our atmosphere.  The reason is that the diameter of the optical lenses and mirrors of the Hubble—as big as they are at 2.4 meter diameter—still produce a diffraction pattern that smears the image so that stars cannot be resolved.  Yet information on the size of a distant object is encoded as phase in the light waves that are emitted from the object, and this phase information is accessible to interferometry.

The first physicist who truly grasped the power of optical interferometry and who understood how to design the first interferometric metrology systems was the French physicist Armand Hippolyte Louis Fizeau (1819 – 1896).  Fizeau became interested in the properties of light when he collaborated with his friend Léon Foucault (1819–1868) on early uses of photography.  The two then embarked on a measurement of the speed of light but had a falling out before the experiment could be finished, and both continued the pursuit independently.  Fizeau achieved the first measurement using a toothed wheel rotating rapidly [1], while Foucault came in second using a more versatile system with a spinning mirror [2].  Yet Fizeau surpassed Foucault in optical design and became an expert in interference effects.  Interference apparatus had been developed earlier by Augustin Fresnel (the Fresnel bi-prism 1819), Humphrey Lloyd (Lloyd’s mirror 1834) and Jules Jamin (Jamin’s interferential refractor 1856).  They had found ways of redirecting light using refraction and reflection to cause interference fringes.  But Fizeau was one of the first to recognize that each emitting region of a light source was coherent with itself, and he used this insight and the use of lenses to design the first interferometer.

Fizeau’s interferometer used a lens with a with a tight focal spot masked off by an opaque screen with two open slits.  When the masked lens device was focused on an intense light source it produced two parallel pencils of light that were mutually coherent but spatially separated.  Fizeau used this apparatus to measure the speed of light in moving water in 1859 [3]

Fig. 1  Optical configuration of the source element of the Fizeau refractometer.

The working principle of the Fizeau refractometer is shown in Fig. 1.  The light source is at the bottom, and it is reflected by the partially-silvered beam splitter to pass through the lens and the mask containing two slits.  (Only the light paths that pass through the double-slit mask on the lens are shown in the figure.)  The slits produce two pencils of mutually coherent light that pass through a system (in the famous Fizeau ether drag experiment it was along two tubes of moving water) and are returned through the same slits, and they intersect at the view port where they produce interference fringes.  The fringe spacing is set by the separation of the two slits in the mask.  The Rayleigh region of the lens defines a region of spatial coherence even for a so-called “incoherent” source.  Therefore, this apparatus, by use of the lens, could convert an incoherent light source into a coherent probe to test the refractive index of test materials, which is why it was called a refractometer. 

Fizeau became adept at thinking of alternative optical designs of his refractometer and alternative applications.  In an address to the French Physical Society in 1868 he suggested that the double-slit mask could be used on a telescope to determine sizes of distant astronomical objects [4].  There were several subsequent attempts to use Fizeau’s configuration in astronomical observations, but none were conclusive and hence were not widely known.

An optical configuration and astronomical application that was very similar to Fizeau’s idea was proposed by Albert Michelson in 1890 [5].  He built the apparatus and used it to successfully measure the size of several moons of Jupiter [6].  The configuration of the Michelson stellar interferometer is shown in Fig. 2.  Light from a distant star passes through two slits in the mask in front of the collecting optics of a telescope.  When the two pencils of light intersect at the view port, they produce interference fringes.  Because of the finite size of the stellar source, the fringes are partially washed out.  By adjusting the slit separation, a certain separation can be found where the fringes completely wash out.  The size of the star is then related to the separation of the slits for which the fringe visibility vanishes.  This simple principle allows this type of stellar interferometry to measure the size of stars that are large and relatively close to Earth.  However, if stars are too far away even this approach cannot be used to measure their sizes because telescopes aren’t big enough.  This limitation is currently being bypassed by the use of long-baseline optical interferometers.

Fig. 2  Optical configuration of the Michelson stellar interferometer.  Fringes at the view port are partially washed out by the finite size of the star.  By adjusting the slit separation, the fringes can be made to vanish entirely, yielding an equation that can be solved for the size of the star.

One of the open questions in the history of interferometry is whether Michelson was aware of Fizeau’s proposal for the stellar interferometer made in 1868.  Michelson was well aware of Fizeau’s published research and acknowledged him as a direct inspiration of his own work in interference effects.  But Michelson also was unaware of the undercurrents in the French school of optical interference.  When he visited Paris in 1881, he met with many of the leading figures in this school (including Lippmann and Cornu), but there is no mention or any evidence that he met with Fizeau.  By this time Fizeau’s wife had passed away, and Fizeau spent most of his time in seclusion at his home outside Paris.  Therefore, it is unlikely that he would have been present during Michelson’s visit.  Because Michelson viewed Fizeau with such awe and respect, if he had met him, he most certainly would have mentioned it.  Therefore, Michelson’s invention of the stellar interferometer can be considered with some confidence to be a case of independent discovery.  It is perhaps not surprising that he hit on the same idea that Fizeau had in 1868, because Michelson was one of the few physicists who understood coherence and interference at the same depth as Fizeau.

Schwarzschild’s Stellar Interferometer

The physics of the Michelson stellar interferometer is very similar to the physics of Young’s double slit experiment.  The two slits in the aperture mask of the telescope objective act to produce a simple sinusoidal interference pattern at the image plane of the optical system.  The size of the stellar diameter is determined by using the wash-out effect of the fringes caused by the finite stellar size.  However, it is well known to physicists who work with diffraction gratings that a multiple-slit interference pattern has a much greater resolving power than a simple double slit. 

This realization must have hit von Seeliger and Schwarzschild, working together at Munich, when they saw the publication of Michelson’s theoretical analysis of his stellar interferometer in 1890, followed by his use of the apparatus to measure the size of Jupiter’s moons.  Schwarzschild and von Seeliger realized that by replacing the double-slit mask with a multiple-slit mask, the widths of the interference maxima would be much narrower.  Such a diffraction mask on a telescope would cause a star to produce a multiple set of images on the image plane of the telescope associated with the multiple diffraction orders.  More interestingly, if the target were a binary star, the diffraction would produce two sets of diffraction maxima—a double image!  If the “finesse” of the grating is high enough, the binary star separation could be resolved as a doublet in the diffraction pattern at the image, and the separation could be measured, giving the angular separation of the two stars of the binary system.  Such an approach to the binary separation would be a direct measurement, which was a distinct and clever improvement over the indirect Michelson configuration that required finding the extinction of the fringe visibility. 

Schwarzschild enlisted the help of a fine German instrument maker to create a multiple slit system that had an adjustable slit separation.  The device is shown in Fig. 3 from Schwarzschild’s 1896 publication on the use of the stellar interferometer to measure the separation of binary stars [7].  The device is ingenious.  By rotating the chain around the gear on the right-hand side of the apparatus, the two metal plates with four slits could be raised or lowered, cause the projection onto the objective plane to have variable slit spacings.  In the operation of the telescope, the changing height of the slits does not matter, because they are near a conjugate optical plane (the entrance pupil) of the optical system.  Using this adjustable multiple slit system, Schwarzschild (and two colleagues he enlisted) made multiple observations of well-known binary star systems, and they calculated the star separations.  Several of their published results are shown in Fig. 4.

Fig. 3  Illustration from Schwarzschild’s 1896 paper describing an improvement of the Michelson interferometer for measuring the separation of binary star systems Ref. [7].
Fig. 4  Data page from Schwarzschild’s 1896 paper measuring the angular separation of two well-known binary star systems: gamma Leonis and chsi Ursa Major. Ref. [7]

Schwarzschild’s publication demonstrated one of the very first uses of stellar interferometry—well before Michelson himself used his own configuration to measure the diameter of Betelgeuse in 1920.  Schwarzschild’s major achievement was performed before he had received his doctorate, on a topic orthogonal to his dissertation topic.  Yet this fact is virtually unknown to the broader physics community outside of astronomy.  If he had not become so famous later for his solution of Einstein’s field equations, Schwarzschild nonetheless might have been famous for his early contributions to stellar interferometry.  But even this was not the end of his unique contributions to physics.

Adiabatic Physics

As Schwarzschild worked for his doctorate under von Seeliger, his dissertation topic was on new theories by Henri Poincaré (1854 – 1912) on celestial mechanics.  Poincaré had made a big splash on the international stage with the publication of his prize-winning memoire in 1890 on the three-body problem.  This is the publication where Poincaré first described what would later become known as chaos theory.  The memoire was followed by his volumes on “New Methods in Celestial Mechanics” published between 1892 and 1899.  Poincaré’s work on celestial mechanics was based on his earlier work on the theory of dynamical systems where he discovered important invariant theorems, such as Liouville’s theorem on the conservation of phase space volume.  Schwarzshild applied Poincaré’s theorems to problems in celestial orbits.  He took his doctorate in 1896 and received a post at an astronomical observatory outside Vienna. 

While at Vienna, Schwarzschild performed his most important sustained contributions to the science of astronomy.  Astronomical observations had been dominated for centuries by the human eye, but photographic techniques had been making steady inroads since the time of Hermann Carl Vogel (1841 – 1907) in the 1880’s at the Potsdam observatory.  Photographic plates were used primarily to record star positions but were known to be unreliable for recording stellar intensities.  Schwarzschild developed a “out-of-focus” technique that blurred the star’s image, while making it larger and easier to measure the density of the exposed and developed photographic emulsions.  In this way, Schwarzschild measured the magnitudes of 367 stars.  Two of these stars had variable magnitudes that he was able to record and track.  Schwarzschild correctly explained the intensity variation caused by steady oscillations in heating and cooling of the stellar atmosphere.  This work established the properties of these Cepheid variables which would become some of the most important “standard candles” for the measurement of cosmological distances.  Based on the importance of this work, Schwarzschild returned to Munich as a teacher in 1899 and subsequently was appointed in 1901 as the director of the observatory at Göttingen established by Gauss eighty years earlier.

Schwarzschild’s years at Göttingen brought him into contact with some of the greatest mathematicians and physicists of that era.  The mathematicians included Felix Klein, David Hilbert and Hermann Minkowski.  The physicists included von Laue, a student of Woldemar Voigt.  This period was one of several “golden ages” of Göttingen.  The first golden age was the time of Gauss and Riemann in the mid-1800’s.  The second golden age, when Schwarzschild was present, began when Felix Klein arrived at Göttingen and attracted the top mathematicians of the time.  The third golden age of Göttingen was the time of Born and Jordan and Heisenberg at the birth of quantum mechanics in the mid 1920’s.

In 1906, the Austrian Physicist Paul Ehrenfest, freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life.  Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest.  It was a monumental task, which was to take him and his physicist wife Tatyanya nearly five years to complete.  Part of the delay was the desire by the Ehrenfests to close some open problems that remained in Boltzmann’s work.  One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters.  These properties would later be called adiabatic invariants by Einstein. 

Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity.  Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system.  This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice.  For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .  

Ehrenfest published his observations in 1913 [8], the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc

After eight exciting years at Göttingen, Schwarzschild was offered the position at the Potsdam Observatory in 1909 upon the retirement from that post of the famous German astronomer Carl Vogel who had made the first confirmed measurements of the optical Doppler effect.  Schwarzschild accepted and moved to Potsdam with a new family.  His son Martin Schwarzschild would follow him into his profession, becoming a famous astronomer at Princeton University and a theorist on stellar structure.  At the outbreak of WWI, Schwarzschild joined the German army out of a sense of patriotism.  Because of his advanced education he was made an officer of artillery with the job to calculate artillery trajectories, and after a short time on the Western Front in Belgium was transferred to the Eastern Front in Russia.  Though he was not in the trenches, he was in the midst of the chaos to the rear of the front.  Despite this situation, he found time to pursue his science through the year 1915. 

Schwarzschild was intrigued by Ehrenfest’s paper on adiabatic invariants and their similarity to several of the invariant theorems of Poincaré that he had studied for his doctorate.  Up until this time, mechanics had been mostly pursued through the Lagrangian formalism which could easily handle generalized forces associated with dissipation.  But celestial mechanics are conservative systems for which the Hamiltonian formalism is a more natural approach.  In particular, the Hamilton-Jacobi canonical transformations made it particularly easy to find pairs of generalized coordinates that had simple periodic behavior.  In his published paper [9], Schwarzschild called these “Action-Angle” coordinates because one was the action integral that was well-known in the principle of “Least Action”, and the other was like an angle variable that changed steadily in time (see Fig. 5). Action-angle coordinates have come to form the foundation of many of the properties of Hamiltonian chaos, Hamiltonian maps, and Hamiltonian tapestries.

Fig. 5  Description of the canonical transformation to action-angle coordinates (Ref. [9] pg. 549). Schwarzschild names the new coordinates “Wirkungsvariable” and “Winkelvariable”.

During lulls in bombardments, Schwarzschild translated the Hamilton-Jacobi methods of celestial mechanics to apply them to the new quantum mechanics of the Bohr orbits.  The phrase “quantum mechanics” had not yet been coined (that would come ten years later in a paper by Max Born), but it was clear that the Bohr quantization conditions were a new type of mechanics.  The periodicities that were inherent in the quantum systems were natural properties that could be mapped onto the periodicities of the angle variables, while Ehrenfest’s adiabatic invariants could be mapped onto the slowly varying action integrals.  Schwarzschild showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the Bohr electron orbits.  Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue, showing that each of Sommerfeld’s quantum conditions were precisely Schwarzschild’s action-integral invariants of the classical electron dynamics [10].

The works by Schwarzschild, and a closely-related paper that amplified his ideas published by his friend Paul Epstein several months later [11], were the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, foreshadowing the future importance of Hamiltonians for quantum theory.  An essential part of the Hamiltonian formalism is the concept of phase space.  In his paper, Schwarzschild showed that the phase space of quantum systems was divided into small but finite elementary regions whose areas were equal to Planck’s constant h-bar (see Fig. 6).  The areas were products of a small change in momentum coordinate Delta-p and a corresponding small change in position coordinate Delta-x.  Therefore, the product DxDp = h-bar.  This observation, made in 1915 by Schwarzschild, was only one step away from Heisenberg’s uncertainty relation, twelve years before Heisenberg discovered it.  However, in 1915 Born’s probabilistic interpretation of quantum mechanics had not yet been made, nor the idea of measurement uncertainty, so Schwarzschild did not have the appropriate context in which to have made the leap to the uncertainty principle.  However, by introducing the action-angle coordinates as well as the Hamiltonian formalism applied to quantum systems, with the natural structure of phase space, Schwarzschild laid the foundation for the future developments in quantum theory made by the next generation.

Fig. 6  Expression of the division of phase space into elemental areas of action equal to h-bar (Ref. [9] pg. 550).

All Quiet on the Eastern Front

Towards the end of his second stay in Munich in 1900, prior to joining the Göttingen faculty, Schwarzschild had presented a paper at a meeting of the German Astronomical Society held in Heidelberg in August.  The topic was unlike anything he had tackled before.  It considered the highly theoretical question of whether the universe was non-Euclidean, and more specifically if it had curvature.  He concluded from observation that if the universe were curved, the radius of curvature must be larger than between 50 light years and 2000 light years, depending on whether the geometry was hyperbolic or elliptical.  Schwarzschild was working out ideas of differential geometry and applying them to the universe at large at a time when Einstein was just graduating from the ETH where he skipped his math classes and had his friend Marcel Grossmann take notes for him.

The topic of Schwarzschild’s talk tells an important story about the warping of historical perspective by the “great man” syndrome.  In this case the great man is Einstein who is today given all the credit for discovering the warping of space.  His development of General Relativity is often portrayed as by a lone genius in the wilderness performing a blazing act of creation out of the void.  In fact, non-Euclidean geometry had been around for some time by 1900—five years before Einstein’s Special Theory and ten years before his first publications on the General Theory.  Gauss had developed the idea of intrinsic curvature of a manifold fifty years earlier, amplified by Riemann.  By the turn of the century alternative geometries were all the rage, and Schwarzschild considered whether there were sufficient astronomical observations to set limits on the size of curvature of the universe.  But revisionist history is just as prevalent in physics as in any field, and when someone like Einstein becomes so big in the mind’s eye, his shadow makes it difficult to see all the people standing behind him.

This is not meant to take away from the feat that Einstein accomplished.  The General Theory of Relativity, published by Einstein in its full form in 1915 was spectacular [12].  Einstein had taken vague notions about curved spaces and had made them specific, mathematically rigorous and intimately connected with physics through the mass-energy source term in his field equations.  His mathematics had gone beyond even what his mathematician friend and former collaborator Grossmann could achieve.  Yet Einstein’s field equations were nonlinear tensor differential equations in which the warping of space depended on the strength of energy fields, but the configuration of those energy fields depended on the warping of space.  This type of nonlinear equation is difficult to solve in general terms, and Einstein was not immediately aware of how to find the solutions to his own equations.

Therefore, it was no small surprise to him when he received a letter from the Eastern Front from an astronomer he barely knew who had found a solution—a simple solution (see Fig. 7) —to his field equations.  Einstein probably wondered how he could have missed it, but he was generous and forwarded the letter to the Reports of the Prussian Physical Society where it was published in 1916 [13].

Fig. 7  Schwarzschild’s solution of the Einstein Field Equations (Ref. [13] pg. 194).

In the same paper, Schwarzschild used his exact solution to find the exact equation that described the precession of the perihelion of Mercury that Einstein had only calculated approximately. The dynamical equations for Mercury are shown in Fig. 8.

Fig. 8  Explanation for the precession of the perihelion of Mercury ( Ref. [13]  pg. 195)

Schwarzschild’s solution to Einstein’s Field Equation of General Relativity was not a general solution, even for a point mass. He had constants of integration that could have arbitrary values, such as the characteristic length scale that Schwarzschild called “alpha”. It was David Hilbert who later expanded upon Schwarzschild’s work, giving the general solution and naming the characteristic length scale (where the metric diverges) after Schwarzschild. This is where the phrase “Schwarzschild Radius” got its name, and it stuck. In fact it stuck so well that Schwarzschild’s radius has now eclipsed much of the rest of Schwarzschild’s considerable accomplishments.

Unfortunately, Schwarzschild’s accomplishments were cut short when he contracted an autoimmune disease that may have been hereditary. It is ironic that in the carnage of the Eastern Front, it was a genetic disease that caused his death at the age of 42. He was already suffering from the effects of the disease as he worked on his last publications. He was sent home from the front to his family in Potsdam where he passed away several months later having shepherded his final two papers through the publication process. His last paper, on the action-angle variables in quantum systems , was published on the day that he died.

Schwarzschild’s Legacy

Schwarzschild’s legacy was assured when he solved Einstein’s field equations and Einstein communicated it to the world. But his hidden legacy is no less important.

Schwarzschild’s application of the Hamiltonian formalism of canonical transformations and phase space for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to later fame, although he could not express it in probabilistic terms because he came too early.

Schwarzschild is considered to be the greatest German astronomer of the last hundred years. This is in part based on his work at the birth of stellar interferometry and in part on his development of stellar photometry and the calibration of the Cepheid variable stars that went on to revolutionize our view of our place in the universe. Solving Einsteins field equations was just a sideline for him, a hobby to occupy his active and curious mind.

[1] Fizeau, H. L. (1849). “Sur une expérience relative à la vitesse de propagation de la lumière.” Comptes rendus de l’Académie des sciences 29: 90–92, 132.

[2] Foucault, J. L. (1862). “Détermination expérimentale de la vitesse de la lumière: parallaxe du Soleil.” Comptes rendus de l’Académie des sciences 55: 501–503, 792–596.

[3] Fizeau, H. (1859). “Sur les hypothèses relatives à l’éther lumineux.” Ann. Chim. Phys.  Ser. 4 57: 385–404.

[4] Fizeau, H. (1868). “Prix Bordin: Rapport sur le concours de l’annee 1867.” C. R. Acad. Sci. 66: 932.

[5] Michelson, A. A. (1890). “I. On the application of interference methods to astronomical measurements.” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 30(182): 1-21.

[6] Michelson, A. A. (1891). “Measurement of Jupiter’s Satellites by Interference.” Nature 45(1155): 160-161.

[7] Schwarzschild, K. (1896). “Über messung von doppelsternen durch interferenzen.” Astron. Nachr. 3335: 139.

[8] P. Ehrenfest, “Een mechanische theorema van Boltzmann en zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and its relation to the theory of energy quanta),” Verslag van de Gewoge Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593, 1913.

[9] Schwarzschild, K. (1916). “Quantum hypothesis.” Sitzungsberichte Der Koniglich Preussischen Akademie Der Wissenschaften: 548-568.

[10] P. Ehrenfest, “Adiabatic invariables and quantum theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.

[11] Epstein, P. S. (1916). “The quantum theory.” Annalen Der Physik 51(18): 168-188.

[12] Einstein, A. (1915). “On the general theory of relativity.” Sitzungsberichte Der Koniglich Preussischen Akademie Der Wissenschaften: 778-786.

[13] Schwarzschild, K. (1916). “Über das Gravitationsfeld eines Massenpunktes nach der Einstein’schen Theorie.” Sitzungsberichte der Königlich-Preussischen Akademie der Wissenschaften: 189.

How to Teach General Relativity to Undergraduate Physics Majors

As a graduate student in physics at Berkeley in the 1980’s, I took General Relativity (aka GR), from Bruno Zumino, who was a world-famous physicist known as one of the originators of super-symmetry in quantum gravity (not to be confused with super-asymmetry of Cooper-Fowler Big Bang Theory fame).  The class textbook was Gravitation and cosmology: principles and applications of the general theory of relativity, by Steven Weinberg, another world-famous physicist, in this case known for grand unification of the electro-weak force with electromagnetism.  With so much expertise at hand, how could I fail but to absorb the simple essence of general relativity? 

The answer is that I failed miserably.  Somehow, I managed to pass the course, but I walked away with nothing!  And it bugged me for years.  What was so hard about GR?  It took me almost a decade teaching undergraduate physics classes at Purdue in the 90’s before I realized that it my biggest obstacle had been language:  I kept mistaking the words and terms of GR as if they were English.  Words like “general covariance” and “contravariant” and “contraction” and “covariant derivative”.  They sounded like English, with lots of “co” prefixes that were hard to keep straight, but they actually are part of a very different language that I call Physics-ese

Physics-ese is a language that has lots of words that sound like English, and so you think you know what the words mean, but the words have sometimes opposite meanings than what you would guess.  And the meanings of Physics-ese are precisely defined, and not something that can be left to interpretation.  I learned this while teaching the intro courses to non-majors, because so many times when the students were confused, it turned out that it was because they had mistaken a textbook jargon term to be English.  If you told them that the word wasn’t English, but just a token standing for a well-defined object or process, it would unshackle them from their misconceptions.

Then, in the early 00’s when I started to explore the physics of generalized trajectories related to some of my own research interests, I realized that the primary obstacle to my learning anything in the Gravitation course was Physics-ese.   So this raised the question in my mind: what would it take to teach GR to undergraduate physics majors in a relatively painless manner?  This is my answer. 

More on this topic can be found in Chapter 11 of the textbook IMD2: Introduction to Modern Dynamics, 2nd Edition, Oxford University Press, 2019

Trajectories as Flows

One of the culprits for my mind block learning GR was Newton himself.  His ubiquitous second law, taught as F = ma, is surprisingly misleading if one wants to have a more general understanding of what a trajectory is.  This is particularly the case for light paths, which can be bent by gravity, yet clearly cannot have any forces acting on them. 

The way to fix this is subtle yet simple.  First, express Newton’s second law as

which is actually closer to the way that Newton expressed the law in his Principia.  In three dimensions for a single particle, these equations represent a 6-dimensional dynamical space called phase space: three coordinate dimensions and three momentum dimensions.  Then generalize the vector quantities, like the position vector, to be expressed as xa for the six dynamics variables: x, y, z, px, py, and pz

Now, as part of Physics-ese, putting the index as a superscript instead as a subscript turns out to be a useful notation when working in higher-dimensional spaces.  This superscript is called a “contravariant index” which sounds like English but is uninterpretable without a Physics-ese-to-English dictionary.  All “contravariant index” means is “column vector component”.  In other words, xa is just the position vector expressed as a column vector

This superscripted index is called a “contravariant” index, but seriously dude, just forget that “contravariant” word from Physics-ese and just think “index”.  You already know it’s a column vector.

Then Newton’s second law becomes

where the index a runs from 1 to 6, and the function Fa is a vector function of the dynamic variables.  To spell it out, this is

so it’s a lot easier to write it in the one-line form with the index notation. 

The simple index notation equation is in the standard form for what is called, in Physics-ese, a “mathematical flow”.  It is an ODE that can be solved for any set of initial conditions for a given trajectory.  Or a whole field of solutions can be considered in a phase-space portrait that looks like the flow lines of hydrodynamics.  The phase-space portrait captures the essential physics of the system, whether it is a rock thrown off a cliff, or a photon orbiting a black hole.  But to get to that second problem, it is necessary to look deeper into the way that space is described by any set of coordinates, especially if those coordinates are changing from location to location.

What’s so Fictitious about Fictitious Forces?

Freshmen physics students are routinely admonished for talking about “centrifugal” forces (rather than centripetal) when describing circular motion, usually with the statement that centrifugal forces are fictitious—only appearing to be forces when the observer is in the rotating frame.  The same is said for the Coriolis force.  Yet for being such a “fictitious” force, the Coriolis effect is what drives hurricanes and the colossal devastation they cause.  Try telling a hurricane victim that they were wiped out by a fictitious force!  Looking closer at the Coriolis force is a good way of understanding how taking derivatives of vectors leads to effects often called “fictitious”, yet it opens the door on some of the simpler techniques in the topic of differential geometry.

To start, consider a vector in a uniformly rotating frame.  Such a frame is called “non-inertial” because of the angular acceleration associated with the uniform rotation.  For an observer in the rotating frame, vectors are attached to the frame, like pinning them down to the coordinate axes, but the axes themselves are changing in time (when viewed by an external observer in a fixed frame).  If the primed frame is the external fixed frame, then a position in the rotating frame is

where R is the position vector of the origin of the rotating frame and r is the position in the rotating frame relative to the origin.  The funny notation on the last term is called in Physics-ese a “contraction”, but it is just a simple inner product, or dot product, between the components of the position vector and the basis vectors.  A basis vector is like the old-fashioned i, j, k of vector calculus indicating unit basis vectors pointing along the x, y and z axes.  The format with one index up and one down in the product means to do a summation.  This is known as the Einstein summation convention, so it’s just

Taking the time derivative of the position vector gives

and by the chain rule this must be

where the last term has a time derivative of a basis vector.  This is non-zero because in the rotating frame the basis vector is changing orientation in time.  This term is non-inertial and can be shown fairly easily (see IMD2 Chapter 1) to be

which is where the centrifugal force comes from.  This shows how a so-called fictitious force arises from a derivative of a basis vector.  The fascinating point of this is that in GR, the force of gravity arises in almost the same way, making it tempting to call gravity a fictitious force, despite the fact that it can kill you if you fall out a window.  The question is, how does gravity arise from simple derivatives of basis vectors?

The Geodesic Equation

To teach GR to undergraduates, you cannot expect them to have taken a course in differential geometry, because most of them just don’t have the time in their schedule to take such an advanced mathematics course.  In addition, there is far more taught in differential geometry than is needed to make progress in GR.  So the simple approach is to teach what they need to understand GR with as little differential geometry as possible, expressed with clear English-to-Physics-ese translations. 

For example, consider the partial derivative of a vector expressed in index notation as

Taking the partial derivative, using the always-necessary chain rule, is

where the second term is just like the extra time-derivative term that showed up in the derivation of the Coriolis force.  The basis vector of a general coordinate system may change size and orientation as a function of position, so this derivative is not in general zero.  Because the derivative of a basis vector is so central to the ideas of GR, they are given their own symbol.  It is

where the new “Gamma” symbol is called a Christoffel symbol.  It has lots of indexes, both up and down, which looks daunting, but it can be interpreted as the beta-th derivative of the alpha-th component of the mu-th basis vector.  The partial derivative is now

For those of you who noticed that some of the indexes flipped from alpha to mu and vice versa, you’re right!  Swapping repeated indexes in these “contractions” is allowed and helps make derivations a lot easier, which is probably why Einstein invented this notation in the first place.

The last step in taking a partial derivative of a vector is to isolate a single vector component Va as

where a new symbol, the del-operator has been introduced.  This del-operator is known as the “covariant derivative” of the vector component.  Again, forget the “covariant” part and just think “gradient”.  Namely, taking the gradient of a vector in general includes changes in the vector component as well as changes in the basis vector.

Now that you know how to take the partial derivative of a vector using Christoffel symbols, you are ready to generate the central equation of General Relativity:  The geodesic equation. 

Everyone knows that a geodesic is the shortest path between two points, like a great circle route on the globe.  But it also turns out to be the straightest path, which can be derived using an idea known as “parallel transport”.  To start, consider transporting a vector along a curve in a flat metric.  The equation describing this process is

Because the Christoffel symbols are zero in a flat space, the covariant derivative and the partial derivative are equal, giving

If the vector is transported parallel to itself, then there is no change in V along the curve, so that

Finally, recognizing

and substituting this in gives

This is the geodesic equation! 

Fig. 1 The geodesic equation of motion is for force-free motion through a metric space. The curvature of the trajectory is analogous to acceleration, and the generalized gradient is analogous to a force. The geodesic equation is the “F = ma” of GR.

Putting this in the standard form of a flow gives the geodesic flow equations

The flow defines an ordinary differential equation that defines a curve that carries its own tangent vector onto itself.  The curve is parameterized by a parameter s that can be identified with path length.  It is the central equation of GR, because it describes how an object follows a force-free trajectory, like free fall, in any general coordinate system.  It can be applied to simple problems like the Coriolis effect, or it can be applied to seemingly difficult problems, like the trajectory of a light path past a black hole.

The Metric Connection

Arriving at the geodesic equation is a major accomplishment, and you have done it in just a few pages of this blog.  But there is still an important missing piece before we are doing General Relativity of gravitation.  We need to connect the Christoffel symbol in the geodesic equation to the warping of space-time around a gravitating object. 

The warping of space-time by matter and energy is another central piece of GR and is often the central focus of a graduate-level course on the subject.  This part of GR does have its challenges leading up to Einstein’s Field Equations that explain how matter makes space bend.  But at an undergraduate level, it is sufficient to just describe the bent coordinates as a starting point, then use the geodesic equation to solve for so many of the cool effects of black holes.

So, stating the way that matter bends space-time is as simple as writing down the length element for the Schwarzschild metric of a spherical gravitating mass as

where RS = GM/c2 is the Schwarzschild radius.  (The connection between the metric tensor gab and the Christoffel symbol can be found in Chapter 11 of IMD2.)  It takes only a little work to find that

This means that if we have the Schwarzschild metric, all we have to do is take first partial derivatives and we will arrive at the Christoffel symbols that go into the geodesic equation.  Solving for any type of force-free trajectory is then just a matter of solving ODEs with initial conditions (performed routinely with numerical ODE solvers in Python, Matlab, Mathematica, etc.).

The first problem we will tackle using the geodesic equation is the deflection of light by gravity.  This is the quintessential problem of GR because there cannot be any gravitational force on a photon, yet the path of the photon surely must bend in the presence of gravity.  This is possible through the geodesic motion of the photon through warped space time.  I’ll take up this problem in my next Blog.

Is the Future of Quantum Computing Bright?

There is a very real possibility that quantum computing is, and always will be, a technology of the future.  Yet if it is ever to be the technology of the now, then it needs two things: practical high-performance implementation and a killer app.  Both of these will require technological breakthroughs.  Whether this will be enough to make quantum computing real (commercializable) was the topic of a special symposium at the Conference on Lasers and ElectroOptics (CLEO) held in San Jose the week of May 6, 2019. 

Quantum computing is stuck in a sort of limbo between hype and hope, pitched with incredible (unbelievable?) claims, yet supported by tantalizing laboratory demonstrations. 

            The symposium had panelists from many top groups working in quantum information science, including Jerry Chow (IBM), Mikhail Lukin (Harvard), Jelena Vuckovic (Stanford), Birgitta Whaley (Berkeley) and Jungsang Kim (IonQ).  The moderator Ben Eggleton (U Sydney) posed the question to the panel: “Will Quantum Computing Actually Work?”.  My Blog for this week is a report, in part, of what they said, and also what was happening in the hallways and the scientific sessions at CLEO.  My personal view after listening and watching this past week is that the future of quantum computers is optics.

Einstein’s Photons

 It is either ironic or obvious that the central figure behind quantum computing is Albert Einstein.  It is obvious because Einstein provided the fundamental tools of quantum computing by creating both quanta and entanglement (the two key elements to any quantum computer).  It is ironic, because Einstein turned his back on quantum mechanics, and he “invented” entanglement to actually argue that it was an “incomplete science”. 

            The actual quantum revolution did not begin with Max Planck in 1900, as so many Modern Physics textbooks attest, but with Einstein in 1905.  This was his “miracle year” when he published 5 seminal papers, each of which solved one of the greatest outstanding problems in the physics of the time.  In one of those papers he used simple arguments based on statistics, combined with the properties of light emission, to propose — actually to prove — that light is composed of quanta of energy (later to be named “photons” by Gilbert Lewis in 1924).  Although Planck’s theory of blackbody radiation contained quanta implicitly through the discrete actions of his oscillators in the walls of the cavity, Planck vigorously rejected the idea that light itself came in quanta.  He even apologized for Einstein, as he was proposing Einstein for membership the Berlin Academy, saying that he should be admitted despite his grave error of believing in light quanta.  When Millikan set out in 1914 to prove experimentally that Einstein was wrong about photons by performing exquisite experiments on the photoelectric effect, he actually ended up proving that Einstein was right after all, which brought Einstein the Nobel Prize in 1921.

            In the early 1930’s after a series of intense and public debates with Bohr over the meaning of quantum mechanics, Einstein had had enough of the “Copenhagen Interpretation” of quantum mechanics.  In league with Schrödinger, who deeply disliked Heisenberg’s version of quantum mechanics, the two proposed two of the most iconic problems of quantum mechanics.  Schrödinger launched, as a laughable parody, his eponymously-named “Schrödinger’s Cat”, and Einstein launched what has become known as the “Entanglement”.  Each was intended to show the absurdity of quantum mechanics and drive a nail into its coffin, but each has been embraced so thoroughly by physicists that Schrödinger and Einstein are given the praise and glory for inventing these touchstones of quantum science. Schrödinger’s cat and entanglement both lie at the heart of the problems and the promise of quantum computers.

Between Hype and Hope

Quantum computing is stuck in a sort of limbo between hype and hope, pitched with incredible (unbelievable?) claims, yet supported by tantalizing laboratory demonstrations.  In the midst of the current revival in quantum computing interest (the first wave of interest in quantum computing was in the 1990’s, see “Mind at Light Speed“), the US Congress has passed a house resolution to fund quantum computing efforts in the United States with a commitment $1B.  This comes on the heels of commercial efforts in quantum computing by big players like IBM, Microsoft and Google, and also is partially in response to China’s substantial financial commitment to quantum information science.  These acts, and the infusion of cash, will supercharge efforts on quantum computing.  But this comes with real danger of creating a bubble.  If there is too much hype, and if the expensive efforts under-deliver, then the bubble will burst, putting quantum computing back by decades.  This has happened before, as in the telecom and fiber optics bubble of Y2K that burst in 2001.  The optics industry is still recovering from that crash nearly 20 years later.  The quantum computing community will need to be very careful in managing expectations, while also making real strides on some very difficult and long-range problems.

            This was part of what the discussion at the CLEO symposium centered around.  Despite the charge by Eggleton to “be real” and avoid the hype, there was plenty of hype going around on the panel and plenty of optimism, tempered by caution.  I admit that there is reason for cautious optimism.  Jerry Chow showed IBM’s very real quantum computer (with a very small number of qubits) that can be accessed through the cloud by anyone.  They even built a user interface to allow users to code their own quantum codes.  Jungsang Kim of IonQ was equally optimistic, showing off their trapped-atom quantum computer with dozens of trapped ions acting as individual qubits.  Admittedly Chow and Kim have vested interests in their own products, but the technology is certainly impressive.  One of the sharpest critics, Mikhail Lukin of Harvard, was surprisingly also one of the most optimistic. He made clear that scalable quantum computers in the near future is nonsense.  Yet he is part of a Harvard-MIT collaboration that has constructed a 51-qubit array of trapped atoms that sets a world record.  Although it cannot be used for quantum computing, it was used to simulate a complex many-body physics problem, and it found an answer that could not be calculated or predicted using conventional computers.

            The panel did come to a general consensus about quantum computing that highlights the specific challenges that the field will face as it is called upon to deliver on its hyperbole.  They each echoed an idea known as the “supremacy plot” which is a two-axis graph of number of qubits and number of operations (also called circuit depth).  The graph has one region that is not interesting, one region that is downright laughable (at the moment), and one final area of great hope.  The region of no interest lies in the range of large numbers of qubits but low numbers of operations, or large numbers of operations on a small number of qubits.  Each of these extremes can easily be calculated on conventional computers and hence is of no practical interest.  The region that is laughable is the the area of large numbers of qubits and large numbers of operations.  No one suggested that this area can be accessed in even the next 10 years.  The region that everyone is eager to reach is the region of “quantum supremacy”.  This consists of quantum computers that have enough qubits and enough operations that they cannot be simulated by classical computers.  When asked where this region is, the panel consensus was that it would require more than 50 qubits and more than hundreds or thousands of operations.  What makes this so exciting is that there are real technologies that are now approaching this region–and they are based on light.

The Quantum Supremacy Chart: Plot of the number of Qbits and the circuit depth (number of operations or gates) in a quantum computer. The red region (“Zzzzzzz”) is where classical computers can do as well. The purple region (“Ha Ha Ha”) is a dream. The middle region (“Wow”) is the region of hope, which may soon be reached by trapped atoms and optics.

Chris Monroe’s Perfect Qubits

The second plenary session at CLEO featured the recent Nobel prize winners Art Ashkin, Donna Strickland and Gerard Mourou who won the 2018 Nobel prize in physics for laser applications.  (Donna Strickland is only the third woman to win the Nobel prize in physics.)  The warm-up band for these headliners was Chris Monroe, founder of the start-up company IonQ out of the University of Maryland.  Monroe outlined the general layout of their quantum computer which is based on trapped atoms which he called “perfect qubits”.  Each trapped atom is literally an atomic clock with the kind of exact precision that atomic clocks come with.  The quantum properties of these atoms are as perfect as is needed for any quantum computation, and the limits on the performance of the current IonQ system is entirely caused by the classical controls that trap and manipulate the atoms.  This is where the efforts of their rapidly growing R&D team are focused.

            If trapped atoms are the perfect qubit, then the perfect quantum communication channel is the photon.  The photon in vacuum is the quintessential messenger, propagating forever and interacting with nothing.  This is why experimental cosmologists can see the photons originating from the Big Bang 13 billion years ago (actually from about a hundred thousand years after the Big Bang when the Universe became transparent).  In a quantum computer based on trapped atoms as the gates, photons become the perfect wires.

            On the quantum supremacy chart, Monroe plotted the two main quantum computing technologies: solid state (based mainly on superconductors but also some semiconductor technology) and trapped atoms.  The challenges to solid state quantum computers comes with the scale-up to the range of 50 qubits or more that will be needed to cross the frontier into quantum supremacy.  The inhomogeneous nature of solid state fabrication, as perfected as it is for the transistor, is a central problem for a solid state solution to quantum computing.  Furthermore, by scaling up the number of solid state qubits, it is extremely difficult to simultaneously increase the circuit depth.  In fact, circuit depth is likely to decrease (initially) as the number of qubits rises because of the two-dimensional interconnect problem that is well known to circuit designers.  Trapped atoms, on the other hand, have the advantages of the perfection of atomic clocks that can be globally interconnected through perfect photon channels, and scaling up the number of qubits can go together with increased circuit depth–at least in the view of Monroe, who admittedly has a vested interest.  But he was speaking before an audience of several thousand highly-trained and highly-critical optics specialists, and no scientist in front of such an audience will make a claim that cannot be supported (although the reality is always in the caveats).

The Future of Quantum Computing is Optics

The state of the art of the photonic control of light equals the levels of sophistication of electronic control of the electron in circuits.  Each is driven by big-world applications: electronics by the consumer electronics and computer market, and photonics by the telecom industry.  Having a technology attached to a major world-wide market is a guarantee that progress is made relatively quickly with the advantages of economy of scale.  The commercial driver is profits, and the driver for funding agencies (who support quantum computing) is their mandate to foster competitive national economies that create jobs and improve standards of living.

            The yearly CLEO conference is one of the top conferences in laser science in the world, drawing in thousands of laser scientists who are working on photonic control.  Integrated optics is one of the current hot topics.  It brings many of the resources of the electronics industry to bear on photonics.  Solid state optics is mostly concerned with quantum properties of matter and its interaction with photons, and this year’s CLEO conference hosted many focused sessions on quantum sensors, quantum control, quantum information and quantum communication.  The level of external control of quantum systems is increasing at a spectacular rate.  Sitting in the audience at CLEO you get the sense that you are looking at the embryonic stages of vast new technologies that will be enlisted in the near future for quantum computing.  The challenge is, there are so many variants that it is hard to know which of these naissent technologies will win and change the world.  But the key to technological progress is diversity (as it is for society), because it is the interplay and cross-fertilization among the diverse technologies that drives each forward, and even technologies that recede away still contribute to the advances of the winning technology. 

            The expert panel at CLEO on the future of quantum computing punctuated their moments of hype with moments of realism as they called for new technologies to solve some of the current barriers to quantum computers.  Walking out of the panel discussion that night, and walking into one of the CLEO technical sessions the next day, you could almost connect the dots.  The enabling technologies being requested by the panel are literally being built by the audience.

            In the end, the panel had a surprisingly prosaic argument in favor of the current push to build a working quantum computer.  It is an echo of the movie Field of Dreams, with the famous quote “If you build it they will come”.  That was the plea made by Lukin, who argued that by putting quantum computers into the hands of users, then the killer app that will drive the future economics of quantum computers likely will emerge.  You don’t really know what to do with a quantum computer until you have one.

            Given the “perfect qubits” of trapped atoms, and the “perfect photons” of the communication channels, combined with the dizzying assortment of quantum control technologies being invented and highlighted at CLEO, it is easy to believe that the first large-scale quantum computers will be based on light.

Chandrasekhar’s Limit

Arthur Eddington was the complete package—an observationalist with the mathematical and theoretical skills to understand Einstein’s general theory, and the ability to construct the theory of the internal structure of stars.  He was Zeus in Olympus among astrophysicists.  He always had the last word, and he stood with Einstein firmly opposed to the Schwarzschild singularity.  In 1924 he published a theoretical paper in which he derived a new coordinate frame (now known as Eddington-Finkelstein coordinates) in which the singularity at the Schwarzschild radius is removed.  At the time, he took this to mean that the singularity did not exist and that gravitational cut off was not possible [1].  It would seem that the possibility of dark stars (black holes) had been put to rest.  Both Eddington and Einstein said so!  But just as they were writing the obituary of black holes, a strange new form of matter was emerging from astronomical observations that would challenge the views of these giants.

Something wonderful, but also a little scary, happened when Chandrasekhar included the relativistic effects in his calculation.

White Dwarf

Binary star systems have always held a certain fascination for astronomers.  If your field of study is the (mostly) immutable stars, then the stars that do move provide some excitement.  The attraction of binaries is the same thing that makes them important astrophysically—they are dynamic.  While many double stars are observed in the night sky (a few had been noted by Galileo), some of these are just coincidental alignments of near and far stars.  However, William Herschel began cataloging binary stars in 1779 and became convinced in 1802 that at least some of them must be gravitationally bound to each other.  He carefully measured the positions of binary stars over many years and confirmed that these stars showed relative changes in position, proving that they were gravitational bound binary star systems [2].  The first orbit of a binary star was computed in 1827 by Félix Savary for the orbit of Xi Ursae Majoris.  Finding the orbit of a binary star system provides a treasure trove of useful information about the pair of stars.  Not only can the masses of the stars be determined, but their radii and densities also can be estimated.  Furthermore, by combining this information with the distance to the binaries, it was possible to develop a relationship between mass and luminosity for all stars, even single stars.  Therefore, binaries became a form of measuring stick for crucial stellar properties.

Comparison of Earth to a white dwarf star with a mass equal to the Sun. They have comparable radii but radically different densities.

One of the binary star systems that Hershel discovered was the pair known as 40 Eridani B/C, which he observed on January 31 in 1783.  Of this pair, 40 Eridani B was very dim compared to its companion.  More than a century later, in 1910 when spectrographs were first being used routinely on large telescopes, the spectrum of 40 Eridani B was found to be of an unusual white spectral class.  In the same year, the low luminosity companion of Sirius, known as Sirius B, which shared the same unusual white spectral class, was evaluated in terms of its size and mass and was found to be exceptionally small and dense [3].  In fact, it was too small and too dense to be believed at first, because the densities were beyond any known or even conceivable matter.  The mass of Sirius B is around the mass of the Sun, but its radius is comparable to the radius of the Earth, making the density of the white star about ten thousand times denser than the core of the Sun.  Eddington at first felt the same way about white dwarfs that he felt about black holes, but he was eventually swayed by the astrophysical evidence.  By 1922 many of these small white stars had been discovered, called white dwarfs, and their incredibly large densities had been firmly established.  In his famous book on stellar structure [4], he noted the strange paradox:  As a star cools, its pressure must decrease, as all gases must do as they cool, and the star would shrink, yet the pressure required to balance the force of gravity to stabilize the star against continued shrinkage must increase as the star gets smaller.  How can pressure decrease and yet increase at the same time?  In 1926, on the eve of the birth of quantum mechanics, Eddington could conceive of no mechanism that could resolve this paradox.  So he noted it as an open problem in his book and sent it to press.

Subrahmanyan Chandrasekhar

Three years after the publication of Eddington’s book, an eager and excited nineteen-year-old graduate of the University in Madras India boarded a steamer bound for England.  Subrahmanyan Chandrasekhar (1910—1995) had been accepted for graduate studies at Cambridge University.  The voyage in 1930 took eighteen days via the Suez Canal, and he needed something to do to pass the time.  He had with him Eddington’s book, which he carried like a bible, and he also had a copy of a breakthrough article written by R. H. Fowler that applied the new theory of quantum mechanics to the problem of dense matter composed of ions and electrons [5].  Fowler showed how the Pauli exclusion principle for electrons, that obeyed Fermi-Dirac statistics, created an energetic sea of electrons in their lowest energy state, called electron degeneracy.  This degeneracy was a fundamental quantum property of matter, and carried with it an intrinsic pressure unrelated to thermal properties.  Chandrasekhar realized that this was a pressure mechanism that could balance the force of gravity in a cooling star and might resolve Eddington’s paradox of the white dwarfs.  As the steamer moved ever closer to England, Chandrasekhar derived the new balance between gravitational pressure and electron degeneracy pressure and found the radius of the white dwarf as a function of its mass.  The critical step in Chandrasekhar’s theory, conceived alone on the steamer at sea with access to just a handful of books and papers, was the inclusion of special relativity with the quantum physics.  This was necessary, because the densities were so high and the electrons were so energetic, that they attained speeds approaching the speed of light. 

Something wonderful, but also a little scary, happened when Chandrasekhar included the relativistic effects in his calculation.  He discovered that electron degeneracy pressure could balance the force of gravity if the mass of the white dwarf were smaller than about 1.4 times the mass of the Sun.  But if the dwarf was more massive than this, then even the electron degeneracy pressure would be insufficient to fight gravity, and the star would continue to collapse.  To what?  Schwarzschild’s singularity was one possibility.  Chandrasekhar wrote up two papers on his calculations, and when he arrived in England, he showed them to Fowler, who was to be his advisor at Cambridge.  Fowler was genuinely enthusiastic about  the first paper, on the derivation of the relativistic electron degeneracy pressure, and it was submitted for publication.  The second paper, on the maximum sustainable mass for a white dwarf, which reared the ugly head of Schwarzschild’s singularity, made Fowler uncomfortable, and he sat on the paper, unwilling to give his approval for publication in the leading British astrophysical journal.  Chandrasekhar grew annoyed, and in frustration sent it, without Fowler’s approval, to an American journal, where “The Maximum Mass of Ideal White Dwarfs” was published in 1931 [6].  This paper, written in eighteen days on a steamer at sea, established what became known as the Chandrasekhar limit, for which Chandrasekhar would win the 1983 Nobel Prize in Physics, but not before he was forced to fight major battles for its acceptance.

The Chandrasekhar limit expressed in terms of the Planck Mass and the mass of a proton. The limit is approximately 1.4 times the mass of the Sun. White dwarfs with masses larger than the limit cannot balance gravitational collapse by relativistic electron degeneracy.

Chandrasekhar versus Eddington

Initially there was almost no response to Chandrasekhar’s paper.  Frankly, few astronomers had the theoretical training needed to understand the physics.  Eddington was one exception, which was why he held such stature in the community.  The big question therefore was:  Was Chandrasekhar’s theory correct?  During the three years to obtain his PhD, Chandrasekhar met frequently with Eddington, who was also at Cambridge, and with colleagues outside the university, and they all encouraged Chandrasekhar to tackle the more difficult problem to combine internal stellar structure with his theory.  This could not be done with pen and paper, but required numerical calculation.  Eddington was in possession of an early electromagnetic calculator, and he loaned it to Chandrasekhar to do the calculations.  After many months of tedious work, Chandrasekhar was finally ready to confirm his theory at the 1934 meeting of the British Astrophysical Society. 

The young Chandrasekhar stood up and gave his results in an impeccable presentation before an auditorium crowded with his peers.  But as he left the stage, he was shocked when Eddington himself rose to give the next presentation.  Eddington proceeded to criticize and reject Chandrasekhar’s careful work, proposing instead a garbled mash-up of quantum theory and relativity that would eliminate Chandrasekhar’s limit and hence prevent collapse to the Schwarzschild singularity.  Chandrasekhar sat mortified in the audience.  After the session, many of his friends and colleagues came up to him to give their condolences—if Eddington, the leader of the field and one of the few astronomers who understood Einstein’s theories, said that Chandrasekhar was wrong, then that was that.  Badly wounded, Chandrasekhar was faced with a dire choice.  Should he fight against the reputation of Eddington, fight for the truth of his theory?  But he was at the beginning of his career and could ill afford to pit himself against the giant.  So he turned his back on the problem of stellar death, and applied his talents to the problem of stellar evolution. 

Chandrasekhar went on to have an illustrious career, spent mostly at the University of Chicago (far from Cambridge), and he did eventually return to his limit as it became clear that Eddington was wrong.  In fact, many at the time already suspected Eddington was wrong and were seeking for the answer to the next question: If white dwarfs cannot support themselves under gravity and must collapse, what do they collapse to?  In Pasadena at the California Institute of Technology, an astrophysicist named Fritz Zwicky thought he knew the answer.

Fritz Zwicky’s Neutron Star

Fritz Zwicky (1898—1874) was an irritating and badly flawed genius.  What made him so irritating was that he knew he was a genius and never let anyone forget it.  What made him badly flawed was that he never cared much for weight of evidence.  It was the ideas that mattered—let lesser minds do the tedious work of filling in the cracks.  And what made him a genius was that he was often right!  Zwicky pushed the envelope—he loved extremes.  The more extreme a theory was, the more likely he was to favor it—like his proposal for dark matter.  Most of his colleagues considered him to be a buffoon and borderline crackpot.  He was tolerated by no one—no one except his steadfast collaborator of many years Ernst Baade (until they nearly came to blows on the eve of World War II).  Baade was a German physicist trained at Göttingen and recently arrived at Cal Tech.  He was exceptionally well informed on the latest advances in a broad range of fields.  Where Zwicky made intuitive leaps, often unsupported by evidence, Baade would provide the context.  Baade was a walking Wikipedia for Zwicky, and together they changed the face of astrophysics.

Zwicky and Baade submitted an abstract to the American Physical Society Meeting in 1933, which Kip Thorne has called “…one of the most prescient documents in the history of physics and astronomy” [7].  In the abstract, Zwicky and Baade introduced, for the first time, the existence of supernovae as a separate class of nova and estimated the total energy output of these cataclysmic events, including the possibility that they are the source of some cosmic rays.  They introduced the idea of a neutron star, a star composed purely of neutrons, only a year after Chadwick discovered the neutron’s existence, and they strongly suggested that a supernova is produced by the transformation of a star into a neutron star.  A neutron star would have a mass similar to that of the Sun, but would have a radius of only tens of kilometers.  If the mass density of white dwarfs was hard to swallow, the density of a neutron star was billion times greater!  It would take nearly thirty years before each of the assertions made in this short abstract were proven true, but Zwicky certainly had a clear view, tempered by Baade, of where the field of astrophysics was headed.  But no one listened to Zwicky.  He was too aggressive and backed up his wild assertions with too little substance.  Therefore, neutron stars simmered on the back burner until more substantial physicists could address their properties more seriously.

Two substantial physicists who had the talent and skills that Zwicky lacked were Lev Landau in Moscow and Robert Oppenheimer at Berkeley.  Landau derived the properties of a neutron star in 1937 and published the results to great fanfare.  He was not aware of Zwicky’s work, and he called them neutron cores, because he hypothesized that they might reside at the core of ordinary stars like the Sun.  Oppenheimer, working with a Canadian graduate student George Volkoff at Berkeley, showed that Landau’s idea about stellar cores was not correct, but that the general idea of a neutron core, or rather neutron star, was correct [8].  Once Oppenheimer was interested in neutron stars, he kept going and asked the same question about neutron stars that Chandrasekhar had asked about white dwarfs:  Is there a maximum size for neutron stars beyond which they must collapse?  The answer to this question used the same quantum mechanical degeneracy pressure (now provided by neutrons rather than electrons) and gravitational compaction as the problem of white dwarfs, but it required detailed understanding of nuclear forces, which in 1938 were only beginning to be understood.  However, Oppenheimer knew enough to make a good estimate of the nuclear binding contribution to the total internal pressure and came to a similar conclusion for neutron stars as Chandrasekhar had made for white dwarfs.  There was indeed a maximum mass of a neutron star, a Chandrasekhar-type limit of about three solar masses.  Beyond this mass, even the degeneracy pressure of neutrons could not support gravitational pressure, and the neutron star must collapse.  In Oppenheimer’s mind it was clear what it must collapse to—a black hole (known as gravitational cut-off at that time). This was to lead Oppenheimer and John Wheeler to their famous confrontation over the existence of black holes, which Oppenheimer won, but Wheeler took possession of the battle field [9].

Derivation of the Relativistic Chandrasekhar Limit

White dwarfs are created from the balance between gravitational compression and the degeneracy pressure of electrons caused by the Pauli exclusion principle. When a star collapses gravitationally, the matter becomes so dense that the electrons begin to fill up quantum states until all the lowest-energy states are filled and no more electrons can be added. This results in a balance that stabilizes the gravitational collapse, and the result is a white dwarf with a mass density a million times larger than the Sun.

If the electrons remained non-relativistic, then there would be no upper limit for the size of a star that would form a white dwarf. However, because electrons become relativistic at high enough compaction, if the initial star is too massive, the electron degeneracy pressure becomes limited relativistically and cannot keep the matter from compacting more, and even the white dwarf will collapse (to a neutron star or a black hole). The largest mass that can be supported by a white dwarf is known as the Chandrasekhar limit.

A simplified derivation of the Chandrasekhar limit begins by defining the total energy as the kinetic energy of the degenerate Fermi electron gas plus the gravitational potential energy

The kinetic energy of the degenerate Fermi gas has the relativistic expression

where the Fermi k-vector can be expressed as a function of the radius of the white dwarf and the total number of electrons in the star, as

If the star is composed of pure hydrogen, then the mass of the star is expressed in terms of the total number of electrons and the mass of the proton

The total energy of the white dwarf is minimized by taking its derivative with respect to the radius of the star

When the derivative is set to zero, the term in brackets becomes

This is solved for the radius for which the electron degeneracy pressure stabilizes the gravitational pressure

This is the relativistic radius-mass expression for the size of the stabilized white dwarf as a function of the mass (or total number of electrons). One of the astonishing results of this calculation is the merging of astronomically large numbers (the mass of stars) with both relativity and quantum physics. The radius of the white dwarf is actually expressed as a multiple of the Compton wavelength of the electron!

The expression in the square root becomes smaller as the size of the star increases, and there is an upper bound to the mass of the star beyond which the argument in the square root goes negative. This upper bound is the Chandrasekhar limit defined when the argument equals zero

This gives the final expression for the Chandrasekhar limit (expressed in terms of the Planck mass)

This expression is only approximate, but it does contain the essential physics and magnitude. This limit is on the order of a solar mass. A more realistic numerical calculation yields a limiting mass of about 1.4 times the mass of the Sun. For white dwarfs larger than this value, the electron degeneracy is insufficient to support the gravitational pressure, and the star will collapse to a neutron star or a black hole.

[1] The fact that Eddington coordinates removed the singularity at the Schwarzschild radius was first pointed out by Lemaitre in 1933.  A local observer passing through the Schwarzschild radius would experience no divergence in local properties, even though a distant observer would see that in-falling observer becoming length contracted and time dilated. This point of view of an in-falling observer was explained in 1958 by Finkelstein, who also pointed out that the Schwarzschild radius is an event horizon.

[2] William Herschel (1803), Account of the Changes That Have Happened, during the Last Twenty-Five Years, in the Relative Situation of Double-Stars; With an Investigation of the Cause to Which They Are Owing, Philosophical Transactions of the Royal Society of London 93, pp. 339–382 (Motion of binary stars)

[3] Boss, L. (1910). Preliminary General Catalogue of 6188 stars for the epoch 1900. Carnegie Institution of Washington. (Mass and radius of Sirius B)

[4] Eddington, A. S. (1927). Stars and Atoms. Clarendon Press. LCCN 27015694.

[5] Fowler, R. H. (1926). “On dense matter”. Monthly Notices of the Royal Astronomical Society 87: 114. Bibcode:1926MNRAS..87..114F. (Quantum mechanics of degenerate matter).

[6] Chandrasekhar, S. (1931). “The Maximum Mass of Ideal White Dwarfs”. The Astrophysical Journal 74: 81. Bibcode:1931ApJ….74…81C. doi:10.1086/143324. (Mass limit of white dwarfs).

[7] Kip Thorne (1994) Black Holes & Time Warps: Einstein’s Outrageous Legacy (Norton). pg. 174

[8] Oppenheimer was aware of Zwicky’s proposal because he had a joint appointment between Berkeley and Cal Tech.

[9] See Chapter 7, “The Lens of Gravity” in Galileo Unbound: A Path Across Life, the Universe and Everything (Oxford University Press, 2018).

A Wealth of Motions: Six Generations in the History of the Physics of Motion


Since Galileo launched his trajectory, there have been six broad generations that have traced the continuing development of concepts of motion. These are: 1) Universal Motion; 2) Phase Space; 3) Space-Time; 4) Geometric Dynamics; 5) Quantum Coherence; and 6) Complex Systems. These six generations were not all sequential, many evolving in parallel over the centuries, borrowing from each other, and there surely are other ways one could divide up the story of dynamics. But these six generations capture the grand concepts and the crucial paradigm shifts that are Galileo’s legacy, taking us from Galileo’s trajectory to the broad expanses across which physicists practice physics today.

Universal Motion emerged as a new concept when Isaac Newton proposed his theory of universal gravitation by which the force that causes apples to drop from trees is the same force that keeps the Moon in motion around the Earth, and the Earth in motion around the Sun. This was a bold step because even in Newton’s day, some still believed that celestial objects obeyed different laws. For instance, it was only through the work of Edmund Halley, a contemporary and friend of Newton’s, that comets were understood to travel in elliptical orbits obeying the same laws as the planets. Universal Motion included ideas of momentum from the start, while concepts of energy and potential, which fill out this first generation, took nearly a century to develop in the hands of many others, like Leibniz and Euler and the Bernoullis. This first generation was concluded by the masterwork of the Italian-French mathematician Joseph-Louis Lagrange, who also planted the seed of the second generation.

The second generation, culminating in the powerful and useful Phase Space, also took more than a century to mature. It began when Lagrange divorced dynamics from geometry, establishing generalized coordinates as surrogates to directions in space. Ironically, by discarding geometry, Lagrange laid the foundation for generalized spaces, because generalized coordinates could be anything, coming in any units and in any number, each coordinate having its companion velocity, doubling the dimension for every freedom. The Austrian physicist Ludwig Boltzmann expanded the number of dimensions to the scale of Avogadro’s number of particles, and he discovered the conservation of phase space volume, an invariance of phase space that stays the same even as 1023 atoms (Avogadro’s number) in ideal gases follow their random trajectories. The idea of phase space set the stage for statistical mechanics and for a new probabilistic viewpoint of mechanics that would extend into chaotic motions.

The French mathematician Henri Poincaré got a glimpse of chaotic motion in 1890 as he rushed to correct an embarrassing mistake in his manuscript that had just won a major international prize. The mistake was mathematical, but the consequences were profoundly physical, beginning the long road to a theory of chaos that simmered, without boiling, for nearly seventy years until computers became common lab equipment. Edward Lorenz of MIT, working on models of the atmosphere in the late 1960s, used one of the earliest scientific computers to expose the beauty and the complexity of chaotic systems. He discovered that the computer simulations were exponentially sensitive to the initial conditions, and the joke became that a butterfly flapping its wings in China could cause hurricanes in the Atlantic. In his computer simulations, Lorenz discovered what today is known as the Lorenz butterfly, an example of something called a “strange attractor”. But the term chaos is a bit of a misnomer, because chaos theory is primarily about finding what things are shared in common, or are invariant, among seemingly random-acting systems.

The third generation in concepts of motion, Space-Time, is indelibly linked with Einstein’s special theory of relativity, but Einstein was not its originator. Space-time was the brain child of the gifted but short-lived Prussian mathematician Hermann Minkowski, who had been attracted from Königsberg to the mathematical powerhouse at the University in Göttingen, Germany around the turn of the 20th Century by David Hilbert. Minkowski was an expert in invariant theory, and when Einstein published his special theory of relativity in 1905 to explain the Lorentz transformations, Minkowski recognized a subtle structure buried inside the theory. This structure was related to Riemann’s metric theory of geometry, but it had the radical feature that time appeared as one of the geometric dimensions. This was a drastic departure from all former theories of motion that had always separated space and time: trajectories had been points in space that traced out a continuous curve as a function of time. But in Minkowski’s mind, trajectories were invariant curves, and although their mathematical representation changed with changing point of view (relative motion of observers), the trajectories existed in a separate unchanging reality, not mere functions of time, but eternal. He called these trajectories world lines. They were static structures in a geometry that is today called Minkowski space. Einstein at first was highly antagonistic to this new view, but he relented, and later he so completely adopted space-time in his general theory that today Minkowski is almost forgotten, his echo heard softly in expressions of the Minkowski metric that is the background to Einstein’s warped geometry that bends light and captures errant space craft.

The fourth generation in the development of concepts of motion, Geometric Dynamics, began when an ambitious French physicist with delusions of grandeur, the historically ambiguous Pierre Louis Maupertuis, returned from a scientific boondoggle to Lapland where he measured the flatness of the Earth in defense of Newtonian physics over Cartesian. Skyrocketed to fame by the success of the expedition, he began his second act by proposing the Principle of Least Action, a principle by which all motion seeks to be most efficient by taking a geometric path that minimizes a physical quantity called action. In this principle, Maupertuis saw both a universal law that could explain all of physical motion, as well as a path for himself to gain eternal fame in the company of Galileo and Newton. Unfortunately, his high hopes were dashed through personal conceit and nasty intrigue, and most physicists today don’t even recognize his name. But the idea of least action struck a deep chord that reverberates throughout physics. It is the first and fundamental example of a minimum principle, of which there are many. For instance, minimum potential energy identifies points of system equilibrium, and paths of minimum distances are geodesic paths. In dynamics, minimization of the difference between potential and kinetic energies identifies the dynamical paths of trajectories, and minimization of distance through space-time warped by mass and energy density identifies the paths of falling objects.

Maupertuis’ fundamentally important idea was picked up by Euler and Lagrange, expanding it through the language of differential geometry. This was the language of Bernhard Riemann, a gifted and shy German mathematician whose mathematical language was adopted by physicists to describe motion as a geodesic, the shortest path like a great-circle route on the Earth, in an abstract dynamical space defined by kinetic energy and potentials. In this view, it is the geometry of the abstract dynamical space that imposes Galileo’s simple parabolic form on freely falling objects. Einstein took this viewpoint farther than any before him, showing how mass and energy warped space and how free objects near gravitating bodies move along geodesic curves defined by the shape of space. This brought trajectories to a new level of abstraction, as space itself became the cause of motion. Prior to general relativity, motion occurred in space. Afterwards, motion was caused by space. In this sense, gravity is not a force, but is like a path down which everything falls.

The fifth generation of concepts of motion, Quantum Coherence, increased abstraction yet again in the comprehension of trajectories, ushering in difficult concepts like wave-particle duality and quantum interference. Quantum interference underlies many of the counter-intuitive properties of quantum systems, including the possibility for quantum systems to be in two or more states at the same time, and for quantum computers to crack unbreakable codes. But this new perspective came with a cost, introducing fundamental uncertainties that are locked in a battle of trade-offs as one measurement becomes more certain and others becomes more uncertain.

Einstein distrusted Heisenberg’s uncertainty principle, not that he disagreed with its veracity, but he felt it was more a statement of ignorance than a statement of fundamental unknowability. In support of Einstein, Schrödinger devised a thought experiment that was meant to be a reduction to absurdity in which a cat is placed in a box with a vial of poison that would be broken if a quantum particle decays. The cruel fate of Schrödinger’s cat, who might or might not be poisoned, hinges on whether or not someone opens the lid and looks inside. Once the box is opened, there is one world in which the cat is alive and another world in which the cat is dead. These two worlds spring into existence when the box is opened—a bizarre state of affairs from the point of view of a pragmatist. This is where Richard Feynman jumped into the fray and redefined the idea of a trajectory in a radically new way by showing that a quantum trajectory is not a single path, like Galileo’s parabola, but the combined effect of the quantum particle taking all possible paths simultaneously. Feynman established this new view of quantum trajectories in his thesis dissertation under the direction of John Archibald Wheeler at Princeton. By adapting Maupertuis’ Principle of Least Action to quantum mechanics, Feynman showed how every particle takes every possible path—simultaneously—every path interfering in such as way that only the path with the most constructive interference is observed. In the quantum view, the deterministic trajectory of the cannon ball evaporates into a cloud of probable trajectories.

In our current complex times, the sixth generation in the evolution of concepts of motion explores Complex Systems. Lorenz’s Butterfly has more to it than butterflies, because Life is the greatest complex system of our experience and our existence. We are the end result of a cascade of self-organizing events that began half a billion years after Earth coalesced out of the nebula, leading to the emergence of consciousness only about 100,000 years ago—a fact that lets us sit here now and wonder about it all. That we are conscious is perhaps no accident. Once the first amino acids coagulated in a muddy pool, we have been marching steadily uphill, up a high mountain peak in a fitness landscape. Every advantage a species gained over its environment and over its competitors exerted a type of pressure on all the other species in the ecosystem that caused them to gain their own advantage.

The modern field of evolutionary dynamics spans a wide range of scales across a wide range of abstractions. It treats genes and mutations on DNA in much the same way it treats the slow drift of languages and the emergence of new dialects. It treats games and social interactions the same way it does the evolution of cancer. Evolutionary dynamics is the direct descendant of chaos theory that turned butterflies into hurricanes, but the topics it treats are special to us as evolved species, and as potential victims of disease. The theory has evolved its own visualizations, such as the branches in the tree of life and the high mountain tops in fitness landscapes separated by deep valleys. Evolutionary dynamics draws, in a fundamental way, on dynamic processes in high dimensions, without which it would be impossible to explain how something as complex as human beings could have arisen from random mutations.

These six generations in the development of dynamics are not likely to stop, and future generations may arise as physicists pursue the eternal quest for the truth behind the structure of reality.