One of the hardest aspects to grasp about relativity theory is the question of whether an event “look as if” it is doing something, or whether it “actually is” doing something.
Take, for instance, the classic twin paradox of relativity theory in which there are twins who wear identical high-precision wrist watches. One of them rockets off to Alpha Centauri at relativistic speeds and returns while the other twin stays on Earth. Each twin sees the other twin’s clock running slowly because of relativistic time dilation. Yet when they get back together and, standing side-by-side, they compare their watches—the twin who went to Alpha Centauri is actually younger than the other, despite the paradox. The relativistic effect of time dilation is “real”, not just apparent, regardless of whether they come back together to do the comparison.
Yet this understanding of relativistic effects took many years, even decades, to gain acceptance after Einstein proposed them. He was aware himself that key experiments were required to prove that relativistic effects are real and not just apparent.
Einstein and the Transverse Doppler Effect
In 1905 Einstein used his new theory of special relativity to predict observable consequences that included a general treatment of the relativistic Doppler effect . This included the effects of time dilation in addition to the longitudinal effect of the source chasing the wave. Time dilation produced a correction to Doppler’s original expression for the longitudinal effect that became significant at speeds approaching the speed of light. More significantly, it predicted a transverse Doppler effect for a source moving along a line perpendicular to the line of sight to an observer. This effect had not been predicted either by Christian Doppler (1803 – 1853) or by Woldemar Voigt (1850 – 1919).
Despite the generally positive reception of Einstein’s theory of special relativity, some of its consequences were anathema to many physicists at the time. A key stumbling block was the question whether relativistic effects, like moving clocks running slowly, were only apparent, or were actually real, and Einstein had to fight to convince others of its reality. When Johannes Stark (1874 – 1957) observed Doppler line shifts in ion beams called “canal rays” in 1906 (Stark received the 1919 Nobel prize in part for this discovery) , Einstein promptly published a paper suggesting how the canal rays could be used in a transverse geometry to directly detect time dilation through the transverse Doppler effect . Thirty years passed before the experiment was performed with sufficient accuracy by Herbert Ives and G. R. Stilwell in 1938 to measure the transverse Doppler effect . Ironically, even at this late date, Ives and Stilwell were convinced that their experiment had disproved Einstein’s time dilation by supporting Lorentz’ contraction theory of the electron. The Ives-Stilwell experiment was the first direct test of time dilation, followed in 1940 by muon lifetime measurements .
A) Transverse Doppler Shift Relative to EmissionAngle
The Doppler effect varies between blue shifts in the forward direction to red shifts in the backward direction, with a smooth variation in Doppler shift as a function of the emission angle. Consider the configuration shown in Fig. 1 for light emitted from a source moving at speed v and emitting at an angle θ0 in the receiver frame. The source moves a distance vT in the time of a single emission cycle (assume a harmonic wave). In that time T (which is the period of oscillation of the light source — or the period of a clock if we think of it putting out light pulses) the light travels a distance cT before another cycle begins (or another pulse is emitted).
The observed wavelength in the receiver frame is thus given by
where T is the emission period of the moving source. Importantly, the emission period is time dilated relative to the proper emission time of the source
This expression can be evaluated for several special cases:
a) θ0 = 0 for forward emission
which is the relativistic blue shift for longitudinal motion in the direction of the receiver.
b) θ0 = π for backward emission
which is the relativistic red shift for longitudinal motion away from the receiver
c) θ0 = π/2 for transverse emission
This transverse Doppler effect for emission at right angles is a red shift, caused only by the time dilation of the moving light source. This is the effect proposed by Einstein and observed by Stark that proved moving clocks tick slowly. But it is not the only way to view the transverse Doppler effect.
B) Transverse Doppler Shift Relative to Angle at Reception
A different option for viewing the transverse Doppler effect is the angle to the moving source at the moment that the light is detected. The geometry of this configuration relative to the previous is illustrated in Fig. 2.
The transverse distance to the detection point is
The length of the line connecting the detection point P with the location of the light source at the moment of detection is (using the law of cosines)
Combining with the first equation gives
An equivalent expression is obtained as
Note that this result, relating θ1 to θ0, is independent of the distance to the observation point.
When θ1 = π/2, then
for which the Doppler effect is
which is a blue shift. This creates the unexpected result that sin θ0 = π/2 produces a red shift, while sin θ1 = π/2 produces a blue shift. The question could be asked: which one represents time dilation? In fact, it is sin θ0 = π/2 that produces time dilation exclusively, because in that configuration there is no foreshortening effect on the wavelength–only the emission time.
C) Compromise: The Null Transverse Doppler Shift
The previous two configurations each could be used as a definition for the transverse Doppler effect. But one gives a red shift and one gives a blue shift, which seems contradictory. Therefore, one might try to strike a compromise between these two cases so that sin θ1 = sin θ0, and the configuration is shown in Fig. 3.
This is the case when θ1 + θ2 = π. The sines of the two angles are equal, yielding
which is solved for
Inserting this into the Doppler equation gives
where the Taylor’s expansion of the denominator (at low speed) cancels the numerator to give zero net Doppler shift. This compromise configuration represents the condition of null Doppler frequency shift. However, for speeds approaching the speed of light, the net effect is a lengthening of the wavelength, dominated by time dilation, causing a red shift.
D) Source in Circular Motion Around Receiver
An interesting twist can be added to the problem of the transverse Doppler effect: put the source or receiver into circular motion, one about the other. In the case of a source in circular motion around the receiver, it is easy to see that this looks just like case A) above for θ0 = π/2, which is the red shift caused by the time dilation of the moving source
However, there is the possible complication that the source is no longer in an inertial frame (it experiences angular acceleration) and therefore it is in the realm of general relativity instead of special relativity. In fact, it was Einstein’s solution to this problem that led him to propose the Equivalence Principle and make his first calculations on the deflection of light by gravity. His solution was to think of an infinite number of inertial frames, each of which was instantaneously co-moving with the same linear velocity as the source. These co-moving frames are inertial and can be analyzed using the principles of special relativity. The general relativistic effects come from slipping from one inertial co-moving frame to the next. But in the case of the circular transverse Doppler effect, each instantaneously co-moving frame has the exact configuration as case A) above, and so the wavelength is red shifted exactly by the time dilation.
E) Receiver in Circular Motion Around Source
With the notion of co-moving inertial frames now in hand, this configuration is exactly the same as case B) above, and the wavelength is blue shifted
 A. Einstein, “On the electrodynamics of moving bodies,” Annalen Der Physik, vol. 17, no. 10, pp. 891-921, Sep (1905)
“Society is founded on hero worship”, wrote Thomas Carlyle (1795 – 1881) in his 1840 lecture on “Hero as Divinity”—and the society of physicists is no different. Among physicists, the hero is the genius—the monomyth who journeys into the supernatural realm of high mathematics, engages in single combat against chaos and confusion, gains enlightenment in the mysteries of the universe, and returns home to share the new understanding. If the hero is endowed with unusual talent and achieves greatness, then mythologies are woven, creating shadows that can grow and eclipse the truth and the work of others, bestowing upon the hero recognitions that are not entirely deserved.
“Gentlemen! The views of space and time which I wish to lay before you … They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.”
Herman Minkowski (1908)
The greatest hero of physics of the twentieth century, without question, is Albert Einstein. He is the person most responsible for the development of “Modern Physics” that encompasses:
Relativity theory (both special and general),
Quantum theory (he invented the quantum in 1905—see my blog),
Astrophysics (his field equations of general relativity were solved by Schwarzschild in 1916 to predict event horizons of black holes, and he solved his own equations to predict gravitational waves that were discovered in 2015),
Cosmology (his cosmological constant is now recognized as the mysterious dark energy that was discovered in 2000), and
Solid state physics (his explanation of the specific heat of crystals inaugurated the field of quantum matter).
Einstein made so many seminal contributions to so many sub-fields of physics that it defies comprehension—hence he is mythologized as genius, able to see into the depths of reality with unique insight. He deserves his reputation as the greatest physicist of the twentieth century—he has my vote, and he was chosen by Time magazine in 2000 as the Man of the Century. But as his shadow has grown, it has eclipsed and even assimilated the work of others—work that he initially criticized and dismissed, yet later embraced so whole-heartedly that he is mistakenly given credit for its discovery.
For instance, when we think of Einstein, the first thing that pops into our minds is probably “spacetime”. He himself wrote several popular accounts of relativity that incorporated the view that spacetime is the natural geometry within which so many of the non-intuitive properties of relativity can be understood. When we think of time being mixed with space, making it seem that position coordinates and time coordinates share an equal place in the description of relativistic physics, it is common to attribute this understanding to Einstein. Yet Einstein initially resisted this viewpoint and even disparaged it when he first heard it!
Spacetime was the brain-child of Hermann Minkowski.
Minkowski in Königsberg
Hermann Minkowski was born in 1864 in Russia to German parents who moved to the city of Königsberg (King’s Mountain) in East Prussia when he was eight years old. He entered the university in Königsberg in 1880 when he was sixteen. Within a year, when he was only seventeen years old, and while he was still a student at the University, Minkowski responded to an announcement of the Mathematics Prize of the French Academy of Sciences in 1881. When he submitted is prize-winning memoire, he could have had no idea that it was starting him down a path that would lead him years later to revolutionary views.
The specific Prize challenge of 1881 was to find the number of representations of an integer as a sum of five squares of integers. For instance, every integer n > 33 can be expressed as the sum of five nonzero squares. As an example, 42 = 22 + 22 + 32 + 32 + 42, which is the only representation for that number. However, there are five representation for n = 53
The task of enumerating these representations draws from the theory of quadratic forms. A quadratic form is a function of products of numbers with integer coefficients, such as ax2 + bxy + cy2 and ax2 + by2 + cz2 + dxy + exz + fyz. In number theory, one seeks to find integer solutions for which the quadratic form equals an integer. For instance, the Pythagorean theorem x2 + y2 = n2 for integers is a quadratic form for which there are many integer solutions (x,y,n), known as Pythagorean triplets, such as
The topic of quadratic forms gained special significance after the work of Bernhard Riemann who established the properties of metric spaces based on the metric expression
for infinitesimal distance in a D-dimensional metric space. This is a generalization of Euclidean distance to more general non-Euclidean spaces that may have curvature. Minkowski would later use this expression to great advantage, developing a “Geometry of Numbers”  as he delved ever deeper into quadratic forms and their uses in number theory.
Minkowski in Göttingen
After graduating with a doctoral degree in 1885 from Königsberg, Minkowski did his habilitation at the university of Bonn and began teaching, moving back to Königsberg in 1892 and then to Zurich in 1894 (where one of his students was a somewhat lazy and unimpressive Albert Einstein). A few years later he was given an offer that he could not refuse.
At the turn of the 20th century, the place to be in mathematics was at the University of Göttingen. It had a long tradition of mathematical giants that included Carl Friedrich Gauss, Bernhard Riemann, Peter Dirichlet, and Felix Klein. Under the guidance of Felix Klein, Göttingen mathematics had undergone a renaissance. For instance, Klein had attracted Hilbert from the University of Königsberg in 1895. David Hilbert had known Minkowski when they were both students in Königsberg, and Hilbert extended an invitation to Minkowski to join him in Göttingen, which Minkowski accepted in 1902.
A few years after Minkowski arrived at Göttingen, the relativity revolution broke, and both Minkowski and Hilbert began working on mathematical aspects of the new physics. They organized a colloquium dedicated to relativity and related topics, and on Nov. 5, 1907 Minkowski gave his first tentative address on the geometry of relativity.
Because Minkowski’s specialty was quadratic forms, and given his understanding of Riemann’s work, he was perfectly situated to apply his theory of quadratic forms and invariants to the Lorentz transformations derived by Poincaré and Einstein. Although Poincaré had published a paper in 1906 that showed that the Lorentz transformation was a generalized rotation in four-dimensional space , Poincaré continued to discuss space and time as separate phenomena, as did Einstein. For them, simultaneity was no longer an invariant, but events in time were still events in time and not somehow mixed with space-like properties. Minkowski recognized that Poincaré had missed an opportunity to define a four-dimensional vector space filled by four-vectors that captured all possible events in a single coordinate description without the need to separate out time and space.
Minkowski’s first attempt, presented in his 1907 colloquium, at constructing velocity four-vectors was flawed because (like so many of my mechanics students when they first take a time derivative of the four-position) he had not yet understood the correct use of proper time. But the research program he outlined paved the way for the great work that was to follow.
On Feb. 21, 1908, only 3 months after his first halting steps, Minkowski delivered a thick manuscript to the printers for an article to appear in the Göttinger Nachrichten. The title “Die Grundgleichungen für die elektromagnetischen Vorgänge in bewegten Körpern” (The Basic Equations for Electromagnetic Processes of Moving Bodies) belies the impact and importance of this very dense article . In its 60 pages (with no figures), Minkowski presents the correct form for four-velocity by taking derivatives relative to proper time, and he formalizes his four-dimensional approach to relativity that became the standard afterwards. He introduces the terms spacelikevector, timelike vector, light cone and world line. He also presents the complete four-tensor form for the electromagnetic fields. The foundational work of Levi Cevita and Ricci-Curbastro on tensors was not yet well known, so Minkowski invents his own terminology of Traktor to describe it. Most importantly, he invents the terms spacetime (Raum-Zeit) and events (Erignisse) .
Minkowski’s four-dimensional formalism of relativistic electromagnetics was more than a mathematical trick—it uncovered the presence of a multitude of invariants that were obscured by the conventional mathematics of Einstein and Lorentz and Poincaré. In Minkowski’s approach, whenever a proper four-vector is contracted with itself (its inner product), an invariant emerges. Because there are many fundamental four-vectors, there are many invariants. These invariants provide the anchors from which to understand the complex relative properties amongst relatively moving frames.
Minkowski’s master work appeared in the Nachrichten on April 5, 1908. If he had thought that physicists would embrace his visionary perspective, he was about to be woefully disabused of that notion.
Despite his impressive ability to see into the foundational depths of the physical world, Einstein did not view mathematics as the root of reality. Mathematics for him was a tool to reduce physical intuition into quantitative form. In 1908 his fame was rising as the acknowledged leader in relativistic physics, and he was not impressed or pleased with the abstract mathematical form that Minkowski was trying to stuff the physics into. Einstein called it “superfluous erudition” , and complained “since the mathematics pounced on the relativity theory, I no longer understand it myself! ”
With his collaborator Jakob Laub (also a former student of Minkowski’s), Einstein objected to more than the hard-to-follow mathematics—they believed that Minkowski’s form of the pondermotive force was incorrect. They then proceeded to re-translate Minkowski’s elegant four-vector derivations back into ordinary vector analysis, publishing two papers in Annalen der Physik in the summer of 1908 that were politely critical of Minkowski’s approach [7-8]. Yet another of Minkowski’s students from Zurich, Gunnar Nordström, showed how to derive Minkowski’s field equations without any of the four-vector formalism.
One can only wonder why so many of his former students so easily dismissed Minkowski’s revolutionary work. Einstein had actually avoided Minkowski’s mathematics classes as a student at ETH , which may say something about Minkowski’s reputation among the students, although Einstein did appreciate the class on mechanics that he took from Minkowski. Nonetheless, Einstein missed the point! Rather than realizing the power and universality of the four-dimensional spacetime formulation, he dismissed it as obscure and irrelevant—perhaps prejudiced by his earlier dim view of his former teacher.
Raum und Zeit
It is clear that Minkowski was stung by the poor reception of his spacetime theory. It is also clear that he truly believed that he had uncovered an essential new approach to physical reality. While mathematicians were generally receptive of his work, he knew that if physicists were to adopt his new viewpoint, he needed to win them over with the elegant results.
In 1908, Minkowski presented a now-famous paper Raum und Zeit at the 80thAssembly of German Natural Scientists and Physicians (21 September 1908). In his opening address, he stated :
To illustrate his arguments Minkowski constructed the most recognizable visual icon of relativity theory—the space-time diagram in which the trajectories of particles appear as “world lines”, as in Fig. 1. On this diagram, one spatial dimension is plotted along the horizontal-axis, and the value ct (speed of light times time) is plotted along the vertical-axis. In these units, a photon travels along a line oriented at 45 degrees, and the world-line (the name Minkowski gave to trajectories) of all massive particles must have slopes steeper than this. For instance, a stationary particle, that appears to have no trajectory at all, executes a vertical trajectory on the space-time diagram as it travels forward through time. Within this new formulation by Minkowski, space and time were mixed together in a single manifold—spacetime—and were no longer separate entities.
In addition to the spacetime construct, Minkowski’s great discovery was the plethora of invariants that followed from his geometry. For instance, the spacetime hyperbola
is invariant to Lorentz transformation in coordinates. This is just a simple statement that a vector is an entity of reality that is independent of how it is described. The length of a vector in our normal three-space does not change if we flip the coordinates around or rotate them, and the same is true for four-vectors in Minkowski space subject to Lorentz transformations.
In relativity theory, this property of invariance becomes especially useful because part of the mental challenge of relativity is that everything looks different when viewed from different frames. How do you get a good grip on a phenomenon if it is always changing, always relative to one frame or another? The invariants become the anchors that we can hold on to as reference frames shift and morph about us.
As an example of a fundamental invariant, the mass of a particle in its rest frame becomes an invariant mass, always with the same value. In earlier relativity theory, even in Einstein’s papers, the mass of an object was a function of its speed. How is the mass of an electron a fundamental property of physics if it is a function of how fast it is traveling? The construction of invariant mass removes this problem, and the mass of the electron becomes an immutable property of physics, independent of the frame. Invariant mass is just one of many invariants that emerge from Minkowski’s space-time description. The study of relativity, where all things seem relative, became a study of invariants, where many things never change. In this sense, the theory of relativity is a misnomer. Ironically, relativity theory became the motivation of post-modern relativism that denies the existence of absolutes, even as relativity theory, as practiced by physicists, is all about absolutes.
Despite his audacious gambit to win over the physicists, Minkowski would not live to see the fruits of his effort. He died suddenly of a burst gall bladder on Jan. 12, 1909 at the age of 44.
Arnold Sommerfeld (who went on to play a central role in the development of quantum theory) took up Minkowski’s four vectors, and he systematized it in a way that was palatable to physicists. Then Max von Laue extended it while he was working with Sommerfeld in Munich, publishing the first physics textbook on relativity theory in 1911, establishing the space-time formalism for future generations of German physicists. Further support for Minkowski’s work came from his distinguished colleagues at Göttingen (Hilbert, Klein, Wiechert, Schwarzschild) as well as his former students (Born, Laue, Kaluza, Frank, Noether). With such champions, Minkowski’s work was immortalized in the methodology (and mythology) of physics, representing one of the crowning achievements of the Göttingen mathematical community.
Already in 1907 Einstein was beginning to grapple with the role of gravity in the context of relativity theory, and he knew that the special theory was just a beginning. Yet between 1908 and 1910 Einstein’s focus was on the quantum of light as he defended and extended his unique view of the photon and prepared for the first Solvay Congress of 1911. As he returned his attention to the problem of gravitation after 1910, he began to realize that Minkowski’s formalism provided a framework from which to understand the role of accelerating frames. In 1912 Einstein wrote to Sommerfeld to say 
I occupy myself now exclusively with the problem of gravitation . One thing is certain that I have never before had to toil anywhere near as much, and that I have been infused with great respect for mathematics, which I had up until now in my naivety looked upon as a pure luxury in its more subtle parts. Compared to this problem. the original theory of relativity is child’s play.
By the time Einstein had finished his general theory of relativity and gravitation in 1915, he fully acknowledge his indebtedness to Minkowski’s spacetime formalism without which his general theory may never have appeared.
Einstein is the alpha of the quantum. Einstein is also the omega. Although he was the one who established the quantum of energy and matter (see my Blog Einstein vs Planck), Einstein pitted himself in a running debate against Niels Bohr’s emerging interpretation of quantum physics that had, in Einstein’s opinion, severe deficiencies. Between sessions during a series of conferences known as the Solvay Congresses over a period of eight years from 1927 to 1935, Einstein constructed a challenges of increasing sophistication to confront Bohr and his quasi-voodoo attitudes about wave-function collapse. To meet the challenge, Bohr sharpened his arguments and bested Einstein, who ultimately withdrew from the field of battle. Einstein, as quantum physics’ harshest critic, played a pivotal role, almost against his will, establishing the Copenhagen interpretation of quantum physics that rules to this day, and also inventing the principle of entanglement which lies at the core of almost all quantum information technology today.
Fifth Solvay Congress: 1927 October Brussels: Debate Round 1
Einstein and ensembles
Sixth Solvay Congress: 1930 Debate Round 2
Photon in a box
Seventh Solvay Congress: 1933
Einstein absent (visiting the US when Hitler takes power…decides not to return to Germany.)
Physical Review 1935: Debate Round 3
EPR paper and Bohr’s response
Notable Nobel Prizes
1933 Dirac and Schrödinger
The Solvay Conferences
The Solvay congresses were unparalleled scientific meetings of their day. They were attended by invitation only, and invitations were offered only to the top physicists concerned with the selected topic of each meeting. The Solvay congresses were held about every three years always in Belgium, supported by the Belgian chemical industrialist Ernest Solvay. The first meeting, held in 1911, was on the topic of radiation and quanta.
The fifth meeting, held in 1927, was on electrons and photons and focused on the recent rapid advances in quantum theory. The old quantum guard was invited—Planck, Bohr and Einstein. The new quantum guard was invited as well—Heisenberg, de Broglie, Schrödinger, Born, Pauli, and Dirac. Heisenberg and Bohr joined forces to present a united front meant to solidify what later became known as the Copenhagen interpretation of quantum physics. The basic principles of the interpretation include the wavefunction of Schrödinger, the probabilistic interpretation of Born, the uncertainty principle of Heisenberg, the complementarity principle of Bohr and the collapse of the wavefunction during measurement. The chief conclusion that Heisenberg and Bohr sought to impress on the assembled attendees was that the theory of quantum processes was complete, meaning that unknown or uncertain characteristics of measurements could not be attributed to lack of knowledge or understanding, but were fundamental and permanently inaccessible.
Einstein was not convinced with that argument, and he rose to his feet to object after Bohr’s informal presentation of his complementarity principle. Einstein insisted that uncertainties in measurement were not fundamental, but were caused by incomplete information, that , if known, would accurately account for the measurement results. Bohr was not prepared for Einstein’s critique and brushed it off, but what ensued in the dining hall and the hallways of the Hotel Metropole in Brussels over the next several days has become one of the most famous scientific debates of the modern era, known as the Bohr-Einstein debate on the meaning of quantum theory. The debate gently raged night and day through the fifth congress, and was renewed three years later at the 1930 congress. It finished, in a final flurry of published papers in 1935 that launched some of the central concepts of quantum theory, including the idea of quantum entanglement and, of course, Schrödinger’s cat.
Einstein’s strategy, to refute Bohr, was to construct careful thought experiments that envisioned perfect experiments, without errors, that measured properties of ideal quantum systems. His aim was to paint Bohr into a corner from which he could not escape, caught by what Einstein assumed was the inconsistency of complementarity. Einstein’s “thought experiments” used electrons passing through slits, diffracting as required by Schrödinger’s theory, but being detected by classical measurements. Einstein would present a thought experiment to Bohr, who would then retreat to consider the way around Einstein’s arguments, returning the next hour or the next day with his answer, only to be confronted by yet another clever device of Einstein’s clever imagination that would force Bohr to retreat again. The spirit of this back and forth encounter between Bohr and Einstein is caught dramatically in the words of Paul Ehrenfest who witnessed the debate first hand, partially mediating between Bohr and Einstein, both of whom he respected deeply.
“Brussels-Solvay was fine!… BOHR towering over everybody. At first not understood at all … , then step by step defeating everybody. Naturally, once again the awful Bohr incantation terminology. Impossible for anyone else to summarise … (Every night at 1 a.m., Bohr came into my room just to say ONE SINGLE WORD to me, until three a.m.) It was delightful for me to be present during the conversation between Bohr and Einstein. Like a game of chess, Einstein all the time with new examples. In a certain sense a sort of Perpetuum Mobile of the second kind to break the UNCERTAINTY RELATION. Bohr from out of philosophical smoke clouds constantly searching for the tools to crush one example after the other. Einstein like a jack-in-the-box; jumping out fresh every morning. Oh, that was priceless. But I am almost without reservation pro Bohr and contra Einstein. His attitude to Bohr is now exacly like the attitude of the defenders of absolute simultaneity towards him …” 
The most difficult example that Einstein constructed during the fifth Solvary Congress involved an electron double-slit apparatus that could measure, in principle, the momentum imparted to the slit by the passing electron, as shown in Fig.3. The electron gun is a point source that emits the electrons in a range of angles that illuminates the two slits. The slits are small relative to a de Broglie wavelength, so the electron wavefunctions diffract according to Schrödinger’s wave mechanics to illuminate the detection plate. Because of the interference of the electron waves from the two slits, electrons are detected clustered in intense fringes separated by dark fringes.
So far, everyone was in agreement with these suggested results. The key next step is the assumption that the electron gun emits only a single electron at a time, so that only one electron is present in the system at any given time. Furthermore, the screen with the double slit is suspended on a spring, and the position of the screen is measured with complete accuracy by a displacement meter. When the single electron passes through the entire system, it imparts a momentum kick to the screen, which is measured by the meter. It is also detected at a specific location on the detection plate. Knowing the position of the electron detection, and the momentum kick to the screen, provides information about which slit the electron passed through, and gives simultaneous position and momentum values to the electron that have no uncertainty, apparently rebutting the uncertainty principle.
This challenge by Einstein was the culmination of successively more sophisticated examples that he had to pose to combat Bohr, and Bohr was not going to let it pass unanswered. With ingenious insight, Bohr recognized that the key element in the apparatus was the fact that the screen with the slits must have finite mass if the momentum kick by the electron were to produce a measurable displacement. But if the screen has finite mass, and hence a finite momentum kick from the electron, then there must be an uncertainty in the position of the slits. This uncertainty immediately translates into a washout of the interference fringes. In fact the more information that is obtained about which slit the electron passed through, the more the interference is washed out. It was a perfect example of Bohr’s own complementarity principle. The more the apparatus measures particle properties, the less it measures wave properties, and vice versa, in a perfect balance between waves and particles.
Einstein grudgingly admitted defeat at the end of the first round, but he was not defeated. Three years later he came back armed with more clever thought experiments, ready for the second round in the debate.
The Sixth Solvay Conference: 1930
At the Solvay Congress of 1930, Einstein was ready with even more difficult challenges. His ultimate idea was to construct a box containing photons, just like the original black bodies that launched Planck’s quantum hypothesis thirty years before. The box is attached to a weighing scale so that the weight of the box plus the photons inside can be measured with arbitrarily accuracy. A shutter over a hole in the box is opened for a time T, and a photon is emitted. Because the photon has energy, it has an equivalent weight (Einstein’s own famous E = mc2), and the mass of the box changes by an amount equal to the photon energy divided by the speed of light squared: m = E/c2. If the scale has arbitrary accuracy, then the energy of the photon has no uncertainty. In addition, because the shutter was open for only a time T, the time of emission similarly has no uncertainty. Therefore, the product of the energy uncertainty and the time uncertainty is much smaller than Planck’s constant, apparently violating Heisenberg’s precious uncertainty principle.
Bohr was stopped in his tracks with this challenge. Although he sensed immediately that Einstein had missed something (because Bohr had complete confidence in the uncertainty principle), he could not put his finger immediately on what it was. That evening he wandered from one attendee to another, very unhappy, trying to persuade them and saying that Einstein could not be right because it would be the end of physics. At the end of the evening, Bohr was no closer to a solution, and Einstein was looking smug. However, by the next morning Bohr reappeared tired but in high spirits, and he delivered a master stroke. Where Einstein had used special relaitivity against Bohr, Bohr now used Einstein’s own general relativity against him.
The key insight was that the weight of the box must be measured, and the process of measurement was just as important as the quantum process being measured—this was one of the cornerstones of the Copenhagen interpretation. So Bohr envisioned a measuring apparatus composed of a spring and a scale with the box suspended in gravity from the spring. As the photon leaves the box, the weight of the box changes, and so does the deflection of the spring, changing the height of the box. This change in height, in a gravitational potential, causes the timing of the shutter to change according to the law of gravitational time dilation in general relativity. By calculating the the general relativistic uncertainty in the time, coupled with the special relativistic uncertainty in the weight of the box, produced a product that was at least as big as Planck’s constant—Heisenberg’s uncertainty principle was saved!
Entanglement and Schrödinger’s Cat
Einstein ceded the point to Bohr but was not convinced. He still believed that quantum mechanics was not a “complete” theory of quantum physics and he continued to search for the perfect thought experiment that Bohr could not escape. Even today when we have become so familiar with quantum phenomena, the Copenhagen interpretation of quantum mechanics has weird consequences that seem to defy common sense, so it is understandable that Einstein had his reservations.
After the sixth Solvay congress Einstein and Schrödinger exchanged many letters complaining to each other about Bohr’s increasing strangle-hold on the interpretation of quantum mechanics. Egging each other on, they both constructed their own final assault on Bohr. The irony is that the concepts they devised to throw down quantum mechanics have today become cornerstones of the theory. For Einstein, his final salvo was “Entanglement”. For Schrödinger, his final salvo was his “cat”. Today, Entanglement and Schrödinger’s Cat have become enshrined on the alter of quantum interpretation even though their original function was to thwart that interpretation.
The final round of the debate was carried out, not at a Solvay congress, but in the Physical review journal by Einstein  and Bohr , and in the Naturwissenshaften by Schrödinger .
In 1969, Heisenberg looked back on these years and said,
To those of us who participated in the development of atomic theory, the five years following the Solvay Conference in Brussels in 1927 looked so wonderful that we often spoke of them as the golden age of atomic physics. The great obstacles that had occupied all our efforts in the preceding years had been cleared out of the way, the gate to an entirely new field, the quantum mechanics of the atomic shells stood wide open, and fresh fruits seemed ready for the picking. 
 A. Whitaker, Einstein, Bohr, and the quantum dilemma : from quantum theory to quantum information, 2nd ed. Cambridge University Press, 2006. (pg. 210)
 A. Einstein, B. Podolsky, and N. Rosen, “Can quantum-mechanical description of physical reality be considered complete?,” Physical Review, vol. 47, no. 10, pp. 0777-0780, May (1935)
Christian Andreas Doppler (1803 – 1853) was born in Salzburg, Austria, to a longstanding family of stonemasons. As a second son, he was expected to help his older brother run the business, so his Father had him tested in his 18th year for his suitability for a career in business. The examiner Simon Stampfer (1790 – 1864), an Austrian mathematician and inventor teaching at the Lyceum in Salzburg, discovered that Doppler had a gift for mathematics and was better suited for a scientific career. Stampfer’s enthusiasm convinced Doppler’s father to enroll him in the Polytechnik Institute in Vienna (founded only a few years earlier in 1815) where he took classes in mathematics, mechanics and physics  from 1822 to 1825. Doppler excelled in his courses, but was dissatisfied with the narrowness of the education, yearning for more breadth and depth in his studies and for more significance in his positions, feelings he would struggle with for his entire short life. He left Vienna, returning to the Lyceum in Salzburg to round out his education with philosophy, languages and poetry. Unfortunately, this four-year detour away from technical studies impeded his ability to gain a permanent technical position, so he began a temporary assistantship with a mathematics professor at Vienna. As he approached his 30th birthday this term expired without prospects. He was about to emigrate to America when he finally received an offer to teach at a secondary school in Prague.
To read about the attack by Joseph Petzval on Doppler’s effect and the effect it had on Doppler, see my feature article “The Fall and Rise of the Doppler Effect“ in Physics Today, 73(3) 30, March (2020).
Doppler in Prague
Prague gave Doppler new life. He was a professor with a position that allowed him to marry the daughter of a sliver and goldsmith from Salzburg. He began to publish scholarly papers, and in 1837 was appointed supplementary professor of Higher Mathematics and Geometry at the Prague Technical Institute, promoted to full professor in 1841. It was here that he met the unusual genius Bernard Bolzano (1781 – 1848), recently returned from political exile in the countryside. Bolzano was a philosopher and mathematician who developed rigorous concepts of mathematical limits and is famous today for his part in the Bolzano-Weierstrass theorem in functional analysis, but he had been too liberal and too outspoken for the conservative Austrian regime and had been dismissed from the University in Prague in 1819. He was forbidden to publish his work in Austrian journals, which is one reason why much of Bolzano’s groundbreaking work in functional analysis remained unknown during his lifetime. However, he participated in the Bohemian Society for Science from a distance, recognizing the inventive tendencies in the newcomer Doppler and supporting him for membership in the Bohemian Society. When Bolzano was allowed to return in 1842 to the Polytechnic Institute in Prague, he and Doppler became close friends as kindred spirits.
Prague, Czech Republic
On May 25, 1842, Bolzano presided as chairman over a meeting of the Bohemian Society for Science on the day that Doppler read a landmark paper on the color of stars to a meagre assembly of only five regular members of the Society . The turn-out was so small that the meeting may have been held in the robing room of the Society rather than in the meeting hall itself. Leading up to this famous moment, Doppler’s interests were peripatetic, ranging widely over mathematical and physical topics, but he had lately become fascinated by astronomy and by the phenomenon of stellar aberration. Stellar aberration was discovered by James Bradley in 1729 and explained as the result of the Earth’s yearly motion around the Sun, causing the apparent location of a distant star to change slightly depending on the direction of the Earth’s motion. Bradley explained this in terms of the finite speed of light and was able to estimate it to within several percent . As Doppler studied Bradley aberration, he wondered how the relative motion of the Earth would affect the color of the star. By making a simple analogy of a ship traveling with, or against, a series of ocean waves, he concluded that the frequency of impact of the peaks and troughs of waves on the ship was no different than the arrival of peaks and troughs of the light waves impinging on the eye. Because perceived color was related to the frequency of excitation in the eye, he concluded that the color of light would be slightly shifted to the blue if approaching, and to the red if receding from, the light source.
Doppler wave fronts from a source emitting spherical waves moving with speeds β relative to the speed of the wave in the medium.
Doppler calculated the magnitude of the effect by taking a simple ratio of the speed of the observer relative to the speed of light. What he found was that the speed of the Earth, though sufficient to cause the detectable aberration in the position of stars, was insufficient to produce a noticeable change in color. However, his interest in astronomy had made him familiar with binary stars where the relative motion of the light source might be high enough to cause color shifts. In fact, in the star catalogs there were examples of binary stars that had complementary red and blue colors. Therefore, the title of his paper, published in the Proceedings of the Royal Bohemian Society of Sciences a few months after he read it to the society, was “On the Coloured Light of the Double Stars and Certain Other Stars of the Heavens: Attempt at a General Theory which Incorporates Bradley’s Theorem of Aberration as an Integral Part” .
Title page of Doppler’s 1842 paper introducing the Doppler Effect.
Doppler’s analogy was correct, but like all analogies not founded on physical law, it differed in detail from the true nature of the phenomenon. By 1842 the transverse character of light waves had been thoroughly proven through the work of Fresnel and Arago several decades earlier, yet Doppler held onto the old-fashioned notion that light was composed of longitudinal waves. Bolzano, fully versed in the transverse nature of light, kindly published a commentary shortly afterwards  showing how the transverse effect for light, and a longitudinal effect for sound, were both supported by Doppler’s idea. Yet Doppler also did not know that speeds in visual binaries were too small to produce noticeable color effects to the unaided eye. Finally, (and perhaps the greatest flaw in his argument on the color of stars) a continuous spectrum that extends from the visible into the infrared and ultraviolet would not change color because all the frequencies would shift together preserving the flat (white) spectrum.
The simple algebraic derivation of the Doppler Effect in the 1842 publication..
Doppler’s twelve years in Prague were intense. He was consumed by his Society responsibilities and by an extremely heavy teaching load that included personal exams of hundreds of students. The only time he could be creative was during the night while his wife and children slept. Overworked and running on too little rest, his health already frail with the onset of tuberculosis, Doppler collapsed, and he was unable to continue at the Polytechnic. In 1847 he transferred to the School of Mines and Forrestry in Schemnitz (modern Banská Štiavnica in Slovakia) with more pay and less work. Yet the revolutions of 1848 swept across Europe, with student uprisings, barricades in the streets, and Hungarian liberation armies occupying the cities and universities, giving him no peace. Providentially, his former mentor Stampfer retired from the Polytechnic in Vienna, and Doppler was called to fill the vacancy.
Although Doppler was named the Director of Austria’s first Institute of Physics and was elected to the National Academy, he ran afoul of one of the other Academy members, Joseph Petzval (1807 – 1891), who persecuted Doppler and his effect. To read a detailed description of the attack by Petzval on Doppler’s effect and the effect it had on Doppler, see my feature article “The Fall and Rise of the Doppler Effect” in Physics Today, March issue (2020).
It is difficult today to appreciate just how deeply engrained the reality of the luminiferous ether was in the psyche of the 19th century physicist. The last of the classical physicists were reluctant even to adopt Maxwell’s electromagnetic theory for the explanation of optical phenomena, and as physicists inevitably were compelled to do so, some of their colleagues looked on with dismay and disappointment. This was the situation for Woldemar Voigt (1850 – 1919) at the University of Göttingen, who was appointed as one of the first professors of physics there in 1883, to be succeeded in later years by Peter Debye and Max Born. Voigt received his doctorate at the University of Königsberg under Franz Neumann, exploring the elastic properties of rock salt, and at Göttingen he spent a quarter century pursuing experimental and theoretical research into crystalline properties. Voigt’s research, with students like Paul Drude, laid the foundation for the modern field of solid state physics. His textbook Lehrbuch der Kristallphysik published in 1910 remained influential well into the 20th century because it adopted mathematical symmetry as a guiding principle of physics. It was in the context of his studies of crystal elasticity that he introduced the word “tensor” into the language of physics.
At the January 1887 meeting of the Royal Society of Science at Göttingen, three months before Michelson and Morely began their reality-altering experiments at the Case Western Reserve University in Cleveland Ohio, Voit submitted a paper deriving the longitudinal optical Doppler effect in an incompressible medium. He was responding to results published in 1886 by Michelson and Morely on their measurements of the Fresnel drag coefficient, which was the precursor to their later results on the absolute motion of the Earth through the ether.
Fresnel drag is the effect of light propagating through a medium that is in motion. The French physicist Francois Arago (1786 – 1853) in 1810 had attempted to observe the effects of corpuscles of light emitted from stars propagating with different speeds through the ether as the Earth spun on its axis and traveled around the sun. He succeeded only in observing ordinary stellar aberration. The absence of the effects of motion through the ether motivated Augustin-Jean Fresnel (1788 – 1827) to apply his newly-developed wave theory of light to explain the null results. In 1818 Fresnel derived an expression for the dragging of light by a moving medium that explained the absence of effects in Arago’s observations. For light propagating through a medium of refractive index n that is moving at a speed v, the resultant velocity of light is
where the last term in parenthesis is the Fresnel drag coefficient. The Fresnel drag effect supported the idea of the ether by explaining why its effects could not be observed—a kind of Catch-22—but it also applied to light moving through a moving dielectric medium. In 1851, Fizeau used an interferometer to measure the Fresnel drag coefficient for light moving through moving water, arriving at conclusions that directly confirmed the Fresnel drag effect. The positive experiments of Fizeau, as well as the phenomenon of stellar aberration, would be extremely influential on the thoughts of Einstein as he developed his approach to special relativity in 1905. They were also extremely influential to Michelson, Morley and Voigt.
In his paper on the absence of the Fresnel drag effect in the first Michelson-Morley experiment, Voigt pointed out that an equation of the form
is invariant under the transformation
From our modern vantage point, we immediately recognize (to within a scale factor) the Lorentz transformation of relativity theory. The first equation is common Galilean relativity, but the last equation was something new, introducing a position-dependent time as an observer moved with speed relative to the speed of light . Using these equations, Voigt was the first to derive the longitudinal (conventional) Doppler effect from relativistic effects.
Voigt’s derivation of the longitudinal Doppler effect used a classical approach that is still used today in Modern Physics textbooks to derive the Doppler effect. The argument proceeds by considering a moving source that emits a continuous wave in the direction of motion. Because the wave propagates at a finite speed, the moving source chases the leading edge of the wave front, catching up by a small amount by the time a single cycle of the wave has been emitted. The resulting compressed oscillation represents a blue shift of the emitted light. By using his transformations, Voigt arrived at the first relativistic expression for the shift in light frequency. At low speeds, Voigt’s derivation reverted to Doppler’s original expression.
A few months after Voigt delivered his paper, Michelson and Morley announced the results of their interferometric measurements of the motion of the Earth through the ether—with their null results. In retrospect, the Michelson-Morley experiment is viewed as one of the monumental assaults on the old classical physics, helping to launch the relativity revolution. However, in its own day, it was little more than just another null result on the ether. It did incite Fitzgerald and Lorentz to suggest that length of the arms of the interferometer contracted in the direction of motion, with the eventual emergence of the full Lorentz transformations by 1904—seventeen years after the Michelson results.
In 1904 Einstein, working in relative isolation at the Swiss patent office, was surprisingly unaware of the latest advances in the physics of the ether. He did not know about Voigt’s derivation of the relativistic Doppler effect (1887) as he had not heard of Lorentz’s final version of relativistic coordinate transformations (1904). His thinking about relativistic effects focused much farther into the past, to Bradley’s stellar aberration (1725) and Fizeau’s experiment of light propagating through moving water (1851). Einstein proceeded on simple principles, unencumbered by the mental baggage of the day, and delivered his beautifully minimalist theory of special relativity in his famous paper of 1905 “On the Electrodynamics of Moving Bodies”, independently deriving the Lorentz coordinate transformations .
One of Einstein’s talents in theoretical physics was to predict new phenomena as a way to provide direct confirmation of a new theory. This was how he later famously predicted the deflection of light by the Sun and the gravitational frequency shift of light. In 1905 he used his new theory of special relativity to predict observable consequences that included a general treatment of the relativistic Doppler effect. This included the effects of time dilation in addition to the longitudinal effect of the source chasing the wave. Time dilation produced a correction to Doppler’s original expression for the longitudinal effect that became significant at speeds approaching the speed of light. More significantly, it predicted a transverse Doppler effect for a source moving along a line perpendicular to the line of sight to an observer. This effect had not been predicted either by Doppler or by Voigt. The equation for the general Doppler effect for any observation angle is
Just as Doppler had been motivated by Bradley’s aberration of starlight when he conceived of his original principle for the longitudinal Doppler effect, Einstein combined the general Doppler effect with his results for the relativistic addition of velocities (also in his 1905 Annalen paper) as the conclusive treatment of stellar aberration nearly 200 years after Bradley first observed the effect.
Despite the generally positive reception of Einstein’s theory of special relativity, some of its consequences were anathema to many physicists at the time. A key stumbling block was the question whether relativistic effects, like moving clocks running slowly, were only apparent, or were actually real, and Einstein had to fight to convince others of its reality. When Johannes Stark (1874 – 1957) observed Doppler line shifts in ion beams called “canal rays” in 1906 (Stark received the 1919 Nobel prize in part for this discovery) , Einstein promptly published a paper suggesting how the canal rays could be used in a transverse geometry to directly detect time dilation through the transverse Doppler effect . Thirty years passed before the experiment was performed with sufficient accuracy by Herbert Ives and G. R. Stilwell in 1938 to measure the transverse Doppler effect . Ironically, even at this late date, Ives and Stilwell were convinced that their experiment had disproved Einstein’s time dilation by supporting Lorentz’ contraction theory of the electron. The Ives-Stilwell experiment was the first direct test of time dilation, followed in 1940 by muon lifetime measurements .
 Bradley, J (1729). “Account of a new discoved Motion of the Fix’d Stars”. Phil Trans. 35: 637–660.
 C. A. DOPPLER, “Über das farbige Licht der Doppelsterne und einiger anderer Gestirne des Himmels (About the coloured light of the binary stars and some other stars of the heavens),” Proceedings of the Royal Bohemian Society of Sciences, vol. V, no. 2, pp. 465–482, (Reissued 1903) (1842).
 B. Bolzano, “Ein Paar Bemerkunen über die Neu Theorie in Herrn Professor Ch. Doppler’s Schrift “Über das farbige Licht der Doppersterne und eineger anderer Gestirnedes Himmels”,” Pogg. Anal. der Physik und Chemie, vol. 60, p. 83, 1843; B. Bolzano, “Christian Doppler’s neuste Leistunen af dem Gebiet der physikalischen Apparatenlehre, Akoustik, Optik and optische Astronomie,” Pogg. Anal. der Physik und Chemie, vol. 72, pp. 530-555, 1847.
 W. Voigt, “Uber das Doppler’sche Princip,” Göttinger Nachrichten, vol. 7, pp. 41–51, (1887). The common use of c to express the speed of light came later from Voigt’s student Paul Drude.
 A. Einstein, “On the electrodynamics of moving bodies,” Annalen Der Physik, vol. 17, pp. 891-921, 1905.
 J. Stark, W. Hermann, and S. Kinoshita, “The Doppler effect in the spectrum of mercury,” Annalen Der Physik, vol. 21, pp. 462-469, Nov 1906.
 A. Einstein, “”Über die Möglichkeit einer neuen Prüfung des Relativitätsprinzips”,” vol. 328, pp. 197–198, 1907.
 H. E. Ives and G. R. Stilwell, “An experimental study of the rate of a moving atomic clock,” Journal of the Optical Society of America, vol. 28, p. 215, 1938.
 B. Rossi and D. B. Hall, “Variation of the Rate of Decay of Mesotrons with Momentum,” Physical Review, vol. 59, pp. 223–228, 1941.
The first time I ran across the Bohr-Sommerfeld quantization conditions I admit that I laughed! I was a TA for the Modern Physics course as a graduate student at Berkeley in 1982 and I read about Bohr-Sommerfeld in our Tipler textbook. I was familiar with Bohr orbits, which are already the wrong way of thinking about quantized systems. So the Bohr-Sommerfeld conditions, especially for so-called “elliptical” orbits, seemed like nonsense.
But it’s funny how a a little distance gives you perspective. Forty years later I know a little more physics than I did then, and I have gained a deep respect for an obscure property of dynamical systems known as “adiabatic invariants”. It turns out that adiabatic invariants lie at the core of quantum systems, and in the case of hydrogen adiabatic invariants can be visualized as … elliptical orbits!
Quantum Physics in Copenhagen
Niels Bohr (1885 – 1962) was born in Copenhagen, Denmark, the middle child of a physiology professor at the University in Copenhagen. Bohr grew up with his siblings as a faculty child, which meant an unconventional upbringing full of ideas, books and deep discussions. Bohr was a late bloomer in secondary school but began to show talent in Math and Physics in his last two years. When he entered the University in Copenhagen in 1903 to major in physics, the university had only one physics professor, Christian Christiansen, and had no physics laboratories. So Bohr tinkered in his father’s physiology laboratory, performing a detailed experimental study of the hydrodynamics of water jets, writing and submitting a paper that was to be his only experimental work. Bohr went on to receive a Master’s degree in 1909 and his PhD in 1911, writing his thesis on the theory of electrons in metals. Although the thesis did not break much new ground, it uncovered striking disparities between observed properties and theoretical predictions based on the classical theory of the electron. For his postdoc studies he applied for and was accepted to a position working with the discoverer of the electron, Sir J. J. Thompson, in Cambridge. Perhaps fortunately for the future history of physics, he did not get along well with Thompson, and he shifted his postdoc position in early 1912 to work with Ernest Rutherford at the much less prestigious University of Manchester.
Ernest Rutherford had just completed a series of detailed experiments on the scattering of alpha particles on gold film and had demonstrated that the mass of the atom was concentrated in a very small volume that Rutherford called the nucleus, which also carried the positive charge compensating the negative electron charges. The discovery of the nucleus created a radical new model of the atom in which electrons executed planetary-like orbits around the nucleus. Bohr immediately went to work on a theory for the new model of the atom. He worked closely with Rutherford and the other members of Rutherford’s laboratory, involved in daily discussions on the nature of atomic structure. The open intellectual atmosphere of Rutherford’s group and the ready flow of ideas in group discussions became the model for Bohr, who would some years later set up his own research center that would attract the top young physicists of the time. Already by mid 1912, Bohr was beginning to see a path forward, hinting in letters to his younger brother Harald (who would become a famous mathematician) that he had uncovered a new approach that might explain some of the observed properties of simple atoms.
By the end of 1912 his postdoc travel stipend was over, and he returned to Copenhagen, where he completed his work on the hydrogen atom. One of the key discrepancies in the classical theory of the electron in atoms was the requirement, by Maxwell’s Laws, for orbiting electrons to continually radiate because of their angular acceleration. Furthermore, from energy conservation, if they radiated continuously, the electron orbits must also eventually decay into the nuclear core with ever-decreasing orbital periods and hence ever higher emitted light frequencies. Experimentally, on the other hand, it was known that light emitted from atoms had only distinct quantized frequencies. To circumvent the problem of classical radiation, Bohr simply assumed what was observed, formulating the idea of stationary quantum states. Light emission (or absorption) could take place only when the energy of an electron changed discontinuously as it jumped from one stationary state to another, and there was a lowest stationary state below which the electron could never fall. He then took a critical and important step, combining this new idea of stationary states with Planck’s constant h. He was able to show that the emission spectrum of hydrogen, and hence the energies of the stationary states, could be derived if the angular momentum of the electron in a Hydrogen atom was quantized by integer amounts of Planck’s constant h.
Bohr published his quantum theory of the hydrogen atom in 1913, which immediately focused the attention of a growing group of physicists (including Einstein, Rutherford, Hilbert, Born, and Sommerfeld) on the new possibilities opened up by Bohr’s quantum theory . Emboldened by his growing reputation, Bohr petitioned the university in Copenhagen to create a new faculty position in theoretical physics, and to appoint him to it. The University was not unreceptive, but university bureaucracies make decisions slowly, so Bohr returned to Rutherford’s group in Manchester while he awaited Copenhagen’s decision. He waited over two years, but he enjoyed his time in the stimulating environment of Rutherford’s group in Manchester, growing steadily into the role as master of the new quantum theory. In June of 1916, Bohr returned to Copenhagen and a year later was elected to the Royal Danish Academy of Sciences.
Although Bohr’s theory had succeeded in describing some of the properties of the electron in atoms, two central features of his theory continued to cause difficulty. The first was the limitation of the theory to single electrons in circular orbits, and the second was the cause of the discontinuous jumps. In response to this challenge, Arnold Sommerfeld provided a deeper mechanical perspective on the origins of the discrete energy levels of the atom.
Quantum Physics in Munich
Arnold Johannes Wilhem Sommerfeld (1868—1951) was born in Königsberg, Prussia, and spent all the years of his education there to his doctorate that he received in 1891. In Königsberg he was acquainted with Minkowski, Wien and Hilbert, and he was the doctoral student of Lindemann. He also was associated with a social group at the University that spent too much time drinking and dueling, a distraction that lead to his receiving a deep sabre cut on his forehead that became one of his distinguishing features along with his finely waxed moustache. In outward appearance, he looked the part of a Prussian hussar, but he finally escaped this life of dissipation and landed in Göttingen where he became Felix Klein’s assistant in 1894. He taught at local secondary schools, rising in reputation, until he secured a faculty position of theoretical physics at the University in Münich in 1906. One of his first students was Peter Debye who received his doctorate under Sommerfeld in 1908. Later famous students would include Peter Ewald (doctorate in 1912), Wolfgang Pauli (doctorate in 1921), Werner Heisenberg (doctorate in 1923), and Hans Bethe (doctorate in 1928). These students had the rare treat, during their time studying under Sommerfeld, of spending weekends in the winter skiing and staying at a ski hut that he owned only two hours by train outside of Münich. At the end of the day skiing, discussion would turn invariably to theoretical physics and the leading problems of the day. It was in his early days at Münich that Sommerfeld played a key role aiding the general acceptance of Minkowski’s theory of four-dimensional space-time by publishing a review article in Annalen der Physik that translated Minkowski’s ideas into language that was more familiar to physicists.
Around 1911, Sommerfeld shifted his research interest to the new quantum theory, and his interest only intensified after the publication of Bohr’s model of hydrogen in 1913. In 1915 Sommerfeld significantly extended the Bohr model by building on an idea put forward by Planck. While further justifying the black body spectrum, Planck turned to descriptions of the trajectory of a quantized one-dimensional harmonic oscillator in phase space. Planck had noted that the phase-space areas enclosed by the quantized trajectories were integral multiples of his constant. Sommerfeld expanded on this idea, showing that it was not the area enclosed by the trajectories that was fundamental, but the integral of the momentum over the spatial coordinate . This integral is none other than the original action integral of Maupertuis and Euler, used so famously in their Principle of Least Action almost 200 years earlier. Where Planck, in his original paper of 1901, had recognized the units of his constant to be those of action, and hence called it the quantum of action, Sommerfeld made the explicit connection to the dynamical trajectories of the oscillators. He then showed that the same action principle applied to Bohr’s circular orbits for the electron on the hydrogen atom, and that the orbits need not even be circular, but could be elliptical Keplerian orbits.
The quantum condition for this otherwise classical trajectory was the requirement for the action integral over the motion to be equal to integer units of the quantum of action. Furthermore, Sommerfeld showed that there must be as many action integrals as degrees of freedom for the dynamical system. In the case of Keplerian orbits, there are radial coordinates as well as angular coordinates, and each action integral was quantized for the discrete electron orbits. Although Sommerfeld’s action integrals extended Bohr’s theory of quantized electron orbits, the new quantum conditions also created a problem because there were now many possible elliptical orbits that all had the same energy. How was one to find the “correct” orbit for a given orbital energy?
Quantum Physics in Leiden
In 1906, the Austrian Physicist Paul Ehrenfest (1880 – 1933), freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life. Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest. It was a monumental task, which was to take him and his physicist wife Tatyana nearly five years to complete. Part of the delay was the desire by Ehrenfest to close some open problems that remained in Boltzmann’s work. One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters. These properties would later be called adiabatic invariants by Einstein. Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity. Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system. This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice. For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .
Ehrenfest published his observations in 1913 , the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc. Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue with his theory of adiabatic invariants, showing that each of Sommerfeld’s quantum conditions were precisely the adabatic invariants of the classical electron dynamics . The remaining question was which coordinates were the correct ones, because different choices led to different answers. This was quickly solved by Johannes Burgers (one of Ehrenfest’s students) who showed that action integrals were adiabatic invariants, and then by Karl Schwarzschild and Paul Epstein who showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the electron orbits. Schwarzshild’s paper was published the same day that he died on the Eastern Front. The work by Schwarzschild and Epstein was the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, which foreshadowed the future importance of Hamiltonians for quantum theory.
Emboldened by Ehrenfest’s adiabatic principle, which demonstrated a close connection between classical dynamics and quantization conditions, Bohr formalized a technique that he had used implicitly in his 1913 model of hydrogen, and now elevated it to the status of a fundamental principle of quantum theory. He called it the Correspondence Principle, and published the details in 1920. The Correspondence Principle states that as the quantum number of an electron orbit increases to large values, the quantum behavior converges to classical behavior. Specifically, if an electron in a state of high quantum number emits a photon while jumping to a neighboring orbit, then the wavelength of the emitted photon approaches the classical radiation wavelength of the electron subject to Maxwell’s equations.
Bohr’s Correspondence Principle cemented the bridge between classical physics and quantum physics. One of the biggest former questions about the physics of electron orbits in atoms was why they did not radiate continuously because of the angular acceleration they experienced in their orbits. Bohr had now reconnected to Maxwell’s equations and classical physics in the limit. Like the theory of adiabatic invariants, the Correspondence Principle became a new tool for distinguishing among different quantum theories. It could be used as a filter to distinguish “correct” quantum models, that transitioned smoothly from quantum to classical behavior, from those that did not. Bohr’s Correspondence Principle was to be a powerful tool in the hands of Werner Heisenberg as he reinvented quantum theory only a few years later.
By the end of 1920, all the elements of the quantum theory of electron orbits were apparently falling into place. Bohr’s originally ad hoc quantization condition was now on firm footing. The quantization conditions were related to action integrals that were, in turn, adiabatic invariants of the classical dynamics. This meant that slight variations in the parameters of the dynamics systems would not induce quantum transitions among the various quantum states. This conclusion would have felt right to the early quantum practitioners. Bohr’s quantum model of electron orbits was fundamentally a means of explaining quantum transitions between stationary states. Now it appeared that the condition for the stationary states of the electron orbits was an insensitivity, or invariance, to variations in the dynamical properties. This was analogous to the principle of stationary action where the action along a dynamical trajectory is invariant to slight variations in the trajectory. Therefore, the theory of quantum orbits now rested on firm foundations that seemed as solid as the foundations of classical mechanics.
From the perspective of modern quantum theory, the concept of elliptical Keplerian orbits for the electron is grossly inaccurate. Most physicists shudder when they see the symbol for atomic energy—the classic but mistaken icon of electron orbits around a nucleus. Nonetheless, Bohr and Ehrenfest and Sommerfeld had hit on a deep thread that runs through all of physics—the concept of action—the same concept that Leibniz introduced, that Maupertuis minimized and that Euler canonized. This concept of action is at work in the macroscopic domain of classical dynamics as well as the microscopic world of quantum phenomena. Planck was acutely aware of this connection with action, which is why he so readily recognized his elementary constant as the quantum of action.
However, the old quantum theory was running out of steam. For instance, the action integrals and adiabatic invariants only worked for single electron orbits, leaving the vast bulk of many-electron atomic matter beyond the reach of quantum theory and prediction. The literal electron orbits were a crutch or bias that prevented physicists from moving past them and seeing new possibilities for quantum theory. Orbits were an anachronism, exerting a damping force on progress. This limitation became painfully clear when Bohr and his assistants at Copenhagen–Kramers and Slater–attempted to use their electron orbits to explain the refractive index of gases. The theory was cumbersome and exhausted. It was time for a new quantum revolution by a new generation of quantum wizards–Heisenberg, Born, Schrödinger, Pauli, Jordan and Dirac.
 N. Bohr, “On the Constitution of Atoms and Molecules, Part II Systems Containing Only a Single Nucleus,” Philosophical Magazine, vol. 26, pp. 476–502, 1913.
 A. Sommerfeld, “The quantum theory of spectral lines,” Annalen Der Physik, vol. 51, pp. 1-94, Sep 1916.
 P. Ehrenfest, “Een mechanische theorema van Boltzmann en zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and its relation to the theory of energy quanta),” Verslag van de Gewoge Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593, 1913.
 P. Ehrenfest, “Adiabatic invariables and quantum theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.
Albert Einstein defies condensation—it is impossible to condense his approach, his insight, his motivation—into a single word like “genius”. He was complex, multifaceted, contradictory, revolutionary as well as conservative. Some of his work was so simple that it is hard to understand why no-one else did it first, even when they were right in the middle of it. Lorentz and Poincaré spring to mind—they had been circling the ideas of spacetime for decades—but never stepped back to see what the simplest explanation could be. Einstein did, and his special relativity was simple and beautiful, and the math is just high-school algebra. On the other hand, parts of his work—like gravitation—are so embroiled in mathematics and the religion of general covariance that it remains opaque to physics neophytes 100 years later and is usually reserved for graduate study.
Yet there is a third thread in Einstein’s work that relies on pure intuition—neither simple nor complicated—but almost impossible to grasp how he made his leap. This is the case when he proposed the real existence of the photon—the quantum particle of light. For ten years after this proposal, it was considered by almost everyone to be his greatest blunder. It even came up when Planck was nominating Einstein for membership in the German Academy of Science. Planck said
That he may sometimes have missed the target of his speculations, as for example, in his hypothesis of light quanta, cannot really be held against him.
In this single statement, we have the father of the quantum being criticized by the father of the quantum discontinuity.
Max Planck’s Discontinuity
In histories of the development of quantum theory, the German physicist Max Planck (1858—1947) is characterized as an unlikely revolutionary. He was an establishment man, in the stolid German tradition, who was already embedded in his career, in his forties, holding a coveted faculty position at the University of Berlin. In his research, he was responding to a theoretical challenge issued by Kirchhoff many years ago in 1860 to find the function of temperature and wavelength that described and explained the observed spectrum of radiating bodies. Planck was not looking for a revolution. In fact, he was looking for the opposite. One of his motivations in studying the thermodynamics of electromagnetic radiation was to rebut the statistical theories of Boltzmann. Planck had never been convinced by the atomistic and discrete approach Boltzmann had used to explain entropy and the second law of thermodynamics. With the continuum of light radiation he thought he had the perfect system that would show how entropy behaved in a continuous manner, without the need for discrete quantities.
Therefore, Planck’s original intentions were to use blackbody radiation to argue against Boltzmann—to set back the clock. For this reason, not only was Planck an unlikely revolutionary, he was a counter-revolutionary. But Planck was a revolutionary because that is what he did, whatever his original intentions were, and he accepted his role as a revolutionary when he had the courage to stand in front of his scientific peers and propose a quantum hypothesis that lay at the heart of physics.
Blackbody radiation, at the end of the nineteenth century, was a topic of keen interest and had been measured with high precision. This was in part because it was such a “clean” system, having fundamental thermodynamic properties independent of any of the material properties of the black body, unlike the so-called ideal gases, which always showed some dependence on the molecular properties of the gas. The high-precision measurements of blackbody radiation were made possible by new developments in spectrometers at the end of the century, as well as infrared detectors that allowed very precise and repeatable measurements to be made of the spectrum across broad ranges of wavelengths.
In 1893 the German physicist Wilhelm Wien (1864—1928) had used adiabatic expansion arguments to derive what became known as Wien’s Displacement Law that showed a simple linear relationship between the temperature of the blackbody and the peak wavelength. Later, in 1896, he showed that the high-frequency behavior could be described by an exponential function of temperature and wavelength that required no other properties of the blackbody. This was approaching the solution of Kirchhoff’s challenge of 1860 seeking a universal function. However, at lower frequencies Wien’s approximation failed to match the measured spectrum. In mid-year 1900, Planck was able to define a single functional expression that described the experimentally observed spectrum. Planck had succeeded in describing black-body radiation, but he had not satisfied Kirchhoff’s second condition—to explain it.
Therefore, to describe the blackbody spectrum, Planck modeled the emitting body as a set of ideal oscillators. As an expert in the Second Law, Planck derived the functional form for the radiation spectrum, from which he found the entropy of the oscillators that produced the spectrum. However, once he had the form for the entropy, he needed to explain why it took that specific form. In this sense, he was working backwards from a known solution rather than forwards from first principles. Planck was at an impasse. He struggled but failed to find any continuum theory that could work.
Then Planck turned to Boltzmann’s statistical theory of entropy, the same theory that he had previously avoided and had hoped to discredit. He described this as “an act of despair … I was ready to sacrifice any of my previous convictions about physics.” In Boltzmann’s expression for entropy, it was necessary to “count” possible configurations of states. But counting can only be done if the states are discrete. Therefore, he lumped the energies of the oscillators into discrete ranges, or bins, that he called “quanta”. The size of the bins was proportional to the frequency of the oscillator, and the proportionality constant had the units of Maupertuis’ quantity of action, so Planck called it the “quantum of action”. Finally, based on this quantum hypothesis, Planck derived the functional form of black-body radiation.
Planck presented his findings at a meeting of the German Physical Society in Berlin on November 15, 1900, introducing the word quantum (plural quanta) into physics from the Latin word that means quantity . It was a casual meeting, and while the attendees knew they were seeing an intriguing new physical theory, there was no sense of a revolution. But Planck himself was aware that he had created something fundamentally new. The radiation law of cavities depended on only two physical properties—the temperature and the wavelength—and on two constants—Boltzmann’s constant kB and a new constant that later became known as Planck’s constant h = ΔE/f = 6.6×10-34 J-sec. By combining these two constants with other fundamental constants, such as the speed of light, Planck was able to establish accurate values for long-sought constants of nature, like Avogadro’s number and the charge of the electron.
Although Planck’s quantum hypothesis in 1900 explained the blackbody radiation spectrum, his specific hypothesis was that it was the interaction of the atoms and the light field that was somehow quantized. He certainly was not thinking in terms of individual quanta of the light field.
When Einstein analyzed the properties of the blackbody radiation in 1905, using his deep insight into statistical mechanics, he was led to the inescapable conclusion that light itself must be quantized in amounts E = hf, where h is Planck’s constant and f is the frequency of the light field. Although this equation is exactly the same as Planck’s from 1900, the meaning was completely different. For Planck, this was the discreteness of the interaction of light with matter. For Einstein, this was the quantum of light energy—whole and indivisible—just as if the light quantum were a particle with particle properties. For this reason, we can answer the question posed in the title of this Blog—Einstein takes the honor of being the inventor of the quantum.
Einstein’s clarity of vision is a marvel to behold even to this day. His special talent was to take simple principles, ones that are almost trivial and beyond reproach, and to derive something profound. In Special Relativity, he simply assumed the constancy of the speed of light and derived Lorentz’s transformations that had originally been based on obtuse electromagnetic arguments about the electron. In General Relativity, he assumed that free fall represented an inertial frame, and he concluded that gravity must bend light. In quantum theory, he assumed that the low-density limit of Planck’s theory had to be consistent with light in thermal equilibrium in thermal equilibrium with the black body container, and he concluded that light itself must be quantized into packets of indivisible energy quanta . One immediate consequence of this conclusion was his simple explanation of the photoelectric effect for which the energy of an electron ejected from a metal by ultraviolet irradiation is a linear function of the frequency of the radiation. Einstein published his theory of the quanta of light  as one of his four famous 1905 articles in Annalen der Physik in his Annus Mirabilis.
Einstein’s theory of light quanta was controversial and was slow to be accepted. It is ironic that in 1914 when Einstein was being considered for a position at the University in Berlin, Planck himself, as he championed Einstein’s case to the faculty, implored his colleagues to accept Einstein despite his ill-conceived theory of light quanta . This comment by Planck goes far to show how Planck, father of the quantum revolution, did not fully grasp, even by 1914, the fundamental nature and consequences of his original quantum hypothesis. That same year, the American physicist Robert Millikan (1868—1953) performed a precise experimental measurement of the photoelectric effect, with the ostensible intention of proving Einstein wrong, but he accomplished just the opposite—providing clean experimental evidence confirming Einstein’s theory of the photoelectric effect.
The Stimulated Emission of Light
About a year after Millikan proved that the quantum of energy associated with light absorption was absorbed as a whole quantum of energy that was not divisible, Einstein took a step further in his theory of the light quantum. In 1916 he published a paper in the proceedings of the German Physical Society that explored how light would be in a state of thermodynamic equilibrium when interacting with atoms that had discrete energy levels. Once again he used simple arguments, this time using the principle of detailed balance, to derive a new and unanticipated property of light—stimulated emission!
The stimulated emission of light occurs when an electron is in an excited state of a quantum system, like an atom, and an incident photon stimulates the emission of a second photon that has the same energy and phase as the first photon. If there are many atoms in the excited state, then this process leads to a chain reaction as 1 photon produces 2, and 2 produce 4, and 4 produce 8, etc. This exponential gain in photons with the same energy and phase is the origin of laser radiation. At the time that Einstein proposed this mechanism, lasers were half a century in the future, but he was led to this conclusion by extremely simple arguments about transition rates.
Detailed balance is a principle that states that in thermal equilibrium all fluxes are balanced. In the case of atoms with ground states and excited states, this principle requires that as many transitions occur from the ground state to the excited state as from the excited state to the ground state. The crucial new element that Einstein introduced was to distinguish spontaneous emission from stimulated emission. Just as the probability to absorb a photon must be proportional to the photon density, there must be an equivalent process that de-excites the atom that also must be proportional the photon density. In addition, an electron must be able to spontaneously emit a photon with a rate that is independent of photon density. This leads to distinct coefficients in the transition rate equations that are today called the “Einstein A and B coefficients”. The B coefficients relate to the photon density, while the A coefficient relates to spontaneous emission.
Using the principle of detailed balance together with his A and B coefficients as well as Boltzmann factors describing the number of excited states relative to ground state atoms in equilibrium at a given temperature, Einstein was able to derive an early form of what is today called the Bose-Einstein occupancy function for photons.
Derivation of the Einstein A and B Coefficients
Detailed balance requires the rate from m to n to be the same as the rate from n to m
where the first term is the spontaneous emission rate from the excited state m to the ground state n, the second term is the stimulated emission rate, and the third term (on the right) is the absorption rate from n to m. The numbers in each state are Nm and Nn, and the density of photons is ρ. The relative numbers in the excited state relative to the ground state is given by the Boltzmann factor
By assuming that the stimulated transition coefficient from n to m is the same as m to n, and inserting the Boltzmann factor yields
The Planck density of photons for ΔE = hf is
which yields the final relation between the spontaneous emission coefficient and the stimulated emission coefficient
The total emission rate is
where the p-bar is the average photon number in the cavity. One of the striking aspects of this derivation is that no assumptions are made about the physical mechanisms that determine the coefficient B. Only arguments of detailed balance are required to arrive at these results.
Einstein’s Quantum Legacy
Einstein was awarded the Nobel Prize in 1921 for the photoelectric effect, not for the photon nor for any of Einstein’s other theoretical accomplishments. Even in 1921, the quantum nature of light remained controversial. It was only in 1923, after the American physicist Arthur Compton (1892—1962) showed that energy and momentum were conserved in the scattering of photons from electrons, that the quantum nature of light began to be accepted. The very next year, in 1924, the quantum of light was named the “photon” by the American American chemical physicist Gilbert Lewis (1875—1946).
A blog article like this, that attributes the invention of the quantum to Einstein rather than Planck, must say something about the irony of this attribution. If Einstein is the father of the quantum, he ultimately was led to disinherit his own brain child. His final and strongest argument against the quantum properties inherent in the Copenhagen Interpretation was his famous EPR paper which, against his expectations, launched the concept of entanglement that underlies the coming generation of quantum computers.
Einstein’s Quantum Timeline
1900 – Planck’s quantum discontinuity for the calculation of the entropy of blackbody radiation.
1905 – Einstein’s “Miracle Year”. Proposes the light quantum.
1911 – First Solvay Conference on the theory of radiation and quanta.
1913 – Bohr’s quantum theory of hydrogen.
1914 – Einstein becomes a member of the German Academy of Science.
1915 – Millikan measurement of the photoelectric effect.
1916 – Einstein proposes stimulated emission.
1921 – Einstein receives Nobel Prize for photoelectric effect and the light quantum. Third Solvay Conference on atoms and electrons.
1927 – Heisenberg’s uncertainty relation. Fifth Solvay International Conference on Electrons and Photons in Brussels. “First” Bohr-Einstein debate on indeterminancy in quantum theory.
1930 – Sixth Solvay Conference on magnetism. “Second” Bohr-Einstein debate.
1935 – Einstein-Podolsky-Rosen (EPR) paper on the completeness of quantum mechanics.
Selected Einstein Quantum Papers
Einstein, A. (1905). “Generation and conversion of light with regard to a heuristic point of view.” Annalen Der Physik 17(6): 132-148.
Einstein, A. (1907). “Die Plancksche Theorie der Strahlung und die Theorie der spezifischen W ̈arme.” Annalen der Physik 22: 180–190.
Einstein, A. (1909). “On the current state of radiation problems.” Physikalische Zeitschrift 10: 185-193.
Einstein, A. and O. Stern (1913). “An argument for the acceptance of molecular agitation at absolute zero.” Annalen Der Physik 40(3): 551-560.
Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.
Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.
Einstein, A., B. Podolsky and N. Rosen (1935). “Can quantum-mechanical description of physical reality be considered complete?” Physical Review 47(10): 0777-0780.
 M. Planck, “Elementary quanta of matter and electricity,” Annalen Der Physik, vol. 4, pp. 564-566, Mar 1901.
 Klein, M. J. (1964). Einstein’s First Paper on Quanta. The natural philosopher. D. A. Greenberg and D. E. Gershenson. New York, Blaidsdell. 3.
 A. Einstein, “Generation and conversion of light with regard to a heuristic point of view,” Annalen Der Physik, vol. 17, pp. 132-148, Jun 1905.
In one of my previous blog posts, as I was searching for Schwarzschild’s original papers on Einstein’s field equations and quantum theory, I obtained a copy of the January 1916 – June 1916 volume of the Proceedings of the Royal Prussian Academy of Sciences through interlibrary loan. The extremely thick volume arrived at Purdue about a week after I ordered it online. It arrived from Oberlin College in Ohio that had received it as a gift in 1928 from the library of Professor Friedrich Loofs of the University of Halle in Germany. Loofs had been the Haskell Lecturer at Oberlin for the 1911-1912 semesters.
As I browsed through the volume looking for Schwarzschild’s papers, I was amused to find a cornucopia of turn-of-the-century science topics recorded in its pages. There were papers on the overbite and lips of marsupials. There were papers on forgotten languages. There were papers on ancient Greek texts. On the origins of religion. On the philosophy of abstraction. Histories of Indian dramas. Reflections on cancer. But what I found most amazing was a snapshot of the field of physics and mathematics in 1916, with historic papers by historic scientists who changed how we view the world. Here is a snapshot in time and in space, a period of only six months from a single journal, containing papers from authors that reads like a who’s who of physics.
In 1916 there were three major centers of science in the world with leading science publications: London with the Philosophical Magazine and Proceedings of the Royal Society; Paris with the Comptes Rendus of the Académie des Sciences; and Berlin with the Proceedings of the Royal Prussian Academy of Sciences and Annalen der Physik. In Russia, there were the scientific Journals of St. Petersburg, but the Bolshevik Revolution was brewing that would overwhelm that country for decades. And in 1916 the academic life of the United States was barely worth noticing except for a few points of light at Yale and Johns Hopkins.
Berlin in 1916 was embroiled in war, but science proceeded relatively unmolested. The six-month volume of the Proceedings of the Royal Prussian Academy of Sciences contains a number of gems. Schwarzschild was one of the most prolific contributors, publishing three papers in just this half-year volume, plus his obituary written by Einstein. But joining Schwarzschild in this volume were Einstein, Planck, Born, Warburg, Frobenious, and Rubens among others—a pantheon of German scientists mostly cut off from the rest of the world at that time, but single-mindedly following their individual threads woven deep into the fabric of the physical world.
Karl Schwarzschild (1873 – 1916)
Schwarzschild had the unenviable yet effective motivation of his impending death to spur him to complete several projects that he must have known would make his name immortal. In this six-month volume he published his three most important papers. The first (pg. 189) was on the exact solution to Einstein’s field equations to general relativity. The solution was for the restricted case of a point mass, yet the derivation yielded the Schwarzschild radius that later became known as the event horizon of a non-roatating black hole. The second paper (pg. 424) expanded the general relativity solutions to a spherically symmetric incompressible liquid mass.
The subject, content and success of these two papers was wholly unexpected from this observational astronomer stationed on the Russian Front during WWI calculating trajectories for German bombardments. He would not have been considered a theoretical physicist but for the importance of his results and the sophistication of his methods. Within only a year after Einstein published his general theory, based as it was on the complicated tensor calculus of Levi-Civita, Christoffel and Ricci-Curbastro that had taken him years to master, Schwarzschild found a solution that evaded even Einstein.
Schwarzschild’s third and final paper (pg. 548) was on an entirely different topic, still not in his official field of astronomy, that positioned all future theoretical work in quantum physics to be phrased in the language of Hamiltonian dynamics and phase space. He proved that action-angle coordinates were the only acceptable canonical coordinates to be used when quantizing dynamical systems. This paper answered a central question that had been nagging Bohr and Einstein and Ehrenfest for years—how to quantize dynamical coordinates. Despite the simple way that Bohr’s quantized hydrogen atom is taught in modern physics, there was an ambiguity in the quantization conditions even for this simple single-electron atom. The ambiguity arose from the numerous possible canonical coordinate transformations that were admissible, yet which led to different forms of quantized motion.
Schwarzschild’s doctoral thesis had been a theoretical topic in astrophysics that applied the celestial mechanics theories of Henri Poincaré to binary star systems. Within Poincaré’s theory were integral invariants that were conserved quantities of the motion. When a dynamical system had as many constraints as degrees of freedom, then every coordinate had an integral invariant. In this unexpected last paper from Schwarzschild, he showed how canonical transformation to action-angle coordinates produced a unique representation in terms of action variables (whose dimensions are the same as Planck’s constant). These action coordinates, with their associated cyclical angle variables, are the only unambiguous representations that can be quantized. The important points of this paper were amplified a few months later in a publication by Schwarzschild’s friend Paul Epstein (1871 – 1939), solidifying this approach to quantum mechanics. Paul Ehrenfest (1880 – 1933) continued this work later in 1916 by defining adiabatic invariants whose quantum numbers remain unchanged under slowly varying conditions, and the program started by Schwarzschild was definitively completed by Paul Dirac (1902 – 1984) at the dawn of quantum mechanics in Göttingen in 1925.
Albert Einstein (1879 – 1955)
In 1916 Einstein was mopping up after publishing his definitive field equations of general relativity the year before. His interests were still cast wide, not restricted only to this latest project. In the 1916 Jan. to June volume of the Prussian Academy Einstein published two papers. Each is remarkably short relative to the other papers in the volume, yet the importance of the papers may stand in inverse proportion to their length.
The first paper (pg. 184) is placed right before Schwarzschild’s first paper on February 3. The subject of the paper is the expression of Maxwell’s equations in four-dimensional space time. It is notable and ironic that Einstein mentions Hermann Minkowski (1864 – 1909) in the first sentence of the paper. When Minkowski proposed his bold structure of spacetime in 1908, Einstein had been one of his harshest critics, writing letters to the editor about the absurdity of thinking of space and time as a single interchangeable coordinate system. This is ironic, because Einstein today is perhaps best known for the special relativity properties of spacetime, yet he was slow to adopt the spacetime viewpoint. Einstein only came around to spacetime when he realized around 1910 that a general approach to relativity required the mathematical structure of tensor manifolds, and Minkowski had provided just such a manifold—the pseudo-Riemannian manifold of space time. Einstein subsequently adopted spacetime with a passion and became its greatest champion, calling out Minkowski where possible to give him his due, although he had already died tragically of a burst appendix in 1909.
The importance of Einstein’s paper hinges on his derivation of the electromagnetic field energy density using electromagnetic four vectors. The energy density is part of the source term for his general relativity field equations. Any form of energy density can warp spacetime, including electromagnetic field energy. Furthermore, the Einstein field equations of general relativity are nonlinear as gravitational fields modify space and space modifies electromagnetic fields, producing a coupling between gravity and electromagnetism. This coupling is implicit in the case of the bending of light by gravity, but Einstein’s paper from 1916 makes the connection explicit.
Einstein’s second paper (pg. 688) is even shorter and hence one of the most daring publications of his career. Because the field equations of general relativity are nonlinear, they are not easy to solve exactly, and Einstein was exploring approximate solutions under conditions of slow speeds and weak fields. In this “non-relativistic” limit the metric tensor separates into a Minkowski metric as a background on which a small metric perturbation remains. This small perturbation has the properties of a wave equation for a disturbance of the gravitational field that propagates at the speed of light. Hence, in the June 22 issue of the Prussian Academy in 1916, Einstein predicts the existence and the properties of gravitational waves. Exactly one hundred years later in 2016, the LIGO collaboration announced the detection of gravitational waves generated by the merger of two black holes.
Max Planck (1858 – 1947)
Max Planck was active as the secretary of the Prussian Academy in 1916 yet was still fully active in his research. Although he had launched the quantum revolution with his quantum hypothesis of 1900, he was not a major proponent of quantum theory even as late as 1916. His primary interests lay in thermodynamics and the origins of entropy, following the theoretical approaches of Ludwig Boltzmann (1844 – 1906). In 1916 he was interested in how to best partition phase space as a way to count states and calculate entropy from first principles. His paper in the 1916 volume (pg. 653) calculated the entropy for single-atom solids.
Max Born (1882 – 1970)
Max Born was to be one of the leading champions of the quantum mechanical revolution based at the University of Göttingen in the 1920’s. But in 1916 he was on leave from the University of Berlin working on ranging for artillery. Yet he still pursued his academic interests, like Schwarzschild. On pg. 614 in the Proceedings of the Prussian Academy, Born published a paper on anisotropic liquids, such as liquid crystals and the effect of electric fields on them. It is astonishing to think that so many of the flat-panel displays we have today, whether on our watches or smart phones, are technological descendants of work by Born at the beginning of his career.
Ferdinand Frobenius (1849 – 1917)
Like Schwarzschild, Frobenius was at the end of his career in 1916 and would pass away one year later, but unlike Schwarzschild, his career had been a long one, receiving his doctorate under Weierstrass and exploring elliptic functions, differential equations, number theory and group theory. One of the papers that established him in group theory appears in the May 4th issue on page 542 where he explores the series expansion of a group.
Heinrich Rubens (1865 – 1922)
Max Planck owed his quantum breakthrough in part to the exquisitely accurate experimental measurements made by Heinrich Rubens on black body radiation. It was only by the precise shape of what came to be called the Planck spectrum that Planck could say with such confidence that his theory of quantized radiation interactions fit Rubens spectrum so perfectly. In 1916 Rubens was at the University of Berlin, having taken the position vacated by Paul Drude in 1906. He was a specialist in infrared spectroscopy, and on page 167 of the Proceedings he describes the spectrum of steam and its consequences for the quantum theory.
Emil Warburg (1946 – 1931)
Emil Warburg’s fame is primarily as the father of Otto Warburg who won the 1931 Nobel prize in physiology. On page 314 Warburg reports on photochemical processes in BrH gases. In an obscure and very indirect way, I am an academic descendant of Emil Warburg. One of his students was Robert Pohl who was a famous early researcher in solid state physics, sometimes called the “father of solid state physics”. Pohl was at the physics department in Göttingen in the 1920’s along with Born and Franck during the golden age of quantum mechanics. Robert Pohl’s son, Robert Otto Pohl, was my professor when I was a sophomore at Cornell University in 1978 for the course on introductory electromagnetism using a textbook by the Nobel laureate Edward Purcell, a quirky volume of the Berkeley Series of physics textbooks. This makes Emil Warburg my professor’s father’s professor.
Papers in the 1916 Vol. 1 of the Prussian Academy of Sciences
Schulze, Alt– und Neuindisches
Orth, Zur Frage nach den Beziehungen des Alkoholismus zur Tuberkulose
Schulze, Die Erhabunen auf der Lippin- und Wangenschleimhaut der Säugetiere
von Wilamwitz-Moellendorff, Die Samie des Menandros
Engler, Bericht über das >>Pflanzenreich<<
von Harnack, Bericht über die Ausgabe der griechischen Kirchenväter der dri ersten Jahrhunderte
Meinecke, Germanischer und romanischer Geist im Wandel der deutschen Geschichtsauffassung
Rubens und Hettner, Das langwellige Wasserdampfspektrum und seine Deutung durch die Quantentheorie
Einstein, Eine neue formale Deutung der Maxwellschen Feldgleichungen der Electrodynamic
Schwarschild, Über das Gravitationsfeld eines Massenpunktes nach der Einsteinschen Theorie
Helmreich, Handschriftliche Verbesserungen zu dem Hippokratesglossar des Galen
Prager, Über die Periode des veränderlichen Sterns RR Lyrae
Holl, Die Zeitfolge des ersten origenistischen Streits
Lüders, Zu den Upanisads. I. Die Samvargavidya
Warburg, Über den Energieumsatz bei photochemischen Vorgängen in Gasen. VI.
Hellman, Über die ägyptischen Witterungsangaben im Kalender von Claudius Ptolemaeus
Meyer-Lübke, Die Diphthonge im Provenzaslischen
Diels, Über die Schrift Antipocras des Nikolaus von Polen
Müller und Sieg, Maitrisimit und >>Tocharisch<<
Meyer, Ein altirischer Heilsegen
Schwarzschild, Über das Gravitationasfeld einer Kugel aus inkompressibler Flüssigkeit nach der Einsteinschen Theorie
Brauer, Die Verbreitung der Hyracoiden
Correns, Untersuchungen über Geschlechtsbestimmung bei Distelarten
Brahn, Weitere Untersuchungen über Fermente in der Lever von Krebskranken
Erdmann, Methodologische Konsequenzen aus der Theorie der Abstraktion
Bang, Studien zur vergleichenden Grammatik der Türksprachen. I.
Frobenius, Über die Kompositionsreihe einer Gruppe
Schwarzschild, Zur Quantenhypothese
Fischer und Bergmann,Über neue Galloylderivate des Traubenzuckers und ihren Vergleich mit der Chebulinsäure
Schuchhardt, Der starke Wall und die breite, zuweilen erhöhte Berme bei frügeschichtlichen Burgen in Norddeutschland
Born, Über anisotrope Flüssigkeiten
Planck, Über die absolute Entropie einatomiger Körper
Haberlandt, Blattepidermis und Lichtperzeption
Einstein, Näherungsweise Integration der Feldgleichungen der Gravitation
Lüders, Die Saubhikas. Ein Beitrag zur Gecschichte des indischen Dramas
In an ironic twist of the history of physics, Karl Schwarzschild’s fame has eclipsed his own legacy. When asked who was Karl Schwarzschild (1873 – 1916), you would probably say he’s the guy who solved Einstein’s Field Equations of General Relativity and discovered the radius of black holes. You may also know that he accomplished this Herculean feat while dying slowly behind the German lines on the Eastern Front in WWI. But asked what else he did, and you would probably come up blank. Yet Schwarzschild was one of the most wide-ranging physicists at the turn of the 20th century, which is saying something, because it places him into the same pantheon as Planck, Lorentz, Poincaré and Einstein. Let’s take a look at the part of his career that hides in the shadow of his own radius.
Radius of Interest
Karl Schwarzschild was born in Frankfurt, Germany, shortly after the Franco-Prussian war thrust Prussia onto the world stage as a major political force in Europe. His family were Jewish merchants of longstanding reputation in the city, and Schwarzschild’s childhood was spent in the vibrant Jewish community. One of his father’s friends was a professor at a university in Frankfurt, whose son, Paul Epstein (1871 – 1939), became a close friend of Karl’s at the Gymnasium. Schwarzshild and Epstein would partially shadow each other’s careers despite the fact that Schwarzschild became an astronomer while Epstein became a famous mathematician and number theorist. This was in part because Schwarzschild had large radius of interests that spanned the breadth of current mathematics and science, practicing both experiments and theory.
Schwarzschild’s application of the Hamiltonian formalism for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to fame.
By the time Schwarzschild was sixteen, he had taught himself the mathematics of celestial mechanics to such depth that he published two papers on the orbits of binary stars. He also became fascinated in astronomy and purchased lenses and other materials to construct his own telescope. His interests were helped along by Epstein, two years older and whose father had his own private observatory. When Epstein went to study at the University of Strasbourg (then part of the German Federation) Schwarzschild followed him. But Schwarzschild’s main interest in astronomy diverged from Epstein’s main interest in mathematics, and Schwarzschild transferred to the University of Munich where he studied under Hugo von Seeliger (1849 – 1924), the premier German astronomer of his day. Epstein remained at Strasbourg where he studied under Bruno Christoffel (1829 – 1900) and eventually became a professor, but he was forced to relinquish the post when Strasbourg was ceded to France after WWI.
Birth of Stellar Interferometry
the Hubble space telescope was launched in 1990 no star had ever been resolved
as a direct image. Within a year of its
launch, using its spectacular resolving power, the Hubble optics resolved—just
barely—the red supergiant Betelgeuse. No
other star (other than the Sun) is close enough or big enough to image the
stellar disk, even for the Hubble far above our atmosphere. The reason is that the diameter of the
optical lenses and mirrors of the Hubble—as big as they are at 2.4 meter
diameter—still produce a diffraction pattern that smears the image so that
stars cannot be resolved. Yet
information on the size of a distant object is encoded as phase in the light
waves that are emitted from the object, and this phase information is
accessible to interferometry.
The first physicist who truly grasped the power of optical interferometry and who understood how to design the first interferometric metrology systems was the French physicist Armand Hippolyte Louis Fizeau (1819 – 1896). Fizeau became interested in the properties of light when he collaborated with his friend Léon Foucault (1819–1868) on early uses of photography. The two then embarked on a measurement of the speed of light but had a falling out before the experiment could be finished, and both continued the pursuit independently. Fizeau achieved the first measurement using a toothed wheel rotating rapidly , while Foucault came in second using a more versatile system with a spinning mirror . Yet Fizeau surpassed Foucault in optical design and became an expert in interference effects. Interference apparatus had been developed earlier by Augustin Fresnel (the Fresnel bi-prism 1819), Humphrey Lloyd (Lloyd’s mirror 1834) and Jules Jamin (Jamin’s interferential refractor 1856). They had found ways of redirecting light using refraction and reflection to cause interference fringes. But Fizeau was one of the first to recognize that each emitting region of a light source was coherent with itself, and he used this insight and the use of lenses to design the first interferometer.
Fizeau’s interferometer used a lens with a with a tight focal spot masked off by an opaque screen with two open slits. When the masked lens device was focused on an intense light source it produced two parallel pencils of light that were mutually coherent but spatially separated. Fizeau used this apparatus to measure the speed of light in moving water in 1859 .
The working principle of the Fizeau refractometer is shown in Fig. 1. The light source is at the bottom, and it is reflected by the partially-silvered beam splitter to pass through the lens and the mask containing two slits. (Only the light paths that pass through the double-slit mask on the lens are shown in the figure.) The slits produce two pencils of mutually coherent light that pass through a system (in the famous Fizeau ether drag experiment it was along two tubes of moving water) and are returned through the same slits, and they intersect at the view port where they produce interference fringes. The fringe spacing is set by the separation of the two slits in the mask. The Rayleigh region of the lens defines a region of spatial coherence even for a so-called “incoherent” source. Therefore, this apparatus, by use of the lens, could convert an incoherent light source into a coherent probe to test the refractive index of test materials, which is why it was called a refractometer.
Fizeau became adept at thinking of alternative optical designs of his refractometer and alternative applications. In an address to the French Physical Society in 1868 he suggested that the double-slit mask could be used on a telescope to determine sizes of distant astronomical objects . There were several subsequent attempts to use Fizeau’s configuration in astronomical observations, but none were conclusive and hence were not widely known.
An optical configuration and astronomical application that was very similar to Fizeau’s idea was proposed by Albert Michelson in 1890 . He built the apparatus and used it to successfully measure the size of several moons of Jupiter . The configuration of the Michelson stellar interferometer is shown in Fig. 2. Light from a distant star passes through two slits in the mask in front of the collecting optics of a telescope. When the two pencils of light intersect at the view port, they produce interference fringes. Because of the finite size of the stellar source, the fringes are partially washed out. By adjusting the slit separation, a certain separation can be found where the fringes completely wash out. The size of the star is then related to the separation of the slits for which the fringe visibility vanishes. This simple principle allows this type of stellar interferometry to measure the size of stars that are large and relatively close to Earth. However, if stars are too far away even this approach cannot be used to measure their sizes because telescopes aren’t big enough. This limitation is currently being bypassed by the use of long-baseline optical interferometers.
One of the open questions in the history of interferometry is whether Michelson was aware of Fizeau’s proposal for the stellar interferometer made in 1868. Michelson was well aware of Fizeau’s published research and acknowledged him as a direct inspiration of his own work in interference effects. But Michelson also was unaware of the undercurrents in the French school of optical interference. When he visited Paris in 1881, he met with many of the leading figures in this school (including Lippmann and Cornu), but there is no mention or any evidence that he met with Fizeau. By this time Fizeau’s wife had passed away, and Fizeau spent most of his time in seclusion at his home outside Paris. Therefore, it is unlikely that he would have been present during Michelson’s visit. Because Michelson viewed Fizeau with such awe and respect, if he had met him, he most certainly would have mentioned it. Therefore, Michelson’s invention of the stellar interferometer can be considered with some confidence to be a case of independent discovery. It is perhaps not surprising that he hit on the same idea that Fizeau had in 1868, because Michelson was one of the few physicists who understood coherence and interference at the same depth as Fizeau.
The physics of the Michelson stellar interferometer is very similar to the physics of Young’s double slit experiment. The two slits in the aperture mask of the telescope objective act to produce a simple sinusoidal interference pattern at the image plane of the optical system. The size of the stellar diameter is determined by using the wash-out effect of the fringes caused by the finite stellar size. However, it is well known to physicists who work with diffraction gratings that a multiple-slit interference pattern has a much greater resolving power than a simple double slit.
This realization must have hit von Seeliger and Schwarzschild, working together at Munich, when they saw the publication of Michelson’s theoretical analysis of his stellar interferometer in 1890, followed by his use of the apparatus to measure the size of Jupiter’s moons. Schwarzschild and von Seeliger realized that by replacing the double-slit mask with a multiple-slit mask, the widths of the interference maxima would be much narrower. Such a diffraction mask on a telescope would cause a star to produce a multiple set of images on the image plane of the telescope associated with the multiple diffraction orders. More interestingly, if the target were a binary star, the diffraction would produce two sets of diffraction maxima—a double image! If the “finesse” of the grating is high enough, the binary star separation could be resolved as a doublet in the diffraction pattern at the image, and the separation could be measured, giving the angular separation of the two stars of the binary system. Such an approach to the binary separation would be a direct measurement, which was a distinct and clever improvement over the indirect Michelson configuration that required finding the extinction of the fringe visibility.
Schwarzschild enlisted the help of a fine German instrument maker to create a multiple slit system that had an adjustable slit separation. The device is shown in Fig. 3 from Schwarzschild’s 1896 publication on the use of the stellar interferometer to measure the separation of binary stars . The device is ingenious. By rotating the chain around the gear on the right-hand side of the apparatus, the two metal plates with four slits could be raised or lowered, cause the projection onto the objective plane to have variable slit spacings. In the operation of the telescope, the changing height of the slits does not matter, because they are near a conjugate optical plane (the entrance pupil) of the optical system. Using this adjustable multiple slit system, Schwarzschild (and two colleagues he enlisted) made multiple observations of well-known binary star systems, and they calculated the star separations. Several of their published results are shown in Fig. 4.
Schwarzschild’s publication demonstrated one of the very first uses of stellar interferometry—well before Michelson himself used his own configuration to measure the diameter of Betelgeuse in 1920. Schwarzschild’s major achievement was performed before he had received his doctorate, on a topic orthogonal to his dissertation topic. Yet this fact is virtually unknown to the broader physics community outside of astronomy. If he had not become so famous later for his solution of Einstein’s field equations, Schwarzschild nonetheless might have been famous for his early contributions to stellar interferometry. But even this was not the end of his unique contributions to physics.
As Schwarzschild worked for his doctorate under von Seeliger, his dissertation topic was on new theories by Henri Poincaré (1854 – 1912) on celestial mechanics. Poincaré had made a big splash on the international stage with the publication of his prize-winning memoire in 1890 on the three-body problem. This is the publication where Poincaré first described what would later become known as chaos theory. The memoire was followed by his volumes on “New Methods in Celestial Mechanics” published between 1892 and 1899. Poincaré’s work on celestial mechanics was based on his earlier work on the theory of dynamical systems where he discovered important invariant theorems, such as Liouville’s theorem on the conservation of phase space volume. Schwarzshild applied Poincaré’s theorems to problems in celestial orbits. He took his doctorate in 1896 and received a post at an astronomical observatory outside Vienna.
While at Vienna, Schwarzschild performed his most important sustained contributions to the science of astronomy. Astronomical observations had been dominated for centuries by the human eye, but photographic techniques had been making steady inroads since the time of Hermann Carl Vogel (1841 – 1907) in the 1880’s at the Potsdam observatory. Photographic plates were used primarily to record star positions but were known to be unreliable for recording stellar intensities. Schwarzschild developed a “out-of-focus” technique that blurred the star’s image, while making it larger and easier to measure the density of the exposed and developed photographic emulsions. In this way, Schwarzschild measured the magnitudes of 367 stars. Two of these stars had variable magnitudes that he was able to record and track. Schwarzschild correctly explained the intensity variation caused by steady oscillations in heating and cooling of the stellar atmosphere. This work established the properties of these Cepheid variables which would become some of the most important “standard candles” for the measurement of cosmological distances. Based on the importance of this work, Schwarzschild returned to Munich as a teacher in 1899 and subsequently was appointed in 1901 as the director of the observatory at Göttingen established by Gauss eighty years earlier.
Schwarzschild’s years at Göttingen brought him into contact with some of the greatest mathematicians and physicists of that era. The mathematicians included Felix Klein, David Hilbert and Hermann Minkowski. The physicists included von Laue, a student of Woldemar Voigt. This period was one of several “golden ages” of Göttingen. The first golden age was the time of Gauss and Riemann in the mid-1800’s. The second golden age, when Schwarzschild was present, began when Felix Klein arrived at Göttingen and attracted the top mathematicians of the time. The third golden age of Göttingen was the time of Born and Jordan and Heisenberg at the birth of quantum mechanics in the mid 1920’s.
In 1906, the Austrian Physicist Paul Ehrenfest, freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life. Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest. It was a monumental task, which was to take him and his physicist wife Tatyanya nearly five years to complete. Part of the delay was the desire by the Ehrenfests to close some open problems that remained in Boltzmann’s work. One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters. These properties would later be called adiabatic invariants by Einstein.
Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity. Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system. This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice. For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .
Ehrenfest published his observations in 1913 , the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc.
After eight exciting years at Göttingen, Schwarzschild was offered the position at the Potsdam Observatory in 1909 upon the retirement from that post of the famous German astronomer Carl Vogel who had made the first confirmed measurements of the optical Doppler effect. Schwarzschild accepted and moved to Potsdam with a new family. His son Martin Schwarzschild would follow him into his profession, becoming a famous astronomer at Princeton University and a theorist on stellar structure. At the outbreak of WWI, Schwarzschild joined the German army out of a sense of patriotism. Because of his advanced education he was made an officer of artillery with the job to calculate artillery trajectories, and after a short time on the Western Front in Belgium was transferred to the Eastern Front in Russia. Though he was not in the trenches, he was in the midst of the chaos to the rear of the front. Despite this situation, he found time to pursue his science through the year 1915.
Schwarzschild was intrigued by Ehrenfest’s paper on adiabatic invariants and their similarity to several of the invariant theorems of Poincaré that he had studied for his doctorate. Up until this time, mechanics had been mostly pursued through the Lagrangian formalism which could easily handle generalized forces associated with dissipation. But celestial mechanics are conservative systems for which the Hamiltonian formalism is a more natural approach. In particular, the Hamilton-Jacobi canonical transformations made it particularly easy to find pairs of generalized coordinates that had simple periodic behavior. In his published paper , Schwarzschild called these “Action-Angle” coordinates because one was the action integral that was well-known in the principle of “Least Action”, and the other was like an angle variable that changed steadily in time (see Fig. 5). Action-angle coordinates have come to form the foundation of many of the properties of Hamiltonian chaos, Hamiltonian maps, and Hamiltonian tapestries.
During lulls in bombardments, Schwarzschild translated the Hamilton-Jacobi methods of celestial mechanics to apply them to the new quantum mechanics of the Bohr orbits. The phrase “quantum mechanics” had not yet been coined (that would come ten years later in a paper by Max Born), but it was clear that the Bohr quantization conditions were a new type of mechanics. The periodicities that were inherent in the quantum systems were natural properties that could be mapped onto the periodicities of the angle variables, while Ehrenfest’s adiabatic invariants could be mapped onto the slowly varying action integrals. Schwarzschild showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the Bohr electron orbits. Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue, showing that each of Sommerfeld’s quantum conditions were precisely Schwarzschild’s action-integral invariants of the classical electron dynamics .
The works by Schwarzschild, and a closely-related paper that amplified his ideas published by his friend Paul Epstein several months later , were the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, foreshadowing the future importance of Hamiltonians for quantum theory. An essential part of the Hamiltonian formalism is the concept of phase space. In his paper, Schwarzschild showed that the phase space of quantum systems was divided into small but finite elementary regions whose areas were equal to Planck’s constant h-bar (see Fig. 6). The areas were products of a small change in momentum coordinate Delta-p and a corresponding small change in position coordinate Delta-x. Therefore, the product DxDp = h-bar. This observation, made in 1915 by Schwarzschild, was only one step away from Heisenberg’s uncertainty relation, twelve years before Heisenberg discovered it. However, in 1915 Born’s probabilistic interpretation of quantum mechanics had not yet been made, nor the idea of measurement uncertainty, so Schwarzschild did not have the appropriate context in which to have made the leap to the uncertainty principle. However, by introducing the action-angle coordinates as well as the Hamiltonian formalism applied to quantum systems, with the natural structure of phase space, Schwarzschild laid the foundation for the future developments in quantum theory made by the next generation.
Quiet on the Eastern Front
the end of his second stay in Munich in 1900, prior to joining the Göttingen
faculty, Schwarzschild had presented a paper at a meeting of the German Astronomical Society held in Heidelberg in
August. The topic was unlike anything he
had tackled before. It considered the
highly theoretical question of whether the universe was non-Euclidean, and more
specifically if it had curvature. He
concluded from observation that if the universe were curved, the radius of
curvature must be larger than between 50 light years and 2000 light years,
depending on whether the geometry was hyperbolic or elliptical. Schwarzschild was working out ideas of
differential geometry and applying them to the universe at large at a time when
Einstein was just graduating from the ETH where he skipped his math classes and
had his friend Marcel Grossmann take notes for him.
The topic of Schwarzschild’s talk tells an important story about the warping of historical perspective by the “great man” syndrome. In this case the great man is Einstein who is today given all the credit for discovering the warping of space. His development of General Relativity is often portrayed as by a lone genius in the wilderness performing a blazing act of creation out of the void. In fact, non-Euclidean geometry had been around for some time by 1900—five years before Einstein’s Special Theory and ten years before his first publications on the General Theory. Gauss had developed the idea of intrinsic curvature of a manifold fifty years earlier, amplified by Riemann. By the turn of the century alternative geometries were all the rage, and Schwarzschild considered whether there were sufficient astronomical observations to set limits on the size of curvature of the universe. But revisionist history is just as prevalent in physics as in any field, and when someone like Einstein becomes so big in the mind’s eye, his shadow makes it difficult to see all the people standing behind him.
This is not meant to take away from the feat that Einstein accomplished. The General Theory of Relativity, published by Einstein in its full form in 1915 was spectacular . Einstein had taken vague notions about curved spaces and had made them specific, mathematically rigorous and intimately connected with physics through the mass-energy source term in his field equations. His mathematics had gone beyond even what his mathematician friend and former collaborator Grossmann could achieve. Yet Einstein’s field equations were nonlinear tensor differential equations in which the warping of space depended on the strength of energy fields, but the configuration of those energy fields depended on the warping of space. This type of nonlinear equation is difficult to solve in general terms, and Einstein was not immediately aware of how to find the solutions to his own equations.
Therefore, it was no small surprise to him when he received a letter from the Eastern Front from an astronomer he barely knew who had found a solution—a simple solution (see Fig. 7) —to his field equations. Einstein probably wondered how he could have missed it, but he was generous and forwarded the letter to the Reports of the Prussian Physical Society where it was published in 1916 .
In the same paper, Schwarzschild used his exact solution to find the exact equation that described the precession of the perihelion of Mercury that Einstein had only calculated approximately. The dynamical equations for Mercury are shown in Fig. 8.
Schwarzschild’s solution to Einstein’s Field Equation of General Relativity was not a general solution, even for a point mass. He had constants of integration that could have arbitrary values, such as the characteristic length scale that Schwarzschild called “alpha”. It was David Hilbert who later expanded upon Schwarzschild’s work, giving the general solution and naming the characteristic length scale (where the metric diverges) after Schwarzschild. This is where the phrase “Schwarzschild Radius” got its name, and it stuck. In fact it stuck so well that Schwarzschild’s radius has now eclipsed much of the rest of Schwarzschild’s considerable accomplishments.
Unfortunately, Schwarzschild’s accomplishments were cut short when he contracted an autoimmune disease that may have been hereditary. It is ironic that in the carnage of the Eastern Front, it was a genetic disease that caused his death at the age of 42. He was already suffering from the effects of the disease as he worked on his last publications. He was sent home from the front to his family in Potsdam where he passed away several months later having shepherded his final two papers through the publication process. His last paper, on the action-angle variables in quantum systems , was published on the day that he died.
Schwarzschild’s legacy was assured when he solved Einstein’s field equations and Einstein communicated it to the world. But his hidden legacy is no less important.
Schwarzschild’s application of the Hamiltonian formalism of canonical transformations and phase space for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to later fame, although he could not express it in probabilistic terms because he came too early.
Schwarzschild is considered to be the greatest German astronomer of the last hundred years. This is in part based on his work at the birth of stellar interferometry and in part on his development of stellar photometry and the calibration of the Cepheid variable stars that went on to revolutionize our view of our place in the universe. Solving Einsteins field equations was just a sideline for him, a hobby to occupy his active and curious mind.
 Fizeau, H. L. (1849). “Sur une expérience relative à la
vitesse de propagation de la lumière.” Comptes rendus de l’Académie des
sciences 29: 90–92, 132.
 Foucault, J. L. (1862). “Détermination expérimentale de la
vitesse de la lumière: parallaxe du Soleil.” Comptes rendus de
l’Académie des sciences 55: 501–503, 792–596.
 Fizeau, H. (1859). “Sur les hypothèses relatives à l’éther
lumineux.” Ann. Chim. Phys. Ser.
4 57: 385–404.
 Fizeau, H. (1868). “Prix
Bordin: Rapport sur le concours de l’annee 1867.” C. R. Acad. Sci. 66:
 Michelson, A. A. (1890). “I. On the application of
interference methods to astronomical measurements.” The London,
Edinburgh, and Dublin Philosophical Magazine and Journal of Science 30(182):
 Michelson, A. A. (1891). “Measurement of Jupiter’s Satellites
by Interference.” Nature 45(1155): 160-161.
 Schwarzschild, K. (1896). “Über messung von doppelsternen
durch interferenzen.” Astron. Nachr. 3335: 139.
P. Ehrenfest, “Een mechanische theorema van Boltzmann en
zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and
its relation to the theory of energy quanta),” Verslag van de Gewoge
Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593,
 Schwarzschild, K. (1916). “Quantum hypothesis.” Sitzungsberichte
Der Koniglich Preussischen Akademie Der Wissenschaften: 548-568.
P. Ehrenfest, “Adiabatic invariables and quantum
theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.
 Epstein, P. S. (1916). “The quantum theory.” Annalen
Der Physik 51(18): 168-188.
 Einstein, A. (1915). “On the general theory of
relativity.” Sitzungsberichte Der Koniglich Preussischen Akademie Der
 Schwarzschild, K. (1916). “Über das Gravitationsfeld eines
Massenpunktes nach der Einstein’schen Theorie.” Sitzungsberichte der
Königlich-Preussischen Akademie der Wissenschaften: 189.
As a graduate student in physics at Berkeley in the 1980’s, I took General Relativity (aka GR), from Bruno Zumino, who was a world-famous physicist known as one of the originators of super-symmetry in quantum gravity (not to be confused with super-asymmetry of Cooper-Fowler Big Bang Theory fame). The class textbook was Gravitation and cosmology: principles and applications of the general theory of relativity, by Steven Weinberg, another world-famous physicist, in this case known for grand unification of the electro-weak force with electromagnetism. With so much expertise at hand, how could I fail but to absorb the simple essence of general relativity?
The answer is that I failed miserably. Somehow, I managed to pass the course, but I walked away with nothing! And it bugged me for years. What was so hard about GR? It took me almost a decade teaching undergraduate physics classes at Purdue in the 90’s before I realized that it my biggest obstacle had been language: I kept mistaking the words and terms of GR as if they were English. Words like “general covariance” and “contravariant” and “contraction” and “covariant derivative”. They sounded like English, with lots of “co” prefixes that were hard to keep straight, but they actually are part of a very different language that I call Physics-ese,
Physics-ese is a language that has lots of words that sound like English, and so you think you know what the words mean, but the words have sometimes opposite meanings than what you would guess. And the meanings of Physics-ese are precisely defined, and not something that can be left to interpretation. I learned this while teaching the intro courses to non-majors, because so many times when the students were confused, it turned out that it was because they had mistaken a textbook jargon term to be English. If you told them that the word wasn’t English, but just a token standing for a well-defined object or process, it would unshackle them from their misconceptions.
Then, in the early 00’s when I started to explore the physics of generalized trajectories related to some of my own research interests, I realized that the primary obstacle to my learning anything in the Gravitation course was Physics-ese. So this raised the question in my mind: what would it take to teach GR to undergraduate physics majors in a relatively painless manner? This is my answer.
One of the culprits for my mind block learning GR was
Newton himself. His ubiquitous second
law, taught as F = ma, is surprisingly misleading if one wants to have a more
general understanding of what a trajectory is.
This is particularly the case for light paths, which can be bent by
gravity, yet clearly cannot have any forces acting on them.
The way to fix this is subtle yet simple. First, express Newton’s second law as
which is actually closer to the way that Newton expressed the law in his Principia. In three dimensions for a single particle, these equations represent a 6-dimensional dynamical space called phase space: three coordinate dimensions and three momentum dimensions. Then generalize the vector quantities, like the position vector, to be expressed as xa for the six dynamics variables: x, y, z, px, py, and pz.
Now, as part of Physics-ese, putting the index as a superscript instead as a subscript turns out to be a useful notation when working in higher-dimensional spaces. This superscript is called a “contravariant index” which sounds like English but is uninterpretable without a Physics-ese-to-English dictionary. All “contravariant index” means is “column vector component”. In other words, xa is just the position vector expressed as a column vector
This superscripted index is called a “contravariant” index, but seriously dude, just forget that “contravariant” word from Physics-ese and just think “index”. You already know it’s a column vector.
Then Newton’s second law becomes
where the index a runs from 1 to 6, and the function
Fa is a vector function of the dynamic variables. To spell it out, this is
so it’s a lot easier to write it in the one-line form with
the index notation.
The simple index notation equation is in the standard form for what is called, in Physics-ese, a “mathematical flow”. It is an ODE that can be solved for any set of initial conditions for a given trajectory. Or a whole field of solutions can be considered in a phase-space portrait that looks like the flow lines of hydrodynamics. The phase-space portrait captures the essential physics of the system, whether it is a rock thrown off a cliff, or a photon orbiting a black hole. But to get to that second problem, it is necessary to look deeper into the way that space is described by any set of coordinates, especially if those coordinates are changing from location to location.
What’s so Fictitious about Fictitious Forces?
Freshmen physics students are routinely admonished for talking about “centrifugal” forces (rather than centripetal) when describing circular motion, usually with the statement that centrifugal forces are fictitious—only appearing to be forces when the observer is in the rotating frame. The same is said for the Coriolis force. Yet for being such a “fictitious” force, the Coriolis effect is what drives hurricanes and the colossal devastation they cause. Try telling a hurricane victim that they were wiped out by a fictitious force! Looking closer at the Coriolis force is a good way of understanding how taking derivatives of vectors leads to effects often called “fictitious”, yet it opens the door on some of the simpler techniques in the topic of differential geometry.
To start, consider a vector in a uniformly rotating frame. Such a frame is called “non-inertial” because of the angular acceleration associated with the uniform rotation. For an observer in the rotating frame, vectors are attached to the frame, like pinning them down to the coordinate axes, but the axes themselves are changing in time (when viewed by an external observer in a fixed frame). If the primed frame is the external fixed frame, then a position in the rotating frame is
where R is the position vector of the origin of the rotating frame and r is the position in the rotating frame relative to the origin. The funny notation on the last term is called in Physics-ese a “contraction”, but it is just a simple inner product, or dot product, between the components of the position vector and the basis vectors. A basis vector is like the old-fashioned i, j, k of vector calculus indicating unit basis vectors pointing along the x, y and z axes. The format with one index up and one down in the product means to do a summation. This is known as the Einstein summation convention, so it’s just
Taking the time derivative of the position vector gives
and by the chain rule this must be
where the last term has a time derivative of a basis
vector. This is non-zero because in the
rotating frame the basis vector is changing orientation in time. This term is non-inertial and can be shown
fairly easily (see IMD2 Chapter 1) to be
which is where the centrifugal force comes from. This shows how a so-called fictitious force
arises from a derivative of a basis vector.
The fascinating point of this is that in GR, the force of gravity arises
in almost the same way, making it tempting to call gravity a fictitious force,
despite the fact that it can kill you if you fall out a window. The question is, how does gravity arise from
simple derivatives of basis vectors?
The Geodesic Equation
To teach GR to undergraduates, you cannot expect them to have taken a course in differential geometry, because most of them just don’t have the time in their schedule to take such an advanced mathematics course. In addition, there is far more taught in differential geometry than is needed to make progress in GR. So the simple approach is to teach what they need to understand GR with as little differential geometry as possible, expressed with clear English-to-Physics-ese translations.
For example, consider the partial derivative of a vector expressed in index notation as
Taking the partial derivative, using the always-necessary chain rule, is
where the second term is just like the extra
time-derivative term that showed up in the derivation of the Coriolis
force. The basis vector of a general
coordinate system may change size and orientation as a function of position, so
this derivative is not in general zero.
Because the derivative of a basis vector is so central to the ideas of
GR, they are given their own symbol. It
where the new “Gamma” symbol is called a Christoffel symbol. It has lots of indexes, both up and down, which looks daunting, but it can be interpreted as the beta-th derivative of the alpha-th component of the mu-th basis vector. The partial derivative is now
For those of you who noticed that some of the indexes
flipped from alpha to mu and vice versa, you’re right! Swapping repeated indexes in these
“contractions” is allowed and helps make derivations a lot easier, which is
probably why Einstein invented this notation in the first place.
The last step in taking a partial derivative of a vector is to isolate a single vector component Va as
where a new symbol, the del-operator has been
introduced. This del-operator is known
as the “covariant derivative” of the vector component. Again, forget the “covariant” part and just
think “gradient”. Namely, taking the
gradient of a vector in general includes changes in the vector component as
well as changes in the basis vector.
Now that you know how to take the partial derivative of a vector using Christoffel symbols, you are ready to generate the central equation of General Relativity: The geodesic equation.
Everyone knows that a geodesic is the shortest path between two points, like a great circle route on the globe. But it also turns out to be the straightest path, which can be derived using an idea known as “parallel transport”. To start, consider transporting a vector along a curve in a flat metric. The equation describing this process is
Because the Christoffel symbols are zero in a flat space, the covariant derivative and the partial derivative are equal, giving
If the vector is transported parallel to itself, then there is no change in V along the curve, so that
and substituting this in gives
This is the geodesic equation!
Putting this in the standard form of a flow gives the geodesic flow equations
The flow defines an ordinary differential equation that
defines a curve that carries its own tangent vector onto itself. The curve is parameterized by a parameter s
that can be identified with path length.
It is the central equation of GR, because it describes how an object
follows a force-free trajectory, like free fall, in any general coordinate
system. It can be applied to simple
problems like the Coriolis effect, or it can be applied to seemingly difficult
problems, like the trajectory of a light path past a black hole.
The Metric Connection
Arriving at the geodesic equation is a major
accomplishment, and you have done it in just a few pages of this blog. But there is still an important missing piece
before we are doing General Relativity of gravitation. We need to connect the Christoffel symbol in
the geodesic equation to the warping of space-time around a gravitating
The warping of space-time by matter and energy is another central piece of GR and is often the central focus of a graduate-level course on the subject. This part of GR does have its challenges leading up to Einstein’s Field Equations that explain how matter makes space bend. But at an undergraduate level, it is sufficient to just describe the bent coordinates as a starting point, then use the geodesic equation to solve for so many of the cool effects of black holes.
So, stating the way that matter bends space-time is as simple as writing down the length element for the Schwarzschild metric of a spherical gravitating mass as
where RS = GM/c2 is the Schwarzschild
radius. (The connection between the
metric tensor gab and the Christoffel symbol can be found in Chapter
11 of IMD2.) It takes only a little work
to find that
This means that if we have the Schwarzschild metric, all we
have to do is take first partial derivatives and we will arrive at the
Christoffel symbols that go into the geodesic equation. Solving for any type of force-free trajectory
is then just a matter of solving ODEs with initial conditions (performed
routinely with numerical ODE solvers in Python, Matlab, Mathematica, etc.).
The first problem we will tackle using the geodesic equation is the deflection of light by gravity. This is the quintessential problem of GR because there cannot be any gravitational force on a photon, yet the path of the photon surely must bend in the presence of gravity. This is possible through the geodesic motion of the photon through warped space time. I’ll take up this problem in my next Blog.
Arthur Eddington was the complete package—an observationalist with the mathematical and theoretical skills to understand Einstein’s general theory, and the ability to construct the theory of the internal structure of stars. He was Zeus in Olympus among astrophysicists. He always had the last word, and he stood with Einstein firmly opposed to the Schwarzschild singularity. In 1924 he published a theoretical paper in which he derived a new coordinate frame (now known as Eddington-Finkelstein coordinates) in which the singularity at the Schwarzschild radius is removed. At the time, he took this to mean that the singularity did not exist and that gravitational cut off was not possible . It would seem that the possibility of dark stars (black holes) had been put to rest. Both Eddington and Einstein said so! But just as they were writing the obituary of black holes, a strange new form of matter was emerging from astronomical observations that would challenge the views of these giants.
Something wonderful, but also a little scary, happened when Chandrasekhar included the relativistic effects in his calculation.
Binary star systems have always held a certain fascination for astronomers. If your field of study is the (mostly) immutable stars, then the stars that do move provide some excitement. The attraction of binaries is the same thing that makes them important astrophysically—they are dynamic. While many double stars are observed in the night sky (a few had been noted by Galileo), some of these are just coincidental alignments of near and far stars. However, William Herschel began cataloging binary stars in 1779 and became convinced in 1802 that at least some of them must be gravitationally bound to each other. He carefully measured the positions of binary stars over many years and confirmed that these stars showed relative changes in position, proving that they were gravitational bound binary star systems . The first orbit of a binary star was computed in 1827 by Félix Savary for the orbit of Xi Ursae Majoris. Finding the orbit of a binary star system provides a treasure trove of useful information about the pair of stars. Not only can the masses of the stars be determined, but their radii and densities also can be estimated. Furthermore, by combining this information with the distance to the binaries, it was possible to develop a relationship between mass and luminosity for all stars, even single stars. Therefore, binaries became a form of measuring stick for crucial stellar properties.
One of the binary star systems that Hershel discovered was the pair known as 40 Eridani B/C, which he observed on January 31 in 1783. Of this pair, 40 Eridani B was very dim compared to its companion. More than a century later, in 1910 when spectrographs were first being used routinely on large telescopes, the spectrum of 40 Eridani B was found to be of an unusual white spectral class. In the same year, the low luminosity companion of Sirius, known as Sirius B, which shared the same unusual white spectral class, was evaluated in terms of its size and mass and was found to be exceptionally small and dense . In fact, it was too small and too dense to be believed at first, because the densities were beyond any known or even conceivable matter. The mass of Sirius B is around the mass of the Sun, but its radius is comparable to the radius of the Earth, making the density of the white star about ten thousand times denser than the core of the Sun. Eddington at first felt the same way about white dwarfs that he felt about black holes, but he was eventually swayed by the astrophysical evidence. By 1922 many of these small white stars had been discovered, called white dwarfs, and their incredibly large densities had been firmly established. In his famous book on stellar structure, he noted the strange paradox: As a star cools, its pressure must decrease, as all gases must do as they cool, and the star would shrink, yet the pressure required to balance the force of gravity to stabilize the star against continued shrinkage must increase as the star gets smaller. How can pressure decrease and yet increase at the same time? In 1926, on the eve of the birth of quantum mechanics, Eddington could conceive of no mechanism that could resolve this paradox. So he noted it as an open problem in his book and sent it to press.
Three years after the publication of Eddington’s book, an eager and excited nineteen-year-old graduate of the University in Madras India boarded a steamer bound for England. Subrahmanyan Chandrasekhar (1910—1995) had been accepted for graduate studies at Cambridge University. The voyage in 1930 took eighteen days via the Suez Canal, and he needed something to do to pass the time. He had with him Eddington’s book, which he carried like a bible, and he also had a copy of a breakthrough article written by R. H. Fowler that applied the new theory of quantum mechanics to the problem of dense matter composed of ions and electrons . Fowler showed how the Pauli exclusion principle for electrons, that obeyed Fermi-Dirac statistics, created an energetic sea of electrons in their lowest energy state, called electron degeneracy. This degeneracy was a fundamental quantum property of matter, and carried with it an intrinsic pressure unrelated to thermal properties. Chandrasekhar realized that this was a pressure mechanism that could balance the force of gravity in a cooling star and might resolve Eddington’s paradox of the white dwarfs. As the steamer moved ever closer to England, Chandrasekhar derived the new balance between gravitational pressure and electron degeneracy pressure and found the radius of the white dwarf as a function of its mass. The critical step in Chandrasekhar’s theory, conceived alone on the steamer at sea with access to just a handful of books and papers, was the inclusion of special relativity with the quantum physics. This was necessary, because the densities were so high and the electrons were so energetic, that they attained speeds approaching the speed of light.
Something wonderful, but also a little scary, happened when Chandrasekhar included the relativistic effects in his calculation. He discovered that electron degeneracy pressure could balance the force of gravity if the mass of the white dwarf were smaller than about 1.4 times the mass of the Sun. But if the dwarf was more massive than this, then even the electron degeneracy pressure would be insufficient to fight gravity, and the star would continue to collapse. To what? Schwarzschild’s singularity was one possibility. Chandrasekhar wrote up two papers on his calculations, and when he arrived in England, he showed them to Fowler, who was to be his advisor at Cambridge. Fowler was genuinely enthusiastic about the first paper, on the derivation of the relativistic electron degeneracy pressure, and it was submitted for publication. The second paper, on the maximum sustainable mass for a white dwarf, which reared the ugly head of Schwarzschild’s singularity, made Fowler uncomfortable, and he sat on the paper, unwilling to give his approval for publication in the leading British astrophysical journal. Chandrasekhar grew annoyed, and in frustration sent it, without Fowler’s approval, to an American journal, where “The Maximum Mass of Ideal White Dwarfs” was published in 1931 . This paper, written in eighteen days on a steamer at sea, established what became known as the Chandrasekhar limit, for which Chandrasekhar would win the 1983 Nobel Prize in Physics, but not before he was forced to fight major battles for its acceptance.
Chandrasekhar versus Eddington
Initially there was almost no response to Chandrasekhar’s paper. Frankly, few astronomers had the theoretical training needed to understand the physics. Eddington was one exception, which was why he held such stature in the community. The big question therefore was: Was Chandrasekhar’s theory correct? During the three years to obtain his PhD, Chandrasekhar met frequently with Eddington, who was also at Cambridge, and with colleagues outside the university, and they all encouraged Chandrasekhar to tackle the more difficult problem to combine internal stellar structure with his theory. This could not be done with pen and paper, but required numerical calculation. Eddington was in possession of an early electromagnetic calculator, and he loaned it to Chandrasekhar to do the calculations. After many months of tedious work, Chandrasekhar was finally ready to confirm his theory at the 1934 meeting of the British Astrophysical Society.
The young Chandrasekhar stood up and gave his results in an impeccable presentation before an auditorium crowded with his peers. But as he left the stage, he was shocked when Eddington himself rose to give the next presentation. Eddington proceeded to criticize and reject Chandrasekhar’s careful work, proposing instead a garbled mash-up of quantum theory and relativity that would eliminate Chandrasekhar’s limit and hence prevent collapse to the Schwarzschild singularity. Chandrasekhar sat mortified in the audience. After the session, many of his friends and colleagues came up to him to give their condolences—if Eddington, the leader of the field and one of the few astronomers who understood Einstein’s theories, said that Chandrasekhar was wrong, then that was that. Badly wounded, Chandrasekhar was faced with a dire choice. Should he fight against the reputation of Eddington, fight for the truth of his theory? But he was at the beginning of his career and could ill afford to pit himself against the giant. So he turned his back on the problem of stellar death, and applied his talents to the problem of stellar evolution.
Chandrasekhar went on to have an illustrious career, spent mostly at the University of Chicago (far from Cambridge), and he did eventually return to his limit as it became clear that Eddington was wrong. In fact, many at the time already suspected Eddington was wrong and were seeking for the answer to the next question: If white dwarfs cannot support themselves under gravity and must collapse, what do they collapse to? In Pasadena at the California Institute of Technology, an astrophysicist named Fritz Zwicky thought he knew the answer.
Fritz Zwicky’s Neutron Star
Fritz Zwicky (1898—1874) was an irritating and badly flawed genius. What made him so irritating was that he knew he was a genius and never let anyone forget it. What made him badly flawed was that he never cared much for weight of evidence. It was the ideas that mattered—let lesser minds do the tedious work of filling in the cracks. And what made him a genius was that he was often right! Zwicky pushed the envelope—he loved extremes. The more extreme a theory was, the more likely he was to favor it—like his proposal for dark matter. Most of his colleagues considered him to be a buffoon and borderline crackpot. He was tolerated by no one—no one except his steadfast collaborator of many years Ernst Baade (until they nearly came to blows on the eve of World War II). Baade was a German physicist trained at Göttingen and recently arrived at Cal Tech. He was exceptionally well informed on the latest advances in a broad range of fields. Where Zwicky made intuitive leaps, often unsupported by evidence, Baade would provide the context. Baade was a walking Wikipedia for Zwicky, and together they changed the face of astrophysics.
Zwicky and Baade submitted an abstract to the American Physical Society Meeting in 1933, which Kip Thorne has called “…one of the most prescient documents in the history of physics and astronomy” . In the abstract, Zwicky and Baade introduced, for the first time, the existence of supernovae as a separate class of nova and estimated the total energy output of these cataclysmic events, including the possibility that they are the source of some cosmic rays. They introduced the idea of a neutron star, a star composed purely of neutrons, only a year after Chadwick discovered the neutron’s existence, and they strongly suggested that a supernova is produced by the transformation of a star into a neutron star. A neutron star would have a mass similar to that of the Sun, but would have a radius of only tens of kilometers. If the mass density of white dwarfs was hard to swallow, the density of a neutron star was billion times greater! It would take nearly thirty years before each of the assertions made in this short abstract were proven true, but Zwicky certainly had a clear view, tempered by Baade, of where the field of astrophysics was headed. But no one listened to Zwicky. He was too aggressive and backed up his wild assertions with too little substance. Therefore, neutron stars simmered on the back burner until more substantial physicists could address their properties more seriously.
Two substantial physicists who had the talent and skills that Zwicky lacked were Lev Landau in Moscow and Robert Oppenheimer at Berkeley. Landau derived the properties of a neutron star in 1937 and published the results to great fanfare. He was not aware of Zwicky’s work, and he called them neutron cores, because he hypothesized that they might reside at the core of ordinary stars like the Sun. Oppenheimer, working with a Canadian graduate student George Volkoff at Berkeley, showed that Landau’s idea about stellar cores was not correct, but that the general idea of a neutron core, or rather neutron star, was correct . Once Oppenheimer was interested in neutron stars, he kept going and asked the same question about neutron stars that Chandrasekhar had asked about white dwarfs: Is there a maximum size for neutron stars beyond which they must collapse? The answer to this question used the same quantum mechanical degeneracy pressure (now provided by neutrons rather than electrons) and gravitational compaction as the problem of white dwarfs, but it required detailed understanding of nuclear forces, which in 1938 were only beginning to be understood. However, Oppenheimer knew enough to make a good estimate of the nuclear binding contribution to the total internal pressure and came to a similar conclusion for neutron stars as Chandrasekhar had made for white dwarfs. There was indeed a maximum mass of a neutron star, a Chandrasekhar-type limit of about three solar masses. Beyond this mass, even the degeneracy pressure of neutrons could not support gravitational pressure, and the neutron star must collapse. In Oppenheimer’s mind it was clear what it must collapse to—a black hole (known as gravitational cut-off at that time). This was to lead Oppenheimer and John Wheeler to their famous confrontation over the existence of black holes, which Oppenheimer won, but Wheeler took possession of the battle field .
Derivation of the Relativistic Chandrasekhar Limit
White dwarfs are created from the balance between gravitational compression and the degeneracy pressure of electrons caused by the Pauli exclusion principle. When a star collapses gravitationally, the matter becomes so dense that the electrons begin to fill up quantum states until all the lowest-energy states are filled and no more electrons can be added. This results in a balance that stabilizes the gravitational collapse, and the result is a white dwarf with a mass density a million times larger than the Sun.
If the electrons remained non-relativistic, then there would be no upper limit for the size of a star that would form a white dwarf. However, because electrons become relativistic at high enough compaction, if the initial star is too massive, the electron degeneracy pressure becomes limited relativistically and cannot keep the matter from compacting more, and even the white dwarf will collapse (to a neutron star or a black hole). The largest mass that can be supported by a white dwarf is known as the Chandrasekhar limit.
A simplified derivation of the Chandrasekhar limit begins by defining the total energy as the kinetic energy of the degenerate Fermi electron gas plus the gravitational potential energy
The kinetic energy of the degenerate Fermi gas has the relativistic expression
where the Fermi k-vector can be expressed as a function of the radius of the white dwarf and the total number of electrons in the star, as
If the star is composed of pure hydrogen, then the mass of the star is expressed in terms of the total number of electrons and the mass of the proton
The total energy of the white dwarf is minimized by taking its derivative with respect to the radius of the star
When the derivative is set to zero, the term in brackets becomes
This is solved for the radius for which the electron degeneracy pressure stabilizes the gravitational pressure
This is the relativistic radius-mass expression for the size of the stabilized white dwarf as a function of the mass (or total number of electrons). One of the astonishing results of this calculation is the merging of astronomically large numbers (the mass of stars) with both relativity and quantum physics. The radius of the white dwarf is actually expressed as a multiple of the Compton wavelength of the electron!
The expression in the square root becomes smaller as the size of the star increases, and there is an upper bound to the mass of the star beyond which the argument in the square root goes negative. This upper bound is the Chandrasekhar limit defined when the argument equals zero
This gives the final expression for the Chandrasekhar limit (expressed in terms of the Planck mass)
This expression is only approximate, but it does contain the essential physics and magnitude. This limit is on the order of a solar mass. A more realistic numerical calculation yields a limiting mass of about 1.4 times the mass of the Sun. For white dwarfs larger than this value, the electron degeneracy is insufficient to support the gravitational pressure, and the star will collapse to a neutron star or a black hole.
 The fact that Eddington coordinates removed the singularity at the Schwarzschild radius was first pointed out by Lemaitre in 1933. A local observer passing through the Schwarzschild radius would experience no divergence in local properties, even though a distant observer would see that in-falling observer becoming length contracted and time dilated. This point of view of an in-falling observer was explained in 1958 by Finkelstein, who also pointed out that the Schwarzschild radius is an event horizon.
 William Herschel (1803), Account of the Changes That Have Happened, during the Last Twenty-Five Years, in the Relative Situation of Double-Stars; With an Investigation of the Cause to Which They Are Owing, Philosophical Transactions of the Royal Society of London 93, pp. 339–382 (Motion of binary stars)
 Boss, L. (1910). Preliminary General Catalogue of 6188 stars for the epoch 1900. Carnegie Institution of Washington. (Mass and radius of Sirius B)
 Eddington, A. S. (1927). Stars and Atoms. Clarendon Press. LCCN 27015694.
 Fowler, R. H. (1926). “On dense matter”. Monthly Notices of the Royal Astronomical Society 87: 114. Bibcode:1926MNRAS..87..114F. (Quantum mechanics of degenerate matter).
 Chandrasekhar, S. (1931). “The Maximum Mass of Ideal White Dwarfs”. The Astrophysical Journal 74: 81. Bibcode:1931ApJ….74…81C. doi:10.1086/143324. (Mass limit of white dwarfs).
 Kip Thorne (1994) Black Holes & Time Warps: Einstein’s Outrageous Legacy (Norton). pg. 174
 Oppenheimer was aware of Zwicky’s proposal because he had a joint appointment between Berkeley and Cal Tech.