Hermann Minkowski’s Spacetime: The Theory that Einstein Overlooked

“Society is founded on hero worship”, wrote Thomas Carlyle (1795 – 1881) in his 1840 lecture on “Hero as Divinity”—and the society of physicists is no different.  Among physicists, the hero is the genius—the monomyth who journeys into the supernatural realm of high mathematics, engages in single combat against chaos and confusion, gains enlightenment in the mysteries of the universe, and returns home to share the new understanding.  If the hero is endowed with unusual talent and achieves greatness, then mythologies are woven, creating shadows that can grow and eclipse the truth and the work of others, bestowing upon the hero recognitions that are not entirely deserved.

      “Gentlemen! The views of space and time which I wish to lay before you … They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.”

Herman Minkowski (1908)

The greatest hero of physics of the twentieth century, without question, is Albert Einstein.  He is the person most responsible for the development of “Modern Physics” that encompasses:

  • Relativity theory (both special and general),
  • Quantum theory (he invented the quantum in 1905—see my blog),
  • Astrophysics (his field equations of general relativity were solved by Schwarzschild in 1916 to predict event horizons of black holes, and he solved his own equations to predict gravitational waves that were discovered in 2015),
  • Cosmology (his cosmological constant is now recognized as the mysterious dark energy that was discovered in 2000), and
  • Solid state physics (his explanation of the specific heat of crystals inaugurated the field of quantum matter). 

Einstein made so many seminal contributions to so many sub-fields of physics that it defies comprehension—hence he is mythologized as genius, able to see into the depths of reality with unique insight. He deserves his reputation as the greatest physicist of the twentieth century—he has my vote, and he was chosen by Time magazine in 2000 as the Man of the Century.  But as his shadow has grown, it has eclipsed and even assimilated the work of others—work that he initially criticized and dismissed, yet later embraced so whole-heartedly that he is mistakenly given credit for its discovery.

For instance, when we think of Einstein, the first thing that pops into our minds is probably “spacetime”.  He himself wrote several popular accounts of relativity that incorporated the view that spacetime is the natural geometry within which so many of the non-intuitive properties of relativity can be understood.  When we think of time being mixed with space, making it seem that position coordinates and time coordinates share an equal place in the description of relativistic physics, it is common to attribute this understanding to Einstein.  Yet Einstein initially resisted this viewpoint and even disparaged it when he first heard it! 

Spacetime was the brain-child of Hermann Minkowski.

Minkowski in Königsberg

Hermann Minkowski was born in 1864 in Russia to German parents who moved to the city of Königsberg (King’s Mountain) in East Prussia when he was eight years old.  He entered the university in Königsberg in 1880 when he was sixteen.  Within a year, when he was only seventeen years old, and while he was still a student at the University, Minkowski responded to an announcement of the Mathematics Prize of the French Academy of Sciences in 1881.  When he submitted is prize-winning memoire, he could have had no idea that it was starting him down a path that would lead him years later to revolutionary views.

A view of Königsberg in 1581. Six of the seven bridges of Königsberg—which Euler famously described in the first essay on topology—are seen in this picture. The University is in the center distance behind the castle. The city was destroyed by the Russians in WWII followed by a forced evacuation of the local population.

The specific Prize challenge of 1881 was to find the number of representations of an integer as a sum of five squares of integers.  For instance, every integer n > 33 can be expressed as the sum of five nonzero squares.  As an example, 42 = 22 + 22 + 32 + 32 + 42,  which is the only representation for that number.  However, there are five representation for n = 53

The task of enumerating these representations draws from the theory of quadratic forms.  A quadratic form is a function of products of numbers with integer coefficients, such as ax2 + bxy + cy2 and ax2 + by2 + cz2 + dxy + exz + fyz.  In number theory, one seeks to find integer solutions for which the quadratic form equals an integer.  For instance, the Pythagorean theorem x2 + y2 = n2 for integers is a quadratic form for which there are many integer solutions (x,y,n), known as Pythagorean triplets, such as

The topic of quadratic forms gained special significance after the work of Bernhard Riemann who established the properties of metric spaces based on the metric expression

for infinitesimal distance in a D-dimensional metric space.  This is a generalization of Euclidean distance to more general non-Euclidean spaces that may have curvature.  Minkowski would later use this expression to great advantage, developing a “Geometry of Numbers” [1] as he delved ever deeper into quadratic forms and their uses in number theory.

Minkowski in Göttingen

After graduating with a doctoral degree in 1885 from Königsberg, Minkowski did his habilitation at the university of Bonn and began teaching, moving back to Königsberg in 1892 and then to Zurich in 1894 (where one of his students was a somewhat lazy and unimpressive Albert Einstein).  A few years later he was given an offer that he could not refuse.

At the turn of the 20th century, the place to be in mathematics was at the University of Göttingen.  It had a long tradition of mathematical giants that included Carl Friedrich Gauss, Bernhard Riemann, Peter Dirichlet, and Felix Klein.  Under the guidance of Felix Klein, Göttingen mathematics had undergone a renaissance. For instance, Klein had attracted Hilbert from the University of Königsberg in 1895.  David Hilbert had known Minkowski when they were both students in Königsberg, and Hilbert extended an invitation to Minkowski to join him in Göttingen, which Minkowski accepted in 1902.

The University of Göttingen

A few years after Minkowski arrived at Göttingen, the relativity revolution broke, and both Minkowski and Hilbert began working on mathematical aspects of the new physics. They organized a colloquium dedicated to relativity and related topics, and on Nov. 5, 1907 Minkowski gave his first tentative address on the geometry of relativity.

Because Minkowski’s specialty was quadratic forms, and given his understanding of Riemann’s work, he was perfectly situated to apply his theory of quadratic forms and invariants to the Lorentz transformations derived by Poincaré and Einstein.  Although Poincaré had published a paper in 1906 that showed that the Lorentz transformation was a generalized rotation in four-dimensional space [2], Poincaré continued to discuss space and time as separate phenomena, as did Einstein.  For them, simultaneity was no longer an invariant, but events in time were still events in time and not somehow mixed with space-like properties. Minkowski recognized that Poincaré had missed an opportunity to define a four-dimensional vector space filled by four-vectors that captured all possible events in a single coordinate description without the need to separate out time and space. 

Minkowski’s first attempt, presented in his 1907 colloquium, at constructing velocity four-vectors was flawed because (like so many of my mechanics students when they first take a time derivative of the four-position) he had not yet understood the correct use of proper time. But the research program he outlined paved the way for the great work that was to follow.

On Feb. 21, 1908, only 3 months after his first halting steps, Minkowski delivered a thick manuscript to the printers for an article to appear in the Göttinger Nachrichten. The title “Die Grundgleichungen für die elektromagnetischen Vorgänge in bewegten Körpern” (The Basic Equations for Electromagnetic Processes of Moving Bodies) belies the impact and importance of this very dense article [3]. In its 60 pages (with no figures), Minkowski presents the correct form for four-velocity by taking derivatives relative to proper time, and he formalizes his four-dimensional approach to relativity that became the standard afterwards. He introduces the terms spacelike vector, timelike vector, light cone and world line. He also presents the complete four-tensor form for the electromagnetic fields. The foundational work of Levi Cevita and Ricci-Curbastro on tensors was not yet well known, so Minkowski invents his own terminology of Traktor to describe it. Most importantly, he invents the terms spacetime (Raum-Zeit) and events (Erignisse) [4].

Minkowski’s four-dimensional formalism of relativistic electromagnetics was more than a mathematical trick—it uncovered the presence of a multitude of invariants that were obscured by the conventional mathematics of Einstein and Lorentz and Poincaré. In Minkowski’s approach, whenever a proper four-vector is contracted with itself (its inner product), an invariant emerges. Because there are many fundamental four-vectors, there are many invariants. These invariants provide the anchors from which to understand the complex relative properties amongst relatively moving frames.

Minkowski’s master work appeared in the Nachrichten on April 5, 1908. If he had thought that physicists would embrace his visionary perspective, he was about to be woefully disabused of that notion.

Einstein’s Reaction

Despite his impressive ability to see into the foundational depths of the physical world, Einstein did not view mathematics as the root of reality. Mathematics for him was a tool to reduce physical intuition into quantitative form. In 1908 his fame was rising as the acknowledged leader in relativistic physics, and he was not impressed or pleased with the abstract mathematical form that Minkowski was trying to stuff the physics into. Einstein called it “superfluous erudition” [5], and complained “since the mathematics pounced on the relativity theory, I no longer understand it myself! [6]”

With his collaborator Jakob Laub (also a former student of Minkowski’s), Einstein objected to more than the hard-to-follow mathematics—they believed that Minkowski’s form of the pondermotive force was incorrect. They then proceeded to re-translate Minkowski’s elegant four-vector derivations back into ordinary vector analysis, publishing two papers in Annalen der Physik in the summer of 1908 that were politely critical of Minkowski’s approach [7-8]. Yet another of Minkowski’s students from Zurich, Gunnar Nordström, showed how to derive Minkowski’s field equations without any of the four-vector formalism.

One can only wonder why so many of his former students so easily dismissed Minkowski’s revolutionary work. Einstein had actually avoided Minkowski’s mathematics classes as a student at ETH [5], which may say something about Minkowski’s reputation among the students, although Einstein did appreciate the class on mechanics that he took from Minkowski. Nonetheless, Einstein missed the point! Rather than realizing the power and universality of the four-dimensional spacetime formulation, he dismissed it as obscure and irrelevant—perhaps prejudiced by his earlier dim view of his former teacher.

Raum und Zeit

It is clear that Minkowski was stung by the poor reception of his spacetime theory. It is also clear that he truly believed that he had uncovered an essential new approach to physical reality. While mathematicians were generally receptive of his work, he knew that if physicists were to adopt his new viewpoint, he needed to win them over with the elegant results.

In 1908, Minkowski presented a now-famous paper Raum und Zeit at the 80th Assembly of German Natural Scientists and Physicians (21 September 1908).  In his opening address, he stated [9]:

“Gentlemen!  The views of space and time which I wish to lay before you have sprung from the soil of experimental physics, and therein lies their strength. They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.”

To illustrate his arguments Minkowski constructed the most recognizable visual icon of relativity theory—the space-time diagram in which the trajectories of particles appear as “world lines”, as in Fig. 1.  On this diagram, one spatial dimension is plotted along the horizontal-axis, and the value ct (speed of light times time) is plotted along the vertical-axis.  In these units, a photon travels along a line oriented at 45 degrees, and the world-line (the name Minkowski gave to trajectories) of all massive particles must have slopes steeper than this.  For instance, a stationary particle, that appears to have no trajectory at all, executes a vertical trajectory on the space-time diagram as it travels forward through time.  Within this new formulation by Minkowski, space and time were mixed together in a single manifold—spacetime—and were no longer separate entities.

Fig. 1 The First “Minkowski diagram” of spacetime.

In addition to the spacetime construct, Minkowski’s great discovery was the plethora of invariants that followed from his geometry. For instance, the spacetime hyperbola

is invariant to Lorentz transformation in coordinates.  This is just a simple statement that a vector is an entity of reality that is independent of how it is described.  The length of a vector in our normal three-space does not change if we flip the coordinates around or rotate them, and the same is true for four-vectors in Minkowski space subject to Lorentz transformations. 

In relativity theory, this property of invariance becomes especially useful because part of the mental challenge of relativity is that everything looks different when viewed from different frames.  How do you get a good grip on a phenomenon if it is always changing, always relative to one frame or another?  The invariants become the anchors that we can hold on to as reference frames shift and morph about us. 

Fig. 2 Any event on an invariant hyperbola is transformed by the Lorentz transformation onto another point on the same hyperbola. Events that are simultaneous in one frame are each on a separate hyperbola. After transformation, simultaneity is lost, but each event stays on its own invariant hyperbola (Figure reprinted from [10]).

As an example of a fundamental invariant, the mass of a particle in its rest frame becomes an invariant mass, always with the same value.  In earlier relativity theory, even in Einstein’s papers, the mass of an object was a function of its speed.  How is the mass of an electron a fundamental property of physics if it is a function of how fast it is traveling?  The construction of invariant mass removes this problem, and the mass of the electron becomes an immutable property of physics, independent of the frame.  Invariant mass is just one of many invariants that emerge from Minkowski’s space-time description.  The study of relativity, where all things seem relative, became a study of invariants, where many things never change.  In this sense, the theory of relativity is a misnomer.  Ironically, relativity theory became the motivation of post-modern relativism that denies the existence of absolutes, even as relativity theory, as practiced by physicists, is all about absolutes.

Despite his audacious gambit to win over the physicists, Minkowski would not live to see the fruits of his effort. He died suddenly of a burst gall bladder on Jan. 12, 1909 at the age of 44.

Arnold Sommerfeld (who went on to play a central role in the development of quantum theory) took up Minkowski’s four vectors, and he systematized it in a way that was palatable to physicists.  Then Max von Laue extended it while he was working with Sommerfeld in Munich, publishing the first physics textbook on relativity theory in 1911, establishing the space-time formalism for future generations of German physicists.  Further support for Minkowski’s work came from his distinguished colleagues at Göttingen (Hilbert, Klein, Wiechert, Schwarzschild) as well as his former students (Born, Laue, Kaluza, Frank, Noether).  With such champions, Minkowski’s work was immortalized in the methodology (and mythology) of physics, representing one of the crowning achievements of the Göttingen mathematical community.

Einstein Relents

Already in 1907 Einstein was beginning to grapple with the role of gravity in the context of relativity theory, and he knew that the special theory was just a beginning. Yet between 1908 and 1910 Einstein’s focus was on the quantum of light as he defended and extended his unique view of the photon and prepared for the first Solvay Congress of 1911. As he returned his attention to the problem of gravitation after 1910, he began to realize that Minkowski’s formalism provided a framework from which to understand the role of accelerating frames. In 1912 Einstein wrote to Sommerfeld to say [5]

I occupy myself now exclusively with the problem of gravitation . One thing is certain that I have never before had to toil anywhere near as much, and that I have been infused with great respect for mathematics, which I had up until now in my naivety looked upon as a pure luxury in its more subtle parts. Compared to this problem. the original theory of relativity is child’s play.

By the time Einstein had finished his general theory of relativity and gravitation in 1915, he fully acknowledge his indebtedness to Minkowski’s spacetime formalism without which his general theory may never have appeared.

By David D. Nolte, April 24, 2021


[1] H. Minkowski, Geometrie der Zahlen. Leipzig and Berlin: R. G. Teubner, 1910.

[2] Poincaré, H. (1906). “Sur la dynamique de l’´electron.” Rendiconti del circolo matematico di Palermo 21: 129–176.

[3] H. Minkowski, “Die Grundgleichungen für die electromagnetischen Vorgänge in bewegten Körpern,” Nachrichten von der Königlichen Gesellschaft der Wissenschaften zu Göttingen, pp. 53–111, (1908)

[4] S. Walter, “Minkowski’s Modern World,” in Minkowski Spacetime: A Hundred Years Later, Petkov Ed.: Springer, 2010, ch. 2, pp. 43-61.

[5] L. Corry, “The influence of David Hilbert and Hermann Minkowski on Einstein’s views over the interrelation between physics and mathematics,” Endeavour, vol. 22, no. 3, pp. 95-97, (1998)

[6] A. Pais, Subtle is the Lord: The Science and the Life of Albert Einstein. Oxford, 2005.

[7] A. Einstein and J. Laub, “Electromagnetic basic equations for moving bodies,” Annalen Der Physik, vol. 26, no. 8, pp. 532-540, Jul (1908)

[8] A. Einstein and J. Laub, “Electromagnetic fields on quiet bodies with pondermotive energy,” Annalen Der Physik, vol. 26, no. 8, pp. 541-550, Jul (1908)

[9] Minkowski, H. (1909). “Raum und Zeit.” Jahresbericht der Deutschen Mathematikier-Vereinigung: 75-88.

[10] D. D. Nolte, Introduction to Modern Dynamics : Chaos, Networks, Space and Time, 2nd ed. Oxford: Oxford University Press, 2019.



New from Oxford Press: The History of Optical Interferometry (Summer 2023)

Bohr’s Orbits

The first time I ran across the Bohr-Sommerfeld quantization conditions I admit that I laughed! I was a TA for the Modern Physics course as a graduate student at Berkeley in 1982 and I read about Bohr-Sommerfeld in our Tipler textbook. I was familiar with Bohr orbits, which are already the wrong way of thinking about quantized systems. So the Bohr-Sommerfeld conditions, especially for so-called “elliptical” orbits, seemed like nonsense.

But it’s funny how a a little distance gives you perspective. Forty years later I know a little more physics than I did then, and I have gained a deep respect for an obscure property of dynamical systems known as “adiabatic invariants”. It turns out that adiabatic invariants lie at the core of quantum systems, and in the case of hydrogen adiabatic invariants can be visualized as … elliptical orbits!

Quantum Physics in Copenhagen

Niels Bohr (1885 – 1962) was born in Copenhagen, Denmark, the middle child of a physiology professor at the University in Copenhagen.  Bohr grew up with his siblings as a faculty child, which meant an unconventional upbringing full of ideas, books and deep discussions.  Bohr was a late bloomer in secondary school but began to show talent in Math and Physics in his last two years.  When he entered the University in Copenhagen in 1903 to major in physics, the university had only one physics professor, Christian Christiansen, and had no physics laboratories.  So Bohr tinkered in his father’s physiology laboratory, performing a detailed experimental study of the hydrodynamics of water jets, writing and submitting a paper that was to be his only experimental work.  Bohr went on to receive a Master’s degree in 1909 and his PhD in 1911, writing his thesis on the theory of electrons in metals.  Although the thesis did not break much new ground, it uncovered striking disparities between observed properties and theoretical predictions based on the classical theory of the electron.  For his postdoc studies he applied for and was accepted to a position working with the discoverer of the electron, Sir J. J. Thompson, in Cambridge.  Perhaps fortunately for the future history of physics, he did not get along well with Thompson, and he shifted his postdoc position in early 1912 to work with Ernest Rutherford at the much less prestigious University of Manchester.

Niels Bohr (Wikipedia)

Ernest Rutherford had just completed a series of detailed experiments on the scattering of alpha particles on gold film and had demonstrated that the mass of the atom was concentrated in a very small volume that Rutherford called the nucleus, which also carried the positive charge compensating the negative electron charges.  The discovery of the nucleus created a radical new model of the atom in which electrons executed planetary-like orbits around the nucleus.  Bohr immediately went to work on a theory for the new model of the atom.  He worked closely with Rutherford and the other members of Rutherford’s laboratory, involved in daily discussions on the nature of atomic structure.  The open intellectual atmosphere of Rutherford’s group and the ready flow of ideas in group discussions became the model for Bohr, who would some years later set up his own research center that would attract the top young physicists of the time.  Already by mid 1912, Bohr was beginning to see a path forward, hinting in letters to his younger brother Harald (who would become a famous mathematician) that he had uncovered a new approach that might explain some of the observed properties of simple atoms. 

By the end of 1912 his postdoc travel stipend was over, and he returned to Copenhagen, where he completed his work on the hydrogen atom.  One of the key discrepancies in the classical theory of the electron in atoms was the requirement, by Maxwell’s Laws, for orbiting electrons to continually radiate because of their angular acceleration.  Furthermore, from energy conservation, if they radiated continuously, the electron orbits must also eventually decay into the nuclear core with ever-decreasing orbital periods and hence ever higher emitted light frequencies.  Experimentally, on the other hand, it was known that light emitted from atoms had only distinct quantized frequencies.  To circumvent the problem of classical radiation, Bohr simply assumed what was observed, formulating the idea of stationary quantum states.  Light emission (or absorption) could take place only when the energy of an electron changed discontinuously as it jumped from one stationary state to another, and there was a lowest stationary state below which the electron could never fall.  He then took a critical and important step, combining this new idea of stationary states with Planck’s constant h.  He was able to show that the emission spectrum of hydrogen, and hence the energies of the stationary states, could be derived if the angular momentum of the electron in a Hydrogen atom was quantized by integer amounts of Planck’s constant h

Bohr published his quantum theory of the hydrogen atom in 1913, which immediately focused the attention of a growing group of physicists (including Einstein, Rutherford, Hilbert, Born, and Sommerfeld) on the new possibilities opened up by Bohr’s quantum theory [1].  Emboldened by his growing reputation, Bohr petitioned the university in Copenhagen to create a new faculty position in theoretical physics, and to appoint him to it.  The University was not unreceptive, but university bureaucracies make decisions slowly, so Bohr returned to Rutherford’s group in Manchester while he awaited Copenhagen’s decision.  He waited over two years, but he enjoyed his time in the stimulating environment of Rutherford’s group in Manchester, growing steadily into the role as master of the new quantum theory.  In June of 1916, Bohr returned to Copenhagen and a year later was elected to the Royal Danish Academy of Sciences. 

Although Bohr’s theory had succeeded in describing some of the properties of the electron in atoms, two central features of his theory continued to cause difficulty.  The first was the limitation of the theory to single electrons in circular orbits, and the second was the cause of the discontinuous jumps.  In response to this challenge, Arnold Sommerfeld provided a deeper mechanical perspective on the origins of the discrete energy levels of the atom. 

Quantum Physics in Munich

Arnold Johannes Wilhem Sommerfeld (1868—1951) was born in Königsberg, Prussia, and spent all the years of his education there to his doctorate that he received in 1891.  In Königsberg he was acquainted with Minkowski, Wien and Hilbert, and he was the doctoral student of Lindemann.  He also was associated with a social group at the University that spent too much time drinking and dueling, a distraction that lead to his receiving a deep sabre cut on his forehead that became one of his distinguishing features along with his finely waxed moustache.  In outward appearance, he looked the part of a Prussian hussar, but he finally escaped this life of dissipation and landed in Göttingen where he became Felix Klein’s assistant in 1894.  He taught at local secondary schools, rising in reputation, until he secured a faculty position of theoretical physics at the University in Münich in 1906.  One of his first students was Peter Debye who received his doctorate under Sommerfeld in 1908.  Later famous students would include Peter Ewald (doctorate in 1912), Wolfgang Pauli (doctorate in 1921), Werner Heisenberg (doctorate in 1923), and Hans Bethe (doctorate in 1928).  These students had the rare treat, during their time studying under Sommerfeld, of spending weekends in the winter skiing and staying at a ski hut that he owned only two hours by train outside of Münich.  At the end of the day skiing, discussion would turn invariably to theoretical physics and the leading problems of the day.  It was in his early days at Münich that Sommerfeld played a key role aiding the general acceptance of Minkowski’s theory of four-dimensional space-time by publishing a review article in Annalen der Physik that translated Minkowski’s ideas into language that was more familiar to physicists.

Arnold Sommerfeld (Wikipedia)

Around 1911, Sommerfeld shifted his research interest to the new quantum theory, and his interest only intensified after the publication of Bohr’s model of hydrogen in 1913.  In 1915 Sommerfeld significantly extended the Bohr model by building on an idea put forward by Planck.  While further justifying the black body spectrum, Planck turned to descriptions of the trajectory of a quantized one-dimensional harmonic oscillator in phase space.  Planck had noted that the phase-space areas enclosed by the quantized trajectories were integral multiples of his constant.  Sommerfeld expanded on this idea, showing that it was not the area enclosed by the trajectories that was fundamental, but the integral of the momentum over the spatial coordinate [2].  This integral is none other than the original action integral of Maupertuis and Euler, used so famously in their Principle of Least Action almost 200 years earlier.  Where Planck, in his original paper of 1901, had recognized the units of his constant to be those of action, and hence called it the quantum of action, Sommerfeld made the explicit connection to the dynamical trajectories of the oscillators.  He then showed that the same action principle applied to Bohr’s circular orbits for the electron on the hydrogen atom, and that the orbits need not even be circular, but could be elliptical Keplerian orbits. 

The quantum condition for this otherwise classical trajectory was the requirement for the action integral over the motion to be equal to integer units of the quantum of action.  Furthermore, Sommerfeld showed that there must be as many action integrals as degrees of freedom for the dynamical system.  In the case of Keplerian orbits, there are radial coordinates as well as angular coordinates, and each action integral was quantized for the discrete electron orbits.  Although Sommerfeld’s action integrals extended Bohr’s theory of quantized electron orbits, the new quantum conditions also created a problem because there were now many possible elliptical orbits that all had the same energy.  How was one to find the “correct” orbit for a given orbital energy?

Quantum Physics in Leiden

In 1906, the Austrian Physicist Paul Ehrenfest (1880 – 1933), freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life.  Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest.  It was a monumental task, which was to take him and his physicist wife Tatyana nearly five years to complete.  Part of the delay was the desire by Ehrenfest to close some open problems that remained in Boltzmann’s work.  One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters.  These properties would later be called adiabatic invariants by Einstein.  Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity.  Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system.  This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice.  For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .  

Paul Ehrenfest (Wikipedia)

Ehrenfest published his observations in 1913 [3], the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc.  Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue with his theory of adiabatic invariants, showing that each of Sommerfeld’s quantum conditions were precisely the adabatic invariants of the classical electron dynamics [4]. The remaining question was which coordinates were the correct ones, because different choices led to different answers.  This was quickly solved by Johannes Burgers (one of Ehrenfest’s students) who showed that action integrals were adiabatic invariants, and then by Karl Schwarzschild and Paul Epstein who showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the electron orbits.  Schwarzshild’s paper was published the same day that he died on the Eastern Front.  The work by Schwarzschild and Epstein was the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, which foreshadowed the future importance of Hamiltonians for quantum theory.

Karl Schwarzschild (Wikipedia)

Bohr-Sommerfeld

Emboldened by Ehrenfest’s adiabatic principle, which demonstrated a close connection between classical dynamics and quantization conditions, Bohr formalized a technique that he had used implicitly in his 1913 model of hydrogen, and now elevated it to the status of a fundamental principle of quantum theory.  He called it the Correspondence Principle, and published the details in 1920.  The Correspondence Principle states that as the quantum number of an electron orbit increases to large values, the quantum behavior converges to classical behavior.  Specifically, if an electron in a state of high quantum number emits a photon while jumping to a neighboring orbit, then the wavelength of the emitted photon approaches the classical radiation wavelength of the electron subject to Maxwell’s equations. 

Bohr’s Correspondence Principle cemented the bridge between classical physics and quantum physics.  One of the biggest former questions about the physics of electron orbits in atoms was why they did not radiate continuously because of the angular acceleration they experienced in their orbits.  Bohr had now reconnected to Maxwell’s equations and classical physics in the limit.  Like the theory of adiabatic invariants, the Correspondence Principle became a new tool for distinguishing among different quantum theories.  It could be used as a filter to distinguish “correct” quantum models, that transitioned smoothly from quantum to classical behavior, from those that did not.  Bohr’s Correspondence Principle was to be a powerful tool in the hands of Werner Heisenberg as he reinvented quantum theory only a few years later.

Quantization conditions.

 By the end of 1920, all the elements of the quantum theory of electron orbits were apparently falling into place.  Bohr’s originally ad hoc quantization condition was now on firm footing.  The quantization conditions were related to action integrals that were, in turn, adiabatic invariants of the classical dynamics.  This meant that slight variations in the parameters of the dynamics systems would not induce quantum transitions among the various quantum states.  This conclusion would have felt right to the early quantum practitioners.  Bohr’s quantum model of electron orbits was fundamentally a means of explaining quantum transitions between stationary states.  Now it appeared that the condition for the stationary states of the electron orbits was an insensitivity, or invariance, to variations in the dynamical properties.  This was analogous to the principle of stationary action where the action along a dynamical trajectory is invariant to slight variations in the trajectory.  Therefore, the theory of quantum orbits now rested on firm foundations that seemed as solid as the foundations of classical mechanics.

From the perspective of modern quantum theory, the concept of elliptical Keplerian orbits for the electron is grossly inaccurate.  Most physicists shudder when they see the symbol for atomic energy—the classic but mistaken icon of electron orbits around a nucleus.  Nonetheless, Bohr and Ehrenfest and Sommerfeld had hit on a deep thread that runs through all of physics—the concept of action—the same concept that Leibniz introduced, that Maupertuis minimized and that Euler canonized.  This concept of action is at work in the macroscopic domain of classical dynamics as well as the microscopic world of quantum phenomena.  Planck was acutely aware of this connection with action, which is why he so readily recognized his elementary constant as the quantum of action. 

However, the old quantum theory was running out of steam.  For instance, the action integrals and adiabatic invariants only worked for single electron orbits, leaving the vast bulk of many-electron atomic matter beyond the reach of quantum theory and prediction.  The literal electron orbits were a crutch or bias that prevented physicists from moving past them and seeing new possibilities for quantum theory.  Orbits were an anachronism, exerting a damping force on progress.  This limitation became painfully clear when Bohr and his assistants at Copenhagen–Kramers and Slater–attempted to use their electron orbits to explain the refractive index of gases.  The theory was cumbersome and exhausted.  It was time for a new quantum revolution by a new generation of quantum wizards–Heisenberg, Born, Schrödinger, Pauli, Jordan and Dirac.


References

[1] N. Bohr, “On the Constitution of Atoms and Molecules, Part II Systems Containing Only a Single Nucleus,” Philosophical Magazine, vol. 26, pp. 476–502, 1913.

[2] A. Sommerfeld, “The quantum theory of spectral lines,” Annalen Der Physik, vol. 51, pp. 1-94, Sep 1916.

[3] P. Ehrenfest, “Een mechanische theorema van Boltzmann en zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and its relation to the theory of energy quanta),” Verslag van de Gewoge Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593, 1913.

[4] P. Ehrenfest, “Adiabatic invariables and quantum theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.

Vladimir Arnold’s Cat Map

The 1960’s are known as a time of cultural revolution, but perhaps less known was the revolution that occurred in the science of dynamics.  Three towering figures of that revolution were Stephen Smale (1930 – ) at Berkeley, Andrey Kolmogorov (1903 – 1987) in Moscow and his student Vladimir Arnold (1937 – 2010).  Arnold was only 20 years old in 1957 when he solved Hilbert’s thirteenth problem (that any continuous function of several variables can be constructed with a finite number of two-variable functions).  Only a few years later his work on the problem of small denominators in dynamical systems provided the finishing touches on the long elusive explanation of the stability of the solar system (the problem for which Poincaré won the King Oscar Prize in mathematics in 1889 when he discovered chaotic dynamics ).  This theory is known as KAM-theory, using the first initials of the names of Kolmogorov, Arnold and Moser [1].  Building on his breakthrough in celestial mechanics, Arnold’s work through the 1960’s remade the theory of Hamiltonian systems, creating a shift in perspective that has permanently altered how physicists look at dynamical systems.

Hamiltonian Physics on a Torus

Traditionally, Hamiltonian physics is associated with systems of inertial objects that conserve the sum of kinetic and potential energy, in other words, conservative non-dissipative systems.  But a modern view (after Arnold) of Hamiltonian systems sees them as hyperdimensional mathematical mappings that conserve volume.  The space that these mappings inhabit is phase space, and the conservation of phase-space volume is known as Liouville’s Theorem [2].  The geometry of phase space is called symplectic geometry, and the universal position that symplectic geometry now holds in the physics of Hamiltonian mechanics is largely due to Arnold’s textbook Mathematical Methods of Classical Mechanics (1974, English translation 1978) [3]. Arnold’s famous quote from that text is “Hamiltonian mechanics is geometry in phase space”. 

One of the striking aspects of this textbook is the reduction of phase-space geometry to the geometry of a hyperdimensional torus for a large number of Hamiltonian systems.  If there are as many conserved quantities as there are degrees of freedom in a Hamiltonian system, then the system is called “integrable” (because you can integrated the equations of motion to find a constant of the motion). Then it is possible to map the physics onto a hyperdimensional torus through the transformation of dynamical coordinates into what are known as “action-angle” coordinates [4].  Each independent angle has an associated action that is conserved during the motion of the system.  The periodicity of the dynamical angle coordinate makes it possible to identify it with the angular coordinate of a multi-dimensional torus.  Therefore, every integrable Hamiltonian system can be mapped to motion on a multi-dimensional torus (one dimension for each degree of freedom of the system). 

Actually, integrable Hamiltonian systems are among the most boring dynamical systems you can imagine. They literally just go in circles (around the torus). But as soon as you add a small perturbation that cannot be integrated they produce some of the most complex and beautiful patterns of all dynamical systems. It was Arnold’s focus on motions on a torus, and perturbations that shift the dynamics off the torus, that led him to propose a simple mapping that captured the essence of Hamiltonian chaos.

The Arnold Cat Map

Motion on a two-dimensional torus is defined by two angles, and trajectories on a two-dimensional torus are simple helixes. If the periodicities of the motion in the two angles have an integer ratio, the helix repeats itself. However, if the ratio of periods (also known as the winding number) is irrational, then the helix never repeats and passes arbitrarily closely to any point on the surface of the torus. This last case leads to an “ergodic” system, which is a term introduced by Boltzmann to describe a physical system whose trajectory fills phase space. The behavior of a helix for rational or irrational winding number is not terribly interesting. It’s just an orbit going in circles like an integrable Hamiltonian system. The helix can never even cross itself.

However, if you could add a new dimension to the torus (or add a new degree of freedom to the dynamical system), then the helix could pass over or under itself by moving into the new dimension. By weaving around itself, a trajectory can become chaotic, and the set of many trajectories can become as mixed up as a bowl of spaghetti. This can be a little hard to visualize, especially in higher dimensions, but Arnold thought of a very simple mathematical mapping that captures the essential motion on a torus, preserving volume as required for a Hamiltonian system, but with the ability for regions to become all mixed up, just like trajectories in a nonintegrable Hamiltonian system.

A unit square is isomorphic to a two-dimensional torus. This means that there is a one-to-one mapping of each point on the unit square to each point on the surface of a torus. Imagine taking a sheet of paper and forming a tube out of it. One of the dimensions of the sheet of paper is now an angle coordinate that is cyclic, going around the circumference of the tube. Now if the sheet of paper is flexible (like it is made of thin rubber) you can bend the tube around and connect the top of the tube with the bottom, like a bicycle inner tube. The other dimension of the sheet of paper is now also an angle coordinate that is cyclic. In this way a flat sheet is converted (with some bending) into a torus.

Arnold’s key idea was to create a transformation that takes the torus into itself, preserving volume, yet including the ability for regions to pass around each other. Arnold accomplished this with the simple map

where the modulus 1 takes the unit square into itself. This transformation can also be expressed as a matrix

followed by taking modulus 1. The transformation matrix is called a Floquet matrix, and the determinant of the matrix is equal to unity, which ensures that volume is conserved.

Arnold decided to illustrate this mapping by using a crude image of the face of a cat (See Fig. 1). Successive applications of the transformation stretch and shear the cat, which is then folded back into the unit square. The stretching and folding preserve the volume, but the image becomes all mixed up, just like mixing in a chaotic Hamiltonian system, or like an immiscible dye in water that is stirred.

Fig. 1 Arnold’s illustration of his cat map from pg. 6 of V. I. Arnold and A. Avez, Ergodic Problems of Classical Mechanics (Benjamin, 1968) [5]
Fig. 2 Arnold Cat Map operation is an iterated succession of stretching with shear of a unit square, and translation back to the unit square. The mapping preserves and mixes areas, and is invertible.

Recurrence

When the transformation matrix is applied to continuous values, it produces a continuous range of transformed values that become thinner and thinner until the unit square is uniformly mixed. However, if the unit square is discrete, made up of pixels, then something very different happens (see Fig. 3). The image of the cat in this case is composed of a 50×50 array of pixels. For early iterations, the image becomes stretched and mixed, but at iteration 50 there are 4 low-resolution upside-down versions of the cat, and at iteration 75 the cat fully reforms, but is upside-down. Continuing on, the cat eventually reappears fully reformed and upright at iteration 150. Therefore, the discrete case displays a recurrence and the mapping is periodic. Calculating the period of the cat map on lattices can lead to interesting patterns, especially if the lattice is composed of prime numbers [6].

Fig. 3 A discrete cat map has a recurrence period. This example with a 50×50 lattice has a period of 150.

The Cat Map and the Golden Mean

The golden mean, or the golden ratio, 1.618033988749895 is never far away when working with Hamiltonian systems. Because the golden mean is the “most irrational” of all irrational numbers, it plays an essential role in KAM theory on the stability of the solar system. In the case of Arnold’s cat map, it pops up its head in several ways. For instance, the transformation matrix has eigenvalues

with the remarkable property that

which guarantees conservation of area.


Selected V. I. Arnold Publications

Arnold, V. I. “FUNCTIONS OF 3 VARIABLES.” Doklady Akademii Nauk Sssr 114(4): 679-681. (1957)

Arnold, V. I. “GENERATION OF QUASI-PERIODIC MOTION FROM A FAMILY OF PERIODIC MOTIONS.” Doklady Akademii Nauk Sssr 138(1): 13-&. (1961)

Arnold, V. I. “STABILITY OF EQUILIBRIUM POSITION OF A HAMILTONIAN SYSTEM OF ORDINARY DIFFERENTIAL EQUATIONS IN GENERAL ELLIPTIC CASE.” Doklady Akademii Nauk Sssr 137(2): 255-&. (1961)

Arnold, V. I. “BEHAVIOUR OF AN ADIABATIC INVARIANT WHEN HAMILTONS FUNCTION IS UNDERGOING A SLOW PERIODIC VARIATION.” Doklady Akademii Nauk Sssr 142(4): 758-&. (1962)

Arnold, V. I. “CLASSICAL THEORY OF PERTURBATIONS AND PROBLEM OF STABILITY OF PLANETARY SYSTEMS.” Doklady Akademii Nauk Sssr 145(3): 487-&. (1962)

Arnold, V. I. “BEHAVIOUR OF AN ADIABATIC INVARIANT WHEN HAMILTONS FUNCTION IS UNDERGOING A SLOW PERIODIC VARIATION.” Doklady Akademii Nauk Sssr 142(4): 758-&. (1962)

Arnold, V. I. and Y. G. Sinai. “SMALL PERTURBATIONS OF AUTHOMORPHISMS OF A TORE.” Doklady Akademii Nauk Sssr 144(4): 695-&. (1962)

Arnold, V. I. “Small denominators and problems of the stability of motion in classical and celestial mechanics (in Russian).” Usp. Mat. Nauk. 18: 91-192. (1963)

Arnold, V. I. and A. L. Krylov. “UNIFORM DISTRIBUTION OF POINTS ON A SPHERE AND SOME ERGODIC PROPERTIES OF SOLUTIONS TO LINEAR ORDINARY DIFFERENTIAL EQUATIONS IN COMPLEX REGION.” Doklady Akademii Nauk Sssr 148(1): 9-&. (1963)

Arnold, V. I. “INSTABILITY OF DYNAMICAL SYSTEMS WITH MANY DEGREES OF FREEDOM.” Doklady Akademii Nauk Sssr 156(1): 9-&. (1964)

Arnold, V. “SUR UNE PROPRIETE TOPOLOGIQUE DES APPLICATIONS GLOBALEMENT CANONIQUES DE LA MECANIQUE CLASSIQUE.” Comptes Rendus Hebdomadaires Des Seances De L Academie Des Sciences 261(19): 3719-&. (1965)

Arnold, V. I. “APPLICABILITY CONDITIONS AND ERROR ESTIMATION BY AVERAGING FOR SYSTEMS WHICH GO THROUGH RESONANCES IN COURSE OF EVOLUTION.” Doklady Akademii Nauk Sssr 161(1): 9-&. (1965)


Bibliography

[1] Dumas, H. S. The KAM Story: A friendly introduction to the content, history and significance of Classical Kolmogorov-Arnold-Moser Theory, World Scientific. (2014)

[2] See Chapter 6, “The Tangled Tale of Phase Space” in Galileo Unbound (D. D. Nolte, Oxford University Press, 2018)

[3] V. I. Arnold, Mathematical Methods of Classical Mechanics (Nauk 1974, English translation Springer 1978)

[4] See Chapter 3, “Hamiltonian Dynamics and Phase Space” in Introduction to Modern Dynamics, 2nd ed. (D. D. Nolte, Oxford University Press, 2019)

[5] V. I. Arnold and A. Avez, Ergodic Problems of Classical Mechanics (Benjamin, 1968)

[6] Gaspari, G. “THE ARNOLD CAT MAP ON PRIME LATTICES.” Physica D-Nonlinear Phenomena 73(4): 352-372. (1994)

Galileo Unbound

Book Outline Topics

  • Chapter 1: Flight of the Swallows
    • Introduction to motion and trajectories
  • Chapter 2: A New Scientist
    • Galileo’s Biography
  • Chapter 3: Galileo’s Trajectory
    • His study of the science of motion
    • Publication of Two New Sciences
  • Chapter 4: On the Shoulders of Giants
    • Newton’s Principia
    • The Principle of Least Action: Maupertuis, Euler, and Voltaire
    • Lagrange and his new dynamics
  • Chapter 5: Geometry on my Mind
    • Differential geometry of Gauss and Riemann
    • Vector spaces rom Grassmann to Hilbert
    • Fractals: Cantor, Weierstrass, Hausdorff
  • Chapter 6: The Tangled Tale of Phase Space
    • Liouville and Jacobi
    • Entropy and Chaos: Clausius, Boltzmann and Poincare
    • Phase Space: Gibbs and Ehrenfest
  • Chapter 7: The Lens of Gravity
    • Einstein and the warping of light
    • Black Holes: Schwarzschild’s radius
    • Oppenheimer versus Wheeler
    • The Golden Age of General Relativity
  • Chapter 8: On the Quantum Footpath
    • Heisenberg’s matrix mechanics
    • Schrödinger’s wave mechanics
    • Bohr’s complementarity
    • Einstein and entanglement
    • Feynman and the path-integral formulation of quantum
  • Chapter 9: From Butterflies to Hurricanes
    • KAM theory of stability of the solar system
    • Steven Smale’s horseshoe
    • Lorenz’ butterfly: strange attractor
    • Feigenbaum and chaos
  • Chapter 10: Darwin in the Clockworks
    • Charles Darwin and the origin of species
    • Fibonnacci’s bees
    • Economic dynamics
    • Mendel and the landscapes of life
    • Evolutionary dynamics
    • Linus Pauling’s molecular clock and Dawkins meme
  • Chapter 11: The Measure of Life
    • Huygens, von Helmholtz and Rayleigh oscillators
    • Neurodynamics
    • Euler and the Seven Bridges of Königsberg
    • Network theory: Strogatz and Barabasi

In June of 1633 Galileo was found guilty of heresy and sentenced to house arrest for what remained of his life. He was a renaissance Prometheus, bound for giving knowledge to humanity. With little to do, and allowed few visitors, he at last had the uninterrupted time to finish his life’s labor. When Two New Sciences was published in 1638, it contained the seeds of the science of motion that would mature into a grand and abstract vision that permeates all science today. In this way, Galileo was unbound, not by Hercules, but by his own hand as he penned the introduction to his work:

. . . what I consider more important, there have been opened up to this vast and most excellent science, of which my work is merely the beginning, ways and means by which other minds more acute than mine will explore its remote corners.

            Galileo Galilei (1638) Two New Sciences

Galileo Unbound (Oxford University Press, 2018) explores the continuous thread from Galileo’s discovery of the parabolic trajectory to modern dynamics and complex systems. It is a history of expanding dimension and increasing abstraction, until today we speak of entangled quantum particles moving among many worlds, and we envision our lives as trajectories through spaces of thousands of dimensions. Remarkably, common themes persist that predict the evolution of species as readily as the orbits of planets. Galileo laid the foundation upon which Newton built a theory of dynamics that could capture the trajectory of the moon through space using the same physics that controlled the flight of a cannon ball. Late in the nineteenth-century, concepts of motion expanded into multiple dimensions, and in the 20th century geometry became the cause of motion rather than the result when Einstein envisioned the fabric of space-time warped by mass and energy, causing light rays to bend past the Sun. Possibly more radical was Feynman’s dilemma of quantum particles taking all paths at once—setting the stage for the modern fields of quantum field theory and quantum computing. Yet as concepts of motion have evolved, one thing has remained constant—the need to track ever more complex changes and to capture their essence—to find patterns in the chaos as we try to predict and control our world. Today’s ideas of motion go far beyond the parabolic trajectory, but even Galileo might recognize the common thread that winds through all these motions, drawing them together into a unified view that gives us the power to see, at least a little, through the mists shrouding the future.