Posts by David D. Nolte

E. M. Purcell Distinguished Professor of Physics and Astronomy at Purdue University

Bohr's Orbits

The first time I ran across the Bohr-Sommerfeld quantization conditions I admit that I laughed! I was a TA for the Modern Physics course as a graduate student at Berkeley in 1982 and I read about Bohr-Sommerfeld in our Tipler textbook. I was familiar with Bohr orbits, which are already the wrong way of thinking about quantized systems. So the Bohr-Sommerfeld conditions, especially for so-called “elliptical” orbits, seemed like nonsense.

But it’s funny how a a little distance gives you perspective. Forty years later I know a little more physics than I did then, and I have gained a deep respect for an obscure property of dynamical systems known as “adiabatic invariants”. It turns out that adiabatic invariants lie at the core of quantum systems, and in the case of hydrogen adiabatic invariants can be visualized as … elliptical orbits!

Quantum Physics in Copenhagen

Niels Bohr (1885 – 1962) was born in Copenhagen, Denmark, the middle child of a physiology professor at the University in Copenhagen.  Bohr grew up with his siblings as a faculty child, which meant an unconventional upbringing full of ideas, books and deep discussions.  Bohr was a late bloomer in secondary school but began to show talent in Math and Physics in his last two years.  When he entered the University in Copenhagen in 1903 to major in physics, the university had only one physics professor, Christian Christiansen, and had no physics laboratories.  So Bohr tinkered in his father’s physiology laboratory, performing a detailed experimental study of the hydrodynamics of water jets, writing and submitting a paper that was to be his only experimental work.  Bohr went on to receive a Master’s degree in 1909 and his PhD in 1911, writing his thesis on the theory of electrons in metals.  Although the thesis did not break much new ground, it uncovered striking disparities between observed properties and theoretical predictions based on the classical theory of the electron.  For his postdoc studies he applied for and was accepted to a position working with the discoverer of the electron, Sir J. J. Thompson, in Cambridge.  Perhaps fortunately for the future history of physics, he did not get along well with Thompson, and he shifted his postdoc position in early 1912 to work with Ernest Rutherford at the much less prestigious University of Manchester.

Niels Bohr (Wikipedia)

Ernest Rutherford had just completed a series of detailed experiments on the scattering of alpha particles on gold film and had demonstrated that the mass of the atom was concentrated in a very small volume that Rutherford called the nucleus, which also carried the positive charge compensating the negative electron charges.  The discovery of the nucleus created a radical new model of the atom in which electrons executed planetary-like orbits around the nucleus.  Bohr immediately went to work on a theory for the new model of the atom.  He worked closely with Rutherford and the other members of Rutherford’s laboratory, involved in daily discussions on the nature of atomic structure.  The open intellectual atmosphere of Rutherford’s group and the ready flow of ideas in group discussions became the model for Bohr, who would some years later set up his own research center that would attract the top young physicists of the time.  Already by mid 1912, Bohr was beginning to see a path forward, hinting in letters to his younger brother Harald (who would become a famous mathematician) that he had uncovered a new approach that might explain some of the observed properties of simple atoms. 

By the end of 1912 his postdoc travel stipend was over, and he returned to Copenhagen, where he completed his work on the hydrogen atom.  One of the key discrepancies in the classical theory of the electron in atoms was the requirement, by Maxwell’s Laws, for orbiting electrons to continually radiate because of their angular acceleration.  Furthermore, from energy conservation, if they radiated continuously, the electron orbits must also eventually decay into the nuclear core with ever-decreasing orbital periods and hence ever higher emitted light frequencies.  Experimentally, on the other hand, it was known that light emitted from atoms had only distinct quantized frequencies.  To circumvent the problem of classical radiation, Bohr simply assumed what was observed, formulating the idea of stationary quantum states.  Light emission (or absorption) could take place only when the energy of an electron changed discontinuously as it jumped from one stationary state to another, and there was a lowest stationary state below which the electron could never fall.  He then took a critical and important step, combining this new idea of stationary states with Planck’s constant h.  He was able to show that the emission spectrum of hydrogen, and hence the energies of the stationary states, could be derived if the angular momentum of the electron in a Hydrogen atom was quantized by integer amounts of Planck’s constant h

Bohr published his quantum theory of the hydrogen atom in 1913, which immediately focused the attention of a growing group of physicists (including Einstein, Rutherford, Hilbert, Born, and Sommerfeld) on the new possibilities opened up by Bohr’s quantum theory [1].  Emboldened by his growing reputation, Bohr petitioned the university in Copenhagen to create a new faculty position in theoretical physics, and to appoint him to it.  The University was not unreceptive, but university bureaucracies make decisions slowly, so Bohr returned to Rutherford’s group in Manchester while he awaited Copenhagen’s decision.  He waited over two years, but he enjoyed his time in the stimulating environment of Rutherford’s group in Manchester, growing steadily into the role as master of the new quantum theory.  In June of 1916, Bohr returned to Copenhagen and a year later was elected to the Royal Danish Academy of Sciences. 

Although Bohr’s theory had succeeded in describing some of the properties of the electron in atoms, two central features of his theory continued to cause difficulty.  The first was the limitation of the theory to single electrons in circular orbits, and the second was the cause of the discontinuous jumps.  In response to this challenge, Arnold Sommerfeld provided a deeper mechanical perspective on the origins of the discrete energy levels of the atom. 

Quantum Physics in Munich

Arnold Johannes Wilhem Sommerfeld (1868—1951) was born in Königsberg, Prussia, and spent all the years of his education there to his doctorate that he received in 1891.  In Königsberg he was acquainted with Minkowski, Wien and Hilbert, and he was the doctoral student of Lindemann.  He also was associated with a social group at the University that spent too much time drinking and dueling, a distraction that lead to his receiving a deep sabre cut on his forehead that became one of his distinguishing features along with his finely waxed moustache.  In outward appearance, he looked the part of a Prussian hussar, but he finally escaped this life of dissipation and landed in Göttingen where he became Felix Klein’s assistant in 1894.  He taught at local secondary schools, rising in reputation, until he secured a faculty position of theoretical physics at the University in Münich in 1906.  One of his first students was Peter Debye who received his doctorate under Sommerfeld in 1908.  Later famous students would include Peter Ewald (doctorate in 1912), Wolfgang Pauli (doctorate in 1921), Werner Heisenberg (doctorate in 1923), and Hans Bethe (doctorate in 1928).  These students had the rare treat, during their time studying under Sommerfeld, of spending weekends in the winter skiing and staying at a ski hut that he owned only two hours by train outside of Münich.  At the end of the day skiing, discussion would turn invariably to theoretical physics and the leading problems of the day.  It was in his early days at Münich that Sommerfeld played a key role aiding the general acceptance of Minkowski’s theory of four-dimensional space-time by publishing a review article in Annalen der Physik that translated Minkowski’s ideas into language that was more familiar to physicists.

Arnold Sommerfeld (Wikipedia)

Around 1911, Sommerfeld shifted his research interest to the new quantum theory, and his interest only intensified after the publication of Bohr’s model of hydrogen in 1913.  In 1915 Sommerfeld significantly extended the Bohr model by building on an idea put forward by Planck.  While further justifying the black body spectrum, Planck turned to descriptions of the trajectory of a quantized one-dimensional harmonic oscillator in phase space.  Planck had noted that the phase-space areas enclosed by the quantized trajectories were integral multiples of his constant.  Sommerfeld expanded on this idea, showing that it was not the area enclosed by the trajectories that was fundamental, but the integral of the momentum over the spatial coordinate [2].  This integral is none other than the original action integral of Maupertuis and Euler, used so famously in their Principle of Least Action almost 200 years earlier.  Where Planck, in his original paper of 1901, had recognized the units of his constant to be those of action, and hence called it the quantum of action, Sommerfeld made the explicit connection to the dynamical trajectories of the oscillators.  He then showed that the same action principle applied to Bohr’s circular orbits for the electron on the hydrogen atom, and that the orbits need not even be circular, but could be elliptical Keplerian orbits. 

The quantum condition for this otherwise classical trajectory was the requirement for the action integral over the motion to be equal to integer units of the quantum of action.  Furthermore, Sommerfeld showed that there must be as many action integrals as degrees of freedom for the dynamical system.  In the case of Keplerian orbits, there are radial coordinates as well as angular coordinates, and each action integral was quantized for the discrete electron orbits.  Although Sommerfeld’s action integrals extended Bohr’s theory of quantized electron orbits, the new quantum conditions also created a problem because there were now many possible elliptical orbits that all had the same energy.  How was one to find the “correct” orbit for a given orbital energy?

Quantum Physics in Leiden

In 1906, the Austrian Physicist Paul Ehrenfest (1880 – 1933), freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life.  Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest.  It was a monumental task, which was to take him and his physicist wife Tatyana nearly five years to complete.  Part of the delay was the desire by Ehrenfest to close some open problems that remained in Boltzmann’s work.  One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters.  These properties would later be called adiabatic invariants by Einstein.  Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity.  Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system.  This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice.  For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .  

Paul Ehrenfest (Wikipedia)

Ehrenfest published his observations in 1913 [3], the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc.  Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue with his theory of adiabatic invariants, showing that each of Sommerfeld’s quantum conditions were precisely the adabatic invariants of the classical electron dynamics [4]. The remaining question was which coordinates were the correct ones, because different choices led to different answers.  This was quickly solved by Johannes Burgers (one of Ehrenfest’s students) who showed that action integrals were adiabatic invariants, and then by Karl Schwarzschild and Paul Epstein who showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the electron orbits.  Schwarzshild’s paper was published the same day that he died on the Eastern Front.  The work by Schwarzschild and Epstein was the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, which foreshadowed the future importance of Hamiltonians for quantum theory.

Karl Schwarzschild (Wikipedia)

Bohr-Sommerfeld

Emboldened by Ehrenfest’s adiabatic principle, which demonstrated a close connection between classical dynamics and quantization conditions, Bohr formalized a technique that he had used implicitly in his 1913 model of hydrogen, and now elevated it to the status of a fundamental principle of quantum theory.  He called it the Correspondence Principle, and published the details in 1920.  The Correspondence Principle states that as the quantum number of an electron orbit increases to large values, the quantum behavior converges to classical behavior.  Specifically, if an electron in a state of high quantum number emits a photon while jumping to a neighboring orbit, then the wavelength of the emitted photon approaches the classical radiation wavelength of the electron subject to Maxwell’s equations. 

Bohr’s Correspondence Principle cemented the bridge between classical physics and quantum physics.  One of the biggest former questions about the physics of electron orbits in atoms was why they did not radiate continuously because of the angular acceleration they experienced in their orbits.  Bohr had now reconnected to Maxwell’s equations and classical physics in the limit.  Like the theory of adiabatic invariants, the Correspondence Principle became a new tool for distinguishing among different quantum theories.  It could be used as a filter to distinguish “correct” quantum models, that transitioned smoothly from quantum to classical behavior, from those that did not.  Bohr’s Correspondence Principle was to be a powerful tool in the hands of Werner Heisenberg as he reinvented quantum theory only a few years later.

Quantization conditions.

 By the end of 1920, all the elements of the quantum theory of electron orbits were apparently falling into place.  Bohr’s originally ad hoc quantization condition was now on firm footing.  The quantization conditions were related to action integrals that were, in turn, adiabatic invariants of the classical dynamics.  This meant that slight variations in the parameters of the dynamics systems would not induce quantum transitions among the various quantum states.  This conclusion would have felt right to the early quantum practitioners.  Bohr’s quantum model of electron orbits was fundamentally a means of explaining quantum transitions between stationary states.  Now it appeared that the condition for the stationary states of the electron orbits was an insensitivity, or invariance, to variations in the dynamical properties.  This was analogous to the principle of stationary action where the action along a dynamical trajectory is invariant to slight variations in the trajectory.  Therefore, the theory of quantum orbits now rested on firm foundations that seemed as solid as the foundations of classical mechanics.

From the perspective of modern quantum theory, the concept of elliptical Keplerian orbits for the electron is grossly inaccurate.  Most physicists shudder when they see the symbol for atomic energy—the classic but mistaken icon of electron orbits around a nucleus.  Nonetheless, Bohr and Ehrenfest and Sommerfeld had hit on a deep thread that runs through all of physics—the concept of action—the same concept that Leibniz introduced, that Maupertuis minimized and that Euler canonized.  This concept of action is at work in the macroscopic domain of classical dynamics as well as the microscopic world of quantum phenomena.  Planck was acutely aware of this connection with action, which is why he so readily recognized his elementary constant as the quantum of action. 

However, the old quantum theory was running out of steam.  For instance, the action integrals and adiabatic invariants only worked for single electron orbits, leaving the vast bulk of many-electron atomic matter beyond the reach of quantum theory and prediction.  The literal electron orbits were a crutch or bias that prevented physicists from moving past them and seeing new possibilities for quantum theory.  Orbits were an anachronism, exerting a damping force on progress.  This limitation became painfully clear when Bohr and his assistants at Copenhagen–Kramers and Slater–attempted to use their electron orbits to explain the refractive index of gases.  The theory was cumbersome and exhausted.  It was time for a new quantum revolution by a new generation of quantum wizards–Heisenberg, Born, Schrödinger, Pauli, Jordan and Dirac.


References

[1] N. Bohr, “On the Constitution of Atoms and Molecules, Part II Systems Containing Only a Single Nucleus,” Philosophical Magazine, vol. 26, pp. 476–502, 1913.

[2] A. Sommerfeld, “The quantum theory of spectral lines,” Annalen Der Physik, vol. 51, pp. 1-94, Sep 1916.

[3] P. Ehrenfest, “Een mechanische theorema van Boltzmann en zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and its relation to the theory of energy quanta),” Verslag van de Gewoge Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593, 1913.

[4] P. Ehrenfest, “Adiabatic invariables and quantum theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.

Snell's Law: The Five-Fold Way

The bending of light rays as they enter a transparent medium—what today is called Snell’s Law—has had a long history of independent discoveries and radically different approaches.  The general problem of refraction was known to the Greeks in the first century AD, and it was later discussed by the Arabic scholar Alhazan.  Ibn Sahl in Bagdad in 984 AD was the first to put an accurate equation to the phenomenon.  Thomas Harriott in England discussed the problem with Johannes Kepler in 1602, unaware of the work by Ibn Sahl.  Willebrord Snellius (1580–1626) in the Netherlands derived the equation for refraction in 1621, but did not publish it, though it was known to Christian Huygens (1629 – 1695).  René Descartes (1596 – 1650), unaware of Snellius’ work, derived the law in his Dioptrics, using his newly-invented coordinate geometry.  Christiaan Huygens, in his Traité de la Lumière in 1678, derived the law yet again, this time using his principle of secondary waves, though he acknowledged the prior work of Snellius, permanently cementing the shortened name “Snell” to the law of refraction.

Through this history and beyond, there have been many approaches to deriving Snell’s Law.  Some used ideas of momentum, while others used principles of waves.  Today, there are roughly five different ways to derive Snell’s law.  These are:

            1) Huygens’ Principle,

            2) Fermat’s Principle,

            3) Wavefront Continuity

            4) Plane-wave Boundary Conditions, and

            5) Photon Momentum Conservation.

The approaches differ in detail, but they fall into two rough categories:  the first two fall under minimization or extremum principles, and the last three fall under continuity or conservation principles.

Snell’s Law: Huygens’ Principle

Huygens’ principle, published in 1687, states that every point on a wavefront serves as the source of a spherical secondary wave.  This was one of the first wave principles ever proposed for light (Robert Hooke had suggested that light had wavelike character based on his observations of colors in thin films) yet remains amazingly powerful even today.  It can be used not only to derive Snell’s law but also properties of light scattering and diffraction.  Huygens’ principle is a form of minimization principle:  it finds the direction of propagation (for a spherically expanding wavefront from a point where a ray strikes a surface) that yields a minimum angle (tangent to the surface) relative to a second source.  Finding the tangent to the spherical surface is a minimization problem and yields Snell’s Law.

Fig. 1 Huygens’ principle.

            The use of Huygen’s principle for the derivation of Snell’s Law is shown in Fig. 1.  Two parallel incoming rays strike a surface a distance d apart.  The first point emits a secondary spherical wave into the second medium.  The wavefront propagates at a speed of v2 relative to the speed in the first medium of v1.  In the diagram, the propagation distance over the distance d is equal to the sine of the angle

Solving for d and equating the two equations gives

The speed depends on the refractive index as

which leads to Snell’s Law:

Snell’s Law: Fermat’s Principle

Fermat’s principle of least time is a direct minimization problem that finds the least time it takes light to propagate from one point to another.  One of the central questions about Fermat’s principle is: why does it work?  Why is the path of least time the path light needs to take?  I’ll answer that question after we do the derivation.  The configuration of the problem is shown in Fig. 2.

Fig. 2 Fermat’s principle of least time and Feynman’s principle of stationary action leading to maximum constructive interference.

Consider a source point A and a destination point B.  Light travels in a straite line in each medium, deflecting at the point x on the figure.  The speed in medium 1 is c/n1, and the speed in medium 2 is c/n2.  What position x provides the minimum time?

The distances from A to x, and from x to B are, respectively:

The total time is

Minimize this expression by taking the derivative of the time relative to the position x and setting the result to zero

Converting the cosines to sines yields Snell’s Law

Fermat’s principle of least time can be explained in terms of wave interference.  If we think of all paths being taken by propagating waves, then those waves that take paths that differ only a little from the optimum path still interfere constructively.  This is the principle of stationarity.  The time minimizes a quadratic expression that deviates from the minimum only in second order (shown in the right part of Fig. 2).  Therefore, all “nearby” paths interfere constructively, while paths that are farther away begin to interfere destructively.  Therefore, the path of least time is also the path of stationary time and hence stationary optical path length and hence the path of maximum constructive interference.  This is the actual path taken by the wave—and the light.

Snell’s Law: Wavefront Continuity

When a wave passes across an interface between two transparent media the phase of the wave remains continuous.  This continuity of phase provides a way to derive Snell’s Law.  Consider Fig. 3.  A plane wave with wavelength l1 is incident from medium 1 on an interface with medium 2 in which the wavelength is l2.  The wavefronts remain continuous, but they are “kinked” at the interface. 

Fig. 3 Wavefront continuity.

The waves in medium 1 and medium 2 share the part of the interface between wavefronts.  This distance is

The wavelengths in the two media are related to the refractive index through

where l0 is the free-space wavelength.  Plugging these into the first expression yields

which relates the denominators through Snell’s Law

Snell’s Law: Plane-Wave Boundary Condition

Maxwell’s four equations in integral form can each be applied to the planar interface between two refractive media.

Fig. 4 Electromagnetic boundary conditions leading to phase-matching at the planar interface.

All four boundary conditions can be written as

The only way this condition can be true for all possible values of the fields is if the phases of the wave terms are all the same (phase-matching), namely

which in turn guarantees that the transverse projection of the k-vector is continuous across the interface

and the transverse components (projections) are

where the last line states both Snell’s law of refraction and the law of reflection. Therefore, the general wave boundary condition leads immediately to Snell’s Law.

Snell’s Law: Momentum Conservation

Going from Maxwell’s equations for classical fields to photons keeps the same mathematical form for the transverse components for the k-vectors, but now interprets them in a different manner.  Where before there was a requirement for phase-matching the classical waves at the interface, in the photon picture the transverse k-vector becomes the transverse momentum through de Broglie’s equation

Therefore, continuity of the transverse k-vector is interpreted as conservation of transverse momentum of the photon across the interface.  In the figure the second medium is denser with a larger refractive index n2 > n1.  Hence, the momentum of the photon in the second medium is larger while keeping the transverse momentum projection the same.  This simple interpretation gives the same mathematical form as the previous derivation using classical boundary conditions, namely

which is again Snell’s law and the law of relection.

Fig. 5 Conservation of transverse photon momentum.

Recap

Snell’s Law has an eerie habit of springing from almost any statement that can be made about a dielectric interface. It yields the path of least time, tracks the path of maximum constructive interference, produces wavefronts that are extremally tangent to wavefronts, connects continuous wavefronts across the interface, conserves transverse momentum, and guarantees phase matching. These all sound very different, yet all lead to the same simple law of Snellius and Ibn Sahl.

This is deep physics!

Who Invented the Quantum? Einstein vs. Planck

Albert Einstein defies condensation—it is impossible to condense his approach, his insight, his motivation—into a single word like “genius”.  He was complex, multifaceted, contradictory, revolutionary as well as conservative.  Some of his work was so simple that it is hard to understand why no-one else did it first, even when they were right in the middle of it.  Lorentz and Poincaré spring to mind—they had been circling the ideas of spacetime for decades—but never stepped back to see what the simplest explanation could be.  Einstein did, and his special relativity was simple and beautiful, and the math is just high-school algebra.  On the other hand, parts of his work—like gravitation—are so embroiled in mathematics and the religion of general covariance that it remains opaque to physics neophytes 100 years later and is usually reserved for graduate study. 

            Yet there is a third thread in Einstein’s work that relies on pure intuition—neither simple nor complicated—but almost impossible to grasp how he made his leap.  This is the case when he proposed the real existence of the photon—the quantum particle of light.  For ten years after this proposal, it was considered by almost everyone to be his greatest blunder. It even came up when Planck was nominating Einstein for membership in the German Academy of Science. Planck said

That he may sometimes have missed the target of his speculations, as for example, in his hypothesis of light quanta, cannot really be held against him.

In this single statement, we have the father of the quantum being criticized by the father of the quantum discontinuity.

Max Planck’s Discontinuity

In histories of the development of quantum theory, the German physicist Max Planck (1858—1947) is characterized as an unlikely revolutionary.  He was an establishment man, in the stolid German tradition, who was already embedded in his career, in his forties, holding a coveted faculty position at the University of Berlin.  In his research, he was responding to a theoretical challenge issued by Kirchhoff many years ago in 1860 to find the function of temperature and wavelength that described and explained the observed spectrum of radiating bodies.  Planck was not looking for a revolution.  In fact, he was looking for the opposite.  One of his motivations in studying the thermodynamics of electromagnetic radiation was to rebut the statistical theories of Boltzmann.  Planck had never been convinced by the atomistic and discrete approach Boltzmann had used to explain entropy and the second law of thermodynamics.  With the continuum of light radiation he thought he had the perfect system that would show how entropy behaved in a continuous manner, without the need for discrete quantities. 

Therefore, Planck’s original intentions were to use blackbody radiation to argue against Boltzmann—to set back the clock.  For this reason, not only was Planck an unlikely revolutionary, he was a counter-revolutionary.  But Planck was a revolutionary because that is what he did, whatever his original intentions were, and he accepted his role as a revolutionary when he had the courage to stand in front of his scientific peers and propose a quantum hypothesis that lay at the heart of physics.

            Blackbody radiation, at the end of the nineteenth century, was a topic of keen interest and had been measured with high precision.  This was in part because it was such a “clean” system, having fundamental thermodynamic properties independent of any of the material properties of the black body, unlike the so-called ideal gases, which always showed some dependence on the molecular properties of the gas. The high-precision measurements of blackbody radiation were made possible by new developments in spectrometers at the end of the century, as well as infrared detectors that allowed very precise and repeatable measurements to be made of the spectrum across broad ranges of wavelengths. 

In 1893 the German physicist Wilhelm Wien (1864—1928) had used adiabatic expansion arguments to derive what became known as Wien’s Displacement Law that showed a simple linear relationship between the temperature of the blackbody and the peak wavelength.  Later, in 1896, he showed that the high-frequency behavior could be described by an exponential function of temperature and wavelength that required no other properties of the blackbody.  This was approaching the solution of Kirchhoff’s challenge of 1860 seeking a universal function.  However, at lower frequencies Wien’s approximation failed to match the measured spectrum.  In mid-year 1900, Planck was able to define a single functional expression that described the experimentally observed spectrum.  Planck had succeeded in describing black-body radiation, but he had not satisfied Kirchhoff’s second condition—to explain it. 

            Therefore, to describe the blackbody spectrum, Planck modeled the emitting body as a set of ideal oscillators.  As an expert in the Second Law, Planck derived the functional form for the radiation spectrum, from which he found the entropy of the oscillators that produced the spectrum.  However, once he had the form for the entropy, he needed to explain why it took that specific form.  In this sense, he was working backwards from a known solution rather than forwards from first principles.  Planck was at an impasse.  He struggled but failed to find any continuum theory that could work. 

Then Planck turned to Boltzmann’s statistical theory of entropy, the same theory that he had previously avoided and had hoped to discredit.  He described this as “an act of despair … I was ready to sacrifice any of my previous convictions about physics.”  In Boltzmann’s expression for entropy, it was necessary to “count” possible configurations of states.  But counting can only be done if the states are discrete.  Therefore, he lumped the energies of the oscillators into discrete ranges, or bins, that he called “quanta”.  The size of the bins was proportional to the frequency of the oscillator, and the proportionality constant had the units of Maupertuis’ quantity of action, so Planck called it the “quantum of action”. Finally, based on this quantum hypothesis, Planck derived the functional form of black-body radiation.

            Planck presented his findings at a meeting of the German Physical Society in Berlin on November 15, 1900, introducing the word quantum (plural quanta) into physics from the Latin word that means quantity [1].  It was a casual meeting, and while the attendees knew they were seeing an intriguing new physical theory, there was no sense of a revolution.  But Planck himself was aware that he had created something fundamentally new.  The radiation law of cavities depended on only two physical properties—the temperature and the wavelength—and on two constants—Boltzmann’s constant kB and a new constant that later became known as Planck’s constant h = ΔE/f = 6.6×10-34 J-sec.  By combining these two constants with other fundamental constants, such as the speed of light, Planck was able to establish accurate values for long-sought constants of nature, like Avogadro’s number and the charge of the electron.

            Although Planck’s quantum hypothesis in 1900 explained the blackbody radiation spectrum, his specific hypothesis was that it was the interaction of the atoms and the light field that was somehow quantized.  He certainly was not thinking in terms of individual quanta of the light field.

Figure. Einstein and Planck at a dinner held by Max von Laue in Berlin on Nov. 11, 1931.

Einstein’s Quantum

When Einstein analyzed the properties of the blackbody radiation in 1905, using his deep insight into statistical mechanics, he was led to the inescapable conclusion that light itself must be quantized in amounts E = hf, where h is Planck’s constant and f is the frequency of the light field.  Although this equation is exactly the same as Planck’s from 1900, the meaning was completely different.  For Planck, this was the discreteness of the interaction of light with matter.  For Einstein, this was the quantum of light energy—whole and indivisible—just as if the light quantum were a particle with particle properties.  For this reason, we can answer the question posed in the title of this Blog—Einstein takes the honor of being the inventor of the quantum.

            Einstein’s clarity of vision is a marvel to behold even to this day.  His special talent was to take simple principles, ones that are almost trivial and beyond reproach, and to derive something profound.  In Special Relativity, he simply assumed the constancy of the speed of light and derived Lorentz’s transformations that had originally been based on obtuse electromagnetic arguments about the electron.  In General Relativity, he assumed that free fall represented an inertial frame, and he concluded that gravity must bend light.  In quantum theory, he assumed that the low-density limit of Planck’s theory had to be consistent with light in thermal equilibrium in thermal equilibrium with the black body container, and he concluded that light itself must be quantized into packets of indivisible energy quanta [2].  One immediate consequence of this conclusion was his simple explanation of the photoelectric effect for which the energy of an electron ejected from a metal by ultraviolet irradiation is a linear function of the frequency of the radiation.  Einstein published his theory of the quanta of light [3] as one of his four famous 1905 articles in Annalen der Physik in his Annus Mirabilis

Figure. In the photoelectric effect a photon is absorbed by an electron state in a metal promoting the electron to a free electron that moves with a maximum kinetic energy given by the difference between the photon energy and the work function W of the metal. The energy of the photon is absorbed as a whole quantum, proving that light is composed of quantized corpuscles that are today called photons.

            Einstein’s theory of light quanta was controversial and was slow to be accepted.  It is ironic that in 1914 when Einstein was being considered for a position at the University in Berlin, Planck himself, as he championed Einstein’s case to the faculty, implored his colleagues to accept Einstein despite his ill-conceived theory of light quanta [4].  This comment by Planck goes far to show how Planck, father of the quantum revolution, did not fully grasp, even by 1914, the fundamental nature and consequences of his original quantum hypothesis.  That same year, the American physicist Robert Millikan (1868—1953) performed a precise experimental measurement of the photoelectric effect, with the ostensible intention of proving Einstein wrong, but he accomplished just the opposite—providing clean experimental evidence confirming Einstein’s theory of the photoelectric effect. 

The Stimulated Emission of Light

About a year after Millikan proved that the quantum of energy associated with light absorption was absorbed as a whole quantum of energy that was not divisible, Einstein took a step further in his theory of the light quantum. In 1916 he published a paper in the proceedings of the German Physical Society that explored how light would be in a state of thermodynamic equilibrium when interacting with atoms that had discrete energy levels. Once again he used simple arguments, this time using the principle of detailed balance, to derive a new and unanticipated property of light—stimulated emission!

Figure. The stimulated emission of light. An excited state is stimulated to emit an identical photon when the electron transitions to its ground state.

The stimulated emission of light occurs when an electron is in an excited state of a quantum system, like an atom, and an incident photon stimulates the emission of a second photon that has the same energy and phase as the first photon. If there are many atoms in the excited state, then this process leads to a chain reaction as 1 photon produces 2, and 2 produce 4, and 4 produce 8, etc. This exponential gain in photons with the same energy and phase is the origin of laser radiation. At the time that Einstein proposed this mechanism, lasers were half a century in the future, but he was led to this conclusion by extremely simple arguments about transition rates.

Figure. Section of Einstein’s 1916 paper that describes the absorption and emission of light by atoms with discrete energy levels [5].

Detailed balance is a principle that states that in thermal equilibrium all fluxes are balanced. In the case of atoms with ground states and excited states, this principle requires that as many transitions occur from the ground state to the excited state as from the excited state to the ground state. The crucial new element that Einstein introduced was to distinguish spontaneous emission from stimulated emission. Just as the probability to absorb a photon must be proportional to the photon density, there must be an equivalent process that de-excites the atom that also must be proportional the photon density. In addition, an electron must be able to spontaneously emit a photon with a rate that is independent of photon density. This leads to distinct coefficients in the transition rate equations that are today called the “Einstein A and B coefficients”. The B coefficients relate to the photon density, while the A coefficient relates to spontaneous emission.

Figure. Section of Einstein’s 1917 paper that derives the equilibrium properties of light interacting with matter. The “B”-coefficient for transition from state m to state n describes stimulated emission. [6]

Using the principle of detailed balance together with his A and B coefficients as well as Boltzmann factors describing the number of excited states relative to ground state atoms in equilibrium at a given temperature, Einstein was able to derive an early form of what is today called the Bose-Einstein occupancy function for photons.

Derivation of the Einstein A and B Coefficients

Detailed balance requires the rate from m to n to be the same as the rate from n to m

where the first term is the spontaneous emission rate from the excited state m to the ground state n, the second term is the stimulated emission rate, and the third term (on the right) is the absorption rate from n to m. The numbers in each state are Nm and Nn, and the density of photons is ρ. The relative numbers in the excited state relative to the ground state is given by the Boltzmann factor

By assuming that the stimulated transition coefficient from n to m is the same as m to n, and inserting the Boltzmann factor yields

The Planck density of photons for ΔE = hf is

which yields the final relation between the spontaneous emission coefficient and the stimulated emission coefficient

The total emission rate is

where the p-bar is the average photon number in the cavity. One of the striking aspects of this derivation is that no assumptions are made about the physical mechanisms that determine the coefficient B. Only arguments of detailed balance are required to arrive at these results.

Einstein’s Quantum Legacy

Einstein was awarded the Nobel Prize in 1921 for the photoelectric effect, not for the photon nor for any of Einstein’s other theoretical accomplishments.  Even in 1921, the quantum nature of light remained controversial.  It was only in 1923, after the American physicist Arthur Compton (1892—1962) showed that energy and momentum were conserved in the scattering of photons from electrons, that the quantum nature of light began to be accepted.  The very next year, in 1924, the quantum of light was named the “photon” by the American American chemical physicist Gilbert Lewis (1875—1946). 

            A blog article like this, that attributes the invention of the quantum to Einstein rather than Planck, must say something about the irony of this attribution.  If Einstein is the father of the quantum, he ultimately was led to disinherit his own brain child.  His final and strongest argument against the quantum properties inherent in the Copenhagen Interpretation was his famous EPR paper which, against his expectations, launched the concept of entanglement that underlies the coming generation of quantum computers.


Einstein’s Quantum Timeline

1900 – Planck’s quantum discontinuity for the calculation of the entropy of blackbody radiation.

1905 – Einstein’s “Miracle Year”. Proposes the light quantum.

1911 – First Solvay Conference on the theory of radiation and quanta.

1913 – Bohr’s quantum theory of hydrogen.

1914 – Einstein becomes a member of the German Academy of Science.

1915 – Millikan measurement of the photoelectric effect.

1916 – Einstein proposes stimulated emission.

1921 – Einstein receives Nobel Prize for photoelectric effect and the light quantum. Third Solvay Conference on atoms and electrons.

1927 – Heisenberg’s uncertainty relation. Fifth Solvay International Conference on Electrons and Photons in Brussels. “First” Bohr-Einstein debate on indeterminancy in quantum theory.

1930 – Sixth Solvay Conference on magnetism. “Second” Bohr-Einstein debate.

1935 – Einstein-Podolsky-Rosen (EPR) paper on the completeness of quantum mechanics.


Selected Einstein Quantum Papers

Einstein, A. (1905). “Generation and conversion of light with regard to a heuristic point of view.” Annalen Der Physik 17(6): 132-148.

Einstein, A. (1907). “Die Plancksche Theorie der Strahlung und die Theorie der spezifischen W ̈arme.” Annalen der Physik 22: 180–190.

Einstein, A. (1909). “On the current state of radiation problems.” Physikalische Zeitschrift 10: 185-193.

Einstein, A. and O. Stern (1913). “An argument for the acceptance of molecular agitation at absolute zero.” Annalen Der Physik 40(3): 551-560.

Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.

Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.

Einstein, A., B. Podolsky and N. Rosen (1935). “Can quantum-mechanical description of physical reality be considered complete?” Physical Review 47(10): 0777-0780.


Notes

[1] M. Planck, “Elementary quanta of matter and electricity,” Annalen Der Physik, vol. 4, pp. 564-566, Mar 1901.

[2] Klein, M. J. (1964). Einstein’s First Paper on Quanta. The natural philosopher. D. A. Greenberg and D. E. Gershenson. New York, Blaidsdell. 3.

[3] A. Einstein, “Generation and conversion of light with regard to a heuristic point of view,” Annalen Der Physik, vol. 17, pp. 132-148, Jun 1905.

[4] Chap. 2 in “Mind at Light Speed”, by David Nolte (Free Press, 2001)

[5] Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.

[6] Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.

Looking Under the Hood of the Generalized Stokes Theorem

Everyone who has taken classes in physics or engineering knows that the most magical of all vector identities (and there are so many vector identities) are Green’s theorem in 2D, and Stokes’ and Gauss’ theorem in 3D.  These theorems have the magical ability to take an integral over some domain and replace it with a simpler integral over the boundary of the domain.  For instance, the vector form of Stokes’ theorem in 3D is

for the curl of a vector field, where S is the surface domain, and C is the closed loop surrounding the domain.

Maybe the most famous application of these theorems is to convert Maxwell’s equations of electromagnetism from their differential form to their integral form.  For instance, we can start with the differential version for the curl of the B-field and integrate over a surface

then applying Stokes’ theorem in 3D (or Green’s theorem in 2D), that converts from the two-dimensional surface integral to a one-dimensional integral around a closed loop bounding the area integral domain yields the integral form of Ampere’s law

Stokes’ theorem has the important property that it converts a high-dimensional integral into a lower-dimensional integral over the closed boundary of the original domain. Stokes’ theorem in component form is

In the case of Green’s theorem in 2D, the principle is easy to explain by the oriented vector character of the integrals and the notion of dividing a domain into small elements with oriented edges. In the case of nonzero circulation, all internal edges of smaller regions cancel pairwise until the outer boundary is reached, where a macroscopic circulation persists along all the outer edges. Similarly in Gauss’ theorem in 3D, the flux of a vector through the face of one element is equal and opposite to the flux through the adjacent element, canceling out pairwise until the outer boundary is reached and the net flux is finite summed over the outer elements. This general property of pairwise cancelation on adjacent subdomains until the outer boundary is reached is the general property of Stokes’ theorem that can be extended to space of any dimensions or onto general manifolds that do not need to be Euclidean.

Figure. Principle of Stokes’ theorem. The circulation from all internal edges cancels out. But on the boundary, all edges add together for a macroscopic circulation.

George Stokes and the Cambridge Tripos

Since 1824, the mathematics course at Cambridge University has held a yearly exam called the Tripos to identify the top graduating mathematics student.  The winner of the contest is called the Senior Wrangler, and in the 1800’s the Senior Wrangler received a level of public fame and admiration for intellectual achievement that is somewhat like the fame reserved today for star athletes.  Famous Senior Wranglers include George Airy, John Herschel, Arthur Cayley, Lord Rayleigh, Arthur Eddington, J. E. Littlewood, Peter Guthrie Tait and Joseph Larmor.

Figure. Sir George Gabriel Stokes, 1st Baronet

            In his second year at Cambridge, Stokes had begun studying under William Hopkins (1793 – 1866), and in 1841 George Stokes became Senior Wrangler the same year he won the Smith’s Prize in mathematics.  The Tripos tested primarily on bookwork, while the Smith’s Prize tested on originality.  To achieve top scores on both designated the student as the most capable and creative mathematician of his class.  Stokes was immediately offered a fellowship at Pembroke College allowing him to teach and study whatever he willed. Within eight years he was chosen for the Lucasian Chair of Mathematics. The Lucasian Chair of Mathematics at Cambridge is one of the most famous academic chairs in the world.  The first Lucasian professor was Isaac Barrow in 1664 followed by Isaac Newton who held the post for 33 years.  Other famous Lucasian professors were George Airy, Charles Babbage, Joseph Larmor, Paul Dirac as well as Stephen Hawking. Among the many fields that Stokes made important contributions was hydrodynamics where he derived Stokes’ Law of Drag.

In 1854 Stokes was one of the Cambridge professors setting exam questions for the Tripos. In a letter that William Thompson (later Lord Kelvin) wrote Stokes, he suggested putting on the exam the task of extending Green’s Theorem to three dimensions and proving the theorem, and Stokes obliged. That year the Tripos consisted of 16 papers spread over 8 days, totaling over 40 hours of effort on 211 questions. One of the candidates for Senior Wrangler that year was James Clerk Maxwell, but he was narrowly beaten out by Edward Routh (1831 – 1907). Routh became famous, but not as famous as Maxwell who later applied Stokes’ Theorem to derive the equations of electrodynamics.

The Fundamental Theorem of Calculus

One of the first and simplest theorems that any student of intro calculus is taught is the Fundamental Theorem of Calculus

where F is called the “antiderivative” of the function f . The interpretation of the Fundamental Theorem is extremely simple:  The integral of a function over a domain is equal to its antiderivative evaluated at the boundary of the domain.  Generalizing this theorem a bit, it says that evaluating an integral over a domain is the same thing as evaluating a lower-dimensional quantity over the boundary of the domain.  The Fundamental Theorem of Calculus sounds a lot like Green’s Theorem or Stokes’ Theorem!  And in fact, they are all part of the same principle.  To understand this principle, we have to look into differential forms and the use of Grassmann’s wedge product and exterior algebra (the subject of my previous blog post).

Differential Forms

Just as in the case of the exterior algebra , the fundamental identities defined for differential forms are given by

A differential 1-form α and a differential 2-form β can be expressed as

The key to understanding why the wedge product shows up in this definition is to recognize that the operation of producing a product of differentials is only defined for the wedge product.  Within the language of differential forms, the symbol dxdy has no meaning, despite the fact that this symbol shows up routinely when integrating.  In fact, integrals that use the expression dxdy are ambiguous, because the oriented surface must be inferred from the context of the integral rather than given.  This is why integration over multiple variables should actually be performed using differential forms, though it is rarely (or never) stated in lower-level calculus classes.

Integration of Differential Forms

Line integrals, as in the Fundamental Theorem of Calculus, are obvious and unique.  However, as soon as we move to integrals over areas, the wedge product is needed.  This is because a general area is oriented.  If you think of a plane defined by z = 0, the surface element dxdy can be oriented along either the positive z-axis or the negative z-axis.  Which one should you take?  The answer is: don’t make the choice.  Work with differential forms, and the integral may be over dx^dy or dy^dx, depending on the exterior analysis that produced the integral in the first place.  One is the negative of the other.  You take the element as it arises from the algebra, and you cannot go wrong!

As an example, we can use differential forms to express a surface integral correctly as

If you make the substitutions: x = (p-q)/2 and y = (p+q)/2, then dp = dx + dy and dq = dy – dx and

which yields

In this case, you will recognize that the factor of -2 is just the Jacobian of the transformation.  Working this way with differential forms makes transformation simple, like a book-keeping trick, and safe, so you just follow the algebra through without needing to make choices.

Exterior Differentiation

The exterior derivative of the 1-form a (defined above) is defined as

where the exterior derivative turns a differential r-form into a differential (r+1)-form.  For instance, in 3D

This should look very familiar to you.  If we expressly make the equivalence

where the integral on the left is a surface integral over a domain, and the integral on the right is a line integral over a line bounding the domain, then

This is just the curl theorem (Stokes’ theorem).

Figure. Stokes Theorem in 3D vector form and general form.

Taking the dimension up one notch, consider the differential 2-form β where

This again looks very familiar, and if we write down the equivalence

then we immediately have the divergence theorem.

We can even find other vector identities using these differential forms.  For instance, if we start with a 2-form expressed as

then we have proven the vector identity

stating that the divergence of a curl must vanish.  This is like playing games with simple algebra to prove profound theorems in vector calculus!

Figure. Exterior differentiation of a differential 1-form to yield a differential 2-form.

Stokes’ Theorem in Higher Dimensions

The power of differential forms is their ability to generalize automatically to higher dimensions. The differential 1-form can have any number of indices for multiple dimensions, and exterior differentiation yields the familiar curl theorem in any number of dimensions

But the differential 2-form in 4D yields to exterior differentiation to give a mixed expression that is neither a curl nor a divergence

The differential 3-form in 4D under exterior differentiation yields the 4D divergence

although the orientations of the 3D boundary elements must be chosen appropriately.

Differential Forms in 4D Electromagnetics

As long as we are working with differential forms and Stokes’ Theorem, let’s finish up by looking at Maxwell’s electromagnetic equations as four-dimensional equations in spacetime.  First, construct the 2-form using the displacement field D and the magnetic intensity H.

The differential of this two-form creates a lot of terms, such as

This can be simplified by collecting like terms to

Renaming each coefficient so that

yields two of Maxwell’s equations

To find the other two Maxwell equations, start with the 1-form

and try the derivation yourself!

Differentiating yields a differential two-form. Then identify the curl of the vector potential as the B-field, etc., to derive the other two Maxwell equations

Bibliography

Vargas, J. G., Differential Geometry for Physicists and Mathematicians: Moving Frames and Differential Forms: From Euclid Past Riemann. 2014; p 1-293.

Hermann Grassmann's Nimble Wedge Product

          

Hyperspace is neither a fiction nor an abstraction. Every interaction we have with our every-day world occurs in high-dimensional spaces of objects and coordinates and momenta. This dynamical hyperspace—also known as phase space—is as real as mathematics, and physics in phase space can be calculated and used to predict complex behavior. Although phase space can extend to thousands of dimensions, our minds are incapable of thinking even in four dimensions—we have no ability to visualize such things. 

Grassmann was convinced that he had discovered a fundamentally new type of mathematics—he actually had.

            Part of the trick of doing physics in high dimensions is having the right tools and symbols with which to work.  For high-dimensional math and physics, one such indispensable tool is Hermann Grassmann’s wedge product. When I first saw the wedge product, probably in some graduate-level dynamics textbook, it struck me as a little cryptic.  It is sort of like a vector product, but not, and it operated on things that had an intimidating name— “forms”. I kept trying to “understand” forms as if they were types of vectors.  After all, under special circumstances, forms and wedges did produce some vector identities.  It was only after I actually stepped back and asked myself how they were constructed that I realized that forms and wedge products were just a simple form of algebra, called exterior algebra. Exterior algebra is an especially useful form of algebra with simple rules.  It goes far beyond vectors while harking back to a time before vectors even existed.

Hermann Grassmann: A Backwater Genius

We are so accustomed to working with oriented objects, like vectors that have a tip and tail, that it is hard to think of a time when that wouldn’t have been natural.  Yet in the mid 1800’s, almost no one was thinking of orientations as a part of geometry, and it took real genius to conceive of oriented elements, how to manipulate them, and how to represent them graphically and mathematically.  At a time when some of the greatest mathematicians lived—Weierstrass, Möbius, Cauchy, Gauss, Hamilton—it turned out to be a high school teacher from a backwater in Prussia who developed the theory for the first time.

Hermann Grassmann

            Hermann Grassmann was the son of a high school teacher at the Gymnasium in Stettin, Prussia, (now Szczecin, Poland) and he inherited his father’s position, but at a lower level.  Despite his lack of background and training, he had serious delusions of grandeur, aspiring to teach mathematics at the university in Berlin, even when he was only allowed to teach the younger high school students basic subjects.  Nonetheless, Grassmann embarked on a program to educate himself, attending classes at Berlin in mathematics.  As part of the requirements to be allowed to teach mathematics to the senior high-school students, he had to submit a thesis on an appropriate topic. 

Modern Szczecin.

            For years, he had been working on an idea that had originally come from his father about a mathematical theory that could manipulate abstract objects or concepts.  He had taken this vague thought and had slowly developed it into a rigorous mathematical form with symbols and manipulations.  His mind was one of those that could permute endlessly, and he defined and discovered dozens of different ways that objects could be defined and combined, and he wrote them all down in a tome of excessive size and complexity.  When it was time to submit the thesis to the examiners, he had created a broad new system of algebra—at a time when no one recognized what a new algebra even meant, especially not his examiners, who could understand none of it.  Fortunately, Grassmann had been corresponding with the famous German mathematician August Möbius over his ideas, and Möbius was encouraging and supportive, and the examiners accepted his thesis and allowed him to teach the upper class-men at his high school. 

The Gymnasium in Stettin

            Encouraged by his success, Grassmann hoped that Möbius would help him climb even higher to teach in Berlin.  Convinced that he had discovered a fundamentally new type of mathematics (he actually had), he decided to publish his thesis as a book under the title Die Lineale Ausdehnungslehre, ein neuer Zweig der Mathematik (The Theory of Linear Extension, a New Branch of Mathematics).  He published it out of his own pocket.  It is some measure of his delusion that he had thousands printed, but almost none sold, and piles of the books were stored away to be used later as scrap paper. Möbius likewise distanced himself from Grassmann and his obsessive theories. Discouraged, Grassmann turned his back on mathematics, though he later achieved fame in the field of linguistics.  (For more on Grassmann’s ideas and struggle for recognition, see Chapter 4 of Galileo Unbound).

Excerpt from Grassmann’s Ausdehnungslehre (Google Books).

The Odd Identity of Nicholas Bourbaki

If you look up the publication history of the famous French mathematician, Nicholas Bourbaki, you will be amazed to see a publication history that spans from 1935 to 2018 — over 85 years of publications!  But if you look in the obituaries, you will see that he died in 1968.  It’s pretty impressive to still be publishing 50 years after your death.  JRR Tolkein has been doing that regularly, but few others spring to mind.

            Actually, you have been duped!  Nicholas is a fiction, constructed as a hoax by a group of French mathematicians who were simultaneously deadly serious about the need for a rigorous foundation on which to educate the new wave of mathematicians in the mid 20th century.  The group was formed during a mathematics meeting in 1924, organized by André Weil and joined by Henri Cartan (son of Eli Cartan), Claude Chevalley, Jean Coulomb, Jean Delsarte, Jean Dieudonné, Charles Ehresmann, René de Possel, and Szolem Mandelbrojt (uncle of Benoit Mandelbrot).  They picked the last name of a French general, and Weil’s wife named him Nicholas.  The group began publishing books under this pseudonym in 1935 and has continued until the present time.  While their publications were entirely serious, the group from time to time had fun with mild hoaxes, such as posting his obituary on one occasion and a wedding announcement of his daughter on another. 

            The wedge product symbol took several years to mature.  Eli Cartan’s book on differential forms published in 1945 used brackets to denote the product instead of the wedge. In Chevally’s book of 1946, he does not use the wedge, but uses a small square, and the book  Chevalley wrote in 1951 “Introduction to the Theory of Algebraic Functions of One Variable” still uses a small square.  But in 1954, Chevalley uses the wedge symbol in his book on Spinors.  He refers to his own book of 1951 (which did not use the wedge) and also to the 1943 version of Bourbaki. The few existing copies of the 1943 Algebra by Bourbaki lie in obscure European libraries. The 1973 edition of the book does indeed use the wedge, although I have yet to get my hands on the original 1943 version. Therefore, the wedge symbol seems to have originated with Chevalley sometime between 1951 and 1954 and gained widespread use after that.

Exterior Algebra

Exterior algebra begins with the definition of an operation on elements.  The elements, for example (u, v, w, x, y, z, etc.) are drawn from a vector space in its most abstract form as “tuples”, such that x = [x1, x2, x3, …, xn] in an n-dimensional space.  On these elements there is an operation called the “wedge product”, the “exterior product”, or the “Grassmann product”.  It is denoted, for example between two elements x and y, as x^y.  It captures the sense of orientation through anti-commutativity, such that

As simple as this definition is, it sets up virtually all later manipulations of vectors and their combinations.  For instance, we can immediately prove (try it yourself) that the wedge product of a vector element with itself equals zero

Once the elements of the vector space have been defined, it is possible to define “forms” on the vector space.  For instance, a 1-form, also known as a vector, is any function

where a, b, c are scalar coefficients.  The wedge product of two 1-forms

yields a 2-form, also known as a bivector.  This specific example makes a direct connection to the cross product in 3-space as

where the unit vectors are mapped onto the 2-forms

Indeed, many of the vector identities of 3-space can be expressed in terms of exterior products, but these are just special cases, and the wedge product is more general.  For instance, while the triple vector cross product is not associative, the wedge product is associative

which can give it an advantage when performing algebra on r-forms.  Expressing the wedge product in terms of vector components

yields the immediate generalization to any number of dimensions (using the Einstein summation convention)

In this way, the wedge product expresses relationships in any number of dimensions.

            A 3-form is constructed as the wedge product of 3 vectors

where the Levi-Civita permuation symbol has been introduced such that

Note that in 3-space there can be no 4-form, because one of the basis elements would be repeated, rendering the product zero.  Therefore, the most general multilinear form for 3-space is

with 23 = 8 elements: one scalar, three 1-forms, three 2-forms and one 3-form.  In 4-space there are 24 = 16 elements: one scalar, four 1-forms, six 2-forms, four 3-forms and one 4-form.  So, the number of elements rises exponentially with the dimension of the space.

            At this point, we have developed a rich multilinear structure, all based on the simple anti-commutativity of elements x^y = -y^x.  This process is called by another name: a Clifford algebra, named after William Kingdon Clifford (1845-1879), second wrangler at Cambridge and close friend of Arthur Cayley.  But the wedge product is not just algebra—there is also a straightforward geometric interpretation of wedge products that make them useful when extending theories of surfaces and volumes into higher dimensions.

Geometric Interpretation

In Euclidean space, a cross product is related to areas and volumes of paralellapipeds. Wedge products are more general than cross products and they generalize the idea of areas and volumes to higher dimension. As an illustration, an area 2-form is shown in Fig. 1 and a 3-form in Fig. 2.

Fig. 1 Area 2-form showing how the area of a parallelogram is related to the wedge product. The 2-form is an oriented area perpendicular to the unit vector.
Fig. 2 A volume 3-form in Euclidean space. The volume of the parallelogram is equal to the magnitude of the wedge product of the three vectors u, v, and w.

The wedge product is not limited to 3 dimensions nor to Euclidean spaces. This is the power and the beauty of Grassmann’s invention. It also generalizes naturally to differential geometry of manifolds producing what are called differential forms. When integrating in higher dimensions or on non-Euclidean manifolds, the most appropriate approach is to use wedge products and differential forms, which will be the topic of my next blog on the generalized Stokes’ theorem.

Further Reading

1.         Dieudonné, J., The Tragedy of Grassmann. Séminaire de Philosophie et Mathématiques 1979, fascicule 2, 1-14.

2.         Fearnley-Sander, D., Hermann Grassmann and the Creation of Linear Algegra. American Mathematical Monthly 1979, 86 (10), 809-817.

3.         Nolte, D. D., Galileo Unbound: A Path Across Life, the Universe and Everything. Oxford University Press: 2018.

4.         Vargas, J. G., Differential Geometry for Physicists and Mathematicians: Moving Frames and Differential Forms: From Euclid Past Riemann. 2014; p 1-293.

Introduction to Modern Dynamics: Chaos, Networks, Space and Time

The second edition of Introduction to Modern Dynamics: Chaos, Networks, Space and Time publishes this week (Novermber 18, 2019), available from Oxford University Press and Amazon.

Most physics majors will use modern dynamics in their careers: nonlinearity, chaos, network theory, econophysics, game theory, neural nets, geodesic geometry, among many others.

The first edition of Introduction to Modern Dynamics (IMD) was an upper-division junior-level mechanics textbook at the level of Thornton and Marion (Classical Dynamics of Particles and Systems) and Taylor (Classical Mechanics).  IMD helped lead an emerging trend in physics education to update the undergraduate physics curriculum.  Conventional junior-level mechanics courses emphasized Lagrangian and Hamiltonian physics, but notably missing from the classic subjects are modern dynamics topics that most physics majors will use in their careers: nonlinearity, chaos, network theory, econophysics, game theory, neural nets, geodesic geometry, among many others.  These are the topics at the forefront of physics that drive high-tech businesses and start-ups, which is where more than half of all physicists work. IMD introduced these modern topics to junior-level physics majors in an accessible form that allowed them to master the fundamentals to prepare them for the modern world.

The second edition (IMD2) continues that trend by expanding the chapters to include additional material and topics.  It rearranges several of the introductory chapters for improved logical flow and expands them to include key conventional topics that were missing in the first edition (e.g., Lagrange undetermined multipliers and expanded examples of Lagrangian applications).  It is also an opportunity to correct several typographical errors and other errata that students have identified over the past several years.  The second edition also has expanded homework problems.

The goal of IMD2 is to strengthen the sections on conventional topics (that students need to master to take their GREs) to make IMD2 attractive as a mainstream physics textbook for broader adoption at the junior level, while continuing the program of updating the topics and approaches that are relevant for the roles that physicists play in the 21st century.

(New Chapters and Sections highlighted in red.)

New Features in Second Edition:

Second Edition Chapters and Sections

Part 1 Geometric Mechanics

• Expanded development of Lagrangian dynamics

• Lagrange multipliers

• More examples of applications

• Connection to statistical mechanics through the virial theorem

• Greater emphasis on action-angle variables

• The key role of adiabatic invariants

Part 1 Geometric Mechanics

Chapter 1 Physics and Geometry

1.1 State space and dynamical flows

1.2 Coordinate representations

1.3 Coordinate transformation

1.4 Uniformly rotating frames

1.5 Rigid-body motion

Chapter 2 Lagrangian Mechanics

2.1 Calculus of variations

2.2 Lagrangian applications

2.3 Lagrange’s undetermined multipliers

2.4 Conservation laws

2.5 Central force motion

2.6 Virial Theorem

Chapter 3 Hamiltonian Dynamics and Phase Space

3.1 The Hamiltonian function

3.2 Phase space

3.3 Integrable systems and action–angle variables

3.4 Adiabatic invariants

Part 2 Nonlinear Dynamics

• New section on non-autonomous dynamics

• Entire new chapter devoted to Hamiltonian mechanics

• Added importance to Chirikov standard map

• The important KAM theory of “constrained chaos” and solar system stability

• Degeneracy in Hamiltonian chaos

• A short overview of quantum chaos

• Rational resonances and the relation to KAM theory

• Synchronized chaos

Part 2 Nonlinear Dynamics

Chapter 4 Nonlinear Dynamics and Chaos

4.1 One-variable dynamical systems

4.2 Two-variable dynamical systems

4.3 Limit cycles

4.4 Discrete iterative maps

4.5 Three-dimensional state space and chaos

4.6 Non-autonomous (driven) flows

4.7 Fractals and strange attractors

Chapter 5 Hamiltonian Chaos

5.1 Perturbed Hamiltonian systems

5.2 Nonintegrable Hamiltonian systems

5.3 The Chirikov Standard Map

5.4 KAM Theory

5.5 Degeneracy and the web map

5.6 Quantum chaos

Chapter 6 Coupled Oscillators and Synchronization

6.1 Coupled linear oscillators

6.2 Simple models of synchronization

6.3 Rational resonances

6.4 External synchronization

6.5 Synchronization of Chaos

Part 3 Complex Systems

• New emphasis on diffusion on networks

• Epidemic growth on networks

• A new section of game theory in the context of evolutionary dynamics

• A new section on general equilibrium theory in economics

Part 3 Complex Systems

Chapter 7 Network Dynamics

7.1 Network structures

7.2 Random network topologies

7.3 Synchronization on networks

7.4 Diffusion on networks

7.5 Epidemics on networks

Chapter 8 Evolutionary Dynamics

81 Population dynamics

8.2 Virus infection and immune deficiency

8.3 Replicator Dynamics

8.4 Quasi-species

8.5 Game theory and evolutionary stable solutions

Chapter 9 Neurodynamics and Neural Networks

9.1 Neuron structure and function

9.2 Neuron dynamics

9.3 Network nodes: artificial neurons

9.4 Neural network architectures

9.5 Hopfield neural network

9.6 Content-addressable (associative) memory

Chapter 10 Economic Dynamics

10.1 Microeconomics and equilibrium

10.2 Macroeconomics

10.3 Business cycles

10.4 Random walks and stock prices (optional)

Part 4 Relativity and Space–Time

• Relativistic trajectories

• Gravitational waves

Part 4 Relativity and Space–Time

Chapter 11 Metric Spaces and Geodesic Motion

11.1 Manifolds and metric tensors

11.2 Derivative of a tensor

11.3 Geodesic curves in configuration space

11.4 Geodesic motion

Chapter 12 Relativistic Dynamics

12.1 The special theory

12.2 Lorentz transformations

12.3 Metric structure of Minkowski space

12.4 Relativistic trajectories

12.5 Relativistic dynamics

12.6 Linearly accelerating frames (relativistic)

Chapter 13 The General Theory of Relativity and Gravitation

13.1 Riemann curvature tensor

13.2 The Newtonian correspondence

13.3 Einstein’s field equations

13.4 Schwarzschild space–time

13.5 Kinematic consequences of gravity

13.6 The deflection of light by gravity

13.7 The precession of Mercury’s perihelion

13.8 Orbits near a black hole

13.9 Gravitational waves

Synopsis of 2nd Ed. Chapters

Chapter 1. Physics and Geometry (Sample Chapter)

This chapter has been rearranged relative to the 1st edition to provide a more logical flow of the overarching concepts of geometric mechanics that guide the subsequent chapters.  The central role of coordinate transformations is strengthened, as is the material on rigid-body motion with expanded examples.

Chapter 2. Lagrangian Mechanics (Sample Chapter)

Much of the structure and material is retained from the 1st edition while adding two important sections.  The section on applications of Lagrangian mechanics adds many direct examples of the use of Lagrange’s equations of motion.  An additional new section covers the important topic of Lagrange’s undetermined multipliers

Chapter 3. Hamiltonian Dynamics and Phase Space (Sample Chapter)

The importance of Hamiltonian systems and dynamics merits a stand-alone chapter.  The topics from the 1st edition are expanded in this new chapter, including a new section on adiabatic invariants that plays an important role in the development of quantum theory.  Some topics are de-emphasized from the 1st edition, such as general canonical transformations and the symplectic structure of phase space, although the specific transformation to action-angle coordinates is retained and amplified.

Chapter 4. Nonlinear Dynamics and Chaos

The first part of this chapter is retained from the 1st edition with numerous minor corrections and updates of figures.  The second part of the IMD 1st edition, treating Hamiltonian chaos, will be expanded into the new Chapter 5.

Chapter 5. Hamiltonian Chaos

This new stand-alone chapter expands on the last half of Chapter 3 of the IMD 1st edition.  The physical character of Hamiltonian chaos is substantially distinct from dissipative chaos that it deserves its own chapter.  It is also a central topic of interest for complex systems that are either conservative or that have integral invariants, such as our N-body solar system that played such an important role in the history of chaos theory beginning with Poincaré.  The new chapter highlights Poincaré’s homoclinic tangle, illustrated by the Chirikov Standard Map.  The Standard Map is an excellent introduction to KAM theory, which is one of the crowning achievements of the theory of dynamical systems by Komogorov, Arnold and Moser, connecting to deeper aspects of synchronization and rational resonances that drive the structure of systems as diverse as the rotation of the Moon and the rings of Saturn.  This is also a perfect lead-in to the next chapter on synchronization.  An optional section at the end of this chapter briefly discusses quantum chaos to show how Hamiltonian chaos can be extended into the quantum regime.

Chapter 6. Synchronization

This is an updated version of the IMD 1st ed. chapter.  It has a reduced initial section on coupled linear oscillators, retaining the key ideas about linear eigenmodes but removing some irrelevant details in the 1st edition.  A new section is added that defines and emphasizes the importance of quasi-periodicity.  A new section on the synchronization of chaotic oscillators is added.

Chapter 7. Network Dynamics

This chapter rearranges the structure of the chapter from the 1st edition, moving synchronization on networks earlier to connect from the previous chapter.  The section on diffusion and epidemics is moved to the back of the chapter and expanded in the 2nd edition into two separate sections on these topics, adding new material on discrete matrix approaches to continuous dynamics.

Chapter 8. Neurodynamics and Neural Networks

This chapter is retained from the 1st edition with numerous minor corrections and updates of figures.

Chapter 9. Evolutionary Dynamics

Two new sections are added to this chapter.  A section on game theory and evolutionary stable solutions introduces core concepts of evolutionary dynamics that merge well with the other topics of the chapter such as the pay-off matrix and replicator dynamics.  A new section on nearly neutral networks introduces new types of behavior that occur in high-dimensional spaces which are counter intuitive but important for understanding evolutionary drift.

Chapter 10.  Economic Dynamics

This chapter will be significantly updated relative to the 1st edition.  Most of the sections will be rewritten with improved examples and figures.  Three new sections will be added.  The 1st edition section on consumer market competition will be split into two new sections describing the Cournot duopoly and Pareto optimality in one section, and Walras’ Law and general equilibrium theory in another section.  The concept of the Pareto frontier in economics is becoming an important part of biophysical approaches to population dynamics.  In addition, new trends in economics are drawing from general equilibrium theory, first introduced by Walras in the nineteenth century, but now merging with modern ideas of fixed points and stable and unstable manifolds.  A third new section is added on econophysics, highlighting the distinctions that contrast economic dynamics (phase space dynamical approaches to economics) from the emerging field of econophysics (statistical mechanics approaches to economics).

Chapter 11. Metric Spaces and Geodesic Motion

 This chapter is retained from the 1st edition with several minor corrections and updates of figures.

Chapter 12. Relativistic Dynamics

This chapter is retained from the 1st edition with minor corrections and updates of figures.  More examples will be added, such as invariant mass reconstruction.  The connection between relativistic acceleration and Einstein’s equivalence principle will be strengthened.

Chapter 13. The General Theory of Relativity and Gravitation

This chapter is retained from the 1st edition with minor corrections and updates of figures.  A new section will derive the properties of gravitational waves, given the spectacular success of LIGO and the new field of gravitational astronomy.

Homework Problems:

All chapters will have expanded and updated homework problems.  Many of the homework problems from the 1st edition will remain, but the number of problems at the end of each chapter will be nearly doubled, while removing some of the less interesting or problematic problems.

Bibliography

D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019)

The Physics of Life, the Universe and Everything (In One Easy Equation)

Everyone knows that the answer to life, the universe and everything is “42”.  But if it’s the question that you want, then you can either grab a towel and a copy of The Hitchhikers Guide to the Galaxy, or you can go into physics and begin the search for yourself. 

What you may find is that the question boils down to an extremely simple formula

This innocuous-looking equation carries such riddles, such surprises, such unintuitive behavior that it can become the object of study for life.  This equation is called a vector flow equation, and it can be used to capture the essential physics of economies, neurons, ecosystems, networks, and even orbits of photons around black holes.  This equation is to modern dynamics what F = ma was to classical mechanics.  It is the starting point for understanding complex systems.

The Phase Space of Everything

The apparent simplicity of the “flow equation” masks the complexity it contains.  It is a vector equation because each “dimension” is a variable of a complex system.  Many systems of interest may have only a few variables, but ecosystems and economies and social networks may have hundreds or thousands of variables.  Expressed in component format, the flow equation is

where the superscript spans the number of variables.  But even this masks all that can happen with such an equation. Each of the functions fa can be entirely different from each other, and can be any type of function, whether polynomial, rational, algebraic, transcendental or composite, although they must be single-valued.  They are generally nonlinear, and the limitless ways that functions can be nonlinear is where the richness of the flow equation comes from.

The vector flow equation is an ordinary differential equation (ODE) that can be solved for specific trajectories as initial value problems.  A single set of initial conditions defines a unique trajectory.  For instance, the trajectory for a 4-dimensional example is described as the column vector

which is the single-parameter position vector to a point in phase space, also called state space.  The point sweeps through successive configurations as a function of its single parameter—time.  This trajectory is also called an orbit.  In classical mechanics, the focus has tended to be on the behavior of specific orbits that arise from a specific set of initial conditions.  This is the classic “rock thrown from a cliff” problem of introductory physics courses.  However, in modern dynamics, the focus shifts away from individual trajectories to encompass the set of all possible trajectories.

Why is Modern Dynamics part of Physics?

If finding the solutions to the “x-dot equals f” vector flow equation is all there is to do, then this would just be a math problem—the solution of ODE’s.  There are plenty of gems for mathematicians to look for, and there is an entire of field of study in mathematics called “dynamical systems“, but this would not be “physics”.  Physics as a profession is separate and distinct from mathematics, although the two are sometimes confused.  Physics uses mathematics as its language and as its toolbox, but physics is not mathematics.  Physics is done best when it is done qualitatively—this means with scribbles done on napkins in restaurants or on the back of envelopes while waiting in line. Physics is about recognizing relationships and patterns. Physics is about identifying the limits to scaling properties where the physics changes when scales change. Physics is about the mapping of the simplest possible mathematics onto behavior in the physical world, and recognizing when the simplest possible mathematics is a universal that applies broadly to diverse systems that seem different, but that share the same underlying principles.

So, granted solving ODE’s is not physics, there is still a tremendous amount of good physics that can be done by solving ODE’s. ODE solvers become the modern physicist’s experimental workbench, providing data output from numerical experiments that can test the dependence on parameters in ways that real-world experiments might not be able to access. Physical intuition can be built based on such simulations as the engaged physicist begins to “understand” how the system behaves, able to explain what will happen as the values of parameters are changed.

In the follow sections, three examples of modern dynamics are introduced with a preliminary study, including Python code. These examples are: Galactic dynamics, synchronized networks and ecosystems. Despite their very different natures, their description using dynamical flows share features in common and illustrate the beauty and depth of behavior that can be explored with simple equations.

Galactic Dynamics

One example of the power and beauty of the vector flow equation and its set of all solutions in phase space is called the Henon-Heiles model of the motion of a star within a galaxy.  Of course, this is a terribly complicated problem that involves tens of billions of stars, but if you average over the gravitational potential of all the other stars, and throw in a couple of conservation laws, the resulting potential can look surprisingly simple.  The motion in the plane of this galactic potential takes two configuration coordinates (x, y) with two associated momenta (px, py) for a total of four dimensions.  The flow equations in four-dimensional phase space are simply

Fig. 1 The 4-dimensional phase space flow equations of a star in a galaxy. The terms in light blue are a simple two-dimensional harmonic oscillator. The terms in magenta are the nonlinear contributions from the stars in the galaxy.

where the terms in the light blue box describe a two-dimensional simple harmonic oscillator (SHO), which is a linear oscillator, modified by the terms in the magenta box that represent the nonlinear galactic potential.  The orbits of this Hamiltonian system are chaotic, and because there is no dissipation in the model, a single orbit will continue forever within certain ranges of phase space governed by energy conservation, but never quite repeating.

Fig. 2 Two-dimensional Poincaré section of sets of trajectories in four-dimensional phase space for the Henon-Heiles galactic dynamics model. The perturbation parameter is &eps; = 0.3411 and the energy E = 1.

Hamilton4D.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Hamilton4D.py
Created on Wed Apr 18 06:03:32 2018

@author: nolte

Derived from:
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os

plt.close('all')

# model_case 1 = Heiles
# model_case 2 = Crescent
print(' ')
print('Hamilton4D.py')
print('Case: 1 = Heiles')
print('Case: 2 = Crescent')
model_case = int(input('Enter the Model Case (1-2)'))

if model_case == 1:
    E = 1       # Heiles: 1, 0.3411   Crescent: 0.05, 1
    epsE = 0.3411   # 3411
    def flow_deriv(x_y_z_w,tspan):
        x, y, z, w = x_y_z_w
        a = z
        b = w
        c = -x - epsE*(2*x*y)
        d = -y - epsE*(x**2 - y**2)
        return[a,b,c,d]
else:
    E = .1       #   Crescent: 0.1, 1
    epsE = 1   
    def flow_deriv(x_y_z_w,tspan):
        x, y, z, w = x_y_z_w
        a = z
        b = w
        c = -(epsE*(y-2*x**2)*(-4*x) + x)
        d = -(y-epsE*2*x**2)
        return[a,b,c,d]
    
prms = np.sqrt(E)
pmax = np.sqrt(2*E)    
            
# Potential Function
if model_case == 1:
    V = np.zeros(shape=(100,100))
    for xloop in range(100):
        x = -2 + 4*xloop/100
        for yloop in range(100):
            y = -2 + 4*yloop/100
            V[yloop,xloop] = 0.5*x**2 + 0.5*y**2 + epsE*(x**2*y - 0.33333*y**3) 
else:
    V = np.zeros(shape=(100,100))
    for xloop in range(100):
        x = -2 + 4*xloop/100
        for yloop in range(100):
            y = -2 + 4*yloop/100
            V[yloop,xloop] = 0.5*x**2 + 0.5*y**2 + epsE*(2*x**4 - 2*x**2*y) 

fig = plt.figure(1)
contr = plt.contourf(V,100, cmap=cm.coolwarm, vmin = 0, vmax = 10)
fig.colorbar(contr, shrink=0.5, aspect=5)    
fig = plt.show()

repnum = 250
mulnum = 64/repnum

np.random.seed(1)
for reploop  in range(repnum):
    px1 = 2*(np.random.random((1))-0.499)*pmax
    py1 = np.sign(np.random.random((1))-0.499)*np.real(np.sqrt(2*(E-px1**2/2)))
    xp1 = 0
    yp1 = 0
    
    x_y_z_w0 = [xp1, yp1, px1, py1]
    
    tspan = np.linspace(1,1000,10000)
    x_t = integrate.odeint(flow_deriv, x_y_z_w0, tspan)
    siztmp = np.shape(x_t)
    siz = siztmp[0]

    if reploop % 50 == 0:
        plt.figure(2)
        lines = plt.plot(x_t[:,0],x_t[:,1])
        plt.setp(lines, linewidth=0.5)
        plt.show()
        time.sleep(0.1)
        #os.system("pause")

    y1 = x_t[:,0]
    y2 = x_t[:,1]
    y3 = x_t[:,2]
    y4 = x_t[:,3]
    
    py = np.zeros(shape=(2*repnum,))
    yvar = np.zeros(shape=(2*repnum,))
    cnt = -1
    last = y1[1]
    for loop in range(2,siz):
        if (last < 0)and(y1[loop] > 0):
            cnt = cnt+1
            del1 = -y1[loop-1]/(y1[loop] - y1[loop-1])
            py[cnt] = y4[loop-1] + del1*(y4[loop]-y4[loop-1])
            yvar[cnt] = y2[loop-1] + del1*(y2[loop]-y2[loop-1])
            last = y1[loop]
        else:
            last = y1[loop]
 
    plt.figure(3)
    lines = plt.plot(yvar,py,'o',ms=1)
    plt.show()
    
if model_case == 1:
    plt.savefig('Heiles')
else:
    plt.savefig('Crescent')
    

Networks, Synchronization and Emergence

A central paradigm of nonlinear science is the emergence of patterns and organized behavior from seemingly random interactions among underlying constituents.  Emergent phenomena are among the most awe inspiring topics in science.  Crystals are emergent, forming slowly from solutions of reagents.  Life is emergent, arising out of the chaotic soup of organic molecules on Earth (or on some distant planet).  Intelligence is emergent, and so is consciousness, arising from the interactions among billions of neurons.  Ecosystems are emergent, based on competition and symbiosis among species.  Economies are emergent, based on the transfer of goods and money spanning scales from the local bodega to the global economy.

One of the common underlying properties of emergence is the existence of networks of interactions.  Networks and network science are topics of great current interest driven by the rise of the World Wide Web and social networks.  But networks are ubiquitous and have long been the topic of research into complex and nonlinear systems.  Networks provide a scaffold for understanding many of the emergent systems.  It allows one to think of isolated elements, like molecules or neurons, that interact with many others, like the neighbors in a crystal or distant synaptic connections.

From the point of view of modern dynamics, the state of a node can be a variable or a “dimension” and the interactions among links define the functions of the vector flow equation.  Emergence is then something that “emerges” from the dynamical flow as many elements interact through complex networks to produce simple or emergent patterns.

Synchronization is a form of emergence that happens when lots of independent oscillators, each vibrating at their own personal frequency, are coupled together to push and pull on each other, entraining all the individual frequencies into one common global oscillation of the entire system.  Synchronization plays an important role in the solar system, explaining why the Moon always shows one face to the Earth, why Saturn’s rings have gaps, and why asteroids are mainly kept away from colliding with the Earth.  Synchronization plays an even more important function in biology where it coordinates the beating of the heart and the functioning of the brain.

One of the most dramatic examples of synchronization is the Kuramoto synchronization phase transition. This occurs when a large set of individual oscillators with differing natural frequencies interact with each other through a weak nonlinear coupling.  For small coupling, all the individual nodes oscillate at their own frequency.  But as the coupling increases, there is a sudden coalescence of all the frequencies into a single common frequency.  This mechanical phase transition, called the Kuramoto transition, has many of the properties of a thermodynamic phase transition, including a solution that utilizes mean field theory.

Fig. 3 The Kuramoto model for the nonlinear coupling of N simple phase oscillators. The term in light blue is the simple phase oscillator. The term in magenta is the global nonlinear coupling that connects each oscillator to every other.

The simulation of 20 Poncaré phase oscillators with global coupling is shown in Fig. 4 as a function of increasing coupling coefficient g. The original individual frequencies are spread randomly. The oscillators with similar frequencies are the first to synchronize, forming small clumps that then synchronize with other clumps of oscillators, until all oscillators are entrained to a single compromise frequency. The Kuramoto phase transition is not sharp in this case because the value of N = 20 is too small. If the simulation is run for 200 oscillators, there is a sudden transition from unsynchronized to synchronized oscillation at a threshold value of g.

Fig. 4 The Kuramoto model for 20 Poincare oscillators showing the frequencies as a function of the coupling coefficient.

The Kuramoto phase transition is one of the most important fundamental examples of modern dynamics because it illustrates many facets of nonlinear dynamics in a very simple way. It highlights the importance of nonlinearity, the simplification of phase oscillators, the use of mean field theory, the underlying structure of the network, and the example of a mechanical analog to a thermodynamic phase transition. It also has analytical solutions because of its simplicity, while still capturing the intrinsic complexity of nonlinear systems.

Kuramoto.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sat May 11 08:56:41 2019

@author: nolte

Derived from:
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""

# https://www.python-course.eu/networkx.php
# https://networkx.github.io/documentation/stable/tutorial.html
# https://networkx.github.io/documentation/stable/reference/functions.html

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt
import networkx as nx
from UserFunction import linfit
import time

tstart = time.time()

plt.close('all')

Nfac = 20   # 25
N = 20      # 50
width = 0.2

# function: omegout, yout = coupleN(G)
def coupleN(G):

    # function: yd = flow_deriv(x_y)
    def flow_deriv(y,t0):
                
        yp = np.zeros(shape=(N,))
        for omloop  in range(N):
            temp = omega[omloop]
            linksz = G.node[omloop]['numlink']
            for cloop in range(linksz):
                cindex = G.node[omloop]['link'][cloop]
                g = G.node[omloop]['coupling'][cloop]

                temp = temp + g*np.sin(y[cindex]-y[omloop])
            
            yp[omloop] = temp
        
        yd = np.zeros(shape=(N,))
        for omloop in range(N):
            yd[omloop] = yp[omloop]
        
        return yd
    # end of function flow_deriv(x_y)

    mnomega = 1.0
    
    for nodeloop in range(N):
        omega[nodeloop] = G.node[nodeloop]['element']
    
    x_y_z = omega    
    
    # Settle-down Solve for the trajectories
    tsettle = 100
    t = np.linspace(0, tsettle, tsettle)
    x_t = integrate.odeint(flow_deriv, x_y_z, t)
    x0 = x_t[tsettle-1,0:N]
    
    t = np.linspace(1,1000,1000)
    y = integrate.odeint(flow_deriv, x0, t)
    siztmp = np.shape(y)
    sy = siztmp[0]
        
    # Fit the frequency
    m = np.zeros(shape = (N,))
    w = np.zeros(shape = (N,))
    mtmp = np.zeros(shape=(4,))
    btmp = np.zeros(shape=(4,))
    for omloop in range(N):
        
        if np.remainder(sy,4) == 0:
            mtmp[0],btmp[0] = linfit(t[0:sy//2],y[0:sy//2,omloop]);
            mtmp[1],btmp[1] = linfit(t[sy//2+1:sy],y[sy//2+1:sy,omloop]);
            mtmp[2],btmp[2] = linfit(t[sy//4+1:3*sy//4],y[sy//4+1:3*sy//4,omloop]);
            mtmp[3],btmp[3] = linfit(t,y[:,omloop]);
        else:
            sytmp = 4*np.floor(sy/4);
            mtmp[0],btmp[0] = linfit(t[0:sytmp//2],y[0:sytmp//2,omloop]);
            mtmp[1],btmp[1] = linfit(t[sytmp//2+1:sytmp],y[sytmp//2+1:sytmp,omloop]);
            mtmp[2],btmp[2] = linfit(t[sytmp//4+1:3*sytmp/4],y[sytmp//4+1:3*sytmp//4,omloop]);
            mtmp[3],btmp[3] = linfit(t[0:sytmp],y[0:sytmp,omloop]);

        
        #m[omloop] = np.median(mtmp)
        m[omloop] = np.mean(mtmp)
        
        w[omloop] = mnomega + m[omloop]
     
    omegout = m
    yout = y
    
    return omegout, yout
    # end of function: omegout, yout = coupleN(G)



Nlink = N*(N-1)//2      
omega = np.zeros(shape=(N,))
omegatemp = width*(np.random.rand(N)-1)
meanomega = np.mean(omegatemp)
omega = omegatemp - meanomega
sto = np.std(omega)

nodecouple = nx.complete_graph(N)

lnk = np.zeros(shape = (N,), dtype=int)
for loop in range(N):
    nodecouple.node[loop]['element'] = omega[loop]
    nodecouple.node[loop]['link'] = list(nx.neighbors(nodecouple,loop))
    nodecouple.node[loop]['numlink'] = np.size(list(nx.neighbors(nodecouple,loop)))
    lnk[loop] = np.size(list(nx.neighbors(nodecouple,loop)))

avgdegree = np.mean(lnk)
mnomega = 1

facval = np.zeros(shape = (Nfac,))
yy = np.zeros(shape=(Nfac,N))
xx = np.zeros(shape=(Nfac,))
for facloop in range(Nfac):
    print(facloop)
    facoef = 0.2

    fac = facoef*(16*facloop/(Nfac))*(1/(N-1))*sto/mnomega
    for nodeloop in range(N):
        nodecouple.node[nodeloop]['coupling'] = np.zeros(shape=(lnk[nodeloop],))
        for linkloop in range (lnk[nodeloop]):
            nodecouple.node[nodeloop]['coupling'][linkloop] = fac

    facval[facloop] = fac*avgdegree
    
    omegout, yout = coupleN(nodecouple)                           # Here is the subfunction call for the flow

    for omloop in range(N):
        yy[facloop,omloop] = omegout[omloop]

    xx[facloop] = facval[facloop]

plt.figure(1)
lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=0.5)
plt.show()

elapsed_time = time.time() - tstart
print('elapsed time = ',format(elapsed_time,'.2f'),'secs')

The Web of Life

Ecosystems are among the most complex systems on Earth.  The complex interactions among hundreds or thousands of species may lead to steady homeostasis in some cases, to growth and collapse in other cases, and to oscillations or chaos in yet others.  But the definition of species can be broad and abstract, referring to businesses and markets in economic ecosystems, or to cliches and acquaintances in social ecosystems, among many other examples.  These systems are governed by the laws of evolutionary dynamics that include fitness and survival as well as adaptation.

The dimensionality of the dynamical spaces for these systems extends to hundreds or thousands of dimensions—far too complex to visualize when thinking in four dimensions is already challenging.  Yet there are shared principles and common behaviors that emerge even here.  Many of these can be illustrated in a simple three-dimensional system that is represented by a triangular simplex that can be easily visualized, and then generalized back to ultra-high dimensions once they are understood.

A simplex is a closed (N-1)-dimensional geometric figure that describes a zero-sum game (game theory is an integral part of evolutionary dynamics) among N competing species.  For instance, a two-simplex is a triangle that captures the dynamics among three species.  Each vertex of the triangle represents the situation when the entire ecosystem is composed of a single species.  Anywhere inside the triangle represents the situation when all three species are present and interacting.

A classic model of interacting species is the replicator equation. It allows for a fitness-based proliferation and for trade-offs among the individual species. The replicator dynamics equations are shown in Fig. 5.

Fig. 5 Replicator dynamics has a surprisingly simple form, but with surprisingly complicated behavior. The key elements are the fitness and the payoff matrix. The fitness relates to how likely the species will survive. The payoff matrix describes how one species gains at the loss of another (although symbiotic relationships also occur).

The population dynamics on the 2D simplex are shown in Fig. 6 for several different pay-off matrices. The matrix values are shown in color and help interpret the trajectories. For instance the simplex on the upper-right shows a fixed point center. This reflects the antisymmetric character of the pay-off matrix around the diagonal. The stable spiral on the lower-left has a nearly asymmetric pay-off matrix, but with unequal off-diagonal magnitudes. The other two cases show central saddle points with stable fixed points on the boundary. A very large variety of behaviors are possible for this very simple system. The Python program is shown in Trirep.py.

Fig. 6 Payoff matrix and population simplex for four random cases: Upper left is an unstable saddle. Upper right is a center. Lower left is a stable spiral. Lower right is a marginal case.

Trirep.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
trirep.py
Created on Thu May  9 16:23:30 2019

@author: nolte

Derived from:
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt

plt.close('all')

def tripartite(x,y,z):

    sm = x + y + z
    xp = x/sm
    yp = y/sm
    
    f = np.sqrt(3)/2
    
    y0 = f*xp
    x0 = -0.5*xp - yp + 1;
    
    plt.figure(2)
    lines = plt.plot(x0,y0)
    plt.setp(lines, linewidth=0.5)
    plt.plot([0, 1],[0, 0],'k',linewidth=1)
    plt.plot([0, 0.5],[0, f],'k',linewidth=1)
    plt.plot([1, 0.5],[0, f],'k',linewidth=1)
    plt.show()
    

def solve_flow(y,tspan):
    def flow_deriv(y, t0):
    #"""Compute the time-derivative ."""
    
        f = np.zeros(shape=(N,))
        for iloop in range(N):
            ftemp = 0
            for jloop in range(N):
                ftemp = ftemp + A[iloop,jloop]*y[jloop]
            f[iloop] = ftemp
        
        phitemp = phi0          # Can adjust this from 0 to 1 to stabilize (but Nth population is no longer independent)
        for loop in range(N):
            phitemp = phitemp + f[loop]*y[loop]
        phi = phitemp
        
        yd = np.zeros(shape=(N,))
        for loop in range(N-1):
            yd[loop] = y[loop]*(f[loop] - phi);
        
        if np.abs(phi0) < 0.01:             # average fitness maintained at zero
            yd[N-1] = y[N-1]*(f[N-1]-phi);
        else:                                     # non-zero average fitness
            ydtemp = 0
            for loop in range(N-1):
                ydtemp = ydtemp - yd[loop]
            yd[N-1] = ydtemp
       
        return yd

    # Solve for the trajectories
    t = np.linspace(0, tspan, 701)
    x_t = integrate.odeint(flow_deriv,y,t)
    return t, x_t

# model_case 1 = zero diagonal
# model_case 2 = zero trace
# model_case 3 = asymmetric (zero trace)
print(' ')
print('trirep.py')
print('Case: 1 = antisymm zero diagonal')
print('Case: 2 = antisymm zero trace')
print('Case: 3 = random')
model_case = int(input('Enter the Model Case (1-3)'))

N = 3
asymm = 3      # 1 = zero diag (replicator eqn)   2 = zero trace (autocatylitic model)  3 = random (but zero trace)
phi0 = 0.001            # average fitness (positive number) damps oscillations
T = 100;


if model_case == 1:
    Atemp = np.zeros(shape=(N,N))
    for yloop in range(N):
        for xloop in range(yloop+1,N):
            Atemp[yloop,xloop] = 2*(0.5 - np.random.random(1))
            Atemp[xloop,yloop] = -Atemp[yloop,xloop]

if model_case == 2:
    Atemp = np.zeros(shape=(N,N))
    for yloop in range(N):
        for xloop in range(yloop+1,N):
            Atemp[yloop,xloop] = 2*(0.5 - np.random.random(1))
            Atemp[xloop,yloop] = -Atemp[yloop,xloop]
        Atemp[yloop,yloop] = 2*(0.5 - np.random.random(1))
    tr = np.trace(Atemp)
    A = Atemp
    for yloop in range(N):
        A[yloop,yloop] = Atemp[yloop,yloop] - tr/N
        
else:
    Atemp = np.zeros(shape=(N,N))
    for yloop in range(N):
        for xloop in range(N):
            Atemp[yloop,xloop] = 2*(0.5 - np.random.random(1))
        
    tr = np.trace(Atemp)
    A = Atemp
    for yloop in range(N):
        A[yloop,yloop] = Atemp[yloop,yloop] - tr/N

plt.figure(3)
im = plt.matshow(A,3,cmap=plt.cm.get_cmap('seismic'))  # hsv, seismic, bwr
cbar = im.figure.colorbar(im)

M = 20
delt = 1/M
ep = 0.01;

tempx = np.zeros(shape = (3,))
for xloop in range(M):
    tempx[0] = delt*(xloop)+ep;
    for yloop in range(M-xloop):
        tempx[1] = delt*yloop+ep
        tempx[2] = 1 - tempx[0] - tempx[1]
        
        x0 = tempx/np.sum(tempx);          # initial populations
        
        tspan = 70
        t, x_t = solve_flow(x0,tspan)
        
        y1 = x_t[:,0]
        y2 = x_t[:,1]
        y3 = x_t[:,2]
        
        plt.figure(1)
        lines = plt.plot(t,y1,t,y2,t,y3)
        plt.setp(lines, linewidth=0.5)
        plt.show()
        plt.ylabel('X Position')
        plt.xlabel('Time')

        tripartite(y1,y2,y3)

Topics in Modern Dynamics

These three examples are just the tip of the iceberg. The topics in modern dynamics are almost numberless. Any system that changes in time is a potential object of study in modern dynamics. Here is a list of a few topics that spring to mind.

Bibliography

D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019) (The physics and the derivations of the equations for the examples in this blog can be found here.)

Publication Date for the Second Edition: November 18, 2019

D. D. Nolte, Galileo Unbound: A Path Across Life, the Universe and Everything (Oxford University Press, 2018) (The historical origins of the examples in this blog can be found here.)