Blogs inspired by the book Galileo Unbound (Oxford, 2018)
Author /David D. Nolte
Posts by David D. Nolte
David D. Nolte is the Edward M. Purcell Distinguished Professor of Physics and Astronomy at Purdue University. He research has pioneered the physics and applications of dynamic holography, the BioCD, and dynamic contrast optical coherence tomography (DC-OCT). He received his baccalaureate from Cornell University, his PhD from the University of California at Berkeley, and had a post-doctoral appointment at AT&T Bell Labs before joining the faculty at Purdue. He is a Fellow of the Optical Society of America, a Fellow of the American Physical Society and a Fellow of the AAAS. He received the Herbert Newby McCoy Award of Purdue University and is the technical founder of two biotech startup companies in diagnostic screening and analysis. He has written several popular science books, including Interference (Oxford, 2023), Galileo Unbound (Oxford, 2018), and Mind at Light Speed (Free Press, 2001).
When Newton developed his theory of universal gravitation, the first problem he tackled was Kepler’s elliptical orbits of the planets around the sun, and he succeeded beyond compare. The second problem he tackled was of more practical importance than the tracks of distant planets, namely the path of the Earth’s own moon, and he was never satisfied.
Newton’s Principia and the Problem of
Longitude
Measuring the precise location of the moon at very exact times against the backdrop of the celestial sphere was a method for ships at sea to find their longitude. Yet the moon’s orbit around the Earth is irregular, and Newton recognized that because gravity was universal, every planet exerted a force on each other, and the moon was being tugged upon by the sun as well as by the Earth.
Newton’s attempt with the Moon was his last significant scientific endeavor
In Propositions 65 and 66 of Book 1
of the Principia, Newton applied his
new theory to attempt to pin down the moon’s trajectory, but was thwarted by
the complexity of the three bodies of the Earth-Moon-Sun system. For instance, the force of the sun on the
moon is greater than the force of the Earth on the moon, which raised the
question of why the moon continued to circle the Earth rather than being pulled
away to the sun. Newton correctly recognized that it was the Earth-moon system that was in orbit around the sun,
and hence the sun caused only a perturbation on the Moon’s orbit around the
Earth. However, because the Moon’s orbit
is approximately elliptical, the Sun’s pull on the Moon is not constant as it
swings around in its orbit, and Newton only succeeded in making estimates of
the perturbation.
Unsatisfied with his results in the Principia, Newton tried again, beginning
in the summer of 1694, but the problem was to too great even for him. In 1702 he published his research, as far as
he was able to take it, on the orbital trajectory of the Moon. He could pin down the motion to within 10 arc
minutes, but this was not accurate enough for reliable navigation, representing
an uncertainty of over 10 kilometers at sea—error enough to run aground at
night on unseen shoals. Newton’s attempt
with the Moon was his last significant scientific endeavor, and afterwards this
great scientist withdrew into administrative activities and other occult
interests that consumed his remaining time.
Race for the Moon
The importance of the Moon for navigation was too pressing to ignore, and in the 1740’s a heated competition to be the first to pin down the Moon’s motion developed among three of the leading mathematicians of the day—Leonhard Euler, Jean Le Rond D’Alembert and Alexis Clairaut—who began attacking the lunar problem and each other [1]. Euler in 1736 had published the first textbook on dynamics that used the calculus, and Clairaut had recently returned from Lapland with Maupertuis. D’Alembert, for his part, had placed dynamics on a firm physical foundation with his 1743 textbook. Euler was first to publish with a lunar table in 1746, but there remained problems in his theory that frustrated his attempt at attaining the required level of accuracy.
At nearly the same time Clairaut and
D’Alembert revisited Newton’s foiled lunar theory and found additional terms in
the perturbation expansion that Newton had neglected. They rushed to beat each other into print, but
Clairaut was distracted by a prize competition for the most accurate lunar
theory, announced by the Russian Academy of Sciences and refereed by Euler,
while D’Alembert ignored the competition, certain that Euler would rule in
favor of Clairaut. Clairaut won the
prize, but D’Alembert beat him into print.
The rivalry over the moon did not
end there. Clairaut continued to improve lunar tables by combining theory and
observation, while D’Alembert remained more purely theoretical. A growing animosity between Clairaut and
D’Alembert spilled out into the public eye and became a daily topic of
conversation in the Paris salons. The
difference in their approaches matched the difference in their personalities,
with the more flamboyant and pragmatic Clairaut disdaining the purist approach
and philosophy of D’Alembert. Clairaut
succeeded in publishing improved lunar theory and tables in 1752, followed by
Euler in 1753, while D’Alembert’s interests were drawn away towards his
activities for Diderot’s Encyclopedia.
The battle over the Moon in the late 1740’s was carried out on the battlefield of perturbation theory. To lowest order, the orbit of the Moon around the Earth is a Keplerian ellipse, and the effect of the Sun, though creating problems for the use of the Moon for navigation, produces only a small modification—a perturbation—of its overall motion. Within a decade or two, the accuracy of perturbation theory calculations, combined with empirical observations, had improved to the point that accurate lunar tables had sufficient accuracy to allow ships to locate their longitude to within a kilometer at sea. The most accurate tables were made by Tobias Mayer, who was awarded posthumously a prize of 3000 pounds by the British Parliament in 1763 for the determination of longitude at sea. Euler received 300 pounds for helping Mayer with his calculations. This was the same prize that was coveted by the famous clockmaker John Harrison and depicted so brilliantly in Dava Sobel’s Longitude (1995).
Lagrange Points
Several years later in 1772 Lagrange discovered an interesting special solution to the planar three-body problem with three massive points each executing an elliptic orbit around the center of mass of the system, but configured such that their positions always coincided with the vertices of an equilateral triangle [2]. He found a more important special solution in the restricted three-body problem that emerged when a massless third body was found to have two stable equilibrium points in the combined gravitational potentials of two massive bodies. These two stable equilibrium points are known as the L4 and L5 Lagrange points. Small objects can orbit these points, and in the Sun-Jupiter system these points are occupied by the Trojan asteroids. Similarly stable Lagrange points exist in the Earth-Moon system where space stations or satellites could be parked.
For the special case of circular orbits of constant angular frequency w, the motion of the third mass is described by the Lagrangian
where the potential is time dependent because of the motion of the two larger masses. Lagrange approached the problem by adopting a rotating reference frame in which the two larger masses m1 and m2 move along the stationary line defined by their centers. The Lagrangian in the rotating frame is
where the effective potential is now time independent. The first term in the effective potential is the Coriolis effect and the second is the centrifugal term.
Fig. Effective potential for the planar three-body problem and the five Lagrange points where the gradient of the effective potential equals zero. The Lagrange points are displayed on a horizontal cross section of the potential energy shown with equipotential lines. The large circle in the center is the Sun. The smaller circle on the right is a Jupiter-like planet. The points L1, L2 and L3 are each saddle-point equilibria positions and hence unstable. The points L4 and L5 are stable points that can collect small masses that orbit these Lagrange points.
The effective potential is shown in the figure for m3 = 10m2. There are five locations where the gradient of the effective potential equals zero. The point L1 is the equilibrium position between the two larger masses. The points L2 and L3 are at positions where the centrifugal force balances the gravitational attraction to the two larger masses. These are also the points that separate local orbits around a single mass from global orbits that orbit the two-body system. The last two Lagrange points at L4 and L5 are at one of the vertices of an equilateral triangle, with the other two vertices at the positions of the larger masses. The first three Lagrange points are saddle points. The last two are at maxima of the effective potential.
L1, lies between Earth and the sun at about 1 million miles from Earth. L1 gets an uninterrupted view of the sun, and is currently occupied by the Solar and Heliospheric Observatory (SOHO) and the Deep Space Climate Observatory. L2 also lies a million miles from Earth, but in the opposite direction of the sun. At this point, with the Earth, moon and sun behind it, a spacecraft can get a clear view of deep space. NASA’s Wilkinson Microwave Anisotropy Probe (WMAP) is currently at this spot measuring the cosmic background radiation left over from the Big Bang. The James Webb Space Telescope will move into this region in 2021.
[1]
Gutzwiller, M. C. (1998). “Moon-Earth-Sun: The oldest
three-body problem.” Reviews of Modern Physics70(2):
589-639.
[2] J.L. Lagrange Essai sur le problème des trois corps,
1772, Oeuvres tome 6
The 1960’s are known as a time of cultural revolution, but perhaps less known was the revolution that occurred in the science of dynamics. Three towering figures of that revolution were Stephen Smale (1930 – ) at Berkeley, Andrey Kolmogorov (1903 – 1987) in Moscow and his student Vladimir Arnold (1937 – 2010). Arnold was only 20 years old in 1957 when he solved Hilbert’s thirteenth problem (that any continuous function of several variables can be constructed with a finite number of two-variable functions). Only a few years later his work on the problem of small denominators in dynamical systems provided the finishing touches on the long elusive explanation of the stability of the solar system (the problem for which Poincaré won the King Oscar Prize in mathematics in 1889 when he discovered chaotic dynamics ). This theory is known as KAM-theory, using the first initials of the names of Kolmogorov, Arnold and Moser [1]. Building on his breakthrough in celestial mechanics, Arnold’s work through the 1960’s remade the theory of Hamiltonian systems, creating a shift in perspective that has permanently altered how physicists look at dynamical systems.
Hamiltonian Physics on a Torus
Traditionally, Hamiltonian physics is associated with systems of inertial objects that conserve the sum of kinetic and potential energy, in other words, conservative non-dissipative systems. But a modern view (after Arnold) of Hamiltonian systems sees them as hyperdimensional mathematical mappings that conserve volume. The space that these mappings inhabit is phase space, and the conservation of phase-space volume is known as Liouville’s Theorem [2]. The geometry of phase space is called symplectic geometry, and the universal position that symplectic geometry now holds in the physics of Hamiltonian mechanics is largely due to Arnold’s textbook Mathematical Methods of Classical Mechanics (1974, English translation 1978) [3]. Arnold’s famous quote from that text is “Hamiltonian mechanics is geometry in phase space”.
One of the striking aspects of this textbook is the reduction of phase-space geometry to the geometry of a hyperdimensional torus for a large number of Hamiltonian systems. If there are as many conserved quantities as there are degrees of freedom in a Hamiltonian system, then the system is called “integrable” (because you can integrated the equations of motion to find a constant of the motion). Then it is possible to map the physics onto a hyperdimensional torus through the transformation of dynamical coordinates into what are known as “action-angle” coordinates [4]. Each independent angle has an associated action that is conserved during the motion of the system. The periodicity of the dynamical angle coordinate makes it possible to identify it with the angular coordinate of a multi-dimensional torus. Therefore, every integrable Hamiltonian system can be mapped to motion on a multi-dimensional torus (one dimension for each degree of freedom of the system).
Actually, integrable Hamiltonian systems are among the most boring dynamical systems you can imagine. They literally just go in circles (around the torus). But as soon as you add a small perturbation that cannot be integrated they produce some of the most complex and beautiful patterns of all dynamical systems. It was Arnold’s focus on motions on a torus, and perturbations that shift the dynamics off the torus, that led him to propose a simple mapping that captured the essence of Hamiltonian chaos.
The Arnold Cat Map
Motion on a two-dimensional torus is defined by two angles, and trajectories on a two-dimensional torus are simple helixes. If the periodicities of the motion in the two angles have an integer ratio, the helix repeats itself. However, if the ratio of periods (also known as the winding number) is irrational, then the helix never repeats and passes arbitrarily closely to any point on the surface of the torus. This last case leads to an “ergodic” system, which is a term introduced by Boltzmann to describe a physical system whose trajectory fills phase space. The behavior of a helix for rational or irrational winding number is not terribly interesting. It’s just an orbit going in circles like an integrable Hamiltonian system. The helix can never even cross itself.
However, if you could add a new dimension to the torus (or add a new degree of freedom to the dynamical system), then the helix could pass over or under itself by moving into the new dimension. By weaving around itself, a trajectory can become chaotic, and the set of many trajectories can become as mixed up as a bowl of spaghetti. This can be a little hard to visualize, especially in higher dimensions, but Arnold thought of a very simple mathematical mapping that captures the essential motion on a torus, preserving volume as required for a Hamiltonian system, but with the ability for regions to become all mixed up, just like trajectories in a nonintegrable Hamiltonian system.
A unit square is isomorphic to a two-dimensional torus. This means that there is a one-to-one mapping of each point on the unit square to each point on the surface of a torus. Imagine taking a sheet of paper and forming a tube out of it. One of the dimensions of the sheet of paper is now an angle coordinate that is cyclic, going around the circumference of the tube. Now if the sheet of paper is flexible (like it is made of thin rubber) you can bend the tube around and connect the top of the tube with the bottom, like a bicycle inner tube. The other dimension of the sheet of paper is now also an angle coordinate that is cyclic. In this way a flat sheet is converted (with some bending) into a torus.
Arnold’s key idea was to create a transformation that takes the torus into itself, preserving volume, yet including the ability for regions to pass around each other. Arnold accomplished this with the simple map
where the modulus 1 takes the unit square into itself. This transformation can also be expressed as a matrix
followed by taking modulus 1. The transformation matrix is called a Floquet matrix, and the determinant of the matrix is equal to unity, which ensures that volume is conserved.
Arnold decided to illustrate this mapping by using a crude image of the face of a cat (See Fig. 1). Successive applications of the transformation stretch and shear the cat, which is then folded back into the unit square. The stretching and folding preserve the volume, but the image becomes all mixed up, just like mixing in a chaotic Hamiltonian system, or like an immiscible dye in water that is stirred.
Fig. 1 Arnold’s illustration of his cat map from pg. 6 of V. I. Arnold and A. Avez, Ergodic Problems of Classical Mechanics (Benjamin, 1968) [5]
Fig. 2 Arnold Cat Map operation is an iterated succession of stretching with shear of a unit square, and translation back to the unit square. The mapping preserves and mixes areas, and is invertible.
Recurrence
When the transformation matrix is applied to continuous values, it produces a continuous range of transformed values that become thinner and thinner until the unit square is uniformly mixed. However, if the unit square is discrete, made up of pixels, then something very different happens (see Fig. 3). The image of the cat in this case is composed of a 50×50 array of pixels. For early iterations, the image becomes stretched and mixed, but at iteration 50 there are 4 low-resolution upside-down versions of the cat, and at iteration 75 the cat fully reforms, but is upside-down. Continuing on, the cat eventually reappears fully reformed and upright at iteration 150. Therefore, the discrete case displays a recurrence and the mapping is periodic. Calculating the period of the cat map on lattices can lead to interesting patterns, especially if the lattice is composed of prime numbers [6].
Fig. 3 A discrete cat map has a recurrence period. This example with a 50×50 lattice has a period of 150.
The Cat Map and the Golden Mean
The golden mean, or the golden ratio, 1.618033988749895 is never far away when working with Hamiltonian systems. Because the golden mean is the “most irrational” of all irrational numbers, it plays an essential role in KAM theory on the stability of the solar system. In the case of Arnold’s cat map, it pops up its head in several ways. For instance, the transformation matrix has eigenvalues
with the remarkable property that
which guarantees conservation of area.
Selected V. I. Arnold Publications
Arnold,
V. I. “FUNCTIONS OF 3 VARIABLES.” Doklady Akademii Nauk Sssr 114(4):
679-681. (1957)
Arnold,
V. I. “GENERATION OF QUASI-PERIODIC MOTION FROM A FAMILY OF PERIODIC
MOTIONS.” Doklady Akademii Nauk Sssr 138(1): 13-&.
(1961)
Arnold,
V. I. “STABILITY OF EQUILIBRIUM POSITION OF A HAMILTONIAN SYSTEM OF
ORDINARY DIFFERENTIAL EQUATIONS IN GENERAL ELLIPTIC CASE.” Doklady
Akademii Nauk Sssr 137(2): 255-&. (1961)
Arnold,
V. I. “BEHAVIOUR OF AN ADIABATIC INVARIANT WHEN HAMILTONS FUNCTION IS
UNDERGOING A SLOW PERIODIC VARIATION.” Doklady Akademii Nauk Sssr 142(4):
758-&. (1962)
Arnold,
V. I. “CLASSICAL THEORY OF PERTURBATIONS AND PROBLEM OF STABILITY OF
PLANETARY SYSTEMS.” Doklady Akademii Nauk Sssr 145(3):
487-&. (1962)
Arnold,
V. I. “BEHAVIOUR OF AN ADIABATIC INVARIANT WHEN HAMILTONS FUNCTION IS
UNDERGOING A SLOW PERIODIC VARIATION.” Doklady Akademii Nauk Sssr 142(4):
758-&. (1962)
Arnold,
V. I. and Y. G. Sinai. “SMALL PERTURBATIONS OF AUTHOMORPHISMS OF A
TORE.” Doklady Akademii Nauk Sssr 144(4): 695-&. (1962)
Arnold,
V. I. “Small denominators and problems of the stability of motion in
classical and celestial mechanics (in Russian).” Usp. Mat. Nauk. 18:
91-192. (1963)
Arnold,
V. I. and A. L. Krylov. “UNIFORM DISTRIBUTION OF POINTS ON A SPHERE AND
SOME ERGODIC PROPERTIES OF SOLUTIONS TO LINEAR ORDINARY DIFFERENTIAL EQUATIONS
IN COMPLEX REGION.” Doklady Akademii Nauk Sssr 148(1):
9-&. (1963)
Arnold,
V. I. “INSTABILITY OF DYNAMICAL SYSTEMS WITH MANY DEGREES OF
FREEDOM.” Doklady Akademii Nauk Sssr 156(1): 9-&. (1964)
Arnold,
V. “SUR UNE PROPRIETE TOPOLOGIQUE DES APPLICATIONS GLOBALEMENT CANONIQUES
DE LA MECANIQUE CLASSIQUE.” Comptes Rendus Hebdomadaires Des Seances De
L Academie Des Sciences 261(19): 3719-&. (1965)
Arnold, V. I. “APPLICABILITY CONDITIONS AND ERROR ESTIMATION BY AVERAGING FOR SYSTEMS WHICH GO THROUGH RESONANCES IN COURSE OF EVOLUTION.” Doklady Akademii Nauk Sssr 161(1): 9-&. (1965)
Bibliography
[1] Dumas, H. S. The KAM Story: A friendly introduction to the content, history and significance of Classical Kolmogorov-Arnold-Moser Theory, World Scientific. (2014)
[2] See Chapter 6, “The Tangled Tale of Phase Space” in Galileo Unbound (D. D. Nolte, Oxford University Press, 2018)
[3] V. I. Arnold, Mathematical Methods of Classical Mechanics (Nauk 1974, English translation Springer 1978)
[4] See Chapter 3, “Hamiltonian Dynamics and Phase Space” in Introduction to Modern Dynamics, 2nd ed. (D. D. Nolte, Oxford University Press, 2019)
[5] V. I. Arnold and A. Avez, Ergodic Problems of Classical Mechanics (Benjamin, 1968)
[6] Gaspari, G. “THE ARNOLD CAT MAP ON PRIME LATTICES.” Physica D-Nonlinear Phenomena 73(4): 352-372. (1994)
This Blog Post is a Companion to the undergraduate physics textbook Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford, 2019) introducing Lagrangians and Hamiltonians, chaos theory, complex systems, synchronization, neural networks, econophysics and Special and General Relativity.
Nature loves the path of steepest descent. Place a ball on a smooth curved surface and release it, and it will instantansouly accelerate in the direction of steepest descent. Shoot a laser beam from an oblique angle onto a piece of glass to hit a target inside, and the path taken by the beam is the only path that decreases the distance to the target in the shortest time. Diffract a stream of electrons from the surface of a crystal, and quantum detection events are greatest at the positions where the troughs and peaks of the deBroglie waves converge the most. The first example is Newton’s second law. The second example is Fermat’s principle and Snell’s Law. The third example is Feynman’s path-integral formulation of quantum mechanics. They all share in common a minimization principle—the principle of least action—that the path of a dynamical system is the one that minimizes a property known as “action”.
The Eikonal Equation is the “F = ma” of ray optics. It’s solutions describe the paths of light rays through complicated media.
The principle of least action, first proposed by the French physicist Maupertuis through mechanical analogy, became a principle of Lagrangian mechanics in the hands of Lagrange, but was still restricted to mechanical systems of particles. The principle was generalized forty years later by Hamilton, who began by considering the propagation of light waves, and ended by transforming mechanics into a study of pure geometry divorced from forces and inertia. Optics played a key role in the development of mechanics, and mechanics returned the favor by giving optics the Eikonal Equation. The Eikonal Equation is the “F = ma” of ray optics. It’s solutions describe the paths of light rays through complicated media.
Malus’ Theorem
Anyone who has taken a course in optics knows that Étienne-Louis Malus (1775-1812) discovered the polarization of light, but little else is taught about this French mathematician who was one of the savants Napoleon had taken along with himself when he invaded Egypt in 1798. After experiencing numerous horrors of war and plague, Malus returned to France damaged but wiser. He discovered the polarization of light in the Fall of 1808 as he was playing with crystals of icelandic spar at sunset and happened to view last rays of the sun reflected from the windows of the Luxumbourg palace. Icelandic spar produces double images in natural light because it is birefringent. Malus discovered that he could extinguish one of the double images of the Luxumbourg windows by rotating the crystal a certain way, demonstrating that light is polarized by reflection. The degree to which light is extinguished as a function of the angle of the polarizing crystal is known as Malus’ Law.
Fronts-piece to the Description de l’Égypte , the first volume published by Joseph Fourier in 1808 based on the report of the savants of L’Institute de l’Égypte that included Monge, Fourier and Malus, among many other French scientists and engineers.
Malus had picked up an interest in the general properties of light and imaging during lulls in his ordeal in Egypt. (To read about Malus’ misadventures during Napoleon’s campaign in Egypt, see Chapter 1 of Interference.) He was an emissionist following his compatriot Laplace, rather than an undulationist following Thomas Young. It is ironic that the French scientists were staunchly supporting Newton on the nature of light, while the British scientist Thomas Young was trying to upend Netwonian optics. Almost all physicists at that time were emissionists, only a few years after Young’s double-slit experiment of 1804, and few serious scientists accepted Young’s theory of the wave nature of light until Fresnel and Arago supplied the rigorous theory and experimental proofs much later in 1819.
Malus’ Theorem states that rays perpendicular to an initial surface are perpendicular to a later surface after reflection in an optical system. This theorem is the starting point for the Eikonal ray equation, as well as for modern applications in adaptive optics. This figure shows a propagating aberrated wavefront that is “compensated” by a deformable mirror to produce a tight focus.
As a prelude to his later discovery of polarization, Malus had earlier proven a theorem about trajectories that particles of light take through an optical system. One of the key questions about the particles of light in an optical system was how they formed images. The physics of light particles moving through lenses was too complex to treat at that time, but reflection was relatively easy based on the simple reflection law. Malus proved a theorem mathematically that after a reflection from a curved mirror, a set of rays perpendicular to an initial nonplanar surface would remain perpendicular at a later surface after reflection (this property is closely related to the conservation of optical etendue). This is known as Malus’ Theorem, and he thought it only held true after a single reflection, but later mathematicians proved that it remains true even after an arbitrary number of reflections, even in cases when the rays intersect to form an optical effect known as a caustic. The mathematics of caustics would catch the interest of an Irish mathematician and physicist who helped launch a new field of mathematical physics.
Etienne-Louis Malus
Hamilton’s Characteristic Function
William Rowan Hamilton (1805 – 1865) was a child prodigy who taught himself thirteen languages by the time he was thirteen years old (with the help of his linguist uncle), but mathematics became his primary focus at Trinity College at the University in Dublin. His mathematical prowess was so great that he was made the Astronomer Royal of Ireland while still an undergraduate student. He also became fascinated in the theory of envelopes of curves and in particular to the mathematics of caustic curves in optics.
In 1823 at the age of 18, he wrote a paper titled Caustics that was read to the Royal Irish Academy. In this paper, Hamilton gave an exceedingly simple proof of Malus’ Law, but that was perhaps the simplest part of the paper. Other aspects were mathematically obscure and reviewers requested further additions and refinements before publication. Over the next four years, as Hamilton expanded this work on optics, he developed a new theory of optics, the first part of which was published as Theory of Systems of Rays in 1827 with two following supplements completed by 1833 but never published.
Hamilton’s most important contribution
to optical theory (and eventually to mechanics) he called his characteristic
function. By applying the principle of
Fermat’s least time, which he called his principle of stationary action, he
sought to find a single unique function that characterized every path through
an optical system. By first proving
Malus’ Theorem and then applying the theorem to any system of rays using the
principle of stationary action, he was able to construct two partial
differential equations whose solution, if it could be found, defined every ray
through the optical system. This result
was completely general and could be extended to include curved rays passing
through inhomogeneous media. Because it
mapped input rays to output rays, it was the most general characterization of
any defined optical system. The
characteristic function defined surfaces of constant action whose normal
vectors were the rays of the optical system.
Today these surfaces of constant action are called the Eikonal function
(but how it got its name is the next chapter of this story). Using his characteristic function, Hamilton
predicted a phenomenon known as conical refraction in 1832, which was
subsequently observed, launching him to a level of fame unusual for an
academic.
Once Hamilton had established his principle of stationary action of curved light rays, it was an easy step to extend it to apply to mechanical systems of particles with curved trajectories. This step produced his most famous work On a General Method in Dynamics published in two parts in 1834 and 1835 [1] in which he developed what became known as Hamiltonian dynamics. As his mechanical work was extended by others including Jacobi, Darboux and Poincaré, Hamilton’s work on optics was overshadowed, overlooked and eventually lost. It was rediscovered when Schrödinger, in his famous paper of 1926, invoked Hamilton’s optical work as a direct example of the wave-particle duality of quantum mechanics [2]. Yet in the interim, a German mathematician tackled the same optical problems that Hamilton had seventy years earlier, and gave the Eikonal Equation its name.
Bruns’ Eikonal
The German mathematician Heinrich Bruns (1848-1919) was engaged chiefly with the measurement of the Earth, or geodesy. He was a professor of mathematics in Berlin and later Leipzig. One claim fame was that one of his graduate students was Felix Hausdorff [3] who would go on to much greater fame in the field of set theory and measure theory (the Hausdorff dimension was a precursor to the fractal dimension). Possibly motivated by his studies done with Hausdorff on refraction of light by the atmosphere, Bruns became interested in Malus’ Theorem for the same reasons and with the same goals as Hamilton, yet was unaware of Hamilton’s work in optics.
The mathematical process of creating “images”, in the sense of a mathematical mapping, made Bruns think of the Greek word εικων which literally means “icon” or “image”, and he published a small book in 1895 with the title Das Eikonal in which he derived a general equation for the path of rays through an optical system. His approach was heavily geometrical and is not easily recognized as an equation arising from variational principals. It rediscovered most of the results of Hamilton’s paper on the Theory of Systems of Rays and was thus not groundbreaking in the sense of new discovery. But it did reintroduce the world to the problem of systems of rays, and his name of Eikonal for the equations of the ray paths stuck, and was used with increasing frequency in subsequent years. Arnold Sommerfeld (1868 – 1951) was one of the early proponents of the Eikonal equation and recognized its connection with action principles in mechanics. He discussed the Eikonal equation in a 1911 optics paper with Runge [4] and in 1916 used action principles to extend Bohr’s model of the hydrogen atom [5]. While the Eikonal approach was not used often, it became popular in the 1960’s when computational optics made numerical solutions possible.
Lagrangian Dynamics of Light Rays
In physical optics, one of the most important properties of a ray passing through an optical system is known as the optical path length (OPL). The OPL is the central quantity that is used in problems of interferometry, and it is the central property that appears in Fermat’s principle that leads to Snell’s Law. The OPL played an important role in the history of the calculus when Johann Bernoulli in 1697 used it to derive the path taken by a light ray as an analogy of a brachistochrone curve – the curve of least time taken by a particle between two points.
The OPL between two points in a refractive medium is the sum of the piecewise product of the refractive index n with infinitesimal elements of the path length ds. In integral form, this is expressed as
where the “dot” is a derivative
with respedt to s. The optical
Lagrangian is recognized as
The Lagrangian is inserted into the Euler equations to yield (after some algebra, see Introduction to Modern Dynamics pg. 336)
This is a second-order
ordinary differential equation in the variables xa that define the
ray path through the system. It is
literally a “trajectory” of the ray, and the Eikonal equation becomes the F =
ma of ray optics.
Hamiltonian Optics
In a paraxial system (in which
the rays never make large angles relative to the optic axis) it is common to
select the position z as a single parameter to define the curve of the ray path
so that the trajectory is parameterized as
where the derivatives
are with respect to z, and the effective Lagrangian is recognized as
The Hamiltonian
formulation is derived from the Lagrangian by defining an optical Hamiltonian
as the Legendre transform of the Lagrangian.
To start, the Lagrangian is expressed in terms of the generalized
coordinates and momenta. The generalized
optical momenta are defined as
This relationship leads
to an alternative expression for the Eikonal equation (also known as the scalar
Eikonal equation) expressed as
where S(x,y,z) = const. is the eikonal function. The
momentum vectors are perpendicular to the surfaces of constant S, which
are recognized as the wavefronts of a propagating wave.
The
Lagrangian can be restated as a function of the generalized momenta as
and the Legendre
transform that takes the Lagrangian into the Hamiltonian is
The trajectory of the
rays is the solution to Hamilton’s equations of motion applied to this
Hamiltonian
Light Orbits
If the optical rays are
restricted to the x-y plane, then Hamilton’s equations of motion can be
expressed relative to the path length ds, and the momenta are pa =
ndxa/ds. The ray equations are
(simply expressing the 2 second-order Eikonal equation as 4 first-order
equations)
where the dot is a derivative
with respect to the element ds.
As an example, consider a radial refractive index profile in the x-y plane
where r is the radius on the x-y plane. Putting this refractive index profile into the Eikonal equations creates a two-dimensional orbit in the x-y plane. The Eikonal Equation is the “F = ma” of ray optics. It’s solutions describe the paths of light rays through complicated media, including the phenomenon of gravitational lensing (see my blog post) and the orbits of photons around black holes (see my other blog post).
By David D. Nolte, May 30, 2019
Gaussian refractive index profile in the x-y plane. From raysimple.py.
Ray orbits around the center of the Gaussian refractive index profile. From raysimple.py
Python Code: raysimple.py
The following Python code solves for individual trajectories. (Python code on GitHub.)
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
raysimple.py
Created on Tue May 28 11:50:24 2019
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os
plt.close('all')
# selection 1 = Gaussian
# selection 2 = Donut
selection = 1
print(' ')
print('raysimple.py')
def refindex(x,y):
if selection == 1:
sig = 10
n = 1 + np.exp(-(x**2 + y**2)/2/sig**2)
nx = (-2*x/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
ny = (-2*y/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
elif selection == 2:
sig = 10;
r2 = (x**2 + y**2)
r1 = np.sqrt(r2)
np.expon = np.exp(-r2/2/sig**2)
n = 1+0.3*r1*np.expon;
nx = 0.3*r1*(-2*x/2/sig**2)*np.expon + 0.3*np.expon*2*x/r1
ny = 0.3*r1*(-2*y/2/sig**2)*np.expon + 0.3*np.expon*2*y/r1
return [n,nx,ny]
def flow_deriv(x_y_z,tspan):
x, y, z, w = x_y_z
n, nx, ny = refindex(x,y)
yp = np.zeros(shape=(4,))
yp[0] = z/n
yp[1] = w/n
yp[2] = nx
yp[3] = ny
return yp
V = np.zeros(shape=(100,100))
for xloop in range(100):
xx = -20 + 40*xloop/100
for yloop in range(100):
yy = -20 + 40*yloop/100
n,nx,ny = refindex(xx,yy)
V[yloop,xloop] = n
fig = plt.figure(1)
contr = plt.contourf(V,100, cmap=cm.coolwarm, vmin = 1, vmax = 3)
fig.colorbar(contr, shrink=0.5, aspect=5)
fig = plt.show()
v1 = 0.707 # Change this initial condition
v2 = np.sqrt(1-v1**2)
y0 = [12, 0, v1, v2] # Change these initial conditions
tspan = np.linspace(1,1700,1700)
y = integrate.odeint(flow_deriv, y0, tspan)
plt.figure(2)
lines = plt.plot(y[1:1550,0],y[1:1550,1])
plt.setp(lines, linewidth=0.5)
plt.show()
New from Oxford University Press: Interference and the History of Light and Optics (2023)
Read the stories of the scientists and engineers who tamed light and used it to probe the universe.
An excellent textbook on geometric optics from Hamilton’s point of view is K. B. Wolf, Geometric Optics in Phase Space (Springer, 2004). Another is H. A. Buchdahl, An Introduction to Hamiltonian Optics (Dover, 1992).
A rather older textbook on geometrical optics is by J. L. Synge, Geometrical Optics: An Introduction to Hamilton’s Method (Cambridge University Press, 1962) showing the derivation of the ray equations in the final chapter using variational methods. Synge takes a dim view of Bruns’ term “Eikonal” since Hamilton got there first and Bruns was unaware of it.
A book that makes an especially strong case for the Optical-Mechanical analogy of Fermat’s principle, connecting the trajectories of mechanics to the paths of optical rays is Daryl Holm, Geometric Mechanics: Part I Dynamics and Symmetry (Imperial College Press 2008).
[1] Hamilton, W. R. “On a general method in dynamics I.” Mathematical Papers, I ,103-161: 247-308. (1834); Hamilton, W. R. “On a general method in dynamics II.” Mathematical Papers, I ,103-161: 95-144. (1835)
[2] Schrodinger, E. “Quantification of the eigen-value problem.” Annalen Der Physik 79(6): 489-527. (1926)
There is a very real possibility that quantum computing is, and always will be, a technology of the future. Yet if it is ever to be the technology of the now, then it needs two things: practical high-performance implementation and a killer app. Both of these will require technological breakthroughs. Whether this will be enough to make quantum computing real (commercializable) was the topic of a special symposium at the Conference on Lasers and ElectroOptics (CLEO) held in San Jose the week of May 6, 2019.
Quantum computing is stuck in a sort of limbo between hype and hope, pitched with incredible (unbelievable?) claims, yet supported by tantalizing laboratory demonstrations.
The symposium had panelists from many top groups working in quantum information science, including Jerry Chow (IBM), Mikhail Lukin (Harvard), Jelena Vuckovic (Stanford), Birgitta Whaley (Berkeley) and Jungsang Kim (IonQ). The moderator Ben Eggleton (U Sydney) posed the question to the panel: “Will Quantum Computing Actually Work?”. My Blog for this week is a report, in part, of what they said, and also what was happening in the hallways and the scientific sessions at CLEO. My personal view after listening and watching this past week is that the future of quantum computers is optics.
Einstein’s Photons
It is either ironic or obvious that the central figure behind quantum computing is Albert Einstein. It is obvious because Einstein provided the fundamental tools of quantum computing by creating both quanta and entanglement (the two key elements to any quantum computer). It is ironic, because Einstein turned his back on quantum mechanics, and he “invented” entanglement to actually argue that it was an “incomplete science”.
The actual quantum revolution did not begin with Max Planck in 1900, as so many Modern Physics textbooks attest, but with Einstein in 1905. This was his “miracle year” when he published 5 seminal papers, each of which solved one of the greatest outstanding problems in the physics of the time. In one of those papers he used simple arguments based on statistics, combined with the properties of light emission, to propose — actually to prove — that light is composed of quanta of energy (later to be named “photons” by Gilbert Lewis in 1924). Although Planck’s theory of blackbody radiation contained quanta implicitly through the discrete actions of his oscillators in the walls of the cavity, Planck vigorously rejected the idea that light itself came in quanta. He even apologized for Einstein, as he was proposing Einstein for membership the Berlin Academy, saying that he should be admitted despite his grave error of believing in light quanta. When Millikan set out in 1914 to prove experimentally that Einstein was wrong about photons by performing exquisite experiments on the photoelectric effect, he actually ended up proving that Einstein was right after all, which brought Einstein the Nobel Prize in 1921.
In the early 1930’s after a series of intense and public debates with Bohr over the meaning of quantum mechanics, Einstein had had enough of the “Copenhagen Interpretation” of quantum mechanics. In league with Schrödinger, who deeply disliked Heisenberg’s version of quantum mechanics, the two proposed two of the most iconic problems of quantum mechanics. Schrödinger launched, as a laughable parody, his eponymously-named “Schrödinger’s Cat”, and Einstein launched what has become known as the “Entanglement”. Each was intended to show the absurdity of quantum mechanics and drive a nail into its coffin, but each has been embraced so thoroughly by physicists that Schrödinger and Einstein are given the praise and glory for inventing these touchstones of quantum science. Schrödinger’s cat and entanglement both lie at the heart of the problems and the promise of quantum computers.
Between Hype and Hope
Quantum computing is stuck in a sort of limbo between hype and hope, pitched with incredible (unbelievable?) claims, yet supported by tantalizing laboratory demonstrations. In the midst of the current revival in quantum computing interest (the first wave of interest in quantum computing was in the 1990’s, see “Mind at Light Speed“), the US Congress has passed a house resolution to fund quantum computing efforts in the United States with a commitment $1B. This comes on the heels of commercial efforts in quantum computing by big players like IBM, Microsoft and Google, and also is partially in response to China’s substantial financial commitment to quantum information science. These acts, and the infusion of cash, will supercharge efforts on quantum computing. But this comes with real danger of creating a bubble. If there is too much hype, and if the expensive efforts under-deliver, then the bubble will burst, putting quantum computing back by decades. This has happened before, as in the telecom and fiber optics bubble of Y2K that burst in 2001. The optics industry is still recovering from that crash nearly 20 years later. The quantum computing community will need to be very careful in managing expectations, while also making real strides on some very difficult and long-range problems.
This was part of what the discussion at the CLEO symposium centered around. Despite the charge by Eggleton to “be real” and avoid the hype, there was plenty of hype going around on the panel and plenty of optimism, tempered by caution. I admit that there is reason for cautious optimism. Jerry Chow showed IBM’s very real quantum computer (with a very small number of qubits) that can be accessed through the cloud by anyone. They even built a user interface to allow users to code their own quantum codes. Jungsang Kim of IonQ was equally optimistic, showing off their trapped-atom quantum computer with dozens of trapped ions acting as individual qubits. Admittedly Chow and Kim have vested interests in their own products, but the technology is certainly impressive. One of the sharpest critics, Mikhail Lukin of Harvard, was surprisingly also one of the most optimistic. He made clear that scalable quantum computers in the near future is nonsense. Yet he is part of a Harvard-MIT collaboration that has constructed a 51-qubit array of trapped atoms that sets a world record. Although it cannot be used for quantum computing, it was used to simulate a complex many-body physics problem, and it found an answer that could not be calculated or predicted using conventional computers.
The panel did come to a general consensus about quantum computing that highlights the specific challenges that the field will face as it is called upon to deliver on its hyperbole. They each echoed an idea known as the “supremacy plot” which is a two-axis graph of number of qubits and number of operations (also called circuit depth). The graph has one region that is not interesting, one region that is downright laughable (at the moment), and one final area of great hope. The region of no interest lies in the range of large numbers of qubits but low numbers of operations, or large numbers of operations on a small number of qubits. Each of these extremes can easily be calculated on conventional computers and hence is of no practical interest. The region that is laughable is the the area of large numbers of qubits and large numbers of operations. No one suggested that this area can be accessed in even the next 10 years. The region that everyone is eager to reach is the region of “quantum supremacy”. This consists of quantum computers that have enough qubits and enough operations that they cannot be simulated by classical computers. When asked where this region is, the panel consensus was that it would require more than 50 qubits and more than hundreds or thousands of operations. What makes this so exciting is that there are real technologies that are now approaching this region–and they are based on light.
The Quantum Supremacy Chart: Plot of the number of Qbits and the circuit depth (number of operations or gates) in a quantum computer. The red region (“Zzzzzzz”) is where classical computers can do as well. The purple region (“Ha Ha Ha”) is a dream. The middle region (“Wow”) is the region of hope, which may soon be reached by trapped atoms and optics.
Chris Monroe’s Perfect Qubits
The second plenary session at CLEO featured the recent Nobel prize winners Art Ashkin, Donna Strickland and Gerard Mourou who won the 2018 Nobel prize in physics for laser applications. (Donna Strickland is only the third woman to win the Nobel prize in physics.) The warm-up band for these headliners was Chris Monroe, founder of the start-up company IonQ out of the University of Maryland. Monroe outlined the general layout of their quantum computer which is based on trapped atoms which he called “perfect qubits”. Each trapped atom is literally an atomic clock with the kind of exact precision that atomic clocks come with. The quantum properties of these atoms are as perfect as is needed for any quantum computation, and the limits on the performance of the current IonQ system is entirely caused by the classical controls that trap and manipulate the atoms. This is where the efforts of their rapidly growing R&D team are focused.
If trapped atoms are the perfect qubit, then the perfect quantum communication channel is the photon. The photon in vacuum is the quintessential messenger, propagating forever and interacting with nothing. This is why experimental cosmologists can see the photons originating from the Big Bang 13 billion years ago (actually from about a hundred thousand years after the Big Bang when the Universe became transparent). In a quantum computer based on trapped atoms as the gates, photons become the perfect wires.
On the quantum supremacy chart, Monroe plotted the two main quantum computing technologies: solid state (based mainly on superconductors but also some semiconductor technology) and trapped atoms. The challenges to solid state quantum computers comes with the scale-up to the range of 50 qubits or more that will be needed to cross the frontier into quantum supremacy. The inhomogeneous nature of solid state fabrication, as perfected as it is for the transistor, is a central problem for a solid state solution to quantum computing. Furthermore, by scaling up the number of solid state qubits, it is extremely difficult to simultaneously increase the circuit depth. In fact, circuit depth is likely to decrease (initially) as the number of qubits rises because of the two-dimensional interconnect problem that is well known to circuit designers. Trapped atoms, on the other hand, have the advantages of the perfection of atomic clocks that can be globally interconnected through perfect photon channels, and scaling up the number of qubits can go together with increased circuit depth–at least in the view of Monroe, who admittedly has a vested interest. But he was speaking before an audience of several thousand highly-trained and highly-critical optics specialists, and no scientist in front of such an audience will make a claim that cannot be supported (although the reality is always in the caveats).
The Future of Quantum Computing is Optics
The state of the art of the photonic control of light equals the levels of sophistication of electronic control of the electron in circuits. Each is driven by big-world applications: electronics by the consumer electronics and computer market, and photonics by the telecom industry. Having a technology attached to a major world-wide market is a guarantee that progress is made relatively quickly with the advantages of economy of scale. The commercial driver is profits, and the driver for funding agencies (who support quantum computing) is their mandate to foster competitive national economies that create jobs and improve standards of living.
The yearly CLEO conference is one of
the top conferences in laser science in the world, drawing in thousands of
laser scientists who are working on photonic control. Integrated optics is one of the current hot
topics. It brings many of the resources
of the electronics industry to bear on photonics. Solid state optics is mostly concerned with
quantum properties of matter and its interaction with photons, and this year’s
CLEO conference hosted many focused sessions on quantum sensors, quantum control,
quantum information and quantum communication.
The level of external control of quantum systems is increasing at a
spectacular rate. Sitting in the
audience at CLEO you get the sense that you are looking at the embryonic stages
of vast new technologies that will be enlisted in the near future for quantum
computing. The challenge is, there are
so many variants that it is hard to know which of these naissent technologies
will win and change the world. But the
key to technological progress is diversity (as it is for society), because it
is the interplay and cross-fertilization among the diverse technologies that
drives each forward, and even technologies that recede away still contribute to
the advances of the winning technology.
The expert panel at CLEO on the
future of quantum computing punctuated their moments of hype with moments of
realism as they called for new technologies to solve some of the current
barriers to quantum computers. Walking
out of the panel discussion that night, and walking into one of the CLEO technical
sessions the next day, you could almost connect the dots. The enabling technologies being requested by
the panel are literally being built by the audience.
In the end, the panel had a surprisingly prosaic argument in favor of the current push to build a working quantum computer. It is an echo of the movie Field of Dreams, with the famous quote “If you build it they will come”. That was the plea made by Lukin, who argued that by putting quantum computers into the hands of users, then the killer app that will drive the future economics of quantum computers likely will emerge. You don’t really know what to do with a quantum computer until you have one.
Given the “perfect qubits” of
trapped atoms, and the “perfect photons” of the communication channels,
combined with the dizzying assortment of quantum control technologies being
invented and highlighted at CLEO, it is easy to believe that the first
large-scale quantum computers will be based on light.
Bistability, bifurcation and hysteresis are ubiquitous phenomena that arise from nonlinear dynamics and have considerable importance for technology applications. For instance, the hysteresis associated with the flipping of magnetic domains under magnetic fields is the central mechanism for magnetic memory, and bistability is a key feature of switching technology.
… one of the most commonly encountered bifurcations is called a saddle-node bifurcation, which is the bifurcation that occurs in the biased double-well potential.
One of the simplest models for bistability and hysteresis is the one-dimensional double-well potential biased by a changing linear potential. An example of a double-well potential with a bias is
where the parameter c is a control parameter (bias) that can be adjusted or that changes slowly in time c(t). This dynamical system is also known as the Duffing oscillator. The net double-well potentials for several values of the control parameter c are shown in Fig. 1. With no bias, there are two degenerate energy minima. As c is made negative, the left well has the lowest energy, and as c is made positive the right well has the lowest energy.
The dynamics of this potential energy profile can be understood by imagining a small ball that responds to the local forces exerted by the potential. For large negative values of c the ball will have its minimum energy in the left well. As c is increased, the energy of the left well increases, and rises above the energy of the right well. If the ball began in the left well, even when the left well has a higher energy than the right, there is a potential barrier that the ball cannot overcome and it remains on the left. This local minimum is a stable equilibrium, but it is called “metastable” because it is not a global minimum of the system. Metastability is the origin of hysteresis.
Fig. 1 A biased double-well potential in one dimension. The thresholds to destroy the local metastable minima are c = +/-1.05. For values beyond threshold, only a single minimum exists with no barrier. Hysteresis is caused by the mass being stuck in the metastable (upper) minimum because it has insufficient energy to overcome the potential barrier, until the barrier disappears at threshold and the ball rolls all the way down to the bottom to the new location. When the bias is slowly reversed, the new location becomes metastable, until the ball can overcome the barrier and roll down to its original minimum, etc.
Once sufficient bias is applied that the local minimum disappears, the ball will roll downhill to the new minimum on the right, and in the presence of dissipation, it will come to rest in the new minimum. The bias can then be slowly lowered, reversing this process. Because of the potential barrier, the bias must change sign and be strong enough to remove the stability of the now metastable fixed point with the ball on the right, allowing the ball to roll back down to its original location on the left. This “overshoot” defines the extent of the hysteresis. The fact that there are two minima, and that one is metastable with a barrier between the two, produces “bistability”, meaning that there are two stable fixed points for the same control parameter.
For illustration, assume a mass obeys the flow equation
including a damping term, where the force is the negative gradient of the potential energy. The bias parameter c can be time dependent, beginning beyond the negative threshold and slowly increasing until it exceeds the positive threshold, and then reversing and decreasing again. The position of the mass is locally a damped oscillator until a threshold is passed, and then the mass falls into the global minimum, as shown in Fig. 2. As the bias is reversed, it remains in the metastable minimum on the right until the control parameter passes threshold, and then the mass drops into the left minimum that is now a global minimum.
Fig. 2 Hysteresis diagram. The mass begins in the left well. As the parameter c increases, the mass remains in the well, even though it is no longer the global minimum when c becomes positive. When c passes the positive threshold (around 1.05 for this example), the mass falls into the right well, with damped oscillation. Then the control parameter c is decreased slowly until the negative threshold is passed, and the mass switches to the left well with damped oscillations. The difference between the “switch up” and “switch down” values of the control parameter represents the “hysteresis” of the this system.
The sudden switching of the biased double-well potential represents what is known as a “bifurcation”. A bifurcation is a sudden change in the qualitative behavior of a system caused by a small change in a control variable. Usually, a bifurcation occurs when the number of attractors of a system changes. There is a fairly large menagerie of different types of bifurcations, but one of the most commonly encountered bifurcations is called a saddle-node bifurcation, which is the bifurcation that occurs in the biased double-well potential. In fact, there are two saddle-node bifurcations.
Bifurcations are easily portrayed by creating a joint space between phase space and the one (or more) control parameters that induce the bifurcation. The phase space of the double well is two dimensional (position, velocity) with three fixed points, but the change in the number of fixed points can be captured by taking a projection of the phase space onto a lower-dimensional manifold. In this case, the projection is simply along the x-axis. Therefore a “co-dimensional phase space” can be constructed with the x-axis as one dimension and the control parameter as the other. This is illustrated in Fig. 3. The cubic curve traces out the solutions to the fixed-point equation
For a given value of the control parameter c there are either three solutions or one solution. The values of c where the number of solutions changes discontinuously is the bifurcation point c*. Two examples of the potential function are shown on the right for c = +1 and c = -0.5 showing the locations of the three fixed points.
Fig. 3 The co-dimension phase space combines the one-dimensional dynamics along the position x with the control parameter. For a given value of c, there are three or one solution for the fixed point. When there are three solutions, two are stable (the double minima) and one is unstable (the saddle). As the magnitude of the bias increases, one stable node annihilates with the unstable node (a minimum and the saddle merge) and the dynamics “switch” to the other minimum.
The threshold value in this example is c* = 1.05. When |c| < c* the two stable fixed points are the two minima of the double-well potential, and the unstable fixed point is the saddle between them. When |c| > c* then the single stable fixed point is the single minimum of the potential function. The saddle-node bifurcation takes its name from the fact (illustrated here) that the unstable fixed point is a saddle, and at the bifurcation the saddle point annihilates with one of the stable fixed points.
The following Python code illustrates the behavior of a biased double-well potential, with damping, in which the control parameter changes slowly with a sinusoidal time dependence.
Python Code: DWH.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
DWH.py
Created on Wed Apr 17 15:53:42 2019
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""
import numpy as np
from scipy import integrate
from scipy import signal
from matplotlib import pyplot as plt
plt.close('all')
T = 400
Amp = 3.5
def solve_flow(y0,c0,lim = [-3,3,-3,3]):
def flow_deriv(x_y, t, c0):
#"""Compute the time-derivative of a Medio system."""
x, y = x_y
return [y,-0.5*y - x**3 + 2*x + x*(2*np.pi/T)*Amp*np.cos(2*np.pi*t/T) + Amp*np.sin(2*np.pi*t/T)]
tsettle = np.linspace(0,T,101)
yinit = y0;
x_tsettle = integrate.odeint(flow_deriv,yinit,tsettle,args=(T,))
y0 = x_tsettle[100,:]
t = np.linspace(0, 1.5*T, 2001)
x_t = integrate.odeint(flow_deriv, y0, t, args=(T,))
c = Amp*np.sin(2*np.pi*t/T)
return t, x_t, c
eps = 0.0001
for loop in range(0,100):
c = -1.2 + 2.4*loop/100 + eps;
xc[loop]=c
coeff = [-1, 0, 2, c]
y = np.roots(coeff)
xtmp = np.real(y[0])
ytmp = np.real(y[1])
X[loop] = np.min([xtmp,ytmp])
Y[loop] = np.max([xtmp,ytmp])
Z[loop]= np.real(y[2])
plt.figure(1)
lines = plt.plot(xc,X,xc,Y,xc,Z)
plt.setp(lines, linewidth=0.5)
plt.show()
plt.title('Roots')
y0 = [1.9, 0]
c0 = -2.
t, x_t, c = solve_flow(y0,c0)
y1 = x_t[:,0]
y2 = x_t[:,1]
plt.figure(2)
lines = plt.plot(t,y1)
plt.setp(lines, linewidth=0.5)
plt.show()
plt.ylabel('X Position')
plt.xlabel('Time')
plt.figure(3)
lines = plt.plot(c,y1)
plt.setp(lines, linewidth=0.5)
plt.show()
plt.ylabel('X Position')
plt.xlabel('Control Parameter')
plt.title('Hysteresis Figure')
This Blog Post is a Companion to the undergraduate physics textbook Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford, 2019) introducing Lagrangians and Hamiltonians, chaos theory, complex systems, synchronization, neural networks, econophysics and Special and General Relativity.
In the fall semester of 1947, a brilliant young British mathematician arrived at Cornell University to begin a yearlong fellowship paid by the British Commonwealth. Freeman Dyson (1923 –) had received an undergraduate degree in mathematics from Cambridge University and was considered to be one of their brightest graduates. With strong recommendations, he arrived to work with Hans Bethe on quantum electrodynamics. He made rapid progress on a relativistic model of the Lamb shift, inadvertently intimidating many of his fellow graduate students with his mathematical prowess. On the other hand, someone who intimidated him, was Richard Feynman.
Initially, Dyson considered Feynman to be a bit of a buffoon and slacker, but he started to notice that Feynman could calculate QED problems in a few lines that took him pages.
Freeman Dyson at Princeton in 1972.
I think like most science/geek types, my first introduction to the unfettered mind of Freeman Dyson was through the science fiction novel Ringworld by Larry Niven. The Dyson ring, or Dyson sphere, was conceived by Dyson when he was thinking about the ultimate fate of civilizations and their increasing need for energy. The greatest source of energy on a stellar scale is of course a star, and Dyson envisioned an advanced civilization capturing all that emitted stellar energy by building a solar collector with a radius the size of a planetary orbit. He published the paper “Search for Artificial Stellar Sources of Infra-Red Radiation” in the prestigious magazine Science in 1960. The practicality of such a scheme has to be seriously questioned, but it is a classic example of how easily he thinks outside the box, taking simple principles and extrapolating them to extreme consequences until the box looks like a speck of dust. I got a first-hand chance to see his way of thinking when he gave a physics colloquium at Cornell University in 1980 when I was an undergraduate there. Hans Bethe still had his office at that time in the Newman laboratory. I remember walking by and looking into his office getting a glance of him editing a paper at his desk. The topic of Dyson’s talk was the fate of life in the long-term evolution of the universe. His arguments were so simple they could not be refuted, yet the consequences for the way life would need to evolve in extreme time was unimaginable … it was a bazaar and mind blowing experience for me as an undergrad … and and example of the strange worlds that can be imagined through simple physics principles.
Initially, as Dyson settled into his life at Cornell under Bethe, he considered Feynman to be a bit of a buffoon and slacker, but he started to notice that Feynman could calculate QED problems in a few lines that took him pages. Dyson paid closer attention to Feynman, eventually spending more of his time with him than Bethe, and realized that Feynman had invented an entirely new way of calculating quantum effects that used cartoons as a form of book keeping to reduce the complexity of many calculations. Dyson still did not fully understand how Feynman was doing it, but knew that Feynman’s approach was giving all the right answers. Around that time, he also began to read about Schwinger’s field-theory approach to QED, following Schwinger’s approach as far as he could, but always coming away with the feeling that it was too complicated and required too much math—even for him!
Road Trip Across America
That summer, Dyson had time to explore America for the first time because Bethe had gone on an extended trip to Europe. It turned out that Feynman was driving his car to New Mexico to patch things up with an old flame from his Los Alamos days, so Dyson was happy to tag along. For days, as they drove across the US, they talked about life and physics and QED. Dyson had Feynman all to himself and began to see daylight in Feynman’s approach, and to understand that it might be consistent with Schwinger’s and Tomonaga’s field theory approach. After leaving Feynman in New Mexico, he travelled to the University of Michigan where Schwinger gave a short course on QED, and he was able to dig deeper, talking with him frequently between lectures.
At the end of the summer, it had been arranged that he would spend the second year of his fellowship at the Institute for Advanced Study in Princeton where Oppenheimer was the new head. As a final lark before beginning that new phase of his studies he spent a week at Berkeley. The visit there was uneventful, and he did not find the same kind of open camaraderie that he had found with Bethe in the Newman Laboratory at Cornell, but it left him time to think. And the more he thought about Schwinger and Feynman, the more convinced he became that the two were equivalent. On the long bus ride back east from Berkeley, as he half dozed and half looked out the window, he had an epiphany. He saw all at once how to draw the map from one to the other. What was more, he realized that many of Feynman’s techniques were much simpler than Schwinger’s, which would significantly simplify lengthy calculations. By the time he arrived in Chicago, he was ready to write it all down, and by the time he arrived in Princeton, he was ready to publish. It took him only a few weeks to do it, working with an intensity that he had never experienced before. When he was done, he sent the paper off to the Physical Review[1].
Dyson knew that he had achieved something significant even though he was essentially just a second-year graduate student, at least from the point of view of the American post-graduate system. Cambridge was a little different, and Dyson’s degree there was more than the standard bachelor’s degree here. Nonetheless, he was now under the auspices of the Institute for Advanced Study, where Einstein had his office, and he had sent off an unsupervised manuscript for publication without any imprimatur from the powers at be. The specific power that mattered most was Oppenheimer, who arrived a few days after Dyson had submitted his manuscript. When he greeted Oppenheimer, he was excited and pleased to hand him a copy. Oppenheimer, on the other hand, was neither excited nor pleased to receive it. Oppenheimer had formed a particularly bad opinion of Feynman’s form of QED at the conference held in the Poconos (to read about Feynman’s disaster at the Poconos conference, see my blog) half-a-year earlier and did not think that this brash young grad student could save it. Dyson, on his part, was taken aback. No one who has ever met Dyson would ever call him brash, but in this case he fought for a higher cause, writing a bold memo to Oppenheimer—that terrifying giant of a personality—outlining the importance of the Feynman theory.
Battle for the Heart of Quantum Field Theory
Oppenheimer decided to give Dyson a chance, and arranged for a series of seminars where Dyson could present the story to the assembled theory group at the Institute, but Dyson could make little headway. Every time he began to make progress, Oppenheimer would bring it crashing to a halt with scathing questions and criticisms. This went on for weeks, until Bethe visited from Cornell. Bethe by then was working with the Feynman formalism himself. As Bethe lectured in front of Oppenheimer, he seeded his talk with statements such as “surely they had all seen this from Dyson”, and Dyson took the opportunity to pipe up that he had not been allowed to get that far. After Bethe left, Oppenheimer relented, arranging for Dyson to give three seminars in one week. The seminars each went on for hours, but finally Dyson got to the end of it. The audience shuffled out of the seminar room with no energy left for discussions or arguments. Later that day, Dyson found a note in his box from Oppenheimer saying “Nolo Contendre”—Dyson had won!
With that victory under his belt, Dyson was in a position to communicate the new methods to a small army of postdocs at the Institute, supervising their progress on many outstanding problems in quantum electrodynamics that had resisted calculations using the complicated Schwinger-Tomonaga theory. Feynman, by this time, had finally published two substantial papers on his approach[2], which added to the foundation that Dyson was building at Princeton. Although Feynman continued to work for a year or two on QED problems, the center of gravity for these problems shifted solidly to the Institute for Advanced Study and to Dyson. The army of postdocs that Dyson supervised helped establish the use of Feynman diagrams in QED, calculating ever higher-order corrections to electromagnetic interactions. These same postdocs were among the first batch of wartime-trained theorists to move into faculty positions across the US, bringing the method of Feynman diagrams with them, adding to the rapid dissemination of Feynman diagrams into many aspects of theoretical physics that extend far beyond QED [3].
As a graduate student at Berkeley in the 1980’s I ran across a very simple-looking equation called “the Dyson equation” in our graduate textbook on relativistic quantum mechanics by Sakurai. The Dyson equation is the extraordinarily simple expression of an infinite series of Feynman diagrams that describes how an electron interacts with itself through the emission of virtual photons that link to virtual electron-positron pairs. This process leads to the propagator Green’s function for the electron and is the starting point for including the simple electron in more complex particle interactions.
The Dyson equation for the single-electron Green’s function represented as an infinite series of Feynman diagrams.
I had no feel for the use of the Dyson equation, barely limping through relativistic quantum mechanics, until a few years later when I was working at Lawrence Berkeley Lab with Mirek Hamera, a visiting scientist from Warwaw Poland who introduced me to the Haldane-Anderson model that applied to a project I was working on for my PhD. Using the theory, with Dyson’s equation at its heart, we were able to show that tightly bound electrons on transition-metal impurities in semiconductors acted as internal reference levels that allowed us to measure internal properties of semiconductors that had never been accessible before. A few years later, I used Dyson’s equation again when I was working on small precipitates of arsenic in the semiconductor GaAs, using the theory to describe an accordion-like ladder of electron states that can occur within the semiconductor bandgap when a nano-sphere takes on multiple charges [4].
The Coulomb ladder of deep energy states of a nano-sphere in GaAs calculated using self-energy principles first studied by Dyson.
I last saw Dyson when he gave the Hubert James Memorial Lecture at Purdue University in 1996. The title of his talk was “How the Dinosaurs Might Have Been Saved: Detection and Deflection of Earth-Impacting Bodies”. As always, his talk was wild and wide ranging, using the simplest possible physics to derive the most dire consequences of our continued existence on this planet.
[1]
Dyson, F. J. (1949). “THE RADIATION THEORIES OF TOMONAGA,
SCHWINGER, AND FEYNMAN.” Physical Review75(3): 486-502.
[2]
Feynman, R. P. (1949). “THE THEORY OF POSITRONS.” Physical
Review76(6): 749-759.
Feynman, R. P. (1949). “SPACE-TIME APPROACH TO QUANTUM
ELECTRODYNAMICS.” Physical Review76(6): 769-789.
[3] Kaiser, D., K. Ito and K. Hall (2004). “Spreading the tools of theory: Feynman diagrams in the USA, Japan, and the Soviet Union.” Social Studies of Science34(6): 879-922.
Although coal and steam launched the industrial revolution, gasoline and controlled explosions have sustained it for over a century. After early precursors, the internal combustion engine that we recognize today came to life in 1876 from the German engineers Otto and Daimler with later variations by Benz and Diesel. In the early 20th century, the gasoline engine was replacing coal and oil in virtually all mobile conveyances and had become a major industry attracting the top mechanical engineering talent. One of those talents was the German engineer Georg Duffing (1861 – 1944) whose unlikely side interest in the quantum mechanics revolution brought him to Berlin to hear lectures by Max Planck, where he launched his own revolution in nonlinear oscillators.
The publication of this highly academic book by a nonacademic would establish Duffing as the originator of one of the most iconic oscillators in modern dynamics.
An Academic Non-Academic
Georg Duffing was born in 1861 in the German town of Waldshut on the border with Switzerland north of Zurich. Within a year the family moved to Mannheim near Heidelberg where Georg received a good education in mathematics as well as music. His mathematical interests attracted him to engineering, and he built a reputation that led to an invitation to work at Westinghouse in the United States in 1910. When he returned to Germany he set himself up as a consultant and inventor with the freedom to move where he wished. In early 1913 he wished to move to Berlin where Max Planck was lecturing on the new quantum mechanics at the University. He was always searching for new knowledge, and sitting in on Planck’s lectures must have made him feel like he was witnessing the beginnings of a new era.
At that time Duffing was interested in problems related to brakes, gears and engines. In particular, he had become fascinated by vibrations that often were the limiting factors in engine performance. He stripped the problem of engine vibration down to its simplest form, and he began a careful and systematic study of nonlinear oscillations. While in Berlin, he had became acquainted with Prof. Meyer at the University who had a mechanical engineering laboratory. Meyer let Duffing perform his experiments in the lab on the weekends, sometime accompanied by his eldest daughter. By 1917 he had compiled a systematic investigation of various nonlinear effects in oscillators and had written a manuscript that collected all of this theoretical and experimental work. He extended this into a small book that he published with Vieweg & Sohn in 1918 to be purchased for a price of 5 Deutsch Marks [1]. The publication of this highly academic book by a nonacademic would establish Duffing as the originator of one of the most iconic oscillators in modern dynamics.
Fig. 1 Cover of Duffing’s 1918 publication on nonlinear oscillators.
Duffing’s Nonlinear Oscillator
The mathematical and technical focus of Duffing’s book was low-order nonlinear corrections to the linear harmonic oscillator. In one case, he considered a spring that either became stiffer or softer as it stretched. This happens when a cubic term is added to the usual linear Hooke’s law. In another case, he considered a spring that was stiffer in one direction than another, making the stiffness asymmetric. This happens when a quadratic term is added. These terms are shown in Fig. 2 from Duffing’s book. The top equation is a free oscillation, and the bottom equation has a harmonic forcing function. These were the central equations that Duffing explored, plus the addition of damping that he considered in a later chapter as shown in Fig. 3. The book lays out systematically, chapter by chapter, approximate and series solutions to the nonlinear equations, and in special cases described analytically exact solutions (such as for the nonlinear pendulum).
Fig. 2 Duffing’s equations without damping for free oscillation and driven oscillation with quadratic (producing an asymmetric potential) and cubic (producing stiffening or softening) corrections to the spring force.
Fig. 3 Inclusion of damping in the case with cubic corrections to the spring force.
Duffing was a practical engineer as well as a mathematical one, and he built experimental systems to test his solutions. An engineering drawing of his experimental test apparatus is shown in Fig. 4. The small test pendulum is at S in the figure. The large pendulum at B is the drive pendulum, chosen to be much heavier than the test pendulum so that it can deliver a steady harmonic force through spring F1 to the test system. The cubic nonlinearity of the test system was controlled through the choice of the length of the test pendulum, and the quadratic nonlinearity (the asymmetry) was controlled by allowing the equilibrium angle to be shifted from vertical. The relative strength of the quadratic and cubic terms was adjusted by changing the position of the mass at G. Duffing derived expressions for all the coefficients of the equations in Fig. 1 in terms of experimentally-controlled variables. Using this apparatus, Duffing verified to good accuracy his solutions for various special cases.
Fig. 4 Duffing’s experimental system he used to explore and verify his equations and solutions.
Duffing’s book is a masterpiece of careful systematic investigation, beginning in general terms, and then breaking the problem down into its special cases, finding solutions for each one with accurate experimental verifications. These attributes established the importance of this little booklet in the history of science and technology, but because it was written in German, most of the early citations were by German scientists. The first use of Duffing’s name associated to the nonlinear oscillator problem occurred in 1928 [2], as was the first reference to him in a work in English in a book by Timoshenko [3]. The first use of the phrase “Duffing Equation” specifically to describe an oscillator with a linear and cubic restoring force was in 1942 in a series of lectures presented at Brown University [4], and this nomenclature had become established by the end of that decade [5]. Although Duffing had spent considerable attention in his book to the quadratic term for an asymmetric oscillator, the term “Duffing Equation” now refers to the stiffening and softening problem rather than to the asymmetric problem.
Fig. 5 The Duffing equation is generally expressed as a harmonic oscillator (first three terms plus the harmonic drive) modified by a cubic nonlinearity and driven harmonically.
Duffing Rediscovered
Nonlinear oscillations remained mainly in the realm of engineering for nearly half a century, until a broad spectrum of physical scientists began to discover deep secrets hiding behind the simple equations. In 1963 Edward Lorenz (1917 – 2008) of MIT published a paper that showed how simple nonlinearities in three equations describing the atmosphere could produce a deterministic behavior that appeared to be completely chaotic. News of this paper spread as researchers in many seemingly unrelated fields began to see similar signatures in chemical reactions, turbulence, electric circuits and mechanical oscillators. By 1972 when Lorenz was invited to give a talk on the “Butterfly Effect” the science of chaos was emerging as new frontier in physics, and in 1975 it was given its name “chaos theory” by James Yorke (1941 – ). By 1976 it had become one of the hottest new areas of science.
Through the period of the emergence of chaos theory, the Duffing oscillator was known to be one of the archetypical nonlinear oscillators. A particularly attractive aspect of the general Duffing equations is the possibility of studying a “double-well” potential. This happens when the “alpha” in the equation in Fig. 5 is negative and the “beta” is positive. The double-well potential has a long history in physics, both classical and modern, because it represents a “two-state” system that exhibits bistability, bifurcations, and hysteresis. For a fixed “beta” the potential energy as a function of “alpha” is shown in Fig. 6. The bifurcation cascades of the double-well Duffing equation was investigated by Phillip Holmes (1945 – ) in 1976 [6], and the properties of the strange attractor were demonstrated in 1978 [7] by Yoshisuke Ueda (1936 – ). Holmes, and others, continued to do detailed work on the chaotic properties of the Duffing oscillator, helping to make it one of the most iconic systems of chaos theory.
Fig. 6 Potential energy of the Duffing Oscillator. The position variable is x, and changing alpha is along the other axis. For positive beta and alpha the potential is a quartic. For positive beta and negative alpha the potential is a double well.
Python Code for the Duffing Oscillator: Duffing.py
This Python code uses the simple ODE solver on the driven-damped Duffing double-well oscillator to display the configuration-space trajectories and the Poincaré map of the strange attractor. (Python code on GitHub.)
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Duffing.py
Created on Wed May 21 06:03:32 2018
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os
plt.close('all')
# model_case 1 = Pendulum
# model_case 2 = Double Well
print(' ')
print('Duffing.py')
alpha = -1 # -1
beta = 1 # 1
delta = 0.3 # 0.3
gam = 0.15 # 0.15
w = 1
def flow_deriv(x_y_z,tspan):
x, y, z = x_y_z
a = y
b = delta*np.cos(w*tspan) - alpha*x - beta*x**3 - gam*y
c = w
return[a,b,c]
T = 2*np.pi/w
px1 = np.random.rand(1)
xp1 = np.random.rand(1)
w1 = 0
x_y_z = [xp1, px1, w1]
# Settle-down Solve for the trajectories
t = np.linspace(0, 2000, 40000)
x_t = integrate.odeint(flow_deriv, x_y_z, t)
x0 = x_t[39999,0:3]
tspan = np.linspace(1,20000,400000)
x_t = integrate.odeint(flow_deriv, x0, tspan)
siztmp = np.shape(x_t)
siz = siztmp[0]
y1 = x_t[:,0]
y2 = x_t[:,1]
y3 = x_t[:,2]
plt.figure(2)
lines = plt.plot(y1[1:2000],y2[1:2000],'ko',ms=1)
plt.setp(lines, linewidth=0.5)
plt.show()
for cloop in range(0,3):
#phase = np.random.rand(1)*np.pi;
phase = np.pi*cloop/3
repnum = 5000
px = np.zeros(shape=(2*repnum,))
xvar = np.zeros(shape=(2*repnum,))
cnt = -1
testwt = np.mod(tspan-phase,T)-0.5*T;
last = testwt[1]
for loop in range(2,siz):
if (last < 0)and(testwt[loop] > 0):
cnt = cnt+1
del1 = -testwt[loop-1]/(testwt[loop] - testwt[loop-1])
px[cnt] = (y2[loop]-y2[loop-1])*del1 + y2[loop-1]
xvar[cnt] = (y1[loop]-y1[loop-1])*del1 + y1[loop-1]
last = testwt[loop]
else:
last = testwt[loop]
plt.figure(3)
if cloop == 0:
lines = plt.plot(xvar,px,'bo',ms=1)
elif cloop == 1:
lines = plt.plot(xvar,px,'go',ms=1)
else:
lines = plt.plot(xvar,px,'ro',ms=1)
plt.show()
plt.savefig('Duffing')
Fig. 7 Strange attractor of the double-well Duffing equation for three selected phases.
[3] S. Timoshenko, Vibration Problems in Engineering, D. Van Nostrand Company, Inc.,New York, 1928.
[4] K.O. Friedrichs, P. Le Corbeiller, N. Levinson, J.J. Stoker, Lectures on Non-Linear Mechanics delivered at Brown University, New York, 1942.
[5] Kovacic, I. and M. J. Brennan, Eds. The Duffing Equation: Nonlinear Oscillators and their Behavior. Chichester, United Kingdom, Wiley. (2011)
[6] Holmes, P. J. and D. A. Rand. “Bifurcations of Duffings Equation – Application of Catastrophe Theory.” Journal of Sound and Vibration 44(2): 237-253. (1976)
[7] Ueda, Y. “Randomly Transitional Phenomena in the System Governed by Duffings Equation.” Journal of Statistical Physics 20(2): 181-196. (1979)
In the years immediately following the Japanese surrender at the end of WWII, before the horror and paranoia of global nuclear war had time to sink into the psyche of the nation, atomic scientists were the rock stars of their times. Not only had they helped end the war with a decisive stroke, they were also the geniuses who were going to lead the US and the World into a bright new future of possibilities. To help kick off the new era, the powers in Washington proposed to hold a US meeting modeled on the European Solvay Congresses. The invitees would be a select group of the leading atomic physicists: invitation only! The conference was held at the Rams Head Inn on Shelter Island, at the far end of Long Island, New York in June of 1947. The two dozen scientists arrived in a motorcade with police escort and national press coverage. Richard Feynman was one of the select invitees, although he had done little fundamental work beyond his doctoral thesis with Wheeler. This would be his first real chance to expound on his path integral formulation of quantum mechanics. It was also his first conference where he was with all the big guns. Oppenheimer and Bethe were there as well as Wheeler and Kramers, von Neumann and Pauling. It was an august crowd and auspicious occasion.
Shelter Island and the Foundations of Quantum Mechanics
The topic that had been selected for the conference was Foundations of Quantum Mechanics, which at that time meant quantum electrodynamics, known as QED, a theory that was at the forefront of theoretical physics, but mired in theoretical difficulties. Specifically, it was waist deep in infinities that cropped up in calculations that went beyond the lowest order. The theorists could do back-of-the-envelope calculations with ease and arrive quickly at rough numbers that closely matched experiment, but as soon as they tried to be more accurate, results diverged, mainly because of the self-energy of the electron, which was the problem that Wheeler and Feynman had started on at the beginning of his doctoral studies [1]. As long as experiments had only limited resolution, the calculations were often good enough. But at the Shelter Island conference, Willis Lamb, a theorist-turned-experimentalist from Columbia University, announced the highest resolution atomic spectroscopy of atomic hydrogen ever attained, and there was a deep surprise in the experimental results.
An obvious photo-op at Shelter Island with, left to right: W. Lamb, Abraham Pais, John Wheeler (holding paper), Richard P. Feynman (holding pen), Herman Feschbach and Julian Schwinger.
Hydrogen, of course, is the simplest of all atoms. This was the atom that launched Bohr’s model, inspired Heisenberg’s matrix mechanics and proved Schrödinger’s wave mechanics. Deviations from the classical Bohr levels, measured experimentally, were the testing grounds for Dirac’s relativistic quantum theory that had enjoyed unparalleled success until Lamb’s presentation at Shelter Island. Lamb showed there was an exceedingly small energy splitting of about 200 parts in a billion that amounted to a wavelength of 28 cm in the microwave region of the electromagnetic spectrum. This splitting was not predicted, nor could it be described, by the formerly successful relativistic Dirac theory of the electron.
The audience was abuzz with
excitement. Here was a very accurate
measurement that stood ready for the theorists to test their theories on. In the discussions, Oppenheimer guessed that
the splitting was likely caused by electromagnetic interactions related to the
self energy of the electron. Victor
Weisskopf of MIT with Julian Schwinger of Harvard suggested that, although the
total energy calculations of each level might be infinite, the difference
in energy DE should be finite. After
all, in spectroscopy it is only the energy difference that is measured
experimentally. Absolute energies are
not accessible directly to experiment.
The trick was how to subtract one infinity from another in a consistent
way to get a finite answer. Many of the
discussions in the hallways, as well as many of the presentations, revolved
around this question. For instance,
Kramers suggested that there should be two masses in the electron theory—one is
the observed electron mass seen in experiments, and the second is a type of
internal or bare mass of the electron to be used in perturbation
calculations.
On the train ride up state after the Shelter Island Conference, Hans Bethe took out his pen and a sheaf of paper and started scribbling down ideas about how to use mass renormalization, subtracting infinity from infinity in a precise and consistent way to get finite answers in the QED calculations. He made surprising progress, and by the time the train pulled into the station at Schenectady he had achieved a finite calculation in reasonable agreement with Lamb’s shift. Oppenheimer had been right that the Lamb shift was electromagnetic in origin, and the suggestion by Weisskopf and Schwinger that the energy difference would be finite was indeed the correct approach. Bethe was thrilled with his own progress and quickly wrote up a paper draft and sent a copy in letters to Oppenheimer and Weisskopf [2]. Oppenheimer’s reply was gracious, but Weisskopf initially bristled because he also had tried the calculations after the conference, but had failed where Bethe had succeeded. On the other hand, both pointed out to Bethe that his calculation was non-relativistic, and that a relativistic calculation was still needed.
When Bethe returned to Cornell, he told Feynman about the success of his calculations but that a relativistic version was still missing. Feynman told him on the spot that he knew how to do it and that he would have it the next day. Feynman’s optimism was based on the new approach to relativistic quantum electrodynamics that he had been developing with the aid of his newly-invented “Feynman Diagrams”. Despite his optimism, he hit a snag that evening as he tried to calculate the self-energy of the electron. When he met with Bethe the next day, they both tried to to reconcile the calculations with Feynman’s new approach, but they failed to find a path through the calculations that made sense. Somewhat miffed, because he knew that his approach should work, Feynman got down to work in a way that he had usually avoided (he had always liked finding the “easy” path through tough problems). Over several intense months, he began to see how it all would work out.
At the same time that Feynman was making progress on his work, word arrived at Cornell of progress being made by Julian Schwingerat Harvard. Schwinger was a mathematical prodigy like Feynman, and also like Feynman had grown up in New York city, but they came from very different neighborhoods and had very different styles. Schwinger was a formalist who pursued everything with precision and mathematical rigor. He lectured calmly without notes in flawless presentations. Feynman, on the other hand, did his physics by feel. He made intuitive guesses and checked afterwards if they were right, testing ideas through trial and error. His lectures ranged widely, with great energy, without structure, following wherever the ideas might lead. This difference in approach and style between Schwinger and Feynman would have embarrassing consequences at the upcoming sequel to the Shelter Island conference that was to be held in late March 1948 at a resort in the Pocono Mountains in Pennsylvania.
The Conference in the Poconos
The Pocono conference was poised to be for the theorists Schwinger and Feynman what the Shelter Island had been for the experimentalists Rabi and Lamb—a chance to drop bombshells. There was a palpable buzz leading up to the conference with advance word coming from Schwinger about his successful calculation of the g-factor of the electron and the Lamb shift. In addition to the attendees who had been at Shelter Island, the Pocono conference was attended by Bohr and Dirac—two of the giants who had invented quantum mechanics. Schwinger began his presentation first. He had developed a rigorous mathematical method to remove the infinities from QED, enabling him to make detailed calculations of the QED corrections—a significant achievement—but the method was terribly complicated and tedious. His presentation went on for many hours in his carefully crafted style, without notes, delivered like a speech. Even so, the audience grew restless, and whenever Schwinger tried to justify his work on physical grounds, Bohr would speak up, and arguments among the attendees would ensue, after which Schwinger would say that all would become clear at the end. Finally, he came to the end, where only Fermi and Bethe had followed him. The rest of the audience was in a daze.
Feynman was nervous. It had seemed to him that Schwinger’s talk
had gone badly, despite Schwinger’s careful preparation. Furthermore, the audience was spent and not
in a mood to hear anything challenging.
Bethe suggested that if Feynman stuck to the math instead of the
physics, then the audience might not interrupt so much. So Feynman restructured his talk in the short
break before he was to begin.
Unfortunately, Feynman’s strength was in physical intuition, and
although he was no slouch at math, he was guided by visualization and by trial
and error. Many of the steps in his
method worked (he knew this because they gave the correct answers and because
he could “feel” they were correct), but he did not have all the mathematical
justifications. What he did have was a completely new way of
thinking about quantum electromagnetic interactions and a new way of making
calculations that were far simpler and faster than Schwinger’s. The challenge was that he relied on
space-time graphs in which “unphysical” things were allowed to occur, and in
fact were required to occur, as part
of the sum over many histories of his path integrals. For instance, a key element in the approach
was allowing electrons to travel backwards in time as positrons. In addition, a process in which the electron
and positron annihilate into a single photon, and then the photon decays into
an electron-positron pair, is not allowed by mass and energy conservation, but
this is a possible history that must add to the sum. As long as the time between the photon
emission and decay is short enough to satisfy Heisenberg’s uncertainty
principle, there is no violation of physics.
Feynman’s first published “Feynman Diagram” in the Physical Review (1948) [3] (Photograph reprinted from “Galileo Unbound” (D. Nolte, Oxford University Press, 2018)
None of this was familiar to the audience, and the talk quickly derailed. Dirac pestered him with questions that he tried to deflect, but Dirac persisted like a raven pecking at dead meat. A question was raised about the Pauli exclusion principle, about whether an orbital could have three electrons instead of the required two, and Feynman said that it could (all histories were possible and had to be summed over), an answer that dismayed the audience. Finally, as Feynman was drawing another of his space-time graphs showing electrons as lines, Bohr rose to his feet and asked whether Feynman had forgotten Heisenberg’s uncertainty principle that made it impossible to even talk about an electron trajectory. It was hopeless. Bohr had not understood that the diagrams were a shorthand notation not to be taken literally. The audience gave up and so did Feynman. The talk just fizzled out. It was a disaster.
At the close of the Pocono conference, Schwinger was the hero, and his version of QED appeared to be the right approach [4]. Oppenheimer, the reigning king of physics, former head of the successful Manhattan Project and newly selected to head the prestigious Institute for Advanced Study at Princeton, had been thoroughly impressed by Schwinger and thoroughly disappointed by Feynman. When Oppenheimer returned to Princeton, a letter was waiting for him in the mail from a colleague he knew in Japan by the name of Sin-Itiro Tomonaga [5]. In the letter, Tomonaga described work he had completed, unbeknownst to anyone in the US or Europe, on a renormalized QED. His results and approach were similar to Schwinger’s but had been accomplished independently in a virtual vacuum that surrounded Japan after the end of the war. His results cemented the Schwinger-Tomonaga approach to QED, further elevating them above the odd-ball Feynman scratchings. Oppenheimer immediately circulated the news of Tomonaga’s success to all the attendees of the Pocono conference. It appeared that Feynman was destined to be a footnote, but the prevailing winds were about to change as Feynman retreated to Cornell. In defeat, Feynman found the motivation to establish his simplified yet powerful version of quantum electrodynamics. He published his approach in 1948, a method that surpassed Schwinger and Tomonaga in conceptual clarity and ease of calculation. This work was to catapult Feynman to the pinnacles of fame, becoming the physicist next to Einstein whose name was most recognizable, in that later half of the twentieth century, to the man in the street (helped by a series of books that mythologized his exploits [6]).
For more on the history of Feynman and quantum mechanics, read Galileo Unbound from Oxford Press:
[1] See Chapter 8 “On the Quantum Footpath”, Galileo Unbound (Oxford, 2018)
[2] Schweber, S. S. QED and the men who made it : Dyson, Feynman, Schwinger, and Tomonaga. Princeton, N.J. :, Princeton University Press. (1994)
[3] Feynman, R. P. “Space-time Approach to Quantum Electrodynamics.” Physical Review 76(6): 769-789. (1949)
[4] Schwinger, J. “ON QUANTUM-ELECTRODYNAMICS AND THE MAGNETIC MOMENT OF THE ELECTRON.” Physical Review 73(4): 416-417. (1948)
[5] Tomonaga, S. “ON INFINITE FIELD REACTIONS IN QUANTUM FIELD THEORY.” Physical Review 74(2): 224-225. (1948)
[6] Surely You’re Joking, Mr. Feynman!: Adventures of a Curious Character, Richard Feynman, Ralph Leighton (contributor), Edward Hutchings (editor), 1985, W W Norton,
Paul Adrian Maurice Dirac (1902 – 1984) was given the moniker of “the strangest man” by Niels Bohr while he was reminiscing about the many great scientists with whom he had worked over the years [1]. It is a moniker that resonates with the innumerable “Dirac stories” that abound in the mythology of the hallways of physics departments around the world. Dirac was awkward, shy, a loner, rarely said anything, was completely literal, had not the slightest comprehension of art or poetry, nor any clear understanding of human interpersonal interaction. Dirac was also brilliant, providing the theoretical foundation for the central paradigm of modern physics—quantum field theory. The discovery of the Higgs boson in 2012, a human achievement that capped nearly a century of scientific endeavor, rests solidly on the theory of quantum fields that permeate space. The Higgs particle, when it pops into existence at the Large Hadron Collider in Geneva, is a singular quantum excitation of the Higgs field, a field that usually resides in a vacuum state, frothing with quantum fluctuations that imbue all particles—and you and me—with mass. The Higgs field is Dirac’s legacy.
… all of a sudden he had a new equation with four-dimensional space-time symmetry.
Copenhagen and Bohr
Although Dirac as a young scientist was initially enthralled with relativity theory, he was working under Ralph Fowler (1889 – 1944) in the physics department at Cambridge in 1925 when he had the chance to read advanced proofs of Heisenberg’s matrix mechanics paper. This chance event launched him on his own trajectory in quantum theory. After Dirac was awarded his doctorate from Cambridge in 1926, he received a stipend that sent him to work with Niels Bohr (1885 – 1962) in Copenhagen—ground zero of the new physics. During his time there, Dirac became famous for taking long walks across Copenhagen as he played about with things in his mind, performing mental juggling of abstract symbols, envisioning how they would permute and act. His attention was focused on the electromagnetic field and how it interacted with the quantized states of atoms. Although the electromagnetic field was the classical field of light, it was also the quantum field of Einstein’s photon, and he wondered how the quantized harmonic oscillators of the electromagnetic field could be generated by quantum wavefunctions acting as operators. But acting on what? He decided that, to generate a photon, the wavefunction must operate on a state that had no photons—the ground state of the electromagnetic field known as the vacuum state.
In late 1926, nearing the end of his stay in Copenhagen with Bohr, Dirac put these thoughts into their appropriate mathematical form and began work on two successive manuscripts. The first manuscript contained the theoretical details of the non-commuting electromagnetic field operators. He called the process of generating photons out of the vacuum “second quantization”. This phrase is a bit of a misnomer, because there is no specific “first quantization” per se, although he was probably thinking of the quantized energy levels of Schrödinger and Heisenberg. In second quantization, the classical field of electromagnetism is converted to an operator that generates quanta of the associated quantum field out of the vacuum (and also annihilates photons back into the vacuum). The creation operators can be applied again and again to build up an N-photon state containing N photons that obey Bose-Einstein statistics, as they must, as required by their integer spin, agreeing with Planck’s blackbody radiation.
Dirac then went further to show how an interaction of the quantized electromagnetic field with quantized energy levels involved the annihilation and creation of photons as they promoted electrons to higher atomic energy levels, or demoted them through stimulated emission. Very significantly, Dirac’s new theory explained the spontaneous emission of light from an excited electron level as a direct physical process that creates a photon carrying away the energy as the electron falls to a lower energy level. Spontaneous emission had been explained first by Einstein more than ten years earlier when he derived the famous A and B coefficients, but Einstein’s arguments were based on the principle of detailed balance, which is a thermodynamic argument. It is impressive that Einstein’s deep understanding of thermodynamics and statistical mechanics could allow him to derive the necessity of both spontaneous and stimulated emission, but the physical mechanism for these processes was inferred rather than derived. Dirac, in late 1926, had produced the first direct theory of photon exchange with matter. This was the birth of quantum electrodynamics, known as QED, and the birth of quantum field theory[2].
Fig. 1 Paul Dirac in his early days.
Göttingen and Born
Dirac’s next stop on his postodctoral fellowship was in Göttingen to work with Max Born (1882 – 1970) and the large group of theoreticians and mathematicians who were like electrons in a cloud orbiting around the nucleus represented by the new quantum theory. Göttingen was second only to Copenhagen as the Mecca for quantum theorists. Hilbert was there and von Neumann too, as well as the brash American J. Robert Oppenheimer (1904 – 1967) who was finishing his PhD with Born. Dirac and Oppenheimer struck up an awkward friendship. Oppenheimer was considered arrogant by many others in the group, but he was in awe of Dirac who arrived with his manuscript on quantum electrodynamics ready for submission. Oppenheimer struggled at first to understand Dirac’s new approach to quantizing fields, but he quickly grasped the importance, as did Pascual Jordan (1902 – 1980), who was also in Göttingen.
Jordan had already worked on ideas very close to Dirac’s on the quantization of fields. He and Dirac seemed to be going down the same path, independently arriving at very similar conclusions around the same time. In fact, Jordan was often a step ahead of Dirac, tending to publish just before Dirac, as with non-commuting matrices, transformation theory and the relationship of canonical transformations to second quantization. However, Dirac’s paper on quantum electrodynamics was a masterpiece in clarity and comprehensiveness, launching a new field in a way that Jordan had not yet achieved with his own work. But because of the closeness of Jordan’s thinking to Dirac’s, he was able to see immediately how to extend Dirac’s approach. Within the year, he published a series of papers that established the formalism of quantum electrodynamics as well as quantum field theory. With Pauli, he systematized the operators for creation and annihilation of photons [3]. With Wigner, he developed second quantization for de Broglie matter waves, defining creation and annihilation operators that obeyed the Pauli exclusion principle of electrons[4]. Jordan was on a roll, forging ahead of Dirac on extensions of quantum electrodynamics and field theory, but Dirac was about to eclipse Jordan once and for all.
St. John’s at Cambridge
At the end of the Spring semester in 1927, Dirac was offered a position as a fellow of St. John’s College at Cambridge, which he accepted, returning to England to begin his life as a college professor. During the summer and into the Fall, Dirac returned to his first passion in physics, relativity, which had yet to be successfully incorporated into quantum physics. Oskar Klein and Walter Gordon had made initial attempts at formulating relativistic quantum theory, but they could not correctly incorporate the spin properties of the electron, and their wave equation had the bad habit of producing negative probabilities. Probabilities went negative because the Klein-Gordon equation had two time derivatives instead of one. The reason it had two (while the non-relativistic Schrödinger equation has only one) is because space-time symmetry required the double space derivative of the Schrödinger equation to be paired with a double time derivative. Dirac, with creative insight, realized that the problem could be flipped by requiring the single time derivative to be paired with a single space derivative. The problem was that a single space derivative did not seem to make any sense [5].
St. John’s College at Cambridge
As Dirac puzzled how to get an equation with only single derivatives, he was playing around with Pauli spin matrices and hit on a simple identity that related the spin matrices to the electron momentum. At first he could not get the identity to apply to four-dimensional relativistic momenta using the usual 2×2 spin matrices. Then he realized that four-dimensional space-time could be captured if he expanded Pauli’s 2×2 spin matrices to 4×4 spin matrices, and all of a sudden he had a new equation with four-dimensional space-time symmetry with single derivatives on space and time. As a test of his new equation, he calculated fine details of the experimentally-measured hydrogen spectrum, known as the fine structure, which had resisted theoretical explanation, and he derived answers in close agreement with experiment. He also showed that the electron had spin-1/2, and he calculated its magnetic moment. He finished his manuscript at the end of the Fall semester in 1927, and the paper was published in early 1928[6]. His relativistic quantum wave equation was an instant sensation, becoming known for all time as “the Dirac Equation”. He had succeeded at finding a correct and long-sought relativistic quantum theory where many others had failed, such as Oskar Klein and Paul Gordon. It was a crowning achievement, placing Dirac firmly in the firmament of the quantum theorists.
Fig. 1 The relativistic Dirac equation. The wavefunction is a four-component spinor. The gamma-del product is a 4×4 matrix operator. The time and space derivatives are both first-order operators.
Antimatter
In the process of ridding the Klein-Gordon equation of negative probability, which Dirac found abhorent, his new equation created an infinite number of negative energy states, which he did not find abhorent. It is perhaps a matter of taste what one theoriest is willing to accept over another, and for Dirac, negative energies were better than negative probabilities. Even so, one needed to deal with an infinite number of negative energy states in quantum theory, because they are available to quantum transitions. In 1929 and 1930, as Dirac was writing his famous textbook on quantum theory, he became intrigued by the similarity between the positive and negative electron states of the vacuum and the energy levels of valence electrons on atoms. An electron in a state outside a filled electron shell behaves very much like a single-electron atom, like sodium and lithium with their single valence electrons. Conversely, an atomic shell that has one electron less than a full complement can be described as having a “hole” that behaves “as if” it were a positive particle. It is like a bubble in water. As water sinks, the bubble rises to the top of the water level. For electrons, if all the electrons go one way in an electric field, then the hole goes the opposite direction, like a positive charge.
Dirac took this analogy of nearly-filled atomic shells and applied it to the vacuum states of the electron, viewing the filled negative energy states like the filled electron shells of atoms. If there is a missing electron, a hole in this infinite sea, then it would behave as if it had positive charge. Initially, Dirac speculated that the “hole” was the proton, and he even wrote a paper on that possibility. But Oppenheimer pointed out that the idea was inconsistent with observations, especially the inability of the electron and proton to annihilate, and that the ground state of the infinite electron sea must be completely filled. Hermann Weyl further pointed out that the electron-proton theory did not have the correct symmetry, and Dirac had to rethink. In early 1931 he hit on an audacious solution to the puzzle. What if the hole in the infinite negative energy sea did not just behave like a positive particle, but actually was a positive particle, a new particle that Dirac dubbed the “anti-electron”? The anti-electron would have the same mass as the electron, but would have positive charge. He suggested that such particles might be generated in high-energy collisions in vacuum, and he finished his paper with the suggestion that there also could be an anti-proton with the mass of the proton but with negative charge. In this singular paper, titled “Quantized Singularities of the Electromagnetic Field” published in 1931, Dirac predicted the existence of antimatter. A year later the positron was discovered by Carl David Anderson at Cal Tech. Anderson had originally called the particle the positive electron, but a journal editor of the Physical Review changed it to positron, and the new name stuck.
Fig. 3 An electron-positron pair is created by the absorption of a photon (gamma ray). Positrons have negative energy and can be viewed as a hole in a sea of filled electron states. (Momentum conservation is satisfied if a near-by heavy particle takes up the recoil momentum.)
The prediction and subsequent experimental validation of antmatter stands out in the history of physics in the 20th Century. In previous centuries, theory was performed mainly in the service of experiment, explaining interesting new observed phenomena either as consequences of known physics, or creating new physics to explain the observations. Quantum theory, revolutionary as a way of understanding nature, was developed to explain spectroscopic observations of atoms and molecules and gases. Similarly, the precession of the perihelion of Mercury was a well-known phenomenon when Einstein used his newly developed general relativity to explain it. As a counter example, Einstein’s prediction of the deflection of light by the Sun was something new that emerged from theory. This is one reason why Einstein became so famous after Eddington’s expedition to observe the deflection of apparent star locations during the total eclipse. Einstein had predicted something that had never been seen before. Dirac’s prediction of the existence of antimatter similarly is a triumph of rational thought, following the mathematical representation of reality to an inevitable conclusion that cannot be ignored, no matter how wild and initially unimaginable it is. Dirac went on to receive the Nobel prize in Physics in 1933, sharing the prize that year with Schrödinger (Heisenberg won it the previous year in 1932).
Read the stories behind the history of quantum field theory, in Galileo Unbound from Oxford University Press
[1] Framelo, “The Strangest Man: The Hidden Life of Paul Dirac” (Basic Books, 2011)
[2] Dirac, P. A. M. (1927). “The
quantum theory of the emission and absorption of radiation.” Proceedings
of the Royal Society of London Series A114(767): 243-265.; Dirac, P. A. M. (1927). “The quantum
theory of dispersion.” Proceedings of the Royal Society of London
Series A114(769): 710-728.
[3] Jordan, P. and W. Pauli, Jr.
(1928). “To quantum electrodynamics of free charge fields.” Zeitschrift
Fur Physik47(3-4): 151-173.
[4] Jordan, P. and E. Wigner (1928).
“About the Pauli’s equivalence prohibited.” Zeitschrift Fur Physik47(9-10): 631-651.
[5] This is because two space derivatives measure the
curvative of the wavefunction which is related to the kinetic energy of the
electron.
[6] Dirac, P. A. M. (1928). “The quantum theory of the electron.” Proceedings of the Royal Society of London Series A117(778): 610-624.; Dirac, P. A. M. (1928). “The quantum theory of the electron – Part II.” Proceedings of the Royal Society of London Series A118(779): 351-361.
When I arrived at Berkeley in 1981 to start graduate school in physics, the single action I took that secured my future as a physicist, more than spending scores of sleepless nights studying quantum mechanics by Schiff or electromagnetism by Jackson —was buying a motorcycle! Why motorcycle maintenance should be the Tao of Physics was beyond me at the time—but Zen is transcendent.
The Quantum Sadistics
In my first semester of grad school I made two close friends, Keith Swenson and Kent Owen, as we stayed up all night working on impossible problem sets and hand-grading a thousand midterms for an introductory physics class that we were TAs for. The camaraderie was made tighter when Keith and Kent bought motorcycles and I quickly followed suit, buying my first wheels –– a 1972 Suzuki GT550. It was an old bike, but in good shape and ready to ride, so the three of us began touring around the San Francisco Bay Area together on weekend rides. We went out to Mt. Tam, or up to Vallejo, or around the North and South Bay. Kent thought this was a very cool way for physics grads to spend their time and he came up with a name for our gang –– the “Quantum Sadistics”! He even made a logo for our “colors” that was an eye shedding a tear drop shaped like the dagger of a quantum raising operator.
At the end of the first year, Keith left the program, not sure he was the right material for a physics degree, and moved to San Diego to head up the software arm of a start-up company that he had founder’s shares in. Kent and I continued at Berkeley, but soon got too busy to keep up the weekend rides. My Suzuki was my only set of wheels, so I tooled around with it, keeping it running when it really didn’t want to go any further. I had to pull its head and dive deep into it to adjust the rockers. It stayed together enough for a trip all the way down Highway 1 to San Diego to visit Keith and back, and a trip all the way up Highway 1 to Seattle to visit my grandparents and back, having ridden the full length of the Pacific Coast from Tijuana to Vancouver. Motorcycle maintenance was always part of the process.
Andrew Lange
After a few semesters as a TA for the large lecture courses in physics, it was time to try something real and I noticed a job opening posted on a bulletin board. It was for a temporary research position in Prof. Paul Richard’s group. I had TA-ed for him once, but knew nothing of his research, and the interview wasn’t even with him, but with a graduate student named Andrew Lange. I met with Andrew in a ground-floor lab on the south side of Birge Hall. He was soft-spoken and congenial, with round architect glasses, fine sandy hair and had about him a hint of something exotic. He was encouraging in his reactions to my answers. Then he asked if I had a motorcycle. I wasn’t sure if he already knew, or whether it was a test of some kind, so I said that I did. “Do you work on it?”, he asked. I remember my response. “Not really,” I said. In my mind I was no mechanic. Adjusting the overhead rockers was nothing too difficult. It wasn’t like I had pulled the pistons.
“It’s important to work on your motorcycle.”
For some reason, he didn’t seem to like my answer. He probed further. “Do you change the tires or the oil?”. I admitted that I did, and on further questioning, he slowly dragged out my story of pulling the head and adjusting the cams. He seemed to relax, like he had gotten to the bottom of something. He then gave me some advice, focusing on me with a strange intensity and stressing very carefully, “It’s important to work on your motorcycle.”
I got the job and joined Paul Richards research group. It was a heady time. Andrew was designing a rocket-borne far-infrared spectrometer that would launch on a sounding rocket from Nagoya, Japan. The spectrometer was to make the most detailed measurements ever of the cosmic microwave background (CMB) radiation during a five-minute free fall at the edge of space, before plunging into the Pacific Ocean. But the spectrometer was missing a set of key optical elements known as far-infrared dichroic beam splitters. Without these beam splitters, the spectrometer was just a small chunk of machined aluminum. It became my job to create these beam splitters. The problem was that no one knew how to do it. So with Andrew’s help, I scanned the literature, and we settled on a design related to results from the Ulrich group in Germany.
Our spectral range was different than previous cases, so I created a new methodology using small mylar sheets, patterned with photolithography, evaporating thin films of aluminum on both sides of the mylar. My first photomasks were made using an amazingly archaic technology known as rubylith that had been used in the 70’s to fabricate low-level integrated circuits. Andrew showed me how to cut the fine strips of red plastic tape at a large scale that was then photo-reduced for contract printing. I modeled the beam splitters with equivalent circuits to predict the bandpass spectra, and learned about Kramers-Kronig transforms to explain an additional phase shift that appeared in the interferometric tests of the devices. These were among the first metamaterials ever created (although this was before that word existed), with an engineered magnetic response for millimeter waves. I fabricated the devices in the silicon fab on the top floor of the electrical engineering building on the Berkeley campus. It was one of the first university-based VLSI fabs in the country, with high-class clean rooms and us in bunny suits. But I was doing everything but silicon, modifying all their carefully controlled processes in the photolithography bay. I made and characterized a full set of 5 of these high-tech beam splitters–right before I was ejected from the lab and banned. My processes were incompatible with the VLSI activities of the rest of the students. Fortunately, I had completed the devices, with a little extra material to spare.
I rode my motorcycle with Andrew and his friends around the Bay Area and up to Napa and the wine country. One memorable weekend Paul had all his grad students come up to his property in Mendocino County to log trees. Of course, we rode up on our bikes. Paul’s land was high on a coastal mountain next to the small winery owned by Charles Kittel (the famous Kittel of “Solid State Physics”). The weekend was rustic. The long-abandoned hippie-shack on the property was uninhabitable so we roughed it. After two days of hauling and stacking logs, I took a long way home riding along dark roads under tall redwoods.
Andrew moved his operation to the University of Nagoya, Japan, six months before the launch date. The spectrometer checked out perfectly. As launch day approached, it was mounted into the nose cone of the sounding rocket, continuing to pass all calibration tests. On the day of launch, we held our breath back in Berkeley. There was a 12 hour time difference, then we received the report. The launch was textbook perfect, but at the critical moment when the explosive nose-cone bolts were supposed to blow, they failed. The cone stayed firmly in place, and the spectrometer telemetered back perfect measurements of the inside of the rocket all the way down until it crashed into the Pacific, and the last 9 months of my life sank into the depths of the Marianas Trench. I read the writing on the thin aluminum wall, and the following week I was interviewing for a new job up at Lawrence Berkeley Laboratory, the DOE national lab high on the hill overlooking the Berkeley campus.
Eugene Haller
The instrument I used in Paul Richard’s lab to characterize my state-of-the-art dichroic beamsplitters was a far-infrared Fourier-transform spectrometer that Paul had built using a section of 1-foot-diameter glass sewer pipe. Bob McMurray, a graduate student working with Prof. Eugene Haller on the hill, was a routine user of this makeshift spectrometer, and I had been looking over Bob’s shoulder at the interesting data he was taking on shallow defect centers in semiconductors. The work sounded fascinating, and as Andrew’s Japanese sounding rocket settled deeper into the ocean floor, I arranged to meet with Eugene Haller in his office at LBL.
I was always clueless about interviews. I never thought about them ahead of time, and never knew what I needed to say. On the other hand, I always had a clear idea of what I wanted to accomplish. I think this gave me a certain solid confidence that may have come through. So I had no idea what Eugene was getting at as we began the discussion. He asked me some questions about my project with Paul, which I am sure I answered with lots of details about Kramers-Kronig and the like. Then came the question strangely reminiscent of when I first met Andrew Lange: Did I work on my car? Actually, I didn’t have a car, I had a motorcycle, and said so. Well then, did I work on my motorcycle? He had that same strange intensity that Andrew had when he asked me roughly the same question. He looked like a prosecuting attorney waiting for the suspect to incriminate himself. Once again, I described pulling the head and adjusting the rockers and cams.
Eugene leaned back in his chair and relaxed. He began talking in the future tense about the project I would be working on. It was a new project for the new Center for Advanced Materials at LBL, for which he was the new director. The science revolved around semiconductors and especially a promising new material known as GaAs. He never actually said I had the job … all of a sudden it just seemed to be assumed. When the interview was over, he simply asked me to give him an answer in a few days if I would come up and join his group.
I didn’t know it at the time, by Eugene had a beautiful vintage Talbot roadster that was his baby. One of his loves was working on his car. He was a real motor head and knew everything about the mechanics. He was also an avid short-wave radio enthusiast and knew as much about vacuum tubes as he did about transistors. Working on cars (or motorcycles) was a guaranteed ticket into his group. At a recent gathering of his former students and colleagues for his memorial, similar stories circulated about that question: Did you work on your car? The answer to this one question mattered more than any answer you gave about physics.
I joined Eugene Haller’s research group at LBL in March of 1984 and received my PhD on topics of semiconductor physics in 1988. My association with his group opened the door to a post-doc position at AT&T Bell Labs and then to a faculty position at Purdue University where I currently work on the physics of oncology in medicine and have launched two biotech companies—all triggered by the simple purchase of a motorcycle.
BOOMERanG in Antarctica (1997)
Andrew Lange’s career was particularly stellar. He joined the faculty of Cal Tech, and I was amazed to read in Science magazine in 2004 or 2005, in a section called “Nobel Watch”, that he was a candidate for the Nobel Prize for his work on BoomerAng that had launched and monitored a high-altitude balloon as it circled the South Pole taking unprecedented data on the CMB that constrained the amount of dark matter in the universe. Around that same time I invited Paul Richards to Purdue to give our weekly physics colloquium to talk about his own work on MAXIMA. There was definitely a buzz going around that the BoomerAng and MAXIMA collaborations were being talked about in Nobel circles. The next year, the Nobel Prize of 2006 was indeed awarded for work on the Cosmic Microwave Background, but to Mather and Smoot for their earlier work on the COBE satellite.
Then, in January 2010, I was shocked to read in the New York Times that Andrew, that vibrant sharp-eyed brilliant physicist, was found lifeless in a hotel room, dead from asphyxiation. The police ruled it a suicide. Apparently few had known of his life-long struggle with depression, and it had finally overwhelmed him. Perhaps he had sold his motorcycle by then. But I wonder—if he had pulled out his wrenches and gotten to work on its engine, whether he might have been enveloped by the zen of motorcycle maintenance and the crisis would have passed him by. As Andrew had told me so many years ago, and I wish I could have reminded him, “It’s important to work on your motorcycle.”