Orbiting Photons around a Black Hole

The physics of a path of light passing a gravitating body is one of the hardest concepts to understand in General Relativity, but it is also one of the easiest.  It is hard because there can be no force of gravity on light even though the path of a photon bends as it passes a gravitating body.  It is easy, because the photon is following the simplest possible path—a geodesic equation for force-free motion.

         This blog picks up where my last blog left off, having there defined the geodesic equation and presenting the Schwarzschild metric.  With those two equations in hand, we could simply solve for the null geodesics (a null geodesic is the path of a light beam through a manifold).  But there turns out to be a simpler approach that Einstein came up with himself (he never did like doing things the hard way).  He just had to sacrifice the fundamental postulate that he used to explain everything about Special Relativity.

Throwing Special Relativity Under the Bus

The fundamental postulate of Special Relativity states that the speed of light is the same for all observers.  Einstein posed this postulate, then used it to derive some of the most astonishing consequences of Special Relativity—like E = mc2.  This postulate is at the rock core of his theory of relativity and can be viewed as one of the simplest “truths” of our reality—or at least of our spacetime. 

            Yet as soon as Einstein began thinking how to extend SR to a more general situation, he realized almost immediately that he would have to throw this postulate out.   While the speed of light measured locally is always equal to c, the apparent speed of light observed by a distant observer (far from the gravitating body) is modified by gravitational time dilation and length contraction.  This means that the apparent speed of light, as observed at a distance, varies as a function of position.  From this simple conclusion Einstein derived a first estimate of the deflection of light by the Sun, though he initially was off by a factor of 2.  (The full story of Einstein’s derivation of the deflection of light by the Sun and the confirmation by Eddington is in Chapter 7 of Galileo Unbound (Oxford University Press, 2018).)

The “Optics” of Gravity

The invariant element for a light path moving radially in the Schwarzschild geometry is

The apparent speed of light is then

where c(r) is  always less than c, when observing it from flat space.  The “refractive index” of space is defined, as for any optical material, as the ratio of the constant speed divided by the observed speed

Because the Schwarzschild metric has the property

the effective refractive index of warped space-time is

with a divergence at the Schwarzschild radius.

            The refractive index of warped space-time in the limit of weak gravity can be used in the ray equation (also known as the Eikonal equation described in an earlier blog)

where the gradient of the refractive index of space is

The ray equation is then a four-variable flow

These equations represent a 4-dimensional flow for a light ray confined to a plane.  The trajectory of any light path is found by using an ODE solver subject to the initial conditions for the direction of the light ray.  This is simple for us to do today with Python or Matlab, but it was also that could be done long before the advent of computers by early theorists of relativity like Max von Laue  (1879 – 1960).

The Relativity of Max von Laue

In the Fall of 1905 in Berlin, a young German physicist by the name of Max Laue was sitting in the physics colloquium at the University listening to another Max, his doctoral supervisor Max Planck, deliver a seminar on Einstein’s new theory of relativity.  Laue was struck by the simplicity of the theory, in this sense “simplistic” and hence hard to believe, but the beauty of the theory stuck with him, and he began to think through the consequences for experiments like the Fizeau experiment on partial ether drag.

         Armand Hippolyte Louis Fizeau (1819 – 1896) in 1851 built one of the world’s first optical interferometers and used it to measure the speed of light inside moving fluids.  At that time the speed of light was believed to be a property of the luminiferous ether, and there were several opposing theories on how light would travel inside moving matter.  One theory would have the ether fully stationary, unaffected by moving matter, and hence the speed of light would be unaffected by motion.  An opposite theory would have the ether fully entrained by matter and hence the speed of light in moving matter would be a simple sum of speeds.  A middle theory considered that only part of the ether was dragged along with the moving matter.  This was Fresnel’s partial ether drag hypothesis that he had arrived at to explain why his friend Francois Arago had not observed any contribution to stellar aberration from the motion of the Earth through the ether.  When Fizeau performed his experiment, the results agreed closely with Fresnel’s drag coefficient, which seemed to settle the matter.  Yet when Michelson and Morley performed their experiments of 1887, there was no evidence for partial drag.

         Even after the exposition by Einstein on relativity in 1905, the disagreement of the Michelson-Morley results with Fizeau’s results was not fully reconciled until Laue showed in 1907 that the velocity addition theorem of relativity gave complete agreement with the Fizeau experiment.  The velocity observed in the lab frame is found using the velocity addition theorem of special relativity. For the Fizeau experiment, water with a refractive index of n is moving with a speed v and hence the speed in the lab frame is

The difference in the speed of light between the stationary and the moving water is the difference

where the last term is precisely the Fresnel drag coefficient.  This was one of the first definitive “proofs” of the validity of Einstein’s theory of relativity, and it made Laue one of relativity’s staunchest proponents.  Spurred on by his success with the Fresnel drag coefficient explanation, Laue wrote the first monograph on relativity theory, publishing it in 1910. 

Fig. 1 Front page of von Laue’s textbook, first published in 1910, on Special Relativity (this is a 4-th edition published in 1921).

A Nobel Prize for Crystal X-ray Diffraction

In 1909 Laue became a Privatdozent under Arnold Sommerfeld (1868 – 1951) at the university in Munich.  In the Spring of 1912 he was walking in the Englischer Garten on the northern edge of the city talking with Paul Ewald (1888 – 1985) who was finishing his doctorate with Sommerfed studying the structure of crystals.  Ewald was considering the interaction of optical wavelength with the periodic lattice when it struck Laue that x-rays would have the kind of short wavelengths that would allow the crystal to act as a diffraction grating to produce multiple diffraction orders.  Within a few weeks of that discussion, two of Sommerfeld’s students (Friedrich and Knipping) used an x-ray source and photographic film to look for the predicted diffraction spots from a copper sulfate crystal.  When the film was developed, it showed a constellation of dark spots for each of the diffraction orders of the x-rays scattered from the multiple periodicities of the crystal lattice.  Two years later, in 1914, Laue was awarded the Nobel prize in physics for the discovery.  That same year his father was elevated to the hereditary nobility in the Prussian empire and Max Laue became Max von Laue.

            Von Laue was not one to take risks, and he remained conservative in many of his interests.  He was immensely respected and played important roles in the administration of German science, but his scientific contributions after receiving the Nobel Prize were only modest.  Yet as the Nazis came to power in the early 1930’s, he was one of the few physicists to stand up and resist the Nazi take-over of German physics.  He was especially disturbed by the plight of the Jewish physicists.  In 1933 he was invited to give the keynote address at the conference of the German Physical Society in Wurzburg where he spoke out against the Nazi rejection of relativity as they branded it “Jewish science”.  In his speech he likened Einstein, the target of much of the propaganda, to Galileo.  He said, “No matter how great the repression, the representative of science can stand erect in the triumphant certainty that is expressed in the simple phrase: And yet it moves.”  Von Laue believed that truth would hold out in the face of the proscription against relativity theory by the Nazi regime.  The quote “And yet it moves” is supposed to have been muttered by Galileo just after his abjuration before the Inquisition, referring to the Earth moving around the Sun.  Although the quote is famous, it is believed to be a myth.

            In an odd side-note of history, von Laue sent his gold Nobel prize medal to Denmark for its safe keeping with Niels Bohr so that it would not be paraded about by the Nazi regime.  Yet when the Nazis invaded Denmark, to avoid having the medals fall into the hands of the Nazis, the medal was dissolved in aqua regia by a member of Bohr’s team, George de Hevesy.  The gold completely dissolved into an orange liquid that was stored in a beaker high on a shelf through the war.  When Denmark was finally freed, the dissolved gold was precipitated out and a new medal was struck by the Nobel committee and re-presented to von Laue in a ceremony in 1951. 

The Orbits of Light Rays

Von Laue’s interests always stayed close to the properties of light and electromagnetic radiation ever since he was introduced to the field when he studied with Woldemor Voigt at Göttingen in 1899.  This interest included the theory of relativity, and only a few years after Einstein published his theory of General Relativity and Gravitation, von Laue added to his earlier textbook on relativity by writing a second volume on the general theory.  The new volume was published in 1920 and included the theory of the deflection of light by gravity. 

         One of the very few illustrations in his second volume is of light coming into interaction with a super massive gravitational field characterized by a Schwarzschild radius.  (No one at the time called it a “black hole”, nor even mentioned Schwarzschild.  That terminology came much later.)  He shows in the drawing, how light, if incident at just the right impact parameter, would actually loop around the object.  This is the first time such a diagram appeared in print, showing the trajectory of light so strongly affected by gravity.

Fig. 2 A page from von Laue’s second volume on relativity (first published in 1920) showing the orbit of a photon around a compact mass with “gravitational cutoff” (later known as a “black hole:”). The figure is drawn semi-quantitatively, but the phenomenon was clearly understood by von Laue.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue May 28 11:50:24 2019

@author: nolte
"""

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os

plt.close('all')

def create_circle():
	circle = plt.Circle((0,0), radius= 10, color = 'black')
	return circle

def show_shape(patch):
	ax=plt.gca()
	ax.add_patch(patch)
	plt.axis('scaled')
	plt.show()
    
def refindex(x,y):
    
    A = 10
    eps = 1e-6
    
    rp0 = np.sqrt(x**2 + y**2);
        
    n = 1/(1 - A/(rp0+eps))
    fac = np.abs((1-9*(A/rp0)**2/8))   # approx correction to Eikonal
    nx = -fac*n**2*A*x/(rp0+eps)**3
    ny = -fac*n**2*A*y/(rp0+eps)**3
     
    return [n,nx,ny]

def flow_deriv(x_y_z,tspan):
    x, y, z, w = x_y_z
    
    [n,nx,ny] = refindex(x,y)
        
    yp = np.zeros(shape=(4,))
    yp[0] = z/n
    yp[1] = w/n
    yp[2] = nx
    yp[3] = ny
    
    return yp
                
for loop in range(-5,30):
    
    xstart = -100
    ystart = -2.245 + 4*loop
    print(ystart)
    
    [n,nx,ny] = refindex(xstart,ystart)


    y0 = [xstart, ystart, n, 0]

    tspan = np.linspace(1,400,2000)

    y = integrate.odeint(flow_deriv, y0, tspan)

    xx = y[1:2000,0]
    yy = y[1:2000,1]


    plt.figure(1)
    lines = plt.plot(xx,yy)
    plt.setp(lines, linewidth=1)
    plt.show()
    plt.title('Photon Orbits')
    
c = create_circle()
show_shape(c)
axes = plt.gca()
axes.set_xlim([-100,100])
axes.set_ylim([-100,100])

# Now set up a circular photon orbit
xstart = 0
ystart = 15

[n,nx,ny] = refindex(xstart,ystart)

y0 = [xstart, ystart, n, 0]

tspan = np.linspace(1,94,1000)

y = integrate.odeint(flow_deriv, y0, tspan)

xx = y[1:1000,0]
yy = y[1:1000,1]

plt.figure(1)
lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=2, color = 'black')
plt.show()

One of the most striking effects of gravity on photon trajectories is the possibility for a photon to orbit a black hole in a circular orbit. This is shown in Fig. 3 as the black circular ring for a photon at a radius equal to 1.5 times the Schwarzschild radius. This radius defines what is known as the photon sphere. However, the orbit is not stable. Slight deviations will send the photon spiraling outward or inward.

The Eikonal approximation does not strictly hold under strong gravity, but the Eikonal equations with the effective refractive index of space still yield semi-quantitative behavior. In the Python code, a correction factor is used to match the theory to the circular photon orbits, while still agreeing with trajectories far from the black hole. The results of the calculation are shown in Fig. 3. For large impact parameters, the rays are deflected through a finite angle. At a critical impact parameter, near 3 times the Schwarzschild radius, the ray loops around the black hole. For smaller impact parameters, the rays are captured by the black hole.

Fig. 3 Photon orbits near a black hole calculated using the Eikonal equation and the effective refractive index of warped space. One ray, near the critical impact parameter, loops around the black hole as predicted by von Laue. The central black circle is the black hole with a Schwarzschild radius of 10 units. The black ring is the circular photon orbit at a radius 1.5 times the Schwarzschild radius.

Photons pile up around the black hole at the photon sphere. The first image ever of the photon sphere of a black hole was made earlier this year (announced April 10, 2019). The image shows the shadow of the supermassive black hole in the center of Messier 87 (M87), an elliptical galaxy 55 million light-years from Earth. This black hole is 6.5 billion times the mass of the Sun. Imaging the photosphere required eight ground-based radio telescopes placed around the globe, operating together to form a single telescope with an optical aperture the size of our planet.  The resolution of such a large telescope would allow one to image a half-dollar coin on the surface of the Moon, although this telescope operates in the radio frequency range rather than the optical.

Fig. 4 Scientists have obtained the first image of a black hole, using Event Horizon Telescope observations of the center of the galaxy M87. The image shows a bright ring formed as light bends in the intense gravity around a black hole that is 6.5 billion times more massive than the Sun.

Further Reading

Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019)

B. Lavenda, The Optical Properties of Gravity, J. Mod. Phys, 8 8-3-838 (2017)

Getting Armstrong, Aldrin and Collins Home from the Moon: Apollo 11 and the Three-Body Problem

Fifty years ago on the 20th of July at nearly 11 o’clock at night, my brothers and I were peering through the screen door of a very small 1960’s Shasta compact car trailer watching the TV set on the picnic table outside the trailer door.  Our family was at a camp ground in southern Michigan and the mosquitos were fierce (hence why we were inside the trailer looking out through the screen).  Neil Armstrong was about to be the first human to step foot on the Moon.  The image on the TV was a fuzzy black and white, with barely recognizable shapes clouded even more by the dirt and dead bugs on the screen, but it is a memory etched in my mind.  I was 10 years old and I was convinced that when I grew up I would visit the Moon myself, because by then Moon travel would be like flying to Europe.  It didn’t turn out that way, and fifty years later it’s a struggle to even get back there. 

The dangers could have become life-threatening for the crew of Apollo 11. If they miscalculated their trajectory home and had bounced off the Earth’s atmosphere, they would have become a tragic demonstration of the chaos of three-body orbits.

So maybe I won’t get to the Moon, but maybe my grandchildren will.  And if they do, I hope they know something about the three-body problem in physics, because getting to and from the Moon isn’t as easy as it sounds.  Apollo 11 faced real danger at several critical points on its flight plan, but all went perfectly (except overshooting their landing site and that last boulder field right before Armstrong landed). Some of those dangers became life-threatening for the crew of Apollo 13, and if they had miscalculated their trajectory home and had bounced off the Earth’s atmosphere, they would have become a tragic demonstration of the chaos of three-body orbits.  In fact, their lifeless spaceship might have returned to the Moon and back to Earth over and over again, caught in an infinite chaotic web.

The complexities of trajectories in the three-body problem arise because there are too few constants of motion and too many degrees of freedom.  To get an intuitive picture of how the trajectory behaves, it is best to start with a problem known as the restricted three-body problem.

The Saturn V Booster, perhaps the pinnacle of “muscle and grit” space exploration.

The Restricted Three-Body Problem

The restricted three-body problem was first considered by Leonhard Euler in 1762 (for a further discussion of the history of the three-body problem, see my Blog from July 5).  For the special case of circular orbits of constant angular frequency, the motion of the third mass is described by the Lagrangian

where the potential is time dependent because of the motion of the two larger masses.  Lagrange approached the problem by adopting a rotating reference frame in which the two larger masses m1 and m2 move along the stationary line defined by their centers.  The new angle variable is theta-prime.  The Lagrangian in the rotating frame is

where the effective potential is now time independent.  The first term in the effective potential is the Coriolis effect and the second is the centrifugal term.  The dynamical flow in the plane is four dimensional, and the four-dimensional flow is

where the position vectors are in the center-of-mass frame

relative to the positions of the Earth and Moon (x1 and x2) in the rotating frame in which they are at rest along the x-axis.

A single trajectory solved for this flow is shown in Fig. 1 for a tiny object passing back and forth chaotically between the Earth and the Moon. The object is considered to be massless, or at least so small it does not perturb the Earth-Moon system. The energy of the object was selected to allow it to pass over the potential barrier of the Lagrange-Point L1 between the Earth and the Moon. The object spends most of its time around the Earth, but now and then will get into a transfer orbit that brings it around the Moon. This would have been the fate of Apollo 11 if their last thruster burn had failed.

Fig. 1 The trajectory of a tiny object in the planar three-body problem interacting with a large mass (Earth on the left) and a small mass (Moon on the right). The energy of the trajectory allows it to pass back and forth chaotically between proximity to the Earth and proximity to the Moon. The time-duration of the simulation is approximately one decade. The envelope of the trajectories is called the “Hill region” named after one of the the first US astrophysicists George William Hill (1838-1914) who studied the 3-body problem of the Moon.

Contrast the orbit of Fig. 1 with the simple flight plan of Apollo 11 on the banner figure. The chaotic character of the three-body problem emerges for a “random” initial condition. You can play with different initial conditions in the following Python code to explore the properties of this dynamical problem. Note that in this simulation, the mass of the Moon was chosen about 8 times larger than in nature to exaggerate the effect of the Moon.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue May 28 11:50:24 2019

@author: nolte
"""

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os

plt.close('all')

womega = 1
R = 1
eps = 1e-6

M1 = 1     % Mass of the Earth
M2 = 1/10     % Mass of the Moon
chsi = M2/M1

x1 = -M2*R/(M1+M2)    % Earth location in rotating frame
x2 = x1 + R     % Moon location

def poten(y,c):
    
    rp0 = np.sqrt(y**2 + c**2);
    thetap0 = np.arctan(y/c);
        
    rp1 = np.sqrt(x1**2 + rp0**2 - 2*np.abs(rp0*x1)*np.cos(np.pi-thetap0));
    rp2 = np.sqrt(x2**2 + rp0**2 - 2*np.abs(rp0*x2)*np.cos(thetap0));
    V = -M1/rp1 -M2/rp2 - E;
     
    return [V]

def flow_deriv(x_y_z,tspan):
    x, y, z, w = x_y_z
    
    r1 = np.sqrt(x1**2 + x**2 - 2*np.abs(x*x1)*np.cos(np.pi-z));
    r2 = np.sqrt(x2**2 + x**2 - 2*np.abs(x*x2)*np.cos(z));
        
    yp = np.zeros(shape=(4,))
    yp[0] = y
    yp[1] = -womega**2*R**3*(np.abs(x)-np.abs(x1)*np.cos(np.pi-z))/(r1**3+eps) - womega**2*R**3*chsi*(np.abs(x)-abs(x2)*np.cos(z))/(r2**3+eps) + x*(w-womega)**2
    yp[2] = w
    yp[3] = 2*y*(womega-w)/x - womega**2*R**3*chsi*abs(x2)*np.sin(z)/(x*(r2**3+eps)) + womega**2*R**3*np.abs(x1)*np.sin(np.pi-z)/(x*(r1**3+eps))
    
    return yp
                
r0 = 0.64   % initial radius
v0 = 0.3    % initial radial speed
theta0 = 0   % initial angle
vrfrac = 1   % fraction of speed in radial versus angular directions

rp1 = np.sqrt(x1**2 + r0**2 - 2*np.abs(r0*x1)*np.cos(np.pi-theta0))
rp2 = np.sqrt(x2**2 + r0**2 - 2*np.abs(r0*x2)*np.cos(theta0))
V = -M1/rp1 - M2/rp2
T = 0.5*v0**2
E = T + V

vr = vrfrac*v0
W = (2*T - v0**2)/r0

y0 = [r0, vr, theta0, W]   % This is where you set the initial conditions

tspan = np.linspace(1,2000,20000)

y = integrate.odeint(flow_deriv, y0, tspan)

xx = y[1:20000,0]*np.cos(y[1:20000,2]);
yy = y[1:20000,0]*np.sin(y[1:20000,2]);

plt.figure(1)
lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=0.5)
plt.show()

In the code, set the position and speed of the Apollo command module on lines 56-59 and put in the initial conditions on line 70. The mass of the Moon in nature is 1/81 of the mass of the Earth, which shrinks the L1 “bottleneck” to a much smaller region that you can explore to see what the fate of the Apollo missions could have been.

Further Reading

The Three-body Problem, Longitude at Sea, and Lagrange’s Points

Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019)

The Iconic Eikonal and the Optical Path

Nature loves the path of steepest descent.  Place a ball on a smooth curved surface and release it, and it will instantansouly accelerate in the direction of steepest descent.  Shoot a laser beam from an oblique angle onto a piece of glass to hit a target inside, and the path taken by the beam is the only path that decreases the distance to the target in the shortest time.  Diffract a stream of electrons from the surface of a crystal, and quantum detection events are greatest at the positions where the troughs and peaks of the deBroglie waves converge the most.  The first example is Newton’s second law.  The second example is Fermat’s principle.  The third example is Feynman’s path-integral formulation of quantum mechanics.  They all share in common a minimization principle—the principle of least action—that the path of a dynamical system is the one that minimizes a property known as “action”.

The Eikonal Equation is the “F = ma” of ray optics.  It’s solutions describe the paths of light rays through complicated media.

         The principle of least action, first proposed by the French physicist Maupertuis through mechanical analogy, became a principle of Lagrangian mechanics in the hands of Lagrange, but was still restricted to mechanical systems of particles.  The principle was generalized forty years later by Hamilton, who began by considering the propagation of light waves, and ended by transforming mechanics into a study of pure geometry divorced from forces and inertia.  Optics played a key role in the development of mechanics, and mechanics returned the favor by giving optics the Eikonal Equation.  The Eikonal Equation is the “F = ma” of ray optics.  It’s solutions describe the paths of light rays through complicated media.

Malus’ Theorem

Anyone who has taken a course in optics knows that Étienne-Louis Malus (1775-1812) discovered the polarization of light, but little else is taught about this French mathematician who was one of the savants Napoleon had taken along with himself when he invaded Egypt in 1798.  After experiencing numerous horrors of war and plague, Malus returned to France damaged but wiser.  He discovered the polarization of light in the Fall of 1808 as he was playing with crystals of icelandic spar at sunset and happened to view last rays of the sun reflected from the windows of the Luxumbourg palace.  Icelandic spar produces double images in natural light because it is birefringent.  Malus discovered that he could extinguish one of the double images of the Luxumbourg windows by rotating the crystal a certain way, demonstrating that light is polarized by reflection.  The degree to which light is extinguished as a function of the angle of the polarizing crystal is known as Malus’ Law

Fronts-piece to the Description de l’Égypte , the first volume published by Joseph Fourier in 1808 based on the report of the savants of L’Institute de l’Égypte that included Monge, Fourier and Malus, among many other French scientists and engineers.

         Malus had picked up an interest in the general properties of light and imaging during lulls in his ordeal in Egypt.  He was an emissionist following his compatriot Laplace, rather than an undulationist following Thomas Young.  It is ironic that the French scientists were staunchly supporting Newton on the nature of light, while the British scientist Thomas Young was trying to upend Netwonian optics.  Almost all physicists at that time were emissionists, only a few years after Young’s double-slit experiment of 1804, and few serious scientists accepted Young’s theory of the wave nature of light until Fresnel and Arago supplied the rigorous theory and experimental proofs much later in 1819. 

Malus’ Theorem states that rays perpendicular to an initial surface are perpendicular to a later surface after reflection in an optical system. This theorem is the starting point for the Eikonal ray equation, as well as for modern applications in adaptive optics. This figure shows a propagating aberrated wavefront that is “compensated” by a deformable mirror to produce a tight focus.

         As a prelude to his later discovery of polarization, Malus had earlier proven a theorem about trajectories that particles of light take through an optical system.  One of the key questions about the particles of light in an optical system was how they formed images.  The physics of light particles moving through lenses was too complex to treat at that time, but reflection was relatively easy based on the simple reflection law.  Malus proved a theorem mathematically that after a reflection from a curved mirror, a set of rays perpendicular to an initial nonplanar surface would remain perpendicular at a later surface after reflection (this property is closely related to the conservation of optical etendue).  This is known as Malus’ Theorem, and he thought it only held true after a single reflection, but later mathematicians proved that it remains true even after an arbitrary number of reflections, even in cases when the rays intersect to form an optical effect known as a caustic.  The mathematics of caustics would catch the interest of an Irish mathematician and physicist who helped launch a new field of mathematical physics.

Etienne-Louis Malus

Hamilton’s Characteristic Function

William Rowan Hamilton (1805 – 1865) was a child prodigy who taught himself thirteen languages by the time he was thirteen years old (with the help of his linguist uncle), but mathematics became his primary focus at Trinity College at the University in Dublin.  His mathematical prowess was so great that he was made the Astronomer Royal of Ireland while still an undergraduate student.  He also became fascinated in the theory of envelopes of curves and in particular to the mathematics of caustic curves in optics. 

         In 1823 at the age of 18, he wrote a paper titled Caustics that was read to the Royal Irish Academy.  In this paper, Hamilton gave an exceedingly simple proof of Malus’ Law, but that was perhaps the simplest part of the paper.  Other aspects were mathematically obscure and reviewers requested further additions and refinements before publication.  Over the next four years, as Hamilton expanded this work on optics, he developed a new theory of optics, the first part of which was published as Theory of Systems of Rays in 1827 with two following supplements completed by 1833 but never published.

         Hamilton’s most important contribution to optical theory (and eventually to mechanics) he called his characteristic function.  By applying the principle of Fermat’s least time, which he called his principle of stationary action, he sought to find a single unique function that characterized every path through an optical system.  By first proving Malus’ Theorem and then applying the theorem to any system of rays using the principle of stationary action, he was able to construct two partial differential equations whose solution, if it could be found, defined every ray through the optical system.  This result was completely general and could be extended to include curved rays passing through inhomogeneous media.  Because it mapped input rays to output rays, it was the most general characterization of any defined optical system.  The characteristic function defined surfaces of constant action whose normal vectors were the rays of the optical system.  Today these surfaces of constant action are called the Eikonal function (but how it got its name is the next chapter of this story).  Using his characteristic function, Hamilton predicted a phenomenon known as conical refraction in 1832, which was subsequently observed, launching him to a level of fame unusual for an academic.

         Once Hamilton had established his principle of stationary action of curved light rays, it was an easy step to extend it to apply to mechanical systems of particles with curved trajectories.  This step produced his most famous work On a General Method in Dynamics published in two parts in 1834 and 1835 [1] in which he developed what became known as Hamiltonian dynamics.  As his mechanical work was extended by others including Jacobi, Darboux and Poincaré, Hamilton’s work on optics was overshadowed, overlooked and eventually lost.  It was rediscovered when Schrödinger, in his famous paper of 1926, invoked Hamilton’s optical work as a direct example of the wave-particle duality of quantum mechanics [2]. Yet in the interim, a German mathematician tackled the same optical problems that Hamilton had seventy years earlier, and gave the Eikonal Equation its name.

Bruns’ Eikonal

The German mathematician Heinrich Bruns (1848-1919) was engaged chiefly with the measurement of the Earth, or geodesy.  He was a professor of mathematics in Berlin and later Leipzig.  One claim fame was that one of his graduate students was Felix Hausdorff [3] who would go on to much greater fame in the field of set theory and measure theory (the Hausdorff dimension was a precursor to the fractal dimension).  Possibly motivated by his studies done with Hausdorff on refraction of light by the atmosphere, Bruns became interested in Malus’ Theorem for the same reasons and with the same goals as Hamilton, yet was unaware of Hamilton’s work in optics. 

         The mathematical process of creating “images”, in the sense of a mathematical mapping, made Bruns think of the Greek word  eikwn which literally means “icon” or “image”, and he published a small book in 1895 with the title Das Eikonal in which he derived a general equation for the path of rays through an optical system.  His approach was heavily geometrical and is not easily recognized as an equation arising from variational principals.  It rediscovered most of the results of Hamilton’s paper on the Theory of Systems of Rays and was thus not groundbreaking in the sense of new discovery.  But it did reintroduce the world to the problem of systems of rays, and his name of Eikonal for the equations of the ray paths stuck, and was used with increasing frequency in subsequent years.  Arnold Sommerfeld (1868 – 1951) was one of the early proponents of the Eikonal equation and recognized its connection with action principles in mechanics. He discussed the Eikonal equation in a 1911 optics paper with Runge [4] and in 1916 used action principles to extend Bohr’s model of the hydrogen atom [5]. While the Eikonal approach was not used often, it became popular in the 1960’s when computational optics made numerical solutions possible.

Lagrangian Dynamics of Light Rays

In physical optics, one of the most important properties of a ray passing through an optical system is known as the optical path length (OPL).  The OPL is the central quantity that is used in problems of interferometry, and it is the central property that appears in Fermat’s principle that leads to Snell’s Law.  The OPL played an important role in the history of the calculus when Johann Bernoulli in 1697 used it to derive the path taken by a light ray as an analogy of a brachistochrone curve – the curve of least time taken by a particle between two points.

            The OPL between two points in a refractive medium is the sum of the piecewise product of the refractive index n with infinitesimal elements of the path length ds.  In integral form, this is expressed as

where the “dot” is a derivative with respedt to s.  The optical Lagrangian is recognized as

The Lagrangian is inserted into the Euler equations to yield (after some algebra, see Introduction to Modern Dynamics pg. 336)

This is a second-order ordinary differential equation in the variables xa that define the ray path through the system.  It is literally a “trajectory” of the ray, and the Eikonal equation becomes the F = ma of ray optics.

Hamiltonian Optics

In a paraxial system (in which the rays never make large angles relative to the optic axis) it is common to select the position z as a single parameter to define the curve of the ray path so that the trajectory is parameterized as

where the derivatives are with respect to z, and the effective Lagrangian is recognized as

The Hamiltonian formulation is derived from the Lagrangian by defining an optical Hamiltonian as the Legendre transform of the Lagrangian.  To start, the Lagrangian is expressed in terms of the generalized coordinates and momenta.  The generalized optical momenta are defined as

This relationship leads to an alternative expression for the Eikonal equation (also known as the scalar Eikonal equation) expressed as

where S(x,y,z) = const. is the eikonal function.  The  momentum vectors are perpendicular to the surfaces of constant S, which are recognized as the wavefronts of a propagating wave.

            The Lagrangian can be restated as a function of the generalized momenta as

and the Legendre transform that takes the Lagrangian into the Hamiltonian is

The trajectory of the rays is the solution to Hamilton’s equations of motion applied to this Hamiltonian

Light Orbits

If the optical rays are restricted to the x-y plane, then Hamilton’s equations of motion can be expressed relative to the path length ds, and the momenta are pa = ndxa/ds.  The ray equations are (simply expressing the 2 second-order Eikonal equation as 4 first-order equations)

where the dot is a derivative with respect to the element ds.

As an example, consider a radial refractive index profile in the x-y plane

where r is the radius on the x-y plane. Putting this refractive index profile into the Eikonal equations creates a two-dimensional orbit in the x-y plane. The following Python code solves for individual trajectories.

Python Code: raysimple.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue May 28 11:50:24 2019

@author: nolte
"""

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os

plt.close('all')

# selection 1 = Gaussian
# selection 2 = Donut
selection = 1

print(' ')
print('raysimple.py')

def refindex(x,y):
    
    if selection == 1:
        
        sig = 10
        
        n = 1 + np.exp(-(x**2 + y**2)/2/sig**2)
        nx = (-2*x/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
        ny = (-2*y/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
        
    elif selection == 2:
        
        sig = 10;
        r2 = (x**2 + y**2)
        r1 = np.sqrt(r2)
        np.expon = np.exp(-r2/2/sig**2)
        
        n = 1+0.3*r1*np.expon;
        nx = 0.3*r1*(-2*x/2/sig**2)*np.expon + 0.3*np.expon*2*x/r1
        ny = 0.3*r1*(-2*y/2/sig**2)*np.expon + 0.3*np.expon*2*y/r1
    
        
    return [n,nx,ny]


def flow_deriv(x_y_z,tspan):
    x, y, z, w = x_y_z
    
    n, nx, ny = refindex(x,y)
    
    yp = np.zeros(shape=(4,))
    yp[0] = z/n
    yp[1] = w/n
    yp[2] = nx
    yp[3] = ny
    
    return yp
                
V = np.zeros(shape=(100,100))
for xloop in range(100):
    xx = -20 + 40*xloop/100
    for yloop in range(100):
        yy = -20 + 40*yloop/100
        n,nx,ny = refindex(xx,yy) 
        V[yloop,xloop] = n

fig = plt.figure(1)
contr = plt.contourf(V,100, cmap=cm.coolwarm, vmin = 1, vmax = 3)
fig.colorbar(contr, shrink=0.5, aspect=5)    
fig = plt.show()


v1 = 0.707      # Change this initial condition
v2 = np.sqrt(1-v1**2)
y0 = [12, v1, 0, v2]     # Change these initial conditions

tspan = np.linspace(1,1700,1700)

y = integrate.odeint(flow_deriv, y0, tspan)

plt.figure(2)
lines = plt.plot(y[1:1550,0],y[1:1550,1])
plt.setp(lines, linewidth=0.5)
plt.show()

Gaussian refractive index profile in the x-y plane. From raysimple.py.
Ray orbits around the center of the Gaussian refractive index profile. From raysimple.py

Bibliography

An excellent textbook on geometric optics from Hamilton’s point of view is K. B. Wolf, Geometric Optics in Phase Space (Springer, 2004). Another is H. A. Buchdahl, An Introduction to Hamiltonian Optics (Dover, 1992).

A rather older textbook on geometrical optics is by J. L. Synge, Geometrical Optics: An Introduction to Hamilton’s Method (Cambridge University Press, 1962) showing the derivation of the ray equations in the final chapter using variational methods. Synge takes a dim view of Bruns’ term “Eikonal” since Hamilton got there first and Bruns was unaware of it.

A book that makes an especially strong case for the Optical-Mechanical analogy of Fermat’s principle, connecting the trajectories of mechanics to the paths of optical rays is Daryl Holm, Geometric Mechanics: Part I Dynamics and Symmetry (Imperial College Press 2008).

The Eikonal ray equation is derived from the geodesic equation (or rather as a geodesic equation) in D. D. Nolte, Introduction to Modern Dynamics (Oxford, 2015).


References

[1] Hamilton, W. R. “On a general method in dynamics I.” Mathematical Papers, I ,103-161: 247-308. (1834); Hamilton, W. R. “On a general method in dynamics II.” Mathematical Papers, I ,103-161: 95-144. (1835)

[2] Schrodinger, E. “Quantification of the eigen-value problem.” Annalen Der Physik 79(6): 489-527. (1926)

[3] For the fateful story of Felix Hausdorff (aka Paul Mongré) see Chapter 9 of Galileo Unbound (Oxford, 2018).

[4] Sommerfeld, A. and J. Runge. “The application of vector calculations on the basis of geometric optics.” Annalen Der Physik 35(7): 277-298. (1911)

[5] Sommerfeld, A. “The quantum theory of spectral lines.” Annalen Der Physik 51(17): 1-94. (1916)


Biased Double-well Potential: Bistability, Bifurcation and Hysteresis

Bistability, bifurcation and hysteresis are ubiquitous phenomena that arise from nonlinear dynamics and have considerable importance for technology applications.  For instance, the hysteresis associated with the flipping of magnetic domains under magnetic fields is the central mechanism for magnetic memory, and bistability is a key feature of switching technology.

… one of the most commonly encountered bifurcations is called a saddle-node bifurcation, which is the bifurcation that occurs in the biased double-well potential.

One of the simplest models for bistability and hysteresis is the one-dimensional double-well potential biased by a changing linear potential.  An example of a double-well potential with a bias is

where the parameter c is a control parameter (bias) that can be adjusted or that changes slowly in time c(t).  This dynamical system is also known as the Duffing oscillator. The net double-well potentials for several values of the control parameter c are shown in Fig. 1.   With no bias, there are two degenerate energy minima.  As c is made negative, the left well has the lowest energy, and as c is made positive the right well has the lowest energy.

The dynamics of this potential energy profile can be understood by imagining a small ball that responds to the local forces exerted by the potential.  For large negative values of c the ball will have its minimum energy in the left well.  As c is increased, the energy of the left well increases, and rises above the energy of the right well.  If the ball began in the left well, even when the left well has a higher energy than the right, there is a potential barrier that the ball cannot overcome and it remains on the left.  This local minimum is a stable equilibrium, but it is called “metastable” because it is not a global minimum of the system.  Metastability is the origin of hysteresis.

Fig. 1 A biased double-well potential in one dimension. The thresholds to destroy the local metastable minima are c = +/-1.05. For values beyond threshold, only a single minimum exists with no barrier. Hysteresis is caused by the mass being stuck in the metastable (upper) minimum because it has insufficient energy to overcome the potential barrier, until the barrier disappears at threshold and the ball rolls all the way down to the bottom to the new location. When the bias is slowly reversed, the new location becomes metastable, until the ball can overcome the barrier and roll down to its original minimum, etc.

           Once sufficient bias is applied that the local minimum disappears, the ball will roll downhill to the new minimum on the right, and in the presence of dissipation, it will come to rest in the new minimum.  The bias can then be slowly lowered, reversing this process. Because of the potential barrier, the bias must change sign and be strong enough to remove the stability of the now metastable fixed point with the ball on the right, allowing the ball to roll back down to its original location on the left.  This “overshoot” defines the extent of the hysteresis. The fact that there are two minima, and that one is metastable with a barrier between the two, produces “bistability”, meaning that there are two stable fixed points for the same control parameter.

           For illustration, assume a mass obeys the flow equation

including a damping term, where the force is the negative gradient of the potential energy.  The bias parameter c can be time dependent, beginning beyond the negative threshold and slowly increasing until it exceeds the positive threshold, and then reversing and decreasing again.  The position of the mass is locally a damped oscillator until a threshold is passed, and then the mass falls into the global minimum, as shown in Fig. 2. As the bias is reversed, it remains in the metastable minimum on the right until the control parameter passes threshold, and then the mass drops into the left minimum that is now a global minimum.

Fig. 2 Hysteresis diagram. The mass begins in the left well. As the parameter c increases, the mass remains in the well, even though it is no longer the global minimum when c becomes positive. When c passes the positive threshold (around 1.05 for this example), the mass falls into the right well, with damped oscillation. Then the control parameter c is decreased slowly until the negative threshold is passed, and the mass switches to the left well with damped oscillations. The difference between the “switch up” and “switch down” values of the control parameter represents the “hysteresis” of the this system.

The sudden switching of the biased double-well potential represents what is known as a “bifurcation”. A bifurcation is a sudden change in the qualitative behavior of a system caused by a small change in a control variable. Usually, a bifurcation occurs when the number of attractors of a system changes. There is a fairly large menagerie of different types of bifurcations, but one of the most commonly encountered bifurcations is called a saddle-node bifurcation, which is the bifurcation that occurs in the biased double-well potential. In fact, there are two saddle-node bifurcations.

Bifurcations are easily portrayed by creating a joint space between phase space and the one (or more) control parameters that induce the bifurcation. The phase space of the double well is two dimensional (position, velocity) with three fixed points, but the change in the number of fixed points can be captured by taking a projection of the phase space onto a lower-dimensional manifold. In this case, the projection is simply along the x-axis. Therefore a “co-dimensional phase space” can be constructed with the x-axis as one dimension and the control parameter as the other. This is illustrated in Fig. 3. The cubic curve traces out the solutions to the fixed-point equation

For a given value of the control parameter c there are either three solutions or one solution. The values of c where the number of solutions changes discontinuously is the bifurcation point c*. Two examples of the potential function are shown on the right for c = +1 and c = -0.5 showing the locations of the three fixed points.

Fig. 3 The co-dimension phase space combines the one-dimensional dynamics along the position x with the control parameter. For a given value of c, there are three or one solution for the fixed point. When there are three solutions, two are stable (the double minima) and one is unstable (the saddle). As the magnitude of the bias increases, one stable node annihilates with the unstable node (a minimum and the saddle merge) and the dynamics “switch” to the other minimum.

The threshold value in this example is c* = 1.05. When |c| < c* the two stable fixed points are the two minima of the double-well potential, and the unstable fixed point is the saddle between them. When |c| > c* then the single stable fixed point is the single minimum of the potential function. The saddle-node bifurcation takes its name from the fact (illustrated here) that the unstable fixed point is a saddle, and at the bifurcation the saddle point annihilates with one of the stable fixed points.

The following Python code illustrates the behavior of a biased double-well potential, with damping, in which the control parameter changes slowly with a sinusoidal time dependence.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Apr 17 15:53:42 2019

@author: nolte
"""

import numpy as np
from scipy import integrate
from scipy import signal
from matplotlib import pyplot as plt

plt.close('all')
T = 400
Amp = 3.5

def solve_flow(y0,c0,lim = [-3,3,-3,3]):

    def flow_deriv(x_y, t, c0):
        #"""Compute the time-derivative of a Medio system."""
        x, y = x_y

        return [y,-0.5*y - x**3 + 2*x + x*(2*np.pi/T)*Amp*np.cos(2*np.pi*t/T) + Amp*np.sin(2*np.pi*t/T)]

    tsettle = np.linspace(0,T,101)   
    yinit = y0;
    x_tsettle = integrate.odeint(flow_deriv,yinit,tsettle,args=(T,))
    y0 = x_tsettle[100,:]
    
    t = np.linspace(0, 1.5*T, 2001)
    x_t = integrate.odeint(flow_deriv, y0, t, args=(T,))
    c  = Amp*np.sin(2*np.pi*t/T)
        
    return t, x_t, c

eps = 0.0001

for loop in range(0,100):
    c = -1.2 + 2.4*loop/100 + eps;
    xc[loop]=c
    
    coeff = [-1, 0, 2, c]
    y = np.roots(coeff)
    
    xtmp = np.real(y[0])
    ytmp = np.real(y[1])
    
    X[loop] = np.min([xtmp,ytmp])
    Y[loop] = np.max([xtmp,ytmp])
    Z[loop]= np.real(y[2])


plt.figure(1)
lines = plt.plot(xc,X,xc,Y,xc,Z)
plt.setp(lines, linewidth=0.5)
plt.show()
plt.title('Roots')

y0 = [1.9, 0]
c0 = -2.

t, x_t, c = solve_flow(y0,c0)
y1 = x_t[:,0]
y2 = x_t[:,1]

plt.figure(2)
lines = plt.plot(t,y1)
plt.setp(lines, linewidth=0.5)
plt.show()
plt.ylabel('X Position')
plt.xlabel('Time')

plt.figure(3)
lines = plt.plot(c,y1)
plt.setp(lines, linewidth=0.5)
plt.show()
plt.ylabel('X Position')
plt.xlabel('Control Parameter')
plt.title('Hysteresis Figure')

Further Reading:

D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time (2nd Edition) (Oxford University Press, 2019)

The Pendulum Lab

Georg Duffing’s Equation

Although coal and steam launched the industrial revolution, gasoline and controlled explosions have sustained it for over a century.  After early precursors, the internal combustion engine that we recognize today came to life in 1876 from the German engineers Otto and Daimler with later variations by Benz and Diesel.  In the early 20th century, the gasoline engine was replacing coal and oil in virtually all mobile conveyances and had become a major industry attracting the top mechanical engineering talent.  One of those talents was the German engineer Georg Duffing (1861 – 1944) whose unlikely side interest in the quantum mechanics revolution brought him to Berlin to hear lectures by Max Planck, where he launched his own revolution in nonlinear oscillators.

The publication of this highly academic book by a nonacademic would establish Duffing as the originator of one of the most iconic oscillators in modern dynamics.

An Academic Non-Academic

Georg Duffing was born in 1861 in the German town of Waldshut on the border with Switzerland north of Zurich.  Within a year the family moved to Mannheim near Heidelberg where Georg received a good education in mathematics as well as music.  His mathematical interests attracted him to engineering, and he built a reputation that led to an invitation to work at Westinghouse in the United States in 1910.  When he returned to Germany he set himself up as a consultant and inventor with the freedom to move where he wished.  In early 1913 he wished to move to Berlin where Max Planck was lecturing on the new quantum mechanics at the University.  He was always searching for new knowledge, and sitting in on Planck’s lectures must have made him feel like he was witnessing the beginnings of a new era.            

At that time Duffing was interested in problems related to brakes, gears and engines.  In particular, he had become fascinated by vibrations that often were the limiting factors in engine performance.  He stripped the problem of engine vibration down to its simplest form, and he began a careful and systematic study of nonlinear oscillations.  While in Berlin, he had became acquainted with Prof. Meyer at the University who had a mechanical engineering laboratory.  Meyer let Duffing perform his experiments in the lab on the weekends, sometime accompanied by his eldest daughter.  By 1917 he had compiled a systematic investigation of various nonlinear effects in oscillators and had written a manuscript that collected all of this theoretical and experimental work.  He extended this into a small book that he published with Vieweg & Sohn in 1918 to be purchased for a price of 5 Deutsch Marks [1].   The publication of this highly academic book by a nonacademic would establish Duffing as the originator of one of the most iconic oscillators in modern dynamics.

Fig. 1 Cover of Duffing’s 1918 publication on nonlinear oscillators.

Duffing’s Nonlinear Oscillator

The mathematical and technical focus of Duffing’s book was low-order nonlinear corrections to the linear harmonic oscillator.  In one case, he considered a spring that either became stiffer or softer as it stretched.  This happens when a cubic term is added to the usual linear Hooke’s law.  In another case, he considered a spring that was stiffer in one direction than another, making the stiffness asymmetric.  This happens when a quadratic term is added.  These terms are shown in Fig. 2 from Duffing’s book.  The top equation is a free oscillation, and the bottom equation has a harmonic forcing function.  These were the central equations that Duffing explored, plus the addition of damping that he considered in a later chapter as shown in Fig. 3. The book lays out systematically, chapter by chapter, approximate and series solutions to the nonlinear equations, and in special cases described analytically exact solutions (such as for the nonlinear pendulum).

Fig. 2 Duffing’s equations without damping for free oscillation and driven oscillation with quadratic (producing an asymmetric potential) and cubic (producing stiffening or softening) corrections to the spring force.
Fig. 3 Inclusion of damping in the case with cubic corrections to the spring force.

Duffing was a practical engineer as well as a mathematical one, and he built experimental systems to test his solutions.  An engineering drawing of his experimental test apparatus is shown in Fig. 4. The small test pendulum is at S in the figure. The large pendulum at B is the drive pendulum, chosen to be much heavier than the test pendulum so that it can deliver a steady harmonic force through spring F1 to the test system. The cubic nonlinearity of the test system was controlled through the choice of the length of the test pendulum, and the quadratic nonlinearity (the asymmetry) was controlled by allowing the equilibrium angle to be shifted from vertical. The relative strength of the quadratic and cubic terms was adjusted by changing the position of the mass at G. Duffing derived expressions for all the coefficients of the equations in Fig. 1 in terms of experimentally-controlled variables. Using this apparatus, Duffing verified to good accuracy his solutions for various special cases.

Fig. 4 Duffing’s experimental system he used to explore and verify his equations and solutions.

           Duffing’s book is a masterpiece of careful systematic investigation, beginning in general terms, and then breaking the problem down into its special cases, finding solutions for each one with accurate experimental verifications. These attributes established the importance of this little booklet in the history of science and technology, but because it was written in German, most of the early citations were by German scientists.  The first use of Duffing’s name associated to the nonlinear oscillator problem occurred in 1928 [2], as was the first reference to him in a work in English in a book by Timoshenko [3].  The first use of the phrase “Duffing Equation” specifically to describe an oscillator with a linear and cubic restoring force was in 1942 in a series of lectures presented at Brown University [4], and this nomenclature had become established by the end of that decade [5].  Although Duffing had spent considerable attention in his book to the quadratic term for an asymmetric oscillator, the term “Duffing Equation” now refers to the stiffening and softening problem rather than to the asymmetric problem.

Fig. 5 The Duffing equation is generally expressed as a harmonic oscillator (first three terms plus the harmonic drive) modified by a cubic nonlinearity and driven harmonically.

Duffing Rediscovered

Nonlinear oscillations remained mainly in the realm of engineering for nearly half a century, until a broad spectrum of physical scientists began to discover deep secrets hiding behind the simple equations.  In 1963 Edward Lorenz (1917 – 2008) of MIT published a paper that showed how simple nonlinearities in three equations describing the atmosphere could produce a deterministic behavior that appeared to be completely chaotic.  News of this paper spread as researchers in many seemingly unrelated fields began to see similar signatures in chemical reactions, turbulence, electric circuits and mechanical oscillators.  By 1972 when Lorenz was invited to give a talk on the “Butterfly Effect” the science of chaos was emerging as new frontier in physics, and in 1975 it was given its name “chaos theory” by James Yorke (1941 – ).  By 1976 it had become one of the hottest new areas of science. 

        Through the period of the emergence of chaos theory, the Duffing oscillator was known to be one of the archetypical nonlinear oscillators.  A particularly attractive aspect of the general Duffing equations is the possibility of studying a “double-well” potential.  This happens when the “alpha” in the equation in Fig. 5 is negative and the “beta” is positive.  The double-well potential has a long history in physics, both classical and modern, because it represents a “two-state” system that exhibits bistability, bifurcations, and hysteresis.  For a fixed “beta” the potential energy as a function of “alpha” is shown in Fig. 6.  The bifurcation cascades of the double-well Duffing equation was investigated by Phillip Holmes (1945 – ) in 1976 [6], and the properties of the strange attractor were demonstrated in 1978 [7] by Yoshisuke Ueda (1936 – ).  Holmes, and others, continued to do detailed work on the chaotic properties of the Duffing oscillator, helping to make it one of the most iconic systems of chaos theory.

Fig. 6 Potential energy of the Duffing Oscillator. The position variable is x, and changing alpha is along the other axis. For positive beta and alpha the potential is a quartic. For positive beta and negative alpha the potential is a double well.

Python Code for the Duffing Oscillator

This Python code uses the simple ODE solver on the driven-damped Duffing double-well oscillator to display the configuration-space trajectories and the Poincaré map of the strange attractor.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed May 21 06:03:32 2018
@author: nolte
"""
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os

plt.close('all')

# model_case 1 = Pendulum
# model_case 2 = Double Well
print(' ')
print('Duffing.py')

alpha = -1       # -1
beta = 1         # 1
delta = 0.3       # 0.3
gam = 0.15    # 0.15
w = 1
def flow_deriv(x_y_z,tspan):
    x, y, z = x_y_z
    a = y
    b = delta*np.cos(w*tspan) - alpha*x - beta*x**3 - gam*y
    c = w
    return[a,b,c]
                
T = 2*np.pi/w

px1 = np.random.rand(1)
xp1 = np.random.rand(1)
w1 = 0

x_y_z = [xp1, px1, w1]

# Settle-down Solve for the trajectories
t = np.linspace(0, 2000, 40000)
x_t = integrate.odeint(flow_deriv, x_y_z, t)
x0 = x_t[39999,0:3]

tspan = np.linspace(1,20000,400000)
x_t = integrate.odeint(flow_deriv, x0, tspan)
siztmp = np.shape(x_t)
siz = siztmp[0]

y1 = x_t[:,0]
y2 = x_t[:,1]
y3 = x_t[:,2]
    
plt.figure(2)
lines = plt.plot(y1[1:2000],y2[1:2000],'ko',ms=1)
plt.setp(lines, linewidth=0.5)
plt.show()

for cloop in range(0,3):

#phase = np.random.rand(1)*np.pi;
    phase = np.pi*cloop/3

    repnum = 5000
    px = np.zeros(shape=(2*repnum,))
    xvar = np.zeros(shape=(2*repnum,))
    cnt = -1
    testwt = np.mod(tspan-phase,T)-0.5*T;
    last = testwt[1]
    for loop in range(2,siz):
        if (last < 0)and(testwt[loop] &gt; 0):
            cnt = cnt+1
            del1 = -testwt[loop-1]/(testwt[loop] - testwt[loop-1])
            px[cnt] = (y2[loop]-y2[loop-1])*del1 + y2[loop-1]
            xvar[cnt] = (y1[loop]-y1[loop-1])*del1 + y1[loop-1]
            last = testwt[loop]
        else:
            last = testwt[loop]
 
    plt.figure(3)
    if cloop == 0:
        lines = plt.plot(xvar,px,'bo',ms=1)
    elif cloop == 1:
        lines = plt.plot(xvar,px,'go',ms=1)
    else:
        lines = plt.plot(xvar,px,'ro',ms=1)
        
    plt.show()

plt.savefig('Duffing')
Fig. 7 Strange attractor of the double-well Duffing equation for three selected phases.

[1] G. Duffing, Erzwungene Schwingungen bei veranderlicher Eigenfrequenz und ihre technische Bedeutung, Vieweg & Sohn, Braunschweig, 1918.

[2] Lachmann, K. “Duffing’s vibration problem.” Mathematische Annalen 99: 479-492. (1928)

[3] S. Timoshenko, Vibration Problems in Engineering, D. Van Nostrand Company, Inc.,New York, 1928.

[4] K.O. Friedrichs, P. Le Corbeiller, N. Levinson, J.J. Stoker, Lectures on Non-Linear Mechanics delivered at Brown University, New York, 1942.

[5] Kovacic, I. and M. J. Brennan, Eds. The Duffing Equation: Nonlinear Oscillators and their Behavior. Chichester, United Kingdom, Wiley. (2011)

[6] Holmes, P. J. and D. A. Rand. “Bifurcations of Duffings Equation – Application of Catastrophe Theory.” Journal of Sound and Vibration 44(2): 237-253. (1976)

[7] Ueda, Y. “Randomly Transitional Phenomena in the System Governed by Duffings Equation.” Journal of Statistical Physics 20(2): 181-196. (1979)

The Wonderful World of Hamiltonian Maps

Hamiltonian systems are freaks of nature.  Unlike the everyday world we experience that is full of dissipation and inefficiency, Hamiltonian systems live in a world free of loss.  Despite how rare this situation is for us, this unnatural state happens commonly in two extremes: orbital mechanics and quantum mechanics.  In the case of orbital mechanics, dissipation does exist, most commonly in tidal effects, but effects of dissipation in the orbits of moons and planets takes eons to accumulate, making these systems effectively free of dissipation on shorter time scales.  Quantum mechanics is strictly free of dissipation, but there is a strong caveat: ALL quantum states need to be included in the quantum description.  This includes the coupling of discrete quantum states to their environment.  Although it is possible to isolate quantum systems to a large degree, it is never possible to isolate them completely, and they do interact with the quantum states of their environment, if even just the black-body radiation from their container, and even if that container is cooled to milliKelvins.  Such interactions involve so many degrees of freedom, that it all behaves like dissipation.  The origin of quantum decoherence, which poses such a challenge for practical quantum computers, is the entanglement of quantum systems with their environment.

Liouville’s theorem plays a central role in the explanation of the entropy and ergodic properties of ideal gases, as well as in Hamiltonian chaos.

Liouville’s Theorem and Phase Space

A middle ground of practically ideal Hamiltonian mechanics can be found in the dynamics of ideal gases. This is the arena where Maxwell and Boltzmann first developed their theories of statistical mechanics using Hamiltonian physics to describe the large numbers of particles.  Boltzmann applied a result he learned from Jacobi’s Principle of the Last Multiplier to show that a volume of phase space is conserved despite the large number of degrees of freedom and the large number of collisions that take place.  This was the first derivation of what is today known as Liouville’s theorem.

Close-up of the Lozi Map with B = -1 and C = 0.5.

In 1838 Joseph Liouville, a pure mathematician, was interested in classes of solutions of differential equations.  In a short paper, he showed that for one class of differential equation one could define a property that remained invariant under the time evolution of the system.  This purely mathematical paper by Liouville was expanded upon by Jacobi, who was a major commentator on Hamilton’s new theory of dynamics, contributing much of the mathematical structure that we associate today with Hamiltonian mechanics.  Jacobi recognized that Hamilton’s equations were of the same class as the ones studied by Liouville, and the conserved property was a product of differentials.  In the mid-1800’s the language of multidimensional spaces had yet to be invented, so Jacobi did not recognize the conserved quantity as a volume element, nor the space within which the dynamics occurred as a space.  Boltzmann recognized both, and he was the first to establish the principle of conservation of phase space volume. He named this principle after Liouville, even though it was actually Boltzmann himself who found its natural place within the physics of Hamiltonian systems [1].

Liouville’s theorem plays a central role in the explanation of the entropy of ideal gases, as well as in Hamiltonian chaos.  In a system with numerous degrees of freedom, a small volume of initial conditions is stretched and folded by the dynamical equations as the system evolves.  The stretching and folding is like what happens to dough in a bakers hands.  The volume of the dough never changes, but after a long time, a small spot of food coloring will eventually be as close to any part of the dough as you wish.  This analogy is part of the motivation for ergodic systems, and this kind of mixing is characteristic of Hamiltonian systems, in which trajectories can diffuse throughout the phase space volume … usually.

Interestingly, when the number of degrees of freedom are not so large, there is a middle ground of Hamiltonian systems for which some initial conditions can lead to chaotic trajectories, while other initial conditions can produce completely regular behavior.  For the right kind of systems, the regular behavior can hem in the irregular behavior, restricting it to finite regions.  This was a major finding of the KAM theory [2], named after Kolmogorov, Arnold and Moser, which helped explain the regions of regular motion separating regions of chaotic motion as illustrated in Chirikov’s Standard Map.

Discrete Maps

Hamilton’s equations are ordinary continuous differential equations that define a Hamiltonian flow in phase space.  These equations can be solved using standard techniques, such as Runge-Kutta.  However, a much simpler approach for exploring Hamiltonian chaos uses discrete maps that represent the Poincaré first-return map, also known as the Poincaré section.  Testing that a discrete map satisfies Liouville’s theorem is as simple as checking that the determinant of the Floquet matrix is equal to unity.  When the dynamics are represented in a Poincaré plane, these maps are called area-preserving maps.

There are many famous examples of area-preserving maps in the plane.  The Chirikov Standard Map is one of the best known and is often used to illustrate KAM theory.  It is a discrete representation of a kicked rotater, while a kicked harmonic oscillator leads to the Web Map.  The Henon Map was developed to explain the orbits of stars in galaxies.  The Lozi Map is a version of the Henon map that is more accessible analytically.  And the Cat Map was devised by Vladimir Arnold to illustrate what is today called Arnold Diffusion.  All of these maps display classic signatures of (low-dimensional) Hamiltonian chaos with periodic orbits hemming in regions of chaotic orbits.

Chirikov Standard Map
Kicked rotater
Web Map
Kicked harmonic oscillator
Henon Map
Stellar trajectories in galaxies
Lozi Map
Simplified Henon map
Cat MapArnold Diffusion

Table:  Common examples of area-preserving maps.

Lozi Map

My favorite area-preserving discrete map is the Lozi Map.  I first stumbled on this map at the very back of Steven Strogatz’ wonderful book on nonlinear dynamics [3].  It’s one of the last exercises of the last chapter.  The map is particularly simple, but it leads to rich dynamics, both regular and chaotic.  The map equations are

which is area-preserving when |B| = 1.  The constant C can be varied, but the choice C = 0.5 works nicely, and B = -1 produces a beautiful nested structure, as shown in the figure.

Iterated Lozi map for B = -1 and C = 0.5.  Each color is a distinct trajectory.  Many regular trajectories exist that corral regions of chaotic trajectories.  Trajectories become more chaotic farther away from the center.

Python Code for the Lozi Map

"""
Created on Wed May  2 16:17:27 2018
@author: nolte
"""
import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt

B = -1
C = 0.5

np.random.seed(2)
plt.figure(1)

for eloop in range(0,100):

    xlast = np.random.normal(0,1,1)
    ylast = np.random.normal(0,1,1)

    xnew = np.zeros(shape=(500,))
    ynew = np.zeros(shape=(500,))
    for loop in range(0,500):
        xnew[loop] = 1 + ylast - C*abs(xlast)
        ynew[loop] = B*xlast
        xlast = xnew[loop]
        ylast = ynew[loop]
        
    plt.plot(np.real(xnew),np.real(ynew),'o',ms=1)
    plt.xlim(xmin=-1.25,xmax=2)
    plt.ylim(ymin=-2,ymax=1.25)
        
plt.savefig('Lozi')

References:

[1] D. D. Nolte, “The Tangled Tale of Phase Space”, Chapter 6 in Galileo Unbound: A Path Across Life, the Universe and Everything (Oxford University Press, 2018)

[2] H. S. Dumas, The KAM Story: A Friendly Introduction to the Content, History, and Significance of Classical Kolmogorov-Arnold-Moser Theory (World Scientific, 2014)

[3] S. H. Strogatz, Nonlinear Dynamics and Chaos (WestView Press, 1994)

How to Weave a Tapestry from Hamiltonian Chaos

While virtually everyone recognizes the famous Lorenz “Butterfly”, the strange attractor  that is one of the central icons of chaos theory, in my opinion Hamiltonian chaos generates far more interesting patterns. This is because Hamiltonians conserve phase-space volume, stretching and folding small volumes of initial conditions as they evolve in time, until they span large sections of phase space. Hamiltonian chaos is usually displayed as multi-color Poincaré sections (also known as first-return maps) that are created when a set of single trajectories, each represented by a single color, pierce the Poincaré plane over and over again.

The archetype of all Hamiltonian systems is the harmonic oscillator.

MATLAB Handle Graphics

A Hamiltonian tapestry generated from the Web Map for K = 0.616 and q = 4.

Periodically-Kicked Hamiltonian

The classic Hamiltonian system, perhaps the archetype of all Hamiltonian systems, is the harmonic oscillator. The physics of the harmonic oscillator are taught in the most elementary courses, because every stable system in the world is approximated, to lowest order, as a harmonic oscillator. As the simplest dynamical system, one would think that it held no surprises. But surprisingly, it can create the most beautiful tapestries of color when pulsed periodically and mapped onto the Poincaré plane.

The Hamiltonian of the periodically kicked harmonic oscillator is converted into the Web Map, represented as an iterative mapping as

WebMap

There can be resonance between the sequence of kicks and the natural oscillator frequency such that α = 2π/q. At these resonances, intricate web patterns emerge. The Web Map produces a web of stochastic layers when plotted on an extended phase plane. The symmetry of the web is controlled by the integer q, and the stochastic layer width is controlled by the perturbation strength K.

MATLAB Handle Graphics

A tapestry for q = 6.

Web Map Python Program

Iterated maps are easy to implement in code.  Here is a simple Python code to generate maps of different types.  You can play with the coupling constant K and the periodicity q.  For small K, the tapestries are mostly regular.  But as the coupling K increases, stochastic layers emerge.  When q is a small even number, tapestries of regular symmetric are generated.  However, when q is an odd small integer, the tapestries turn into quasi-crystals.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
“”
@author: nolte
“””

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt
plt.close(‘all’)
phi = (1+np.sqrt(5))/2
K = 1-phi     # (0.618, 4) (0.618,5) (0.618,7) (1.2, 4)
q = 4         # 4, 5, 6, 7
alpha = 2*np.pi/q

np.random.seed(2)
plt.figure(1)
for eloop in range(0,1000):

xlast = 50*np.random.random()
ylast = 50*np.random.random()

xnew = np.zeros(shape=(300,))
ynew = np.zeros(shape=(300,))

for loop in range(0,300):

xnew[loop] = (xlast + K*np.sin(ylast))*np.cos(alpha) + ylast*np.sin(alpha)
ynew[loop] = -(xlast + K*np.sin(ylast))*np.sin(alpha) + ylast*np.cos(alpha)

xlast = xnew[loop]
ylast = ynew[loop]

plt.plot(np.real(xnew),np.real(ynew),’o’,ms=1)
plt.xlim(xmin=-60,xmax=60)
plt.ylim(ymin=-60,ymax=60)

plt.title(‘WebMap’)
plt.savefig(‘WebMap’)

References and Further Reading

D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time (Oxford, 2015)

G. M. Zaslavsky,  Hamiltonian chaos and fractional dynamics. (Oxford, 2005)