Imagine if you just discovered how to text through time, i.e. time-texting, when a close friend meets a shocking death. Wouldn’t you text yourself in the past to try to prevent it? But what if, every time you change the time-line and alter the future in untold ways, the friend continues to die, and you seemingly can never stop it? This is the premise of Stein’s Gate, a Japanese sci-fi animé bringing in the paradoxes of time travel, casting CERN as an evil clandestine spy agency, and introducing do-it-yourself inventors, hackers, and wacky characters, while it centers on a terrible death of a lovable character that can never be avoided.
It is also a good computational physics project that explores the dynamics of bifurcations, bistability and chaos. I teach a course in modern dynamics in the Physics Department at Purdue University. The topics of the course range broadly from classical mechanics to chaos theory, social networks, synchronization, nonlinear dynamics, economic dynamics, population dynamics, evolutionary dynamics, neural networks, special and general relativity, among others that are covered in the course using a textbook that takes a modern view of dynamics .
For the final project of the second semester the students (Junior physics majors) are asked to combine two or three of the topics into a single project. Students have come up with a lot of creative combinations: population dynamics of zombies, nonlinear dynamics of negative gravitational mass, percolation of misinformation in presidential elections, evolutionary dynamics of neural architecture, and many more. In that spirit, and for a little fun, in this blog I explore the so-called physics of Stein’s Gate.
Stein’s Gate and the Divergence Meter
Stein’s Gate is a Japanese TV animé series that had a world-wide distribution in 2011. The central premise of the plot is that certain events always occur even if you are on different timelines—like trying to avoid someone’s death in an accident.
This is the problem confronting Rintaro Okabe who tries to stop an accident that kills his friend Mayuri Shiina. But every time he tries to change time, she dies in some other way. It turns out that all the nearby timelines involve her death. According to a device known as The Divergence Meter, Rintaro must get farther than 4% away from the original timeline to have a chance to avoid the otherwise unavoidable event.
This is new. Usually, time-travel Sci-Fi is based on the Butterfly Effect. Chaos theory is characterized by something called sensitivity to initial conditions (SIC), meaning that slightly different starting points produce trajectories that diverge exponentially from nearby trajectories. It is called the Butterfly Effect because of the whimsical notion that a butterfly flapping its wings in China can cause a hurricane in Florida. In the context of the butterfly effect, if you go back in time and change anything at all, the effect cascades through time until the present time in unrecognizable. As an example, in one episode of the TV cartoon The Simpsons, Homer goes back in time to the age of the dinosaurs and kills a single mosquito. When he gets back to our time, everything has changed in bazaar and funny ways.
Stein’s Gate introduces a creative counter example to the Butterfly Effect. Instead of scrambling the future when you fiddle with the past, you find that you always get the same event, even when you change a lot of the conditions—Mayuri still dies. This sounds eerily familiar to a physicist who knows something about chaos theory. It means that the unavoidable event is acting like a stable fixed point in the time dynamics—an attractor! Even if you change the initial conditions, the dynamics draw you back to the fixed point—in this case Mayuri’s accident. What would this look like in a dynamical system?
The Local Basin of Attraction
Dynamical systems can be described as trajectories in a high-dimensional state space. Within state space there are special points where the dynamics are static—known as fixed points. For a stable fixed point, a slight perturbation away will relax back to the fixed point. For an unstable fixed point, on the other hand, a slight perturbation grows and the system dynamics evolve away. However, there can be regions in state space where every initial condition leads to trajectories that stay within that region. This is known as a basin of attraction, and the boundaries of these basins are called separatrixes.
A high-dimensional state space can have many basins of attraction. All the physics that starts within a basin stays within that basin—almost like its own self-consistent universe, bordered by countless other universes. There are well-known physical systems that have many basins of attraction. String theory is suspected to generate many adjacent universes where the physical laws are a little different in each basin of attraction. Spin glasses, which are amorphous solid-state magnets, have this property, as do recurrent neural networks like the Hopfield network. Basins of attraction occur naturally within the physics of these systems.
It is possible to embed basins of attraction within an existing dynamical system. As an example, let’s start with one of the simplest types of dynamics, a hyperbolic fixed point
that has a single saddle fixed point at the origin. We want to add a basin of attraction at the origin with a domain range given by a radius r0. At the same time, we want to create a separatrix that keeps the outer hyperbolic dynamics separate from the internal basin dynamics. To keep all outer trajectories in the outer domain, we can build a dynamical barrier to prevent the trajectories from crossing the separatrix. This can be accomplished by adding a radial repulsive term
In x-y coordinates this is
We also want to keep the internal dynamics of our basin separate from the external dynamics. To do this, we can multiply by a sigmoid function, like a Heaviside function H(r-r0), to zero-out the external dynamics inside our basin. The final external dynamics is then
Now we have to add the internal dynamics for the basin of attraction. To make it a little more interesting, let’s make the internal dynamics an autonomous oscillator
Putting this all together, gives
This looks a little complex, for such a simple model, but it illustrates the principle. The sigmoid is best if it is differentiable, so instead of a Heaviside function it can be a Fermi function
The phase-space portrait of the final dynamics looks like
Adding the internal dynamics does not change the far-field external dynamics, which are still hyperbolic. The repulsive term does split the central saddle point into two saddle points, one on each side left-and-right, so the repulsive term actually splits the dynamics. But the internal dynamics are self-contained and separate from the external dynamics. The origin is an unstable spiral that evolves to a limit cycle. The basin boundary has marginal stability and is known as a “wall”.
To verify the stability of the external fixed point, find the fixed point coordinates
and evaluate the Jacobian matrix (for A = 1 and x0 = 2)
which is clearly a saddle point because the determinant is negative.
In the context of Stein’s Gate, the basin boundary is equivalent to the 4% divergence which is necessary to escape the internal basin of attraction where Mayuri meets her fate.
Python Program: SteinsGate2D.py
# -*- coding: utf-8 -*-
Created on Sat March 6, 2021
@author: David Nolte
Introduction to Modern Dynamics, 2nd edition (Oxford University Press, 2019)
2D simulation of Stein's Gate Divergence Meter
import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt
def solve_flow(param,lim = [-6,6,-6,6],max_time=20.0):
def flow_deriv(x_y, t0, alpha, beta, gamma):
#"""Compute the time-derivative ."""
x, y = x_y
w = 1
R2 = x**2 + y**2
R = np.sqrt(R2)
arg = (R-2)/0.1
env1 = 1/(1+np.exp(arg))
env2 = 1 - env1
f = env2*(x*(1/(R-1.99)**2 + 1e-2) - x) + env1*(w*y + w*x*(1 - R))
g = env2*(y*(1/(R-1.99)**2 + 1e-2) + y) + env1*(-w*x + w*y*(1 - R))
model_title = 'Steins Gate'
xmin = lim
xmax = lim
ymin = lim
ymax = lim
plt.axis([xmin, xmax, ymin, ymax])
N = 24*4 + 47
x0 = np.zeros(shape=(N,2))
ind = -1
for i in range(0,24):
ind = ind + 1
x0[ind,0] = xmin + (xmax-xmin)*i/23
x0[ind,1] = ymin
ind = ind + 1
x0[ind,0] = xmin + (xmax-xmin)*i/23
x0[ind,1] = ymax
ind = ind + 1
x0[ind,0] = xmin
x0[ind,1] = ymin + (ymax-ymin)*i/23
ind = ind + 1
x0[ind,0] = xmax
x0[ind,1] = ymin + (ymax-ymin)*i/23
ind = ind + 1
x0[ind,0] = 0.05
x0[ind,1] = 0.05
for thetloop in range(0,10):
ind = ind + 1
theta = 2*np.pi*(thetloop)/10
ys = 0.125*np.sin(theta)
xs = 0.125*np.cos(theta)
x0[ind,0] = xs
x0[ind,1] = ys
for thetloop in range(0,10):
ind = ind + 1
theta = 2*np.pi*(thetloop)/10
ys = 1.7*np.sin(theta)
xs = 1.7*np.cos(theta)
x0[ind,0] = xs
x0[ind,1] = ys
for thetloop in range(0,20):
ind = ind + 1
theta = 2*np.pi*(thetloop)/20
ys = 2*np.sin(theta)
xs = 2*np.cos(theta)
x0[ind,0] = xs
x0[ind,1] = ys
ind = ind + 1
x0[ind,0] = -3
x0[ind,1] = 0.05
ind = ind + 1
x0[ind,0] = -3
x0[ind,1] = -0.05
ind = ind + 1
x0[ind,0] = 3
x0[ind,1] = 0.05
ind = ind + 1
x0[ind,0] = 3
x0[ind,1] = -0.05
ind = ind + 1
x0[ind,0] = -6
x0[ind,1] = 0.00
ind = ind + 1
x0[ind,0] = 6
x0[ind,1] = 0.00
colors = plt.cm.prism(np.linspace(0, 1, N))
# Solve for the trajectories
t = np.linspace(0, max_time, int(250*max_time))
x_t = np.asarray([integrate.odeint(flow_deriv, x0i, t, param)
for x0i in x0])
for i in range(N):
x, y = x_t[i,:,:].T
lines = plt.plot(x, y, '-', c=colors[i])
return t, x_t
param = (0.02,0.5,0.2) # Steins Gate
lim = (-6,6,-6,6)
t, x_t = solve_flow(param,lim)
The Lorenz Butterfly
Two-dimensional phase space cannot support chaos, and we would like to reconnect the central theme of Stein’s Gate, the Divergence Meter, with the Butterfly Effect. Therefore, let’s actually incorporate our basin of attraction inside the classic Lorenz Butterfly. The goal is to put an attracting domain into the midst of the three-dimensional state space of the Lorenz butterfly in a way that repels the butterfly, without destroying it, but attracts local trajectories. The question is whether the butterfly can survive if part of its state space is made unavailable to it.
The classic Lorenz dynamical system is
As in the 2D case, we will put in a repelling barrier that prevents external trajectories from moving into the local basin, and we will isolate the external dynamics by using the sigmoid function. The final flow equations looks like
where the radius is relative to the center of the attracting basin
and r0 is the radius of the basin. The center of the basin is at [x0, y0, z0] and we are assuming that x0 = 0 and y0 = 0 and z0 = 25 for the standard Butterfly parameters p = 10, r = 25 and b = 8/3. This puts our basin of attraction a little on the high side of the center of the Butterfly. If we embed it too far inside the Butterfly it does actually destroy the Butterfly dynamics.
When r0 = 0, the dynamics of the Lorenz’ Butterfly are essentially unchanged. However, when r0 = 1.5, then there is a repulsive effect on trajectories that pass close to the basin. It can be seen as part of the trajectory skips around the outside of the basin in Figure 2.
Trajectories can begin very close to the basin, but still on the outside of the separatrix, as in the top row of Figure 3 where the basin of attraction with r0 = 1.5 lies a bit above the center of the Butterfly. The Butterfly still exists for the external dynamics. However, any trajectory that starts within the basin of attraction remains there and executes a stable limit cycle. This is the world where Mayuri dies inside the 4% divergence. But if the initial condition can exceed 4%, then the Butterfly effect takes over. The bottom row of Figure 2 shows that the Butterfly itself is fragile. When the external dynamics are perturbed more strongly by more closely centering the local basin, the hyperbolic dynamics of the Butterfly are impeded and the external dynamics are converted to a stable limit cycle. It is interesting that the Butterfly, so often used as an illustration of sensitivity to initial conditions (SIC), is itself sensitive to perturbations that can convert it away from chaos and back to regular motion.
Discussion and Extensions
In the examples shown here, the local basin of attraction was put in “by hand” as an isolated region inside the dynamics. It would be interesting to consider more natural systems, like a spin glass or a Hopfield network, where the basins of attraction occur naturally from the physical principles of the system. Then we could use the “Divergence Meter” to explore these physical systems to see how far the dynamics can diverge before crossing a separatrix. These systems are impossible to visualize because they are intrinsically very high dimensional systems, but Monte Carlo approaches could be used to probe the “sizes” of the basins.
Another interesting extension would be to embed these complex dynamics into spacetime. Since this all started with the idea of texting through time, it would be interesting (and challenging) to see how we could describe this process in a high dimensional Minkowski space that had many space dimensions (but still only one time dimension). Certainly it would violate the speed of light criterion, but we could then take the approach of David Deutsch and view the time axis as if it had multiple branches, like the branches of the arctangent function, creating time-consistent sheets within a sheave of flat Minkowski spaces.
Snorkeling above a shallow reef on a clear sunny day transports you to an otherworldly galaxy of spectacular deep colors and light reverberating off of the rippled surface. Playing across the underwater floor of the reef is a fabulous light show of bright filaments entwining and fluttering, creating random mesh networks of light and dark. These same patterns appear on the bottom of swimming pools in summer and in deep fountains in parks.
Johann Bernoulli had a stormy career and a problematic personality–but he was brilliant even among the bountiful Bernoulli clan. Using methods of tangents, he found the analytic solution of the caustic of the circle.
Something similar happens when a bare overhead light reflects from the sides of a circular glass of water. The pattern no longer moves, but a dazzling filament splays across the bottom of the glass with a sharp bright cusp at the center. These bright filaments of light have an age old name — Caustics — meaning burning as in burning with light. The study of caustics goes back to Archimedes of Syracuse and his apocryphal burning mirrors that are supposed to have torched the invading triremes of the Roman navy in 212 BC.
Caustics in optics are concentrations of light rays that form bright filaments, often with cusp singularities. Mathematically, they are envelope curves that are tangent to a set of lines. Cata-caustics are caustics caused by light reflecting from curved surfaces. Dia-caustics are caustics caused by light refracting from transparent curved materials.
From Leonardo to Huygens
Even after Archimedes, burning mirrors remained an interest for a broad range of scientists, artists and engineers. Leonardo Da Vinci took an interest around 1503 – 1506 when he drew reflected caustics from a circular mirror in his many notebooks.
In the decades after Newton and Leibniz invented the calculus, a small cadre of mathematicians strove to apply the new method to understand aspects of the physical world. At at a time when Newton had left the calculus behind to follow more arcane pursuits, Lebniz, Jakob and Johann Bernoulli, Guillaume de l’Hôpital, Émilie du Chatelet and Walter von Tschirnhaus were pushing notation reform (mainly following Leibniz) to make the calculus easier to learn and use, as well as finding new applications of which there were many.
Ehrenfried Walter von Tschirnhaus (1651 – 1708) was a German mathematician and physician and a lifelong friend of Leibniz, who he met in Paris in 1675. He was one of only five mathematicians to provide a solution to Johann Bernoulli’s brachistochrone problem. One of the recurring interests of von Tschirnhaus, that he revisited throughout his carrier, was in burning glasses and mirrors. A burning glass is a high-quality magnifying lens that brings the focus of the sun to a fine point to burn or anneal various items. Burning glasses were used to heat small items for manufacture or for experimentation. For instance, Priestly and Lavoisier routinely used burning glasses in their chemistry experiments. Low optical aberrations were required for the lenses to bring the light to the finest possible focus, so the study of optical focusing was an important topic both academically and practically. Tshirnhaus had his own laboratory to build and test burning mirrors, and he became aware of the cata-caustic patterns of light reflected from a circular mirror or glass surface. Given his parallel interest in the developing calculus methods, he published a paper in Acta Eruditorum in 1682 that constructed the envelope function created by the cata-caustics of a circle. However, Tschirnhaus did not produce the analytic function–that was provided by Johann Bernoulli ten years later in 1692.
Johann Bernoulli had a stormy career and a problematic personality–but he was brilliant even among the Bountiful Bernoulli clan. Using methods of tangents, he found the analytic solution of the caustic of the circle. He did this by stating the general equation for all reflected rays and then finding when their y values are independent of changing angle … in other words using the principle of stationarity which would later become a potent tool in the hands of Lagrange as he developed Lagrangian physics.
The equation for the reflected ray, expressing y as a function of x for a given angle α in Fig. 5, is
The condition of the caustic envelope requires the change in y with respect to the angle α to vanish while treating x as a constant. This is a partial derivative, and Johann Bernoulli is giving an early use of this method in 1692 to ensure the stationarity of y with respect to the changing angle. The partial derivative is
This is solved to give
Plugging this into the equation at the top equation above yields
These last two expressions for x and y in terms of the angle α are a parametric representation of the caustic. Combining them gives the solution to the caustic of the circle
The square root provides the characteristic cusp at the center of the caustic.
Python Code: raycaustic.py
There are lots of options here. Try them all … then add your own!
# -*- coding: utf-8 -*-
Created on Tue Feb 16 16:44:42 2021
D. D. Nolte, Optical Interferometry for Biology and Medicine (Springer,2011)
import numpy as np
from matplotlib import pyplot as plt
# model_case 1 = cosine
# model_case 2 = circle
# model_case 3 = square root
# model_case 4 = inverse power law
# model_case 5 = ellipse
# model_case 6 = secant
# model_case 7 = parabola
# model_case 8 = Cauchy
model_case = int(input('Input Model Case (1-7)'))
if model_case == 1:
model_title = 'cosine'
xleft = -np.pi
xright = np.pi
ybottom = -1
ytop = 1.2
elif model_case == 2:
model_title = 'circle'
xleft = -1
xright = 1
ybottom = -1
ytop = .2
elif model_case == 3:
model_title = 'square-root'
xleft = 0
xright = 4
ybottom = -2
ytop = 2
elif model_case == 4:
model_title = 'Inverse Power Law'
xleft = 1e-6
xright = 4
ybottom = 0
ytop = 4
elif model_case == 5:
model_title = 'ellipse'
a = 0.5
b = 2
xleft = -b
xright = b
ybottom = -a
ytop = 0.5*b**2/a
elif model_case == 6:
model_title = 'secant'
xleft = -np.pi/2
xright = np.pi/2
ybottom = 0.5
ytop = 4
elif model_case == 7:
model_title = 'Parabola'
xleft = -2
xright = 2
ybottom = 0
ytop = 4
elif model_case == 8:
model_title = 'Cauchy'
xleft = 0
xright = 4
ybottom = 0
ytop = 4
if model_case == 1:
y = -np.cos(x)
elif model_case == 2:
y = -np.sqrt(1-x**2)
elif model_case == 3:
y = -np.sqrt(x)
elif model_case == 4:
y = x**(-0.75)
elif model_case == 5:
y = -a*np.sqrt(1-x**2/b**2)
elif model_case == 6:
y = 1.0/np.cos(x)
elif model_case == 7:
y = 0.5*x**2
elif model_case == 8:
y = 1/(1 + x**2)
xx = np.arange(xleft,xright,0.01)
yy = feval(xx)
lines = plt.plot(xx,yy)
delx = 0.001
N = 75
for i in range(N+1):
x = xleft + (xright-xleft)*(i-1)/N
val = feval(x)
valp = feval(x+delx/2)
valm = feval(x-delx/2)
deriv = (valp-valm)/delx
phi = np.arctan(deriv)
slope = np.tan(np.pi/2 + 2*phi)
if np.abs(deriv) < 1:
xf = (ytop-val+slope*x)/slope;
yf = ytop;
xf = (ybottom-val+slope*x)/slope;
yf = ybottom;
plt.plot([x, x],[ytop, val],linewidth = 0.5)
plt.plot([x, xf],[val, yf],linewidth = 0.5)
The Dia-caustics of Swimming Pools
A caustic is understood mathematically as the envelope function of multiple rays that converge in the Fourier domain (angular deflection measured at far distances). These are points of mathematical stationarity, in which the ray density is invariant to first order in deviations in the refracting surface. The rays themselves are the trajectories of the Eikonal Equation as rays of light thread their way through complicated optical systems.
The basic geometry is shown in Fig 7 for a ray incident on a nonplanar surface emerging into a less-dense medium. From Snell’s law we have the relation for light entering a dense medium like light into water
where n is the relative index (ratio), and the small-angle approximation has been made. The incident angle θ1 is simply related to the slope of the interface dh/dx as
where the small-angle approximation is used again. The angular deflection relative to the optic axis is then
which is equal to the optical path difference through the sample.
In two dimensions, the optical path difference can be replaced with a general potential
and the two orthogonal angular deflections (measured in the far field on a Fourier plane) are
These angles describe the deflection of the rays across the sample surface. They are also the right-hand side of the Eikonal Equation, the equation governing ray trajectories through optical systems.
Caustics are lines of stationarity, meaning that the density of rays is independent of first-order changes in the refracting sample. The condition of stationarity is defined by the Jacobian of the transformation from (x,y) to (θx, θy) with
where the second expression is the Hessian determinant of the refractive power of the uneven surface. When this condition is satisfied, the envelope function bounding groups of collected rays is stationary to perturbations in the inhomogeneous sample.
An example of diacaustic formation from a random surface is shown in Fig. 8 generated by the Python program caustic.py. The Jacobian density (center) outlines regions in which the ray density is independent of small changes in the surface. They are positions of the zeros of the Hessian determinant, the regions of zero curvature of the surface or potential function. These high-intensity regions spread out and are intercepted at some distance by a suface, like the bottom of a swimming pool, where the concentrated rays create bright filaments. As the wavelets on the surface of the swimming pool move, the caustic filaments on the bottom of the swimming pool dance about.
Optical caustics also occur in the gravitational lensing of distant quasars by galaxy clusters in the formation of Einstein rings and arcs seen by deep field telescopes, as described in my following blog post.
Python Code: caustic.py
This Python code was used to generate the caustic patterns in Fig. 8. You can change the surface roughness by changing the divisors on the last two arguments on Line 58. The distance to the bottom of the swimming pool can be changed by changing the parameter d on Line 84.
# -*- coding: utf-8 -*-
Created on Tue Feb 16 19:50:54 2021
D. D. Nolte, Optical Interferometry for Biology and Medicine (Springer,2011)
import numpy as np
from matplotlib import pyplot as plt
from numpy import random as rnd
from scipy import signal as signal
N = 256
x = np.arange(-sx/2,sy/2,1)
y = np.arange(-sy/2,sy/2,1)
y = y[..., None]
ex = np.ones(shape=(sy,1))
x2 = np.kron(ex,x**2/(2*wx**2));
ey = np.ones(shape=(1,sx));
y2 = np.kron(y**2/(2*wy**2),ey);
rad2 = (x2+y2);
A = np.exp(-rad2);
Btemp = 2*np.pi*rnd.rand(sy,sx);
B = np.exp(complex(0,1)*Btemp);
C = gauss2(sy,sx,wy,wx);
Atemp = signal.convolve2d(B,C,'same');
Intens = np.mean(np.mean(np.abs(Atemp)**2));
D = np.real(Atemp/np.sqrt(Intens));
Dphs = np.arctan2(np.imag(D),np.real(D));
return D, Dphs
Sp, Sphs = speckle2(N,N,N/16,N/16)
plt.matshow(Sp,2,cmap=plt.cm.get_cmap('seismic')) # hsv, seismic, bwr
fx, fy = np.gradient(Sp);
fxx,fxy = np.gradient(fx);
fyx,fyy = np.gradient(fy);
J = fxx*fyy - fxy*fyx;
D = np.abs(1/J)
plt.matshow(D,3,cmap=plt.cm.get_cmap('gray')) # hsv, seismic, bwr
eps = 1e-7
cnt = 0
E = np.zeros(shape=(N,N))
for yloop in range(0,N-1):
for xloop in range(0,N-1):
d = N/2
indx = int(N/2 + (d*(fx[yloop,xloop])+(xloop-N/2)/2))
indy = int(N/2 + (d*(fy[yloop,xloop])+(yloop-N/2)/2))
if ((indx > 0) and (indx < N)) and ((indy > 0) and (indy < N)):
E[indy,indx] = E[indy,indx] + 1
The idea of parallel dimensions in physics has a long history dating back to Bernhard Riemann’s famous 1954 lecture on the foundations of geometry that he gave as a requirement to attain a teaching position at the University of Göttingen. Riemann laid out a program of study that included physics problems solved in multiple dimensions, but it was Rudolph Lipschitz twenty years later who first composed a rigorous view of physics as trajectories in many dimensions. Nonetheless, the three spatial dimensions we enjoy in our daily lives remained the only true physical space until Hermann Minkowski re-expressed Einstein’s theory of relativity in 4-dimensional space time. Even so, Minkowski’s time dimension was not on an equal footing with the three spatial dimensions—the four dimensions were entwined, but time had a different characteristic, what is known as pseudo-Riemannian metric. It is this pseudo-metric that allows space-time distances to be negative as easily as positive.
In 1919 Theodore Kaluza of the University of Königsberg in Prussia extended Einstein’s theory of gravitation to a fifth spatial dimension, and physics had its first true parallel dimension. It was more than just an exercise in mathematics—adding a fifth dimension to relativistic dynamics adds new degrees of freedom that allow the dynamical 5-dimensional theory to include more than merely relativistic massive particles and the electric field they generate. In addition to electro-magnetism, something akin to Einstein’s field equation of gravitation emerges. Here was a five-dimensional theory that seemed to unify E&M with gravity—a first unified theory of physics. Einstein, to whom Kaluza communicated his theory, was intrigued but hesitant to forward Kaluza’s paper for publication. It seemed too good to be true. But Einstein finally sent it to be published in the proceedings of the Prussian Academy of Sciences [Kaluza, 1921]. He later launched his own effort to explore such unified field theories more deeply.
Yet Kaluza’s theory was fully classical—if a fifth dimension can be called that—because it made no connection to the rapidly developing field of quantum mechanics. The person who took the step to make five-dimensional space-time into a quantum field theory was Oskar Klein.
Oskar Klein (1894 – 1977)
Oskar Klein was a Swedish physicist who was in the “second wave” of quantum physicists just a few years behind the titans Heisenberg and Schrödinger and Pauli. He began as a student in physical chemistry working in Stockholm under the famous Arrhenius. It was arranged for him to work in France and Germany in 1914, but he was caught in Paris at the onset of World War I. Returning to Sweden, he enlisted in military service from 1915 to 1916 and then joined Arrhenius’ group at the Nobel Institute where he met Hendrick Kramers—Bohr’s direct assistant at Copenhagen at that time. At Kramer’s invitation, Klein traveled to Copenhagen and worked for a year with Kramers and Bohr before returning to defend his doctoral thesis in 1921 in the field of physical chemistry. Klein’s work with Bohr had opened his eyes to the possibilities of quantum theory, and he shifted his research interest away from physical chemistry. Unfortunately, there were no positions at that time in such a new field, so Klein accepted a position as assistant professor at the University of Michigan in Ann Arbor where he stayed from 1923 to 1925.
The Fifth Dimension
In an odd twist of fate, this isolation of Klein from the mainstream quantum theory being pursued in Europe freed him of the bandwagon effect and allowed him to range freely on topics of his own devising and certainly in directions all his own. Unaware of Kaluza’s previous work, Klein expanded Minkowski’s space-time from four to five spatial dimensions, just as Kaluza had done, but now with a quantum interpretation. This was not just an incremental step but had far-ranging consequences in the history of physics.
Klein found a way to keep the fifth dimension Euclidean in its metric properties while rolling itself up compactly into a cylinder with the radius of the Planck length—something inconceivably small. This compact fifth dimension made the manifold into something akin to an infinitesimal string. He published a short note in Nature magazine in 1926 on the possibility of identifying the electric charge within the 5-dimensional theory [Klein, 2916a]. He then returned to Sweden to take up a position at the University of Lund. This odd string-like feature of 5-dimensional space-time was picked up by Einstein and others in their search for unified field theories of physics, but the topic soon drifted from the lime light where it lay dormant for nearly fifty years until the first forays were made into string theory. String theory resurrected the Kaluza-Klein theory which has bourgeoned into the vast topic of String Theory today, including Superstrings that occur in 10+1 dimensionsat the frontiers of physics.
Dirac Electrons without the Spin: Klein-Gordon Equation
Once back in Europe, Klein reengaged with the mainstream trends in the rapidly developing quantum theory and in 1926 developed a relativistic quantum theory of the electron [Klein, 1926b]. Around the same time Walter Gordon also proposed this equation, which is now called the “Klein-Gordon Equation”. The equation was a classic wave equation that was second order in both space and time. This was the most natural form for a wave equation for quantum particles and Schrödinger himself had started with this form. But Schrödinger had quickly realized that the second-order time term in the equation did not capture the correct structure of the hydrogen atom, which led him to express the time-dependent term in first order and non-relativistically—which is today’s “Schrödinger Equation”. The problem was in the spin of the electron. The electron is a spin-1/2 particle, a Fermion, which has special transformation properties. It was Dirac a few years later who discovered how to express the relativistic wave equation for the electron—not by promoting the time-dependent term to second order, but by demoting the space-dependent term to first order. The first-order expression for both the space and time derivatives goes hand in hand with the Pauli spin matrices for the electron, and the Dirac Equation is the appropriate relativistically-correct wave equation for the electron.
Klein’s relativistic quantum wave equation does turn out to be the relevant form for a spin-less particle like the pion, but the pion decays by the strong nuclear force and the Klein-Gordon equation is not a practical description. However, the Higgs boson also is a spin-zero particle, and the Klein-Gordon expression does have relevance for this fundamental exchange particle.
In those early days of the late 1920’s, the nature of the nucleus was still a mystery, especially the problem of nuclear radioactivity where a neutron could convert to a proton with the emission of an electron. Some suggested that the neutron was somehow a proton that had captured an electron in a potential barrier. Klein showed that this was impossible, that the electrons would be highly relativistic—something known as a Dirac electron—and they would tunnel with perfect probability through any potential barrier [Klein, 1929]. Therefore, Klein concluded, no nucleon or nucleus could bind an electron.
This phenomenon of unity transmission through a barrier became known as Klein tunneling. The relativistic electron transmits perfectly through an arbitrary potential barrier—independent of its width or height. This is unlike light that transmits through a dielectric slab in resonances that depend on the thickness of the slab—also known as a Fabry-Perot interferometer. The Dirac electron can have any energy, and the potential barrier can have any width, yet the electron will tunnel with 100% probability. How can this happen?
The answer has to do with the dispersion (velocity versus momentum) of the Dirac electron. As the momentum changes in a potential the speed of the Dirac electron stays constant. In the potential barrier, the moment flips sign, but the speed remains unchanged. This is equivalent to the effects of negative refractive index in optics. If a photon travels through a material with negative refractive index, its momentum is flipped, but its speed remains unchanged. From Fermat’s principle, it is speed which determines how a particle like a photon refracts, so if there is no speed change, then there is no reflection.
For the case of Dirac electrons in a potential with field F, speed v and transverse momentum py, the transmission coefficient is given by
If the transverse momentum is zero, then the transmission is perfect. A visual schematic of the role of dispersion and potentials for Dirac electrons undergoing Klein tunneling is shown in the next figure.
In this case, even if the transverse momentum is not strictly zero, there can still be perfect transmission. It is simply a matter of matching speeds.
Graphene became famous over the past decade because its electron dispersion relation is just like a relativistic Dirac electron with a Dirac point between conduction and valence bands. Evidence for Klein tunneling in graphene systems has been growing, but clean demonstrations have remained difficult to observe.
Now, published in the Dec. 2020 issue of Science magazine—almost a century after Klein first proposed it—an experimental group at the University of California at Berkeley reports a beautiful experimental demonstration of Klein tunneling—not from a nucleus, but in an acoustic honeycomb sounding board the size of a small table—making an experimental analogy between acoustics and Dirac electrons that bears out Klein’s theory.
In this special sounding board, it is not electrons but phonons—acoustic vibrations—that have a Dirac point. Furthermore, by changing the honeycomb pattern, the bands can be shifted, just like in a p-n-p junction, to produce a potential barrier. The Berkeley group, led by Xiang Zhang (now president of Hong Kong University), fabricated the sounding board that is about a half-meter in length, and demonstrated dramatic Klein tunneling.
It is amazing how long it can take between the time a theory is first proposed and the time a clean experimental demonstration is first performed. Nearly 90 years has elapsed since Klein first derived the phenomenon. Performing the experiment with actual relativistic electrons was prohibitive, but bringing the Dirac electron analog into the solid state has allowed the effect to be demonstrated easily.
 Kaluza, Theodor (1921). “Zum Unitätsproblem in der Physik”. Sitzungsber. Preuss. Akad. Wiss. Berlin. (Math. Phys.): 966–972
[1926a] Klein, O. (1926). “The Atomicity of Electricity as a Quantum Theory Law”. Nature118: 516-516.
[1926b] Klein, O. (1926). “Quantentheorie und fünfdimensionale Relativitätstheorie”. Zeitschrift für Physik. 37 (12): 895
 Klein, O. (1929). “Die Reflexion von Elektronen an einem Potentialsprung nach der relativistischen Dynamik von Dirac”. Zeitschrift für Physik. 53 (3–4): 157
The quantum of light—the photon—is a little over 100 years old. It was born in 1905 when Einstein merged Planck’s blackbody quantum hypothesis with statistical mechanics and concluded that light itself must be quantized. No one believed him! Fast forward to today, and the photon is a modern workhorse of modern quantum technology. Quantum encryption and communication are performed almost exclusively with photons, and many prototype quantum computers are optics based. Quantum optics also underpins atomic and molecular optics (AMO), which is one of the hottest and most rapidly advancing frontiers of physics today.
Only after the availability of “quantum” light sources … could photon numbers be manipulated at will, launching the modern era of quantum optics.
This blog tells the story of the early days of the photon and of quantum optics. It begins with Einstein in 1905 and ends with the demonstration of photon anti-bunching that was the first fundamentally quantum optical phenomenon observed seventy years later in 1977. Across that stretch of time, the photon went from a nascent idea in Einstein’s fertile brain to the most thoroughly investigated quantum particle in the realm of physics.
The Photon: Albert Einstein (1905)
When Planck presented his quantum hypothesis in 1900 to the German Physical Society , his model of black body radiation retained all its classical properties but one—the quantized interaction of light with matter. He did not think yet in terms of quanta, only in terms of steps in a continuous interaction.
The quantum break came from Einstein when he published his 1905 paper proposing the existence of the photon—an actual quantum of light that carried with it energy and momentum . His reasoning was simple and iron-clad, resting on Planck’s own blackbody relation that Einstein combined with simple reasoning from statistical mechanics. He was led inexorably to the existence of the photon. Unfortunately, almost no one believed him (see my blog on Einstein and Planck).
This was before wave-particle duality in quantum thinking, so the notion that light—so clearly a wave phenomenon—could be a particle was unthinkable. It had taken half of the 19th century to rid physics of Newton’s corpuscules and emmisionist theories of light, so to bring it back at the beginning of the 20th century seemed like a great blunder. However, Einstein persisted.
In 1909 he published a paper on the fluctuation properties of light  in which he proposed that the fluctuations observed in light intensity had two contributions: one from the discreteness of the photons (what we call “shot noise” today) and one from the fluctuations in the wave properties. Einstein was proposing that both particle and wave properties contributed to intensity fluctuations, exhibiting simultaneous particle-like and wave-like properties. This was one of the first expressions of wave-particle duality in modern physics.
In 1916 and 1917 Einstein took another bold step and proposed the existence of stimulated emission . Once again, his arguments were based on simple physics—this time the principle of detailed balance—and he was led to the audacious conclusion that one photon can stimulated the emission of another. This would become the basis of the laser forty-five years later.
While Einstein was confident in the reality of the photon, others sincerely doubted its existence. Robert Milliken (1868 – 1953) decided to put Einstein’s theory of photoelectron emission to the most stringent test ever performed. In 1915 he painstakingly acquired the definitive dataset with the goal to refute Einstein’s hypothesis, only to confirm it in spectacular fashion . Partly based on Milliken’s confirmation of Einstein’s theory of the photon, Einstein was awarded the Nobel Prize in Physics in 1921.
From that point onward, the physical existence of the photon was accepted and was incorporated routinely into other physical theories. Compton used the energy and the momentum of the photon in 1922 to predict and measure Compton scattering of x-rays off of electrons . The photon was given its modern name by Gilbert Lewis in 1926 .
Single-Photon Interference: Geoffry Taylor (1909)
If a light beam is made up of a group of individual light quanta, then in the limit of very dim light, there should just be one photon passing through an optical system at a time. Therefore, to do optical experiments on single photons, one just needs to reach the ultimate dim limit. As simple and clear as this argument sounds, it has problems that only were sorted out after the Hanbury Brown and Twiss experiments in the 1950’s and the controversy they launched (see below). However, in 1909, this thinking seemed like a clear approach for looking for deviations in optical processes in the single-photon limit.
In 1909, Geoffry Ingram Taylor (1886 – 1975) was an undergraduate student at Cambridge University and performed a low-intensity Young’s double-slit experiment (encouraged by J. J. Thomson). At that time the idea of Einstein’s photon was only 4 years old, and Bohr’s theory of the hydrogen atom was still a year away. But Thomson believed that if photons were real, then their existence could possibly show up as deviations in experiments involving single photons. Young’s double-slit experiment is the classic demonstration of the classical wave nature of light, so performing it under conditions when (on average) only a single photon was in transit between a light source and a photographic plate seemed like the best place to look.
The experiment was performed by finding an optimum exposure of photographic plates in a double slit experiment, then reducing the flux while increasing the exposure time, until the single-photon limit was achieved while retaining the same net exposure of the photographic plate. Under the lowest intensity, when only a single photon was in transit at a time (on average), Taylor performed the exposure for three months. To his disappointment, when he developed the film, there was no significant difference between high intensity and low intensity interference fringes . If photons existed, then their quantized nature was not showing up in the low-intensity interference experiment.
The reason that there is no single-photon-limit deviation in the behavior of the Young double-slit experiment is because Young’s experiment only measures first-order coherence properties. The average over many single-photon detection events is described equally well either by classical waves or by quantum mechanics. Quantized effects in the Young experiment could only appear in fluctuations in the arrivals of photons, but in Taylor’s day there was no way to detect the arrival of single photons.
Quantum Theory of Radiation : Paul Dirac (1927)
After Paul Dirac (1902 – 1984) was awarded his doctorate from Cambridge in 1926, he received a stipend that sent him to work with Niels Bohr (1885 – 1962) in Copenhagen. His attention focused on the electromagnetic field and how it interacted with the quantized states of atoms. Although the electromagnetic field was the classical field of light, it was also the quantum field of Einstein’s photon, and he wondered how the quantized harmonic oscillators of the electromagnetic field could be generated by quantum wavefunctions acting as operators. He decided that, to generate a photon, the wavefunction must operate on a state that had no photons—the ground state of the electromagnetic field known as the vacuum state.
Dirac put these thoughts into their appropriate mathematical form and began work on two manuscripts. The first manuscript contained the theoretical details of the non-commuting electromagnetic field operators. He called the process of generating photons out of the vacuum “second quantization”. In second quantization, the classical field of electromagnetism is converted to an operator that generates quanta of the associated quantum field out of the vacuum (and also annihilates photons back into the vacuum). The creation operators can be applied again and again to build up an N-photon state containing N photons that obey Bose-Einstein statistics, as they must, as required by their integer spin, and agreeing with Planck’s blackbody radiation.
Dirac then showed how an interaction of the quantized electromagnetic field with quantized energy levels involved the annihilation and creation of photons as they promoted electrons to higher atomic energy levels, or demoted them through stimulated emission. Very significantly, Dirac’s new theory explained the spontaneous emission of light from an excited electron level as a direct physical process that creates a photon carrying away the energy as the electron falls to a lower energy level. Spontaneous emission had been explained first by Einstein more than ten years earlier when he derived the famous A and B coefficients , but the physical mechanism for these processes was inferred rather than derived. Dirac, in late 1926, had produced the first direct theory of photon exchange with matter .
Einstein-Podolsky-Rosen (EPR) and Bohr (1935)
The famous dialog between Einstein and Bohr at the Solvay Conferences culminated in the now famous “EPR” paradox of 1935 when Einstein published (together with B. Podolsky and N. Rosen) a paper that contained a particularly simple and cunning thought experiment. In this paper, not only was quantum mechanics under attack, but so was the concept of reality itself, as reflected in the paper’s title “Can Quantum Mechanical Description of Physical Reality Be Considered Complete?” .
Einstein considered an experiment on two quantum particles that had become “entangled” (meaning they interacted) at some time in the past, and then had flown off in opposite directions. By the time their properties are measured, the two particles are widely separated. Two observers each make measurements of certain properties of the particles. For instance, the first observer could choose to measure either the position or the momentum of one particle. The other observer likewise can choose to make either measurement on the second particle. Each measurement is made with perfect accuracy. The two observers then travel back to meet and compare their measurements. When the two experimentalists compare their data, they find perfect agreement in their values every time that they had chosen (unbeknownst to each other) to make the same measurement. This agreement occurred either when they both chose to measure position or both chose to measure momentum.
It would seem that the state of the particle prior to the second measurement was completely defined by the results of the first measurement. In other words, the state of the second particle is set into a definite state (using quantum-mechanical jargon, the state is said to “collapse”) the instant that the first measurement is made. This implies that there is instantaneous action at a distance −− violating everything that Einstein believed about reality (and violating the law that nothing can travel faster than the speed of light). He therefore had no choice but to consider this conclusion of instantaneous action to be false. Therefore quantum mechanics could not be a complete theory of physical reality −− some deeper theory, yet undiscovered, was needed to resolve the paradox.
Bohr, on the other hand, did not hold “reality” so sacred. In his rebuttal to the EPR paper, which he published six months later under the identical title , he rejected Einstein’s criterion for reality. He had no problem with the two observers making the same measurements and finding identical answers. Although one measurement may affect the conditions of the second despite their great distance, no information could be transmitted by this dual measurement process, and hence there was no violation of causality. Bohr’s mind-boggling viewpoint was that reality was nonlocal, meaning that in the quantum world the measurement at one location does influence what is measured somewhere else, even at great distance. Einstein, on the other hand, could not accept a nonlocal reality.
The Intensity Interferometer: Hanbury Brown and Twiss (1956)
Optical physics was surprisingly dormant from the 1930’s through the 1940’s. Most of the research during this time was either on physical optics, like lenses and imaging systems, or on spectroscopy, which was more interested in the physical properties of the materials than in light itself. This hiatus from the photon was about to change dramatically, not driven by physicists, but driven by astronomers.
The development of radar technology during World War II enabled the new field of radio astronomy both with high-tech receivers and with a large cohort of scientists and engineers trained in radio technology. In the late 1940’s and early 1950’s radio astronomy was starting to work with long baselines to better resolve radio sources in the sky using interferometery. The first attempts used coherent references between two separated receivers to provide a common mixing signal to perform field-based detection. However, the stability of the reference was limiting, especially for longer baselines.
In 1950, a doctoral student in the radio astronomy department of the University of Manchester, R. Hanbury Brown, was given the task to design baselines that could work at longer distances to resolve smaller radio sources. After struggling with the technical difficulties of providing a coherent “local” oscillator for distant receivers, Hanbury Brown had a sudden epiphany one evening. Instead of trying to reference the field of one receiver to the field of another, what if, instead, one were to reference the intensity of one receiver to the intensity of the other, specifically correlating the noise on the intensity? To measure intensity requires no local oscillator or reference field. The size of an astronomical source would then show up in how well the intensity fluctuations correlated with each other as the distance between the receivers was changed. He did a back of the envelope calculation that gave him hope that his idea might work, but he needed more rigorous proof if he was to ask for money to try out his idea. He tracked down Richard Twiss at a defense research lab and the two working out the theory of intensity correlations for long-baseline radio interferometry. Using facilities at the famous Jodrell Bank Radio Observatory at Manchester, they demonstrated the principle of their intensity interferometer and measured the angular size of Cygnus A and Cassiopeia A, two of the strongest radio sources in the Northern sky.
One of the surprising side benefits of the intensity interferometer over field-based interferometry was insensitivity to environmental phase fluctuations. For radio astronomy the biggest source of phase fluctuations was the ionosphere, and the new intensity interferometer was immune to its fluctuations. Phase fluctuations had also been the limiting factor for the Michelson stellar interferometer which had limited its use to only about half a dozen stars, so Hanbury Brown and Twiss decided to revisit visible stellar interferometry using their new concept of intensity interferometry.
To illustrate the principle for visible wavelengths, Hanbury Brown and Twiss performed a laboratory experiment to correlate intensity fluctuations in two receivers illuminated by a common source through a beam splitter. The intensity correlations were detected and measured as a function of path length change, illustrating an excess correlation in noise for short path lengths that decayed as the path length increased. They published their results in Nature magazine in 1956 that immediately ignited a firestorm of protest from physicists .
In the 1950’s, many physicists had embraced the discrete properties of the photon and had developed a misleading mental picture of photons as individual and indivisible particles that could only go one way or another from a beam splitter, but not both. Therefore, the argument went, if the photon in an attenuated beam was detected in one detector at the output of a beam splitter, then it cannot be detected at the other. This would produce an anticorrelation in coincidence counts at the two detectors. However, the Hanbury Brown Twiss (HBT) data showed a correlation from the two detectors. This launched an intense controversy in which some of those who accepted the results called for a radical new theory of the photon, while most others dismissed the HBT results as due to systematics in the light source. The heart of this controversy was quickly understood by the Nobel laureate E. M Purcell. He correctly pointed out that photons are bosons and are indistinguishable discrete particles and hence are likely to “bunch” together, according to quantum statistics, even under low light conditions . Therefore, attenuated “chaotic” light would indeed show photodetector correlations, even if the average photon number was less than a single photon at a time, the photons would still bunch.
The bunching of photons in light is a second order effect that moves beyond the first-order interference effects of Young’s double slit, but even here the quantum nature of light is not required. A semiclassical theory of light emission from a spectral line with a natural bandwidth also predicts intensity correlations, and the correlations are precisely what would be observed for photon bunching. Therefore, even the second-order HBT results, when performed with natural light sources, do not distinguish between classical and quantum effects in the experimental results. But this reliance on natural light sources was about to change fundmaentally with the invention of the laser.
Invention of the Laser : Ted Maiman (1959)
One of the great scientific breakthroughs of the 20th century was the nearly simultaneous yet independent realization by several researchers around 1951 (by Charles H. Townes of Columbia University, by Joseph Weber of the University of Maryland, and by Alexander M. Prokhorov and Nikolai G. Basov at the Lebedev Institute in Moscow) that clever techniques and novel apparati could be used to produce collections of atoms that had more electrons in excited states than in ground states. Such a situation is called a population inversion. If this situation could be attained, then according to Einstein’s 1917 theory of photon emission, a single photon would stimulate a second photon, which in turn would stimulate two additional electrons to emit two identical photons to give a total of four photons −− and so on. Clearly this process turns a single photon into a host of photons, all with identical energy and phase.
Charles Townes and his research group were the first to succeed in 1953 in producing a device based on ammonia molecules that could work as an intense source of coherent photons. The initial device did not amplify visible light, but amplified microwave photons that had wavelengths of about 3 centimeters. They called the process microwave amplification by stimulated emission of radiation, hence the acronym “MASER”. Despite the significant breakthrough that this invention represented, the devices were very expensive and difficult to operate. The maser did not revolutionize technology, and some even quipped that the acronym stood for “Means of Acquiring Support for Expensive Research”. The maser did, however, launch a new field of study, called quantum electronics, that was the direct descendant of Einstein’s 1917 paper. Most importantly, the existence and development of the maser became the starting point for a device that could do the same thing for light.
The race to develop an optical maser (later to be called laser, for light amplification by stimulated emission of radiation) was intense. Many groups actively pursued this holy grail of quantum electronics. Most believed that it was possible, which made its invention merely a matter of time and effort. This race was won by Theodore H. Maiman at Hughes Research Laboratory in Malibu California in 1960 . He used a ruby crystal that was excited into a population inversion by an intense flash tube (like a flash bulb) that had originally been invented for flash photography. His approach was amazingly simple −− blast the ruby with a high-intensity pulse of light and see what comes out −− which explains why he was the first. Most other groups had been pursuing much more difficult routes because they believed that laser action would be difficult to achieve.
Perhaps the most important aspect of Maiman’s discovery was that it demonstrated that laser action was actually much simpler than people anticipated, and that laser action is a fairly common phenomenon. His discovery was quickly repeated by other groups, and then additional laser media were discovered such as helium-neon gas mixtures, argon gas, carbon dioxide gas, garnet lasers and others. Within several years, over a dozen different material and gas systems were made to lase, opening up wide new areas of research and development that continues unabated to this day. It also called for new theories of optical coherence to explain how coherent laser light interacted with matter.
Coherent States : Glauber (1963)
The HBT experiment had been performed with attenuated chaotic light that had residual coherence caused by the finite linewidth of the filtered light source. The theory of intensity correlations for this type of light was developed in the 1950’s by Emil Wolf and Leonard Mandel using a semiclassical theory in which the statistical properties of the light was based on electromagnetics without a direct need for quantized photons. The HBT results were fully consistent with this semiclassical theory. However, after the invention of the laser, new “coherent” light sources became available that required a fundamentally quantum depiction.
Roy Glauber was a theoretical physicist who received his PhD working with Julian Schwinger at Harvard. He spent several years as a post-doc at Princeton’s Institute for Advanced Study starting in 1949 at the time when quantum field theory was being developed by Schwinger, Feynman and Dyson. While Feynman was off in Brazil for a year learning to play the bongo drums, Glauber filled in for his lectures at Cal Tech. He returned to Harvard in 1952 in the position of an assistant professor. He was already thinking about the quantum aspects of photons in 1956 when news of the photon correlations in the HBT experiment were published, and when the laser was invented three years later, he began developing a theory of photon correlations in laser light that he suspected would be fundamentally different than in natural chaotic light.
Because of his background in quantum field theory, and especially quantum electrodynamics, it was a fairly easy task to couch the quantum optical properties of coherent light in terms of Dirac’s creation and annihilation operators of the electromagnetic field. Related to the minimum-uncertainty wave functions derived initially by Schrödinger in the late 1920’s, Glauber developed a “coherent state” operator that was a minimum uncertainty state of the quantized electromagnetic field . This coherent state represents a laser operating well above the lasing threshold and predicted that the HBT correlations would vanish. Glauber was awarded the Nobel Prize in Physics in 2005 for his work on such “Glauber” states in quantum optics.
Single-Photon Optics: Kimble and Mandel (1977)
Beyond introducing coherent states, Glauber’s new theoretical approach, and parallel work by George Sudarshan around the same time , provided a new formalism for exploring quantum optical properties in which fundamentally quantum processes could be explored that could not be predicted using only semiclassical theory. For instance, one could envision producing photon states in which the photon arrivals at a detector could display the kind of anti-bunching that had originally been assumed (in error) by the critics of the HBT experiment. A truly one-photon state, also known as a Fock state or a number state, would be the extreme limit in which the quantum field possessed a single quantum that could be directed at a beam splitter and would emerge either from one side or the other with complete anti-correlation. However, generating such a state in the laboratory remained a challenge.
In 1975 by Carmichel and Walls predicted that resonance fluorescence could produce quantized fields that had lower correlations than coherent states . In 1977 H. J. Kimble, M. Dagenais and L. Mandel demonstrated, for the first time, photon antibunching between two photodetectors at the two ports of a beam splitter . They used a beam of sodium atoms pumped by a dye laser.
This first demonstration of photon antibunching represents a major milestone in the history of quantum optics. Taylor’s first-order experiments in 1909 showed no difference between classical electromagnetic waves and a flux of photons. Similarly the second-order HBT experiment of 1956 using chaotic light could be explained equally well using classical or quantum approaches to explain the observed photon correlations. Even laser light (when the laser is operated far above threshold) produced classic “classical” wave effects with only the shot noise demonstrating the discreteness of photon arrivals. Only after the availability of “quantum” light sources, beginning with the work of Kimble and Mandel, could photon numbers be manipulated at will, launching the modern era of quantum optics. Later experiments by them and others have continually improved the control of photon states.
1900 – Planck (1901). “Law of energy distribution in normal spectra.” Annalen Der Physik 4(3): 553-563.
1905 – A. Einstein (1905). “Generation and conversion of light with regard to a heuristic point of view.” Annalen Der Physik 17(6): 132-148.
1909 – A. Einstein (1909). “On the current state of radiation problems.” Physikalische Zeitschrift 10: 185-193.
1909 – G.I. Taylor: Proc. Cam. Phil. Soc. Math. Phys. Sci. 15 , 114 (1909) Single photon double-slit experiment
1915 – Millikan, R. A. (1916). “A direct photoelectric determination of planck’s “h.”.” Physical Review 7(3): 0355-0388. Photoelectric effect.
1916 – Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.. Einstein predicts stimulated emission
1923 –Compton, Arthur H. (May 1923). “A Quantum Theory of the Scattering of X-Rays by Light Elements”. Physical Review. 21 (5): 483–502.
1926 – Lewis, G. N. (1926). “The conservation of photons.” Nature 118: 874-875.. Gilbert Lewis named “photon”
1927 – D. Dirac, P. A. M. (1927). “The quantum theory of the emission and absorption of radiation.” Proceedings of the Royal Society of London Series a-Containing Papers of a Mathematical and Physical Character 114(767): 243-265.
1932 – E. P. Wigner: Phys. Rev. 40, 749 (1932)
1935 – A. Einstein, B. Podolsky, N. Rosen: Phys. Rev. 47 , 777 (1935). EPR paradox.
1935 – N. Bohr: Phys. Rev. 48 , 696 (1935). Bohr’s response to the EPR paradox.
 Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318; Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.
 Brown, R. H. and R. Q. Twiss (1956). “Correlation Between Photons in 2 Coherent Beams of Light.” Nature177(4497): 27-29;  R. H. Brown and R. Q. Twiss, “Test of a new type of stellar interferometer on Sirius,” Nature, vol. 178, no. 4541, pp. 1046-1048, (1956).
 Glauber, R. J. (1963). “Photon Correlations.” Physical Review Letters 10(3): 84.
 Sudarshan, E. C. G. (1963). “Equivalence of semiclassical and quantum mechanical descriptions of statistical light beams.” Physical Review Letters 10(7): 277-&.; Mehta, C. L. and E. C. Sudarshan (1965). “Relation between quantum and semiclassical description of optical coherence.” Physical Review 138(1B): B274.
It is second nature to think of integer dimensions: A line is one dimensional. A plane is two dimensional. A volume is three dimensional. A point has no dimensions.
It is harder to think in four dimensions and higher, but even here it is a simple extrapolation of lower dimensions. Consider the basis vectors spanning a three-dimensional space consisting of the triples of numbers
Then a four dimensional hyperspace is just created by adding a new “tuple” to the list
and so on to 5 and 6 dimensions and on. Child’s play!
But how do you think of fractional dimensions? What is a fractional dimension? For that matter, what is a dimension? Even the integer dimensions began to unravel when George Cantor showed in 1877 that the line and the plane, which clearly had different “dimensionalities”, both had the same cardinality and could be put into a one-to-one correspondence. From then onward the concept of dimension had to be rebuilt from the ground up, leading ultimately to fractals.
Here is a short history of fractal dimension, partially excerpted from my history of dynamics in Galileo Unbound (Oxford University Press, 2018) pg. 110 ff. This blog page presents the history through a set of publications that successively altered how mathematicians thought about curves in spaces, beginning with Karl Weierstrass in 1872.
Karl Weierstrass (1872)
Karl Weierstrass (1815 – 1897) was studying convergence properties of infinite power series in 1872 when he began with a problem that Bernhard Riemann had given to his students some years earlier. Riemann had asked whether the function
was continuous everywhere but not differentiable. This simple question about a simple series was surprisingly hard to answer (it was not solved until Hardy provided the proof in 1916 ). Therefore, Weierstrass conceived of a simpler infinite sum that was continuous everywhere and for which he could calculate left and right limits of derivatives at any point. This function is
where b is a large odd integer and a is positive and less than one. Weierstrass showed that the left and right derivatives failed to converge to the same value, no matter where he took his point. In short, he had discovered a function that was continuous everywhere, but had a derivative nowhere . This pathological function, called a “Monster” by Charles Hermite, is now called the Weierstrass function.
Beyond the strange properties that Weierstrass sought, the Weierstrass function would turn out to be a fractal curve (recognized much later by Besicovitch and Ursell in 1937 ) with a fractal (Hausdorff) dimension given by
although this was not proven until very recently . An example of the function is shown in Fig. 1 for a = 0.5 and b = 5. This specific curve has a fractal dimension D = 1.5693. Notably, this is a number that is greater than 1 dimension (the topological dimension of the curve) but smaller than 2 dimensions (the embedding dimension of the curve). The curve tends to fill more of the two dimensional plane than a straight line, so its intermediate fractal dimension has an intuitive feel about it. The more “monstrous” the curve looks, the closer its fractal dimension approaches 2.
Fig. 1 Weierstrass’ “Monster” (1872) with a = 0.5, b = 5. This continuous function is nowhere differentiable. It is a fractal with fractal dimension D = 2 + ln(0.5)/ln(5) = 1.5693.
Georg Cantor (1883)
Partially inspired by Weierstrass’ discovery, George Cantor (1845 – 1918) published an example of an unusual ternary set in 1883 in “Grundlagen einer allgemeinen Mannigfaltigkeitslehre” (“Foundations of a General Theory of Aggregates”) . The set generates a function (The Cantor Staircase) that has a derivative equal to zero almost everywhere, yet whose area integrates to unity. It is a striking example of a function that is not equal to the integral of its derivative! Cantor demonstrated that the size of his set is aleph0 , which is the cardinality of the real numbers. But whereas the real numbers are uniformly distributed, Cantor’s set is “clumped”. This clumpiness is an essential feature that distinguishes it from the one-dimensional number line, and it raised important questions about dimensionality. The fractal dimension of the ternary Cantor set is DH = ln(2)/ln(3) = 0.6309.
Fig. 2 The 1883 Cantor set (below) and the Cantor staircase (above, as the indefinite integral over the set).
Giuseppe Peano (1890)
In 1878, in a letter to his friend Richard Dedekind, Cantor showed that there was a one-to-one correspondence between the real numbers and the points in any n-dimensional space. He was so surprised by his own result that he wrote to Dedekind “I see it, but I don’t believe it.” The solid concepts of dimension and dimensionality were dissolving before his eyes. What does it mean to trace the path of a trajectory in an n-dimensional space, if all the points in n dimensions were just numbers on a line? What could such a trajectory look like? A graphic example of a plane-filling path was constructed in 1890 by Peano , who was a peripatetic mathematician with interests that wandered broadly across the landscape of the mathematical problems of his day—usually ahead of his time. Only two years after he had axiomatized linear vector spaces , Peano constructed a continuous curve that filled space.
The construction of Peano’s curve proceeds by taking a square and dividing it into 9 equal sub squares. Lines connect the centers of each of the sub squares. Then each sub square is divided again into 9 sub squares whose centers are all connected by lines. At this stage, the original pattern, repeated 9 times, is connected together by 8 links, forming a single curve. This process is repeated infinitely many times, resulting in a curve that passes through every point of the original plane square. In this way, a line is made to fill a plane. Where Cantor had proven abstractly that the cardinality of the real numbers was the same as the points in n-dimensional space, Peano created a specific example. This was followed quickly by another construction, invented by David Hilbert in 1891, that divided the square into four instead of nine, simplifying the construction, but also showing that such constructions were easily generated.
Fig. 3 Peano’s (1890) and Hilbert’s (1891) plane-filling curves. When the iterations are taken to infinity, the curves approach every point of two-dimensional space arbitrarily closely, giving them a dimension DH = DE = 2, although their topological dimensions are DT = 1.
Helge von Koch (1904)
The space-filling curves of Peano and Hilbert have the extreme property that a one-dimensional curve approaches every point in a two-dimensional space. This ability of a one-dimensional trajectory to fill space mirrored the ergodic hypothesis that Boltzmann relied upon as he developed statistical mechanics. These examples by Peano, Hilbert and Boltzmann inspired searches for continuous curves whose dimensionality similarly exceeded one dimension, yet without filling space. Weierstrass’ Monster was already one such curve, existing in some dimension greater than one but not filling the plane. The construction of the Monster required infinite series of harmonic functions, and the resulting curve was single valued on its domain of real numbers.
An alternative approach was proposed by Helge von Koch (1870—1924), a Swedish mathematician with an interest in number theory. He suggested in 1904 that a set of straight line segments could be joined together, and then shrunk by a scale factor to act as new segments of the original pattern . The construction of the Koch curve is shown in Fig. 4. When the process is taken to its limit, it produces a curve, differentiable nowhere, which snakes through two dimensions. When connected with other identical curves into a hexagon, the curve resembles a snowflake, and the construction is known as “Koch’s Snowflake”.
The Koch curve begins in generation 1 with N0 = 4 elements. These are shrunk by a factor of b = 1/3 to become the four elements of the next generation, and so on. The number of elements varies with the observation scale according to the equation
where D is called the fractal dimension. In the example of the Koch curve, the fractal dimension is
which is a number less than its embedding dimenion DE = 2. The fractal is embedded in 2D but has a fractional dimension that is greater than it topological dimension DT = 1.
Fig. 4 Generation of a Koch curve (1904). The fractal dimension is D = ln(4)/ln(3) = 1.26. At each stage, four elements are reduced in size by a factor of 3. The “length” of the curve approaches infinity as the features get smaller and smaller. But the scaling of the length with size is determined uniquely by the fractal dimension.
Waclaw Sierpinski (1915)
Waclaw Sierpinski (1882 – 1969) was a Polish mathematician studying at the Jagellonian University in Krakow for his doctorate when he came across a theorem that every point in the plane can be defined by a single coordinate. Intrigued by such an unintuitive result, he dived deep into Cantor’s set theory after he was appointed as a faculty member at the university in Lvov. He began to construct curves that had more specific properties than the Peano or Hilbert curves, such as a curve that passes through every interior point of a unit square but that encloses an area that is only equal to 5/12 = 0.4167. Sierpinski became interested in the topological properties of such sets.
Sierpinski considered how to define a curve that was embedded in DE = 2 but that was NOT constructed as a topological dimension DT = 1 curve as the curves of Peano, Hilbert, Koch (and even his own) had been. To demonstrate this point, he described a construction that began with a topological dimension DT = 2 object, a planar triangle, from which the open set of its central inverted triangle is removed, leaving its boundary points. The process is continued iteratively to all scales . The resulting point set is shown in Fig. 5 and is called the Sierpinski gasket. What is left after all the internal triangles are removed is a point set that can be made discontinuous by cutting it at a finite set of points. This is shown in Fig. 5 by the red circles. Each circle, no matter the size, cuts the set at three points, making the resulting set discontinuous. Ten years later, Karl Menger would show that this property of discontinuous cuts determined the topological dimension of the Sierpinski gasket to be DT = 1. The embedding dimension is of course DE = 2, and the fractal dimension of the Sierpinski gasket is
Fig. 5 The Sierpinski gasket. The central triangle is removed (leaving its boundary) at each scale. The pattern is self-similar with a fractal dimension DH = 1.5850. Unintuitively, it has a topological dimension DT = 1.
Felix Hausdorff (1918)
The work by Cantor, Peano, von Koch and Sierpinski had created a crisis in geometry as mathematicians struggled to rescue concepts of dimensionality. An important byproduct of that struggle was a much deeper understanding of concepts of space, especially in the hands of Felix Hausdorff.
Felix Hausdorff (1868 – 1942) was born in Breslau, Prussia, and educated in Leipzig. In his early years as a doctoral student, and as an assistant professor at Leipzig, he was a practicing mathematician by day and a philosopher and playwright by night, publishing under the pseudonym Paul Mongré. He was at the University of Bonn working on set theory when the Greek mathematician Constatin Carathéodory published a paper in 1914 that showed how to construct a p-dimensional set in a q-dimensional space . Haussdorff realized that he could apply similar ideas to the Cantor set. He showed that the outer measure of the Cantor set would go discontinuously from zero to infinity as the fractional dimension increased smoothly. The critical value where the measure changed its character became known as the Hausdorff dimension .
For the Cantor ternary set, the Hausdorff dimension is exactly DH = ln(2)/ln(3) = 0.6309. This value for the dimension is less than the embedding dimension DE = 1 of the support (the real numbers on the interval [0, 1]), but it is also greater than DT = 0 which would hold for a countable number of points on the interval. The work by Hausdorff became well known in the mathematics community who applied the idea to a broad range of point sets like Weierstrass’s monster and the Koch curve.
It is important to keep a perspective of what Hausdorff’s work meant during which period of time. For instance, although the curves of Weierstrass, von Koch and Sierpinski were understood to present a challenge to concepts of dimension, it was only after Haussdorff that mathematicians began to think in terms of fractional dimensions and to calculate the fractional dimensions of these earlier point sets. Despite the fact that Sierpinski created one of the most iconic fractals that we use as an example every day, he was unaware at the time that he was doing so. His interest was topological—creating a curve for which any cut at any point would create disconnected subsets starting with objects (triangles) with topological dimension DT = 2. In this way, talking about the early fractal objects tends to be anachronistic, using language to describe them that had not yet been invented at that time.
This perspective is also true for the ideas of topological dimension. For instance, even Sierpinski was not fully tuned into the problems of defining topological dimension. It turns out that what he created was a curve of topological dimension DT = 1, but that would only become clear later with the work of the Austrian mathematician Karl Menger.
Karl Menger (1926)
The day that Karl Menger (1902 – 1985) was born, his father, Carl Menger (1840 – 1941) lost his job. Carl Menger was one of the founders of the famous Viennese school that established the marginalist view of economics. However, Carl was not married to Karl’s mother, which was frowned upon by polite Vienna society, so he had to relinquish his professorship. Despite his father’s reduction in status, Karl received an excellent education at a Viennese gymnasium (high school). Among of his classmates were Wolfgang Pauli (Nobel Prize for Physics in 1945) and Richard Kuhn (Nobel Prize for Chemistry in 1938). When Karl began attending the University of Vienna he studied physics, but the mathematics professor Hans Hahn opened his eyes to the fascinating work on analysis that was transforming mathematics at that time, so Karl shifted his studies to mathematical analysis, specifically concerning conceptions of “curves”.
Menger made important contributions to the history of fractal dimension as well as the history of topological dimension. In his approach to defining the intrinsic topological dimension of a point set, he described the construction of a point set embedded in three dimensions that had zero volume, an infinite surface area, and a fractal dimension between 2 and 3. The object is shown in Fig. 6 and is called a Menger “sponge” . The Menger sponge is a fractal with a fractal dimension DH = ln(20)/ln(3) = 2.7268. The face of the sponge is also known as the Sierpinski carpt. The fractal dimension of the Sierpinski carpet is DH = ln(8)/ln(3) = 1.8928.
Fig. 6 Menger Sponge. Embedding dimension DE = 3. Fractal dimension DH = ln(20)/ln(3) = 2.7268. Topological dimension DT = 1: all one-dimensional metric spaces can be contained within the Menger sponge point set. Each face is a Sierpinski carpet with fractal dimension DH = ln(8)/ln(3) = 1.8928.
The striking feature of the Menger sponge is its topological dimension. Menger created a new definition of topological dimension that partially solved the crises created by Cantor when he showed that every point on the unit square can be defined by a single coordinate. This had put a one dimensional curve in one-to-one correspondence with a two-dimensional plane. Yet the topology of a 2-dimensional object is clearly different than the topology of a line. Menger found a simple definition that showed why 2D is different, topologically, than 3D, despite Cantor’s conundrum. The answer came from the idea of making cuts on a point set and seeing if the cut created disconnected subsets.
As a simple example, take a 1D line. The removal of a single point creates two disconnected sub-lines. The intersection of the cut with the line is 0-dimensional, and Menger showed that this defined the line as 1-dimensional. Similarly, a line cuts the unit square into to parts. The intersection of the cut with the plane is 1-dimensional, signifying that the dimension of the plane is 2-dimensional. In other words, a (n-1) dimensional intersection of the boundary of a small neighborhood with the point set indicates that the point set has a dimension of n. Generalizing this idea, looking at the Sierpinski gasket in Fig. 5, the boundary of a small circular region, if placed appropriately (as in the figure), intersects the Sierpinski gasket at three points of dimension zero. Hence, the topological dimension of the Sierpinski gasket is one-dimensional. Manger was likewise able to show that his sponge also had a topology that was one-dimensional, DT = 1, despite the embedding dimension of DE = 3. In fact, all 1-dimensional metric spaces can be fit inside a Menger Sponge.
Benoit Mandelbrot (1967)
Benoit Mandelbrot (1924 – 2010) was born in Warsaw and his family emigrated to Paris in 1935. He attended the Ecole Polytechnique where he studied under Gaston Julia (1893 – 1978) and Paul Levy (1886 – 1971). Both Julia and Levy made significant contributions to the field of self-similar point sets and made a lasting impression on Mandelbrot. He went to Cal Tech for a master’s degree in aeronautics and then a PhD in mathematical sciences from the University of Paris. In 1958 Mandelbrot joined the research staff of the IBM Thomas J. Watson Research Center in Yorktown Heights, New York where he worked for over 35 years on topics of information theory and economics, always with a view of properties of self-similar sets and time series.
In 1967 Mandelbrot published one of his early important papers on the self-similar properties of the coastline of Britain. He proposed that many natural features had statistical self similarity, which he applied to coastlines. He published the work as “How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension”  in Science magazine , where he showed that the length of the coastline diverged with a Hausdorff dimension equal to D = 1.25. Working at IBM, a world leader in computers, he had ready access to their power as well as their visualization capabilities. Therefore, he was one of the first to begin exploring the graphical character of self-similar maps and point sets.
During one of his sabbaticals at Harvard University he began exploring the properties of Julia sets (named after his former teacher at the Ecole Polytechnique). The Julia set is a self-similar point set that is easily visualized in the complex plane (two dimensions). As Mandelbrot studied the convergence of divergence of infinite series defined by the Julia mapping, he discovered an infinitely nested pattern that was both beautiful and complex. This has since become known as the Mandelbrot set.
Later, in 1975, Mandelbrot coined the term fractal to describe these self-similar point sets, and he began to realize that these types of sets were ubiquitous in nature, ranging from the structure of trees and drainage basins, to the patterns of clouds and mountain landscapes. He published his highly successful and influential book The Fractal Geometry of Nature in 1982, introducing fractals to the wider public and launching a generation of hobbyists interested in computer-generated fractals. The rise of fractal geometry coincided with the rise of chaos theory that was aided by the same computing power. For instance, important geometric structures of chaos theory, known as strange attractors, have fractal geometry.
Appendix: Box Counting
When confronted by a fractal of unknown structure, one of the simplest methods to find the fractal dimension is through box counting. This method is shown in Fig. 8. The fractal set is covered by a set of boxes of size b, and the number of boxes that contain at least one point of the fractal set are counted. As the boxes are reduced in size, the number of covering boxes increases as
To be numerically accurate, this method must be iterated over several orders of magnitude. The number of boxes covering a fractal has this characteristic power law dependence, as shown in Fig. 8, and the fractal dimension is obtained as the slope.
Fig. 8 Calculation of the fractal dimension using box counting. At each generation, the size of the grid is reduced by a factor of 3. The number of boxes that contain some part of the fractal curve increases as , where b is the scale
 Hardy, G. (1916). “Weierstrass’s non-differentiable function.” Transactions of the American Mathematical Society 17: 301-325.
 Weierstrass, K. (1872). “Uber continuirliche Functionen eines reellen Arguments, die fur keinen Werth des letzteren einen bestimmten Differentialquotienten besitzen.” Communication ri I’Academie Royale des Sciences II: 71-74.
 Besicovitch, A. S. and H. D. Ursell (1937). “Sets of fractional dimensions: On dimensional numbers of some continuous curves.” J. London Math. Soc. 1(1): 18-25.
 Shen, W. (2018). “Hausdorff dimension of the graphs of the classical Weierstrass functions.” Mathematische Zeitschrift. 289(1–2): 223–266.
 Cantor, G. (1883). Grundlagen einer allgemeinen Mannigfaltigkeitslehre. Leipzig, B. G. Teubner.
 Peano, G. (1890). “Sur une courbe qui remplit toute une aire plane.” Mathematische Annalen 36: 157-160.
 Peano, G. (1888). Calcolo geometrico secundo l’Ausdehnungslehre di H. Grassmann e precedutto dalle operazioni della logica deduttiva. Turin, Fratelli Bocca Editori.
 Von Koch, H. (1904). “Sur.une courbe continue sans tangente obtenue par une construction geometrique elementaire.” Arkiv for Mathematik, Astronomi och Fysich 1: 681-704.
 Sierpinski, W. (1915). “Sur une courbe dont tout point est un point de ramification.” Comptes Rendus Hebdomadaires des Seances de l’Academie des Sciences de Paris 160: 302-305.
 Carathéodory, C. (1914). “Über das lineare Mass von Punktmengen – eine Verallgemeinerung des Längenbegriffs.” Gött. Nachr. IV: 404–406.
 Hausdorff, F. (1919). “Dimension und ausseres Mass.” Mathematische Anna/en 79: 157-179.
 Menger, Karl (1926), “Allgemeine Räume und Cartesische Räume. I.”, Communications to the Amsterdam Academy of Sciences. English translation reprinted in Edgar, Gerald A., ed. (2004), Classics on fractals, Studies in Nonlinearity, Westview Press. Advanced Book Program, Boulder, CO
 B Mandelbrot, How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension. Science, 156 3775 (May 5, 1967): 636-638.
Quantum sensors have amazing powers. They can detect the presence of an obstacle without ever interacting with it. For instance, consider a bomb that is coated with a light sensitive layer that sets off the bomb if it absorbs just a single photon. Then put this bomb inside a quantum sensor system and shoot photons at it. Remarkably, using the weirdness of quantum mechanics, it is possible to design the system in such a way that you can detect the presence of the bomb using photons without ever setting it off. How can photons see the bomb without illuminating it? The answer is a bizarre side effect of quantum physics in which quantum wavefunctions are recognized as the root of reality as opposed to the pesky wavefunction collapse at the moment of measurement.
The ability for a quantum system to see an object with light, without exposing it, is uniquely a quantum phenomenon that has no classical analog.
All Paths Lead to Feynman
When Richard Feynman was working on his PhD under John Archibald Wheeler at Princeton in the early 1940’s he came across an obscure paper written by Paul Dirac in 1933 that connected quantum physics with classical Lagrangian physics. Dirac had recognized that the phase of a quantum wavefunction was analogous to the classical quantity called the “Action” that arises from Lagrangian physics. Building on this concept, Feynman constructed a new interpretation of quantum physics, known as the “many histories” interpretation, that occupies the middle ground between Schrödinger’s wave mechanics and Heisenberg’s matrix mechanics. One of the striking consequences of the many histories approach is the emergence of the principle of least action—a classical concept—into interpretations of quantum phenomena. In this approach, Feynman considered ALL possible histories for the propagation of a quantum particle from one point to another, he tabulated the quantum action in the phase factor, and he then summed all of these histories.
One of the simplest consequences of the sum over histories is a quantum interpretation of Snell’s law of refraction in optics. When summing over all possible trajectories of a photon from a point above to a point below an interface, there are a subset of paths for which the action integral varies very little from one path in the subset to another. The consequence of this is that the phases of all these paths add constructively, producing a large amplitude to the quantum wavefunction along the centroid of these trajectories. Conversely, for paths far away from this subset, the action integral takes on many values and the phases tend to interfere destructively, canceling the wavefunction along these other paths. Therefore, the most likely path of the photon between the two points is the path of maximum constructive interference and hence the path of stationary action. It is simple so show that this path is none other than the classical path determined by Snell’s Law and equivalently by Fermat’s principle of least time. With the many histories approach, we can add the principle of least (or stationary) action to the list of explanations of Snell’s Law. This argument holds as well for an electron (with mass and a de Broglie wavelength) as it does for a photon, so this not just a coincidence specific to optics but is a fundamental part of quantum physics.
A more subtle consequence of the sum over histories view of quantum phenomena is Young’s double slit experiment for electrons, shown at the top of Fig 1. The experiment consists of a source that emits only a single electron at a time that passes through a double-slit mask to impinge on an electron detection screen. The wavefunction for a single electron extends continuously throughout the full spatial extent of the apparatus, passing through both slits. When the two paths intersect at the screen, the difference in the quantum phases of the two paths causes the combined wavefunction to have regions of total constructive interference and other regions of total destructive interference. The probability of detecting an electron is proportional to the squared amplitude of the wavefunction, producing a pattern of bright stripes separated by darkness. At positions of destructive interference, no electrons are detected when both slits are open. However, if an opaque plate blocks the upper slit, then the interference pattern disappears, and electrons can be detected at those previously dark locations. Therefore, the presence of the object can be deduced by the detection of electrons at locations that should be dark.
Fig. 1 Demonstration of the sum over histories in a double-slit experiment for electrons. In the upper frame, the electron interference pattern on the phosphorescent screen produces bright and dark stripes. No electrons hit the screen in a dark stripe. When the upper slit is blocked (bottom frame), the interference pattern disappears, and an electron can arrive at the location that had previously been dark.
Consider now when the opaque plate is an electron-sensitive detector. In this case, a single electron emitted by the source can be detected at the screen or at the plate. If it is detected at the screen, it can appear at the location of a dark fringe, heralding the presence of the opaque plate. Yet the quantum conundrum is that when the electron arrives at a dark fringe, it must be detected there as a whole, it cannot be detected at the electron-sensitive plate too. So how does the electron sense the presence of the detector without exposing it, without setting it off?
In Feynman’s view, the electron does set off the detector as one possible history. And that history interferes with the other possible history when the electron arrives at the screen. While that interpretation may seem weird, mathematically it is a simple statement that the plate blocks the wavefunction from passing through the upper slit, so the wavefunction in front of the screen, resulting from all possible paths, has no interference fringes (other than possible diffraction from the lower slit). From this point of view, the wavefunction samples all of space, including the opaque plate, and the eventual absorption of a photon one place or another has no effect on the wavefunction. In this sense, it is the wavefunction, prior to any detection event, that samples reality. If the single electron happens to show up at a dark fringe at the screen, the plate, through its effects on the total wavefunction, has been detected without interacting with the photon.
This phenomenon is known as an interaction-free measurement, but there are definitely some semantics issues here. Just because the plate doesn’t absorb a photon, it doesn’t mean that the plate plays no role. The plate certainly blocks the wavefunction from passing through the upper slit. This might be called an “interaction”, but that phrase it better reserved for when the photon is actually absorbed, while the role of the plate in shaping the wavefunction is better described as one of the possible histories.
Quantum Seeing in the Dark
Although Feynman was thinking hard (and clearly) about these issues as he presented his famous lectures in physics at Cal Tech during 1961 to 1963, the specific possibility of interaction-free measurement dates more recently to 1993 when Avshalom C. Elitzur and Lev Vaidman at Tel Aviv University suggested a simple Michelson interferometer configuration that could detect an object half of the time without interacting with it . They are the ones who first pressed this point home by thinking of a light-sensitive bomb. There is no mistaking when a bomb goes off, so it tends to give an exaggerated demonstration of the interaction-free measurement.
The Michelson interferometer for interaction-free measurement is shown in Fig. 2. This configuration uses a half-silvered beamsplitter to split the possible photon paths. When photons hit the beamsplitter, they either continue traveling to the right, or are deflected upwards. After reflecting off the mirrors, the photons again encounter the beamsplitter, where, in each case, they continue undeflected or are reflected. The result is that two paths combine at the beamsplitter to travel to the detector, while two other paths combine to travel back along the direction of the incident beam.
Fig. 2 A quantum-seeing in the dark (QSD) detector with a photo-sensitive bomb. A single photon is sent into the interferometer at a time. If the bomb is NOT present, destructive interference at the detector guarantees that the photon is not detected. However, if the bomb IS present, it destroys the destructive interference and the photon can arrive at the detector. That photon heralds the presence of the bomb without setting it off. (Reprinted from Mind @ Light Speed)
The paths of the light beams can be adjusted so that the beams that combine to travel to the detector experience perfect destructive interference. In this situation, the detector never detects light, and all the light returns back along the direction of the incident beam. Quantum mechanically, when only a single photon is present in the interferometer at a time, we would say that the quantum wavefunction of the photon interferes destructively along the path to the detector, and constructively along the path opposite to the incident beam, and the detector would detect no photons. It is clear that the unobstructed path of both beams results in the detector making no detections.
Now place the light sensitive bomb in the upper path. Because this path is no longer available to the photon wavefunction, the destructive interference of the wavefunction along the detector path is removed. Now when a single photon is sent into the interferometer, three possible things can happen. One, the photon is reflected by the beamsplitter and detonates the bomb. Two, the photon is transmitted by the beamsplitter, reflects off the right mirror, and is transmitted again by the beamsplitter to travel back down the incident path without being detected by the detector. Three, the photon is transmitted by the beamsplitter, reflects off the right mirror, and is reflected off the beamsplitter to be detected by the detector.
In this third case, the photon is detected AND the bomb does NOT go off, which succeeds at quantum seeing in the dark. The odds are much better than for Young’s experiment. If the bomb is present, it will detonate a maximum of 50% of the time. The other 50%, you will either detect a photon (signifying the presence of the bomb), or else you will not detect a photon (giving an ambiguous answer and requiring you to perform the experiment again). When you perform the experiment again, you again have a 50% chance of detonating the bomb, and a 25% chance of detecting it without it detonating, but again a 25% chance of not detecting it, and so forth. All in all, every time you send in a photon, you have one chance in four of seeing the bomb without detonating it. These are much better odds than for the Young’s apparatus where only exact detection of the photon at a forbidden location would signify the presence of the bomb.
It is possible to increase your odds above one chance in four by decreasing the reflectivity of the beamsplitter. In practice, this is easy to do simply by depositing less and less aluminum on the surface of the glass plate. When the reflectivity gets very low, let us say at the level of 1%, then most of the time the photon just travels back along the direction it came and you have an ambiguous result. On the other hand, when the photon does not return, there is an equal probability of detonation as detection. This means that, though you may send in many photons, your odds for eventually seeing the bomb without detonating it are nearly 50%, which is a factor of two better odds than for the half-silvered beamsplitter. A version of this experiment was performed by Paul Kwiat in 1995 as a postdoc at Innsbruck with Anton Zeilinger. It was Kwiat who coined the phrase “quantum seeing in the dark” as a catchier version of “interaction-free measurement” .
A 50% chance of detecting the bomb without setting it off sounds amazing, until you think that there is a 50% chance that it will go off and kill you. Then those odds don’t look so good. But optical phenomena never fail to surprise, and they never let you down. A crucial set of missing elements in the simple Michelson experiment was polarization-control using polarizing beamsplitters and polarization rotators. These are common elements in many optical systems, and when they are added to the Michelson quantum sensor, they can give almost a 100% chance of detecting the bomb without setting it off using the quantum Zeno effect.
The Quantum Zeno Effect
Photons carry polarization as their prime quantum number, with two possible orientations. These can be defined in different ways, but the two possible polarizations are orthogonal to each other. For instance, these polarization pairs can be vertical (V) and horizontal (H), or they can be right circular and left circular. One of the principles of quantum state evolution is that a quantum wavefunction can be maintained in a specific state, even if it has a tendency naturally to drift out of that state, by repeatedly making a quantum measurement that seeks to measure deviations from that state. In practice, the polarization of a photon can be maintained by repeatedly passing it through a polarizing beamsplitter with the polarization direction parallel to the original polarization of the photon. If there is a deviation in the photon polarization direction by a small angle, then a detector on the side port of the polarizing beamsplitter will fire with a probability equal to the square of the sine of the deviation. If the deviation angle is very small, say Δθ, then the probability of measuring the deviation is proportional to (Δθ)2, which is an even smaller number. Furthermore, the probability that the photon will transmit through the polarizing beamsplitter is equal to 1-(Δθ)2 , which is nearly 100%.
This is what happens in Fig. 3 when the photo-sensitive bomb IS present. A single H-polarized photon is injected through a switchable mirror into the interferometer on the right. In the path of the photon is a polarization rotator that rotates the polarization by a small angle Δθ. There is nearly a 100% chance that the photon will transmit through the polarizing beamsplitter with perfect H-polarization reflect from the mirror and return through the polarizing beamsplitter, again with perfect H-polarization to pass through the polarization rotator to the switchable mirror where it reflects, gains another increment to its polarization angle, which is still small, and transmits through the beamsplitter, etc. At each pass, the photon polarization is repeatedly “measured” to be horizontal. After a number of passes N = π/Δθ/2, the photon is switched out of the interferometer and is transmitted through the external polarizing beamsplitter where it is detected at the H-photon detector.
Now consider what happens when the bomb IS NOT present. This time, even though there is a high amplitude for the transmitted photon, there is that Δθ amplitude for reflection out the V port. This small V-amplitude, when it reflects from the mirror, recombines with the H-amplitude at the polarizing beamsplitter to produce a polarization that has the same tilted polarizaton that it started with, sending it back in the direction from which it came. (In this situation, the detector on the “dark” port of the internal beamsplitter never sees the photon because of destructive interference along this path.) The photon is then rotated once more by the polarization rotator, and the photon polarization is rotated again, etc.. Now, after a number of passes N = π/Δθ/2, the photon has acquired a V polarization and is switched out of the interferometer. At the external polarizing beamsplitter it is reflected out of the V-port where it is detected at the V-photon detector.
Fig. 3 Quantum Zeno effect for interaction-free measurement. If the bomb is present, the H-photon detector detects the output photon without setting it off. The switchable mirror ejects the photon after it makes π/Δθ/2 round trips in the polarizing interferometer.
The two end results of this thought experiment are absolutely distinct, giving a clear answer to the question whether the bomb is present or not. If the bomb IS present, the H-detector fires. If the bomb IS NOT present, then the V-detector fires. Through all of this, the chance to set off the bomb is almost zero. Therefore, this quantum Zeno interaction-free measurement detects the bomb with nearly 100% efficiency with almost no chance of setting it off. This is the amazing consequence of quantum physics. The wavefunction is affected by the presence of the bomb, altering the interference effects that allow the polarization to rotate. But the likelihood of a photon being detected by the bomb is very low.
On a side note: Although ultrafast switchable mirrors do exist, the experiment was much easier to perform by creating a helix in the optical path through the system so that there is only a finite number of bounces of the photon inside the cavity. See Ref.  for details.
In conclusion, the ability for a quantum system to see an object with light, without exposing it, is uniquely a quantum phenomenon that has no classical analog. No E&M wave description can explain this effect.
I first wrote about quantum seeing the dark in my 2001 book on the future of optical physics and technology: Nolte, D. D. (2001). Mind at Light Speed : A new kind of intelligence. (New York, Free Press)
More on the story of Feynman and Wheeler and what they were trying to accomplish is told in Chapter 8 of Galileo Unbound on the physics and history of dynamics: Nolte, D. D. (2018). Galileo Unbound: A Path Across Life, the Universe and Everything (Oxford University Press).
Paul Kwiat introduced to the world to interaction-free measurements in 1995 in this illuminating Scientific American article: Kwiat, P., H. Weinfurter and A. Zeilinger (1996). “Quantum seeing in the dark – Quantum optics demonstrates the existence of interaction-free measurements: the detection of objects without light-or anything else-ever hitting them.” Scientific American 275(5): 72-78.
 Elitzur, A. C. and L. Vaidman (1993). “QUANTUM-MECHANICAL INTERACTION-FREE MEASUREMENTS.” Foundations of Physics 23(7): 987-997.
 Kwiat, P., H. Weinfurter, T. Herzog, A. Zeilinger and M. A. Kasevich (1995). “INTERACTION-FREE MEASUREMENT.” Physical Review Letters 74(24): 4763-4766.
The butterfly effect is one of the most widely known principles of chaos theory. It has become a meme, propagating through popular culture in movies, books, TV shows and even casual conversation.
Can a butterfly flapping its wings in Florida send a hurricane to New York?
The origin of the butterfly effect is — not surprisingly — the image of a butterfly-like set of trajectories that was generated, in one of the first computer simulations of chaos theory, by Edward Lorenz.
When Edward Lorenz (1917 – 2008) was a child, he memorized all perfect squares up to ten thousand. This obvious interest in mathematics led him to a master’s degree in the subject at Harvard in 1940 under the supervision of Georg Birkhoff. Lorenz’s master’s thesis was on an aspect of Riemannian geometry, but his foray into nonlinear dynamics was triggered by the intervention of World War II. Only a few months before receiving his doctorate in mathematics from Harvard, the Japanese bombed Pearl Harbor.
Lorenz left the PhD program at Harvard to join the United States Army Air Force to train as a weather forecaster in early 1942, and he took courses on forecasting and meteorology at MIT. After receiving a second master’s degree, this time in meteorology, Lorenz was posted to Hawaii, then to Saipan and finally to Guam. His area of expertise was in high-level winds, which were important for high-altitude bombing missions during the final months of the war in the Pacific. After the Japanese surrender, Lorenz returned to MIT, where he continued his studies in meteorology, receiving his doctorate degree in 1948 with a thesis on the application of fluid dynamical equations to predict the motion of storms.
One of Lorenz’ colleagues at MIT was Norbert Wiener (1894 – 1964), with whom he sometimes played chess during lunch at the faculty club. Wiener had published his landmark book Cybernetics: Control and Communication in the Animal and Machine in 1949 which arose out of the apparently mundane problem of gunnery control during the Second World War. As an abstract mathematician, Wiener attempted to apply his cybernetic theory to the complexities of weather, but he developed a theorem concerning nonlinear fluid dynamics which appeared to show that linear interpolation, of sufficient resolution, would suffice for weather forecasting, possibly even long-range forecasting. Many on the meteorology faculty embraced this theorem because it fell in line with common practices of the day in which tomorrow’s weather was predicted using linear regression on measurements taken today. However, Lorenz was skeptical, having acquired a detailed understanding of atmospheric energy cascades as larger vortices induced smaller vortices all the way down to the molecular level, dissipating as heat, and then all the way back up again as heat drove large-scale convection. This was clearly not a system that would yield to linearization. Therefore, Lorenz determined to solve nonlinear fluid dynamics models to test this conjecture.
Even with a computer in hand, the atmospheric equations needed to be simplified to make the calculations tractable. Lorenz was more a scientist than an engineer, and more of a meteorologist than a forecaster. He did not hesitate to make simplifying assumptions if they retained the correct phenomenological behavior, even if they no longer allowed for accurate weather predictions.
He had simplified the number of atmospheric equations down to twelve. Progress was good, and by 1961, he had completed a large initial numerical study. He focused on nonperiodic solutions, which he suspected would deviate significantly from the predictions made by linear regression, and this hunch was vindicated by his numerical output. One day, as he was testing his results, he decided to save time by starting the computations midway by using mid-point results from a previous run as initial conditions. He typed in the three-digit numbers from a paper printout and went down the hall for a cup of coffee. When he returned, he looked at the printout of the twelve variables and was disappointed to find that they were not related to the previous full-time run. He immediately suspected a faulty vacuum tube, as often happened. But as he looked closer at the numbers, he realized that, at first, they tracked very well with the original run, but then began to diverge more and more rapidly until they lost all connection with the first-run numbers. His initial conditions were correct to a part in a thousand, but this small error was magnified exponentially as the solution progressed.
At this point, Lorenz recalled that he “became rather excited”. He was looking at a complete breakdown of predictability in atmospheric science. If radically different behavior arose from the smallest errors, then no measurements would ever be accurate enough to be useful for long-range forecasting. At a more fundamental level, this was a break with a long-standing tradition in science and engineering that clung to the belief that small differences produced small effects. What Lorenz had discovered, instead, was that the deterministic solution to his 12 equations was exponentially sensitive to initial conditions (known today as SIC).
The Lorenz Equations
Over the following months, he was able to show that SIC was a result of the nonperiodic solutions. The more Lorenz became familiar with the behavior of his equations, the more he felt that the 12-dimensional trajectories had a repeatable shape. He tried to visualize this shape, to get a sense of its character, but it is difficult to visualize things in twelve dimensions, and progress was slow. Then Lorenz found that when the solution was nonperiodic (the necessary condition for SIC), four of the variables settled down to zero, leaving all the dynamics to the remaining three variables.
Lorenz narrowed the equations of atmospheric instability down to three variables: the stream function, the change in temperature and the deviation in linear temperature. The only parameter in the stream function is something known as the Prandtl Number. This is a dimensionless number which is the ratio of the kinetic viscosity of the fluid to its thermal diffusion coefficient and is a physical property of the fluid. The only parameter in the change in temperature is the Rayleigh Number which is a dimensionless parameter proportional to the difference in temperature between the top and the bottom of the fluid layer. The final parameter, in the equation for the deviation in linear temperature, is the ratio of the height of the fluid layer to the width of the convection rolls. The final simplified model is given by the flow equations
Lorenz finally had a 3-variable dynamical system that displayed chaos. Moreover, it had a three-dimensional state space that could be visualized directly. He ran his simulations, exploring the shape of the trajectories in three-dimensional state space for a wide range of initial conditions, and the trajectories did indeed always settle down to restricted regions of state space. They relaxed in all cases to a sort of surface that was elegantly warped, with wing-like patterns like a butterfly, as the state point of the system followed its dynamics through time. The attractor of the Lorenz equations was strange. Later, in 1971, David Ruelle (1935 – ), a Belgian-French mathematical physicist named this a “strange attractor”, and this name has become a standard part of the language of the theory of chaos.
The first graphical representation of the butterfly attractor is shown in Fig. 1 drawn by Lorenz for his 1963 publication.
Using our modern plotting ability, the 3D character of the butterfly is shown in Fig. 2
A projection onto the x-y plane is shown in Fig. 3. In the full 3D state space the trajectories never overlap, but in the projection onto a 2D plane the trajectories are moving above and below each other.
The reason it is called a strange attractor is because all initial conditions relax onto the strange attractor, yet every trajectory on the strange attractor separates exponentially from neighboring trajectories, displaying the classic SIC property of chaos. So here is an elegant collection of trajectories that are certainly not just random noise, yet detailed prediction is still impossible. Deterministic chaos has significant structure, and generates beautiful patterns, without actual “randomness”.
# -*- coding: utf-8 -*-
Created on Mon Apr 16 07:38:57 2018
Introduction to Modern Dynamics, 2nd edition (Oxford University Press, 2019)
Lorenz model of atmospheric turbulence
import numpy as np
import matplotlib as mpl
import matplotlib.colors as colors
import matplotlib.cm as cmx
from scipy import integrate
from matplotlib import cm
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.colors import cnames
from matplotlib import animation
jet = cm = plt.get_cmap('jet')
values = range(10)
cNorm = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)
def solve_lorenz(N=12, angle=0.0, max_time=8.0, sigma=10.0, beta=8./3, rho=28.0):
fig = plt.figure()
ax = fig.add_axes([0, 0, 1, 1], projection='3d')
# prepare the axes limits
def lorenz_deriv(x_y_z, t0, sigma=sigma, beta=beta, rho=rho):
"""Compute the time-derivative of a Lorenz system."""
x, y, z = x_y_z
return [sigma * (y - x), x * (rho - z) - y, x * y - beta * z]
# Choose random starting points, uniformly distributed from -15 to 15
x0 = -10 + 20 * np.random.random((N, 3))
# Solve for the trajectories
t = np.linspace(0, max_time, int(500*max_time))
x_t = np.asarray([integrate.odeint(lorenz_deriv, x0i, t)
for x0i in x0])
# choose a different color for each trajectory
# colors = plt.cm.viridis(np.linspace(0, 1, N))
# colors = plt.cm.rainbow(np.linspace(0, 1, N))
# colors = plt.cm.spectral(np.linspace(0, 1, N))
colors = plt.cm.prism(np.linspace(0, 1, N))
for i in range(N):
x, y, z = x_t[i,:,:].T
lines = ax.plot(x, y, z, '-', c=colors[i])
return t, x_t
t, x_t = solve_lorenz(angle=0, N=12)
lines = plt.plot(t,x_t[1,:,0],t,x_t[1,:,1],t,x_t[1,:,2])
lines = plt.plot(t,x_t[2,:,0],t,x_t[2,:,1],t,x_t[2,:,2])
lines = plt.plot(t,x_t[10,:,0],t,x_t[10,:,1],t,x_t[10,:,2])
To explore the parameter space of the Lorenz attractor, the key parameters to change are sigma (the Prandtl number), r (the Rayleigh number) and b on line 31 of the Python code.
 E. N. Lorenz, The essence of chaos (The Jessie and John Danz lectures; Jessie and John Danz lectures.). Seattle :: University of Washington Press (in English), 1993.
 E. N. Lorenz, “Deterministic Nonperiodic Flow,” Journal of the Atmospheric Sciences, vol. 20, no. 2, pp. 130-141, 1963 (1963)
Well here is another squeaker! The 2020 U. S. presidential election was a dead heat. What is most striking is that half of the past six US presidential elections have been won by less than 1% of the votes cast in certain key battleground states. For instance, in 2000 the election was won in Florida by less than 1/100th of a percent of the total votes cast.
How can so many elections be so close? This question is especially intriguing when one considers the 2020 election, which should have been strongly asymmetric, because one of the two candidates had such serious character flaws. It is also surprising because the country is NOT split 50/50 between urban and rural populations (it’s more like 60/40). And the split of Democrat/Republican is about 33/29 — close, but not as close as the election. So how can the vote be so close so often? Is this a coincidence? Or something fundamental about our political system? The answer lies (partially) in nonlinear dynamics coupled with the libertarian tendencies of American voters.
Rabbits and Sheep
Elections are complex dynamical systems consisting of approximately 140 million degrees of freedom (the voters). Yet US elections are also surprisingly simple. They are dynamical systems with only 2 large political parties, and typically a very small third party.
Voters in a political party are not too different from species in an ecosystem. There are many population dynamics models of things like rabbit and sheep that seek to understand the steady-state solutions when two species vie for the same feedstock (or two parties vie for the same votes). Depending on reproduction rates and competition payoff, one species can often drive the other species to extinction. Yet with fairly small modifications of the model parameters, it is often possible to find a steady-state solution in which both species live in harmony. This is a symbiotic solution to the population dynamics, perhaps because the rabbits help fertilize the grass for the sheep to eat, and the sheep keep away predators for the rabbits.
There are two interesting features to such a symbiotic population-dynamics model. First, because there is a stable steady-state solution, if there is a perturbation of the populations, for instance if the rabbits are culled by the farmer, then the two populations will slowly relax back to the original steady-state solution. For this reason, this solution is called a “stable fixed point”. Deviations away from the steady-state values experience an effective “restoring force” that moves the population values back to the fixed point. The second feature of these models is that the steady state values depend on the parameters of the model. Small changes in the model parameters then cause small changes in the steady-state values. In this sense, this stable fixed point is not fundamental–it depends on the parameters of the model.
But there are dynamical models which do have a stability that maintains steady values even as the model parameters shift. These models have negative feedback, like many dynamical systems, but if the negative feedback is connected to winner-take-all outcomes of game theory, then a robustly stable fixed point can emerge at precisely the threshold where such a winner would take all.
The Replicator Equation
The replicator equation provides a simple model for competing populations . Despite its simplicity, it can model surprisingly complex behavior. The central equation is a simple growth model
where the growth rate depends on the fitness fa of the a-th species relative to the average fitness φ of all the species. The fitness is given by
where pab is the payoff matrix among the different species (implicit Einstein summation applies). The fitness is frequency dependent through the dependence on xb. The average fitness is
This model has a zero-sum rule that keeps the total population constant. Therefore, a three-species dynamics can be represented on a two-dimensional “simplex” where the three vertices are the pure populations for each of the species. The replicator equation can be applied easily to a three-party system, one simply defines a payoff matrix that is used to define the fitness of a party relative to the others.
The Nonlinear Dynamics of Presidential Elections
Here we will consider the replicator equation with three political parties (Democratic, Republican and Libertarian). Even though the third party is never a serious contender, the extra degree of freedom provided by the third party helps to stabilize the dynamics between the Democrats and the Republicans.
It is already clear that an essentially symbiotic relationship is at play between Democrats and Republicans, because the elections are roughly 50/50. If this were not the case, then a winner-take-all dynamic would drive virtually everyone to one party or the other. Therefore, having 100% Democrats is actually unstable, as is 100% Republicans. When the populations get too far out of balance, they get too monolithic and too inflexible, then defections of members will occur to the other parties to rebalance the system. But this is just a general trend, not something that can explain the nearly perfect 50/50 vote of the 2020 election.
To create the ultra-stable fixed point at 50/50 requires an additional contribution to the replicator equation. This contribution must create a type of toggle switch that depends on the winner-take-all outcome of the election. If a Democrat wins 51% of the vote, they get 100% of the Oval Office. This extreme outcome then causes a back action on the electorate who is always afraid when one party gets too much power.
Therefore, there must be a shift in the payoff matrix when too many votes are going one way or the other. Because the winner-take-all threshold is at exactly 50% of the vote, this becomes an equilibrium point imposed by the payoff matrix. Deviations in the numbers of voters away from 50% causes a negative feedback that drives the steady-state populations back to 50/50. This means that the payoff matrix becomes a function of the number of voters of one party or the other. In the parlance of nonlinear dynamics, the payoff matrix becomes frequency dependent. This goes one step beyond the original replicator equation where it was the population fitness that was frequency dependent, but not the payoff matrix. Now the payoff matrix also becomes frequency dependent.
The frequency-dependent payoff matrix (in an extremely simple model of the election dynamics) takes on negative feedback between two of the species (here the Democrats and the Republicans). If these are the first and third species, then the payoff matrix becomes
where the feedback coefficient is
and where the population dependences on the off-diagonal terms guarantee that, as soon as one party gains an advantage, there is defection of voters to the other party. This establishes a 50/50 balance that is maintained even when the underlying parameters would predict a strongly asymmetric election.
For instance, look at the dynamics in Fig. 2. For this choice of parameters, the replicator model predicts a 75/25 win for the democrats. However, when the feedback is active, it forces the 50/50 outcome, despite the underlying advantage for the original parameters.
There are several interesting features in this model. It may seem that the Libertarians are irrelevant because they never have many voters. But their presence plays a surprisingly important role. The Libertarians tend to stabilize the dynamics so that neither the democrats nor the republicans would get all the votes. Also, there is a saddle point not too far from the pure Libertarian vertex. That Libertarian vertex is an attractor in this model, so under some extreme conditions, this could become a one-party system…maybe not Libertarian in that case, but possibly something more nefarious, of which history can provide many sad examples. It’s a word of caution.
Disclaimers and Caveats
No attempt has been made to actually mode the US electorate. The parameters in the modified replicator equations are chosen purely for illustration purposes. This model illustrates a concept — that feedback in the payoff matrix can create an ultra-stable fixed point that is insensitive to changes in the underlying parameters of the model. This can possibly explain why so many of the US presidential elections are so tight.
Someone interested in doing actual modeling of US elections would need to modify the parameters to match known behavior of the voting registrations and voting records. The model presented here assumes a balanced negative feedback that ensures a 50/50 fixed point. This model is based on the aversion of voters to too much power in one party–an echo of the libertarian tradition in the country. A more sophisticated model would yield the fixed point as a consequence of the dynamics, rather than being a feature assumed in the model. In addition, nonlinearity could be added that would drive the vote off of the 50/50 point when the underlying parameters shift strongly enough. For instance, the 2008 election was not a close one, in part because the strong positive character of one of the candidates galvanized a large fraction of the electorate, driving the dynamics away from the 50/50 balance.
 D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time (Oxford University Press, 2019) 2nd Edition.
 Nowak, M. A. (2006). Evolutionary Dynamics: Exploring the Equations of Life. Cambridge, Mass., Harvard University Press.
A chief principle of chaos theory states that even simple systems can display complex dynamics. All that is needed for chaos, roughly, is for a system to have at least three dynamical variables plus some nonlinearity.
A classic example of chaos is the driven damped pendulum. This is a mass at the end of a massless rod driven by a sinusoidal perturbation. The three variables are the angle, the angular velocity and the phase of the sinusoidal drive. The nonlinearity is provided by the cosine function in the potential energy which is anharmonic for large angles. However, the driven damped pendulum is not an autonomous system, because the drive is an external time-dependent function. To find an autonomous system—one that persists in complex motion without any external driving function—one needs only to add one more mass to a simple pendulum to create what is known as a compound pendulum, or a double pendulum.
Daniel Bernoulli and the Discovery of Normal Modes
After the invention of the calculus by Newton and Leibniz, the first wave of calculus practitioners (Leibniz, Jakob and Johann Bernoulli and von Tschirnhaus) focused on static problems, like the functional form of the catenary (the shape of a hanging chain), or on constrained problems, like the brachistochrone (the path of least time for a mass under gravity to move between two points) and the tautochrone (the path of equal time).
The next generation of calculus practitioners (Euler, Johann and Daniel Bernoulli, and D’Alembert) focused on finding the equations of motion of dynamical systems. One of the simplest of these, that yielded the earliest equations of motion as well as the first identification of coupled modes, was the double pendulum. The double pendulum, in its simplest form, is a mass on a rigid massless rod attached to another mass on a massless rod. For small-angle motion, this is a simple coupled oscillator.
Daniel Bernoulli, the son of Johann I Bernoulli, was the first to study the double pendulum, publishing a paper on the topic in 1733 in the proceedings of the Academy in St. Petersburg just as he returned from Russia to take up a post permanently in his home town of Basel, Switzerland. Because he was a physicist first and mathematician second, he performed experiments with masses on strings to attempt to understand the qualitative as well as quantitative behavior of the two-mass system. He discovered that for small motions there was a symmetric behavior that had a low frequency of oscillation and an antisymmetric motion that had a higher frequency of oscillation. Furthermore, he recognized that any general motion of the double pendulum was a combination of the fundamental symmetric and antisymmetric motions. This work by Daniel Bernoulli represents the discovery of normal modes of coupled oscillators. It is also the first statement of the combination of motions that he would use later (1753) to express for the first time the principle of superposition.
Superposition is one of the guiding principles of linear physical systems. It provides a means for the solution of differential equations. It explains the existence of eigenmodes and their eigenfrequencies. It is the basis of all interference phenomenon, whether classical like the Young’s double-slit experiment or quantum like Schrödinger’s cat. Today, superposition has taken center stage in quantum information sciences and helps define the spooky (and useful) properties of quantum entanglement. Therefore, normal modes, composition of motion, superposition of harmonics on a musical string—these all date back to Daniel Bernoulli in the twenty years between 1733 and 1753. (Daniel Bernoulli is also the originator of the Bernoulli principle that explains why birds and airplanes fly.)
Johann Bernoulli and the Equations of Motion
Daniel Bernoulli’s father was Johann I Bernoulli. Daniel had been tutored by Johann, along with his friend Leonhard Euler, when Daniel was young. But as Daniel matured as a mathematician, he and his father began to compete against each other in international mathematics competitions (which were very common in the early eighteenth century). When Daniel beat his father in a competition sponsored by the French Academy, Johann threw Daniel out of his house and their relationship remained strained for the remainder of their lives.
Johann had a history of taking ideas from Daniel and never citing the source. For instance, when Johann published his work on equations of motion for masses on strings in 1742, he built on the work of his son Daniel from 1733 but never once mentioned it. Daniel, of course, was not happy.
In a letter dated 20 October 1742 that Daniel wrote to Euler, he said, “The collected works of my father are being printed, and I have Just learned that he has inserted, without any mention of me, the dynamical problems I first discovered and solved (such as e. g. the descent of a sphere on a moving triangle; the linked pendulum, the center of spontaneous rotation, etc.).” And on 4 September 1743, when Daniel had finally seen his father’s works in print, he said, “The new mechanical problems are mostly mine, and my father saw my solutions before he solved the problems in his way …”. 
Daniel clearly has the priority for the discovery of the normal modes of the linked (i.e. double or compound) pendulum, but Johann often would “improve” on Daniel’s work despite giving no credit for the initial work. As a mathematician, Johann had a more rigorous approach and could delve a little deeper into the math. For this reason, it was Johann in 1742 who came closest to writing down differential equations of motion for multi-mass systems, but falling just short. It was D’Alembert only one year later who first wrote down the differential equations of motion for systems of masses and extended it to the loaded string for which he was the first to derive the wave equation. The D’Alembertian operator is today named after him.
Double Pendulum Dynamics
The general dynamics of the double pendulum are best obtained from Lagrange’s equations of motion. However, setting up the Lagrangian takes careful thought, because the kinetic energy of the second mass depends on its absolute speed which is dependent on the motion of the first mass from which it is suspended. The velocity of the second mass is obtained through vector addition of velocities.
The potential energy of the system is
so that the Lagrangian is
The partial derivatives are
and the time derivatives of the last two expressions are
Therefore, the equations of motion are
To get a sense of how this system behaves, we can make a small-angle approximation to linearize the equations to find the lowest-order normal modes. In the small-angle approximation, the equations of motion become
where the determinant is
This quartic equation is quadratic in w2 and the quadratic solution is
This solution is still a little opaque, so taking the special case: R = R1 = R2 and M = M1 = M2 it becomes
There are two normal modes. The low-frequency mode is symmetric as both masses swing (mostly) together, while the higher frequency mode is antisymmetric with the two masses oscillating against each other. These are the motions that Daniel Bernoulli discovered in 1733.
It is interesting to note that if the string were rigid, so that the two angles were the same, then the lowest frequency would be 3/5 which is within 2% of the above answer but is certainly not equal. This tells us that there is a slightly different angular deflection for the second mass relative to the first.
Chaos in the Double Pendulum
The full expression for the nonlinear coupled dynamics is expressed in terms of four variables (q1, q2, w1, w2). The dynamical equations are
These can be put into the normal form for a four-dimensional flow as
The numerical solution of these equations produce a complex interplay between the angle of the first mass and the angle of the second mass. Examples of trajectory projections in configuration space are shown in Fig. 3 for E = 1. The horizontal is the first angle, and the vertical is the angle of the second mass.
The dynamics in state space are four dimensional which are difficult to visualize directly. Using the technique of the Poincaré first-return map, the four-dimensional trajectories can be viewed as a two-dimensional plot where the trajectories pierce the Poincaré plane. Poincare sections are shown in Fig. 4.
Python Code: DoublePendulum.py
# -*- coding: utf-8 -*-
Created on Oct 16 06:03:32 2020
"Introduction to Modern Dynamics" 2nd Edition (Oxford, 2019)
import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt
E = 1. # Try 0.8 to 1.5
x, y, z, w = x_y_z_w
A = w**2*np.sin(y-x);
B = -2*np.sin(x);
C = z**2*np.sin(y-x)*np.cos(y-x);
D = np.sin(y)*np.cos(y-x);
EE = 2 - (np.cos(y-x))**2;
FF = w**2*np.sin(y-x)*np.cos(y-x);
G = -2*np.sin(x)*np.cos(y-x);
H = 2*z**2*np.sin(y-x);
I = 2*np.sin(y);
JJ = (np.cos(y-x))**2 - 2;
a = z
b = w
c = (A+B+C+D)/EE
d = (FF+G+H+I)/JJ
repnum = 75
for reploop in range(repnum):
px1 = 2*(np.random.random((1))-0.499)*np.sqrt(E);
py1 = -px1 + np.sqrt(2*E - px1**2);
xp1 = 0 # Try 0.1
yp1 = 0 # Try -0.2
x_y_z_w0 = [xp1, yp1, px1, py1]
tspan = np.linspace(1,1000,10000)
x_t = integrate.odeint(flow_deriv, x_y_z_w0, tspan)
siztmp = np.shape(x_t)
siz = siztmp
if reploop % 50 == 0:
lines = plt.plot(x_t[:,0],x_t[:,1])
y1 = np.mod(x_t[:,0]+np.pi,2*np.pi) - np.pi
y2 = np.mod(x_t[:,1]+np.pi,2*np.pi) - np.pi
y3 = np.mod(x_t[:,2]+np.pi,2*np.pi) - np.pi
y4 = np.mod(x_t[:,3]+np.pi,2*np.pi) - np.pi
py = np.zeros(shape=(10*repnum,))
yvar = np.zeros(shape=(10*repnum,))
cnt = -1
last = y1
for loop in range(2,siz):
if (last < 0)and(y1[loop] > 0):
cnt = cnt+1
del1 = -y1[loop-1]/(y1[loop] - y1[loop-1])
py[cnt] = y4[loop-1] + del1*(y4[loop]-y4[loop-1])
yvar[cnt] = y2[loop-1] + del1*(y2[loop]-y2[loop-1])
last = y1[loop]
last = y1[loop]
lines = plt.plot(yvar,py,'o',ms=1)
You can change the energy E on line 16 and also the initial conditions xp1 and yp1 on lines 48 and 49. The energy E is the initial kinetic energy imparted to the two masses. For a given initial condition, what happens to the periodic orbits as the energy E increases?
 Daniel Bernoulli, Theoremata de oscillationibus corporum filo flexili connexorum et catenae verticaliter suspensae,” Academiae Scientiarum Imperialis Petropolitanae, 6, 1732/1733
 Truesdell B. The rational mechanics of flexible or elastic bodies, 1638-1788. (Turici: O. Fussli, 1960). (This rare and artistically produced volume, that is almost impossible to find today in any library, is one of the greatest books written about the early history of dynamics.)
Imagine if you could use the physics of coherent light to record a 3D hologram of a cancer tumor and use it to select the best therapy for the cancer patient.
This week in Scientific Reports, a Nature Research publication, we demonstrate the first step towards that goal.
In a collaboration between Purdue University and the Northwestern University School of Medicine, we performed Doppler spectroscopy of intracellular dynamics of human epithelial ovarian cancer tumor biopsies and observed how they responded to selected anti-cancer drugs. Distinctly different Doppler spectra were observed for patients who went into remission versus those who failed to achieve cancer remission. This is the first clinical pilot trial of the technology, known as Biodynamic Imaging (BDI), published in human cancer research.
BDI may, in the future, make it possible to select the most effective therapies for individual cancer patients, realizing the long-sought dream of personalized cancer care.
The Purdue University Office of Technology Transfer has licensed the BDI patent portfolio to Animated Dynamics, Inc., located in Indianapolis, IN, that is working to commercialize the technology to translate it to the cancer clinic. Currently less than 40% of all cancer patients respond favorably to their chemotherapy. Using BDI technology our hope is to improve rates of remission in select cancer settings.