Up-side-down Physics: Dynamic Equilibrium and the Inverted Pendulum

In the study of mechanics, the physics student moves through several stages in their education.  The first stage is the Newtonian physics of trajectories and energy and momentum conservation—there are no surprises there.  The second stage takes them to Lagrangians and Hamiltonians—here there are some surprises, especially for rigid body rotations.  Yet even at this stage, most problems have analytical solutions, and most of those solutions are exact.

Any street busker can tell you that an equally good (and more interesting) equilibrium point of a simple pendulum is when the bob is at the top.

It is only at the third stage that physics starts to get really interesting, and when surprising results with important ramifications emerge.  This stage is nonlinear physics.  Most nonlinear problems have no exact analytical solutions, but there are regimes where analytical approximations not only are possible but provide intuitive insights.  One of the best examples of this third stage is the dynamic equilibrium of Kapitsa’s up-side-down pendulum.

Piotr Kapitsa

Piotr Kapitsa (1894 – 1984) was a Russian physicist who received the Nobel Prize in physics in 1978 for his discovery in 1937 of superfluidity in liquid helium.  (He shared the 1978 prize with Penzias and Wilson who had discovered the cosmic microwave background.)  Superfluidity is a low-temperature hydrodynamic property of superfluids that shares some aspects in common with superconductivity.  Kapitsa published his results in Nature in 1938 in the same issue as a paper by John Allen and Don Misener of Cambridge, but Kapitsa had submitted his paper 19 days before Allen and Misener and so got priority (and the Nobel).

During his career Kapitsa was a leading force in Russian physics, surviving Stalin’s great purge through force of character, and helping to establish the now-famous Moscow Institute of Physics and Technology.  However, surviving Stalin did not always mean surviving with freedom, and around 1950 Kapitsa was under effective house arrest because of his unwillingness to toe the party line.

In his enforced free time, to while away the hours, Kapitsa developed an ingenious analytical approach to the problem of dynamic equilibrium.  His toy example was the driven inverted pendulum. It is surprising how many great works have emerged from the time freed up by house arrest: Galileo finally had time to write his “Two New Sciences”  after his run-in with the Inquisition, and Fresnel was free to develop his theory of diffraction after he ill-advisedly joined a militia to support the Bourbon king during Napoleon’s return. (In our own time, with so many physicists in lock-down and working from home, it will be interesting to see what great theory emerges from the pandemic.)

Stability in the Inverted Driven Pendulum

The only stable static equilibrium of the simple pendulum is when the pendulum bob is at its lowest point.  However, any street busker can tell you that an equally good (and more interesting) equilibrium point is when the bob is at the top.  The caveat is that this “inverted” equilibrium of the pendulum requires active stabilization. 

If the inverted pendulum is a simple physical pendulum, like a meter stick that you balance on the tip of your finger, you know that you need to nudge the stick gently and continuously this way and that with your finger, in response to the tipping stick, to keep it upright.  It’s an easy trick, and almost everyone masters it as a child.  With the human as the drive force, this is an example of a closed-loop control system.  The tipping stick is observed visually by the human, and the finger position is adjusted to compensate for the tip.  On the other hand, one might be interested to find an “open-loop” system that does not require active feedback or even an operator.  In 1908, Andrew Stephenson suggested that induced stability could be achieved by the inverted pendulum if a drive force of sufficiently high frequency were applied [1].  But the proof of the stability remained elusive until Kapitsa followed Stephenson’s suggestion by solving the problem through a separation of time scales [2].

The Method of Separation of Time Scales

The driven inverted pendulum has the dynamical equation

where w0 is the natural angular frequency of small-amplitude oscillations, a is a drive amplitude (with units of frequency) and w is the drive angular frequency that is assumed to be much larger than the natural frequency.  The essential assumption that allows the problem to be separate according to widely separated timescales is that the angular displacement has a slow contribution that changes on the time scale of the natural frequency, and a fast contribution that changes on the time scale of the much higher drive frequency.  The assumed solution then looks like

This is inserted into the dynamical equation to yield

where we have used the approximation

So far this is simple.  The next step is the key step.  It assumes that the dynamical equation should also separate into fast and slow contributions.  But the last term of the sin q expansion has a product of fast and slow components.  The key insight is that a time average can be used to average over the fast contribution.  The separation of the dynamical equation is then

where the time average of the fast variables is only needed on the first line.  The second line is a simple driven harmonic oscillator with a natural frequency that depends on cos qslow and a driving amplitude that depends on sin qslow.  The classic solution to the second line for qfast is

This solution can then be inserted into the first line to yield

This describes a pendulum under an effective potential (for high drive frequency and no damping)

The first term is unstable at the inverted position, but the second term is actually a restoring force.  If the second term is stronger than the first, then a dynamic equilibrium can be achieved. This occurs when the driving amplitude is larger than a threshold value

The effective potential for increasing drive amplitude looks like

Fig. 1 Effective potential as a function of angle and drive amplitude a (in units of w0)

When the drive amplitude is larger than sqrt(2), a slight dip forms in the unstable potential. The dip increases with increasing drive amplitude, as does the oscillation frequency of the effective potential.

Python Program

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Friday Sept 11 06:03:32 2020
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt

plt.close('all')

print(' ')
print('PenInverted.py')

F = 133.5          # 30 to 140  (133.5)
delt = 0.000       # 0.000 to 0.01
w = 20          # 20
def flow_deriv(x_y_z,tspan):
    x, y, z = x_y_z
    a = y
    b = -(1 + F*np.cos(z))*np.sin(x) - delt*y
    c = w
    return[a,b,c]
                
T = 2*np.pi/w

x0 = np.pi+0.3
v0 = 0.00
z0 = 0

x_y_z = [x0, v0, z0]

# Solve for the trajectories
t = np.linspace(0, 2000, 200000)
x_t = integrate.odeint(flow_deriv, x_y_z, t)
siztmp = np.shape(x_t)
siz = siztmp[0]

#y1 = np.mod(x_t[:,0]-np.pi,2*np.pi)-np.pi
y1 = x_t[:,0]
y2 = x_t[:,1]
y3 = x_t[:,2]    

plt.figure(1)
lines = plt.plot(t[0:2000],x_t[0:2000,0]/np.pi)
plt.setp(lines, linewidth=0.5)
plt.show()
plt.title('Angular Position')

plt.figure(2)
lines = plt.plot(t[0:1000],y2[0:1000])
plt.setp(lines, linewidth=0.5)
plt.show()
plt.title('Speed')

repnum = 5000
px = np.zeros(shape=(2*repnum,))
xvar = np.zeros(shape=(2*repnum,))
cnt = -1
testwt = np.mod(t,T)-0.5*T;
last = testwt[1]
for loop in range(2,siz-1):
    if (last < 0)and(testwt[loop] > 0):
        cnt = cnt+1
        del1 = -testwt[loop-1]/(testwt[loop] - testwt[loop-1])
        px[cnt] = (y2[loop]-y2[loop-1])*del1 + y2[loop-1]
        xvar[cnt] = (y1[loop]-y1[loop-1])*del1 + y1[loop-1]
        last = testwt[loop]
    else:
        last = testwt[loop]
 
plt.figure(3)
lines = plt.plot(xvar[0:5000],px[0:5000],'ko',ms=1)
plt.show()
plt.title('First Return Map')

plt.figure(4)
lines = plt.plot(x_t[0:1000,0]/np.pi,y2[0:1000])
plt.setp(lines, linewidth=0.5)
plt.show()
plt.title('Phase Space')

You can play with the parameters of this program to explore the physics of dynamic equilibrium. For instance, if the control parameter is slightly above the threshold (F = 32) at which a dip appears in the effective potential, the slow oscillation has a vary low frequency, as shown in Fig. 2. The high-frequency drive can still be seen superposed on the slow oscillation of the pendulum that is oscillating just like an ordinary pendulum but up-side-down!

Fig. 2 Just above the pitchfork bifurcation the slow oscillation has a low frequency. F = 32, w = 20, w0 = 1

The oscillation frequency is a function of the drive amplitude. This is a classic signature of a nonlinear system: amplitude-frequency coupling. Well above the threshold (F = 100), the frequency of oscillation in the effective potential becomes much larger, as in Fig. 3.

Fig. 3 High above the transition. F = 100, w = 20, w0 = 1

When the drive amplitude is more than four times larger than the threshold value (F > 140), the equilibrium is destroyed, so there is an upper bound to the dynamic stabilization. This happens when the “slow” frequency becomes comparable to the drive frequency and the separation-of-time-scales approach is no longer valid.

You can also play with the damping (delt) to see what effect it has on thresholds and long-term behavior starting at delt = 0.001 and increasing it.

Other Examples of Dynamic Equilibrium

Every physics student learns that there is no stable electrostatic equilibrium. However, if charges are put into motion, then a time-averaged potential can be created that can confine a charged particle. This is the principle of the Paul Ion Trap, named after Wolfgang Paul who was awarded the Nobel Prize in Physics in 1989 for this invention.

One of the most famous examples of dynamic equilibrium are the L4 and L5 Lagrange points. In the Earth-Jupiter system, these are the locations of the Trojan asteroids. These special Lagrange points are maxima (unstable equilibria) in the effective potential of a rotation coordinate system, but the Coriolis force creates a local minimum that traps the asteroids in a dynamically stable equilibrium.

In economics, general equilibrium theory describes how oscillating prices among multiple markets can stabilize economic performance in macroeconomics.

A recent paper in Science magazine used the principle of dynamic equilibrium to levitate a layer of liquid on which toy boats can ride right-side-up and up-side-down. For an interesting video see Upside-down boat (link).

References

[1] Stephenson Andrew (1908). “XX.On induced stability”. Philosophical Magazine. 6. 15: 233–236.

[2] Kapitza P. L. (1951). “Dynamic stability of a pendulum when its point of suspension vibrates”. Soviet Phys. JETP. 21: 588–597.

Links

https://en.wikipedia.org/wiki/Kapitza%27s_pendulum

A detailed derivation of Kapitsa’s approach:https://elmer.unibas.ch/pendulum/upside.htm

The bifurcation threshold for the inverted pendulum is a pitchfork bifurcation https://elmer.unibas.ch/pendulum/bif.htm#pfbif

Henri Poincaré and his Homoclinic Tangle

Will the next extinction-scale asteroid strike the Earth in our lifetime? 

This existential question—the question of our continued existence on this planet—is rhetorical, because there are far too many bodies in our solar system to accurately calculate all trajectories of all asteroids. 

The solar system is what is known as an N-body problem.  And even the N is not well determined.  The asteroid belt alone has over a million extinction-sized asteroids, and there are tens of millions of smaller ones that could still do major damage to life on Earth if they hit.  To have a hope of calculating even one asteroid trajectory do we ignore planetary masses that are too small?  What is too small?  What if we only consider the Sun, the Earth and Jupiter?  This is what Euler did in 1760, and he still had to make more assumptions.

Stability of the Solar System

Once Newton published his Principia, there was a pressing need to calculate the orbit of the Moon (see my blog post on the three-body problem).  This was important for navigation, because if the daily position of the moon could be known with sufficient accuracy, then ships would have a means to determine their longitude at sea.  However, the Moon, Earth and Sun are already a three-body problem, which still ignores the effects of Mars and Jupiter on the Moon’s orbit, not to mention the problem that the Earth is not a perfect sphere.  Therefore, to have any hope of success, toy systems that were stripped of all their obfuscating detail were needed.

Euler investigated simplified versions of the three-body problem around 1760, treating a body attracted to two fixed centers of gravity moving in the plane, and he solved it using elliptic integrals. When the two fixed centers are viewed in a coordinate frame that is rotating with the Sun-Earth system, it can come close to capturing many of the important details of the system. In 1762 Euler tried another approach, called the restricted three-body problem, where he considered a massless Moon attracted to a massive Earth orbiting a massive Sun, again all in the plane. Euler could not find general solutions to this problem, but he did stumble on an interesting special case when the three bodies remain collinear throughout their motions in a rotating reference frame.

It was not the danger of asteroids that was the main topic of interest in those days, but the question whether the Earth itself is in a stable orbit and is safe from being ejected from the Solar system.  Despite steadily improving methods for calculating astronomical trajectories through the nineteenth century, this question of stability remained open.

Poincaré and the King Oscar Prize of 1889

Some years ago I wrote an article for Physics Today called “The Tangled Tale of Phase Space” that tracks the historical development of phase space. One of the chief players in that story was Henri Poincaré (1854 – 1912). Henri Poincare was the Einstein before Einstein. He was a minor celebrity and was considered to be the greatest genius of his era. The event in his early career that helped launch him to stardom was a mathematics prize announced in 1887 to honor the birthday of King Oscar II of Sweden. The challenge problem was as simple as it was profound: Prove rigorously whether the solar system is stable.

This was the old N-body problem that had so far resisted solution, but there was a sense at that time that recent mathematical advances might make the proof possible. There was even a rumor that Dirichlet had outlined such a proof, but no trace of the outline could be found in his papers after his death in 1859.

The prize competition was announced in Acta Mathematica, written by the Swedish mathematician Gösta Mittag-Leffler. It stated:

Given a system of arbitrarily many mass points that attract each according to Newton’s law, under the assumption that no two points ever collide, try to find a representation of the coordinates of each point as a series in a variable that is some known function of time and for all of whose values the series converges uniformly.

The timing of the prize was perfect for Poincaré who was in his early thirties and just beginning to make his mark on mathematics. He was working on the theory of dynamical systems and was developing a new viewpoint that went beyond integrating single trajectories by focusing more broadly on whole classes of solutions. The question of the stability of the solar system seemed like a good problem to use to sharpen his mathematical tools. The general problem was still too difficult, so he began with Euler’s restricted three-body problem. He made steady progress, and along the way he invented an array of new techniques for studying the general properties of dynamical systems. One of these was the Poincaré section. Another was his set of integral invariants, one of which is recognized as the conservation of volume in phase space, also known as Liouville’s theorem, although it was Ludwig Boltzmann who first derived this result (see my Physics Today article). Eventually, he believed he had proven that the restricted three-body problem was stable.

By the time Poincaré had finished is prize submission, he had invented a new field of mathematical analysis, and the judges of the prize submission recognized it. Poincaré was named the winner, and his submission was prepared for publication in the Acta. However, Mittag-Leffler was a little concerned by a technical objection that had been raised, so he forwarded the comment to Poincaré for him to look at. At first, Poincaré thought the objection could easily be overcome, but as he worked on it and delved deeper, he had a sudden attack of panic. Trajectories near a saddle point did not converge. His proof of stability was wrong!

He alerted Mittag-Leffler to stop the presses, but it was too late. The first printing had been completed and review copies had already been sent to the judges. Mittag-Leffler immediately wrote to them asking for their return while Poincaré worked nonstop to produce a corrected copy. When he had completed his reanalysis, he had discovered a divergent feature of the solution to the dynamical problem near saddle points that his recognized today as the discovery of chaos. Poincaré paid for the reprinting of his paper out of his own pocket and (almost) all of the original printing was destroyed. This embarrassing moment in the life of a great mathematician was virtually forgotten until it was brought to light by the historian Barrow-Green in 1994 [1].

Poincaré is still a popular icon in France. Here is the Poincaré cafe in Paris.
A crater on the Moon is named after Poincaré.

Chaos in the Poincaré Return Map

Despite the fact that his conclusions on the stability of the 3-body problem flipped, Poincaré’s new tools for analyzing dynamical systems earned him the prize. He did not stop at his modified prize submission but continued working on systematizing his methods, publishing New Methods in Celestial Mechanics in several volumes through the 1890’s. It was here that he fully explored what happens when a trajectory approaches a saddle point of dynamical equilibrium.

The third volume of a three-book series that grew from Poincaré’s award-winning paper

To visualize a periodic trajectory, Poincaré invented a mathematical tool called a “first-return map”, also known as a Poincaré section. It was a way of taking a higher dimensional continuous trajectory and turning it into a simple iterated discrete map. Therefore, one did not need to solve continuous differential equations, it was enough to just iterate the map. In this way, complicated periodic, or nearly periodic, behavior could be explored numerically. However, even armed with this weapon, Poincaré found that iterated maps became unstable as a trajectory that originated from a saddle point approached another equivalent saddle point. Because the dynamics are periodic, the outgoing and incoming trajectories are opposite ends of the same trajectory, repeated with 2-pi periodicity. Therefore, the saddle point is also called a homoclinic point, meaning that trajectories in the discrete map intersect with themselves. (If two different trajectories in the map intersect, that is called a heteroclinic point.) When Poincaré calculated the iterations around the homoclinic point, he discovered a wild and complicated pattern in which a trajectory intersected itself many times. Poincaré wrote:

[I]f one seeks to visualize the pattern formed by these two curves and their infinite number of intersections … these intersections form a kind of lattice work, a weave, a chain-link network of infinitely fine mesh; each of the two curves can never cross itself, but it must fold back on itself in a very complicated way so as to recross all the chain-links an infinite number of times .… One will be struck by the complexity of this figure, which I am not even attempting to draw. Nothing can give us a better idea of the intricacy of the three-body problem, and of all the problems of dynamics in general…

Poincaré’s first view of chaos.

This was the discovery of chaos! Today we call this “lattice work” the “homoclinic tangle”. He could not draw it with the tools of his day … but we can!

Chirikov’s Standard Map

The restricted 3-body problem is a bit more complicated than is needed to illustrate Poincaré’s homoclinic tangle. A much simpler model is a discrete map called Chirikov’s Map or the Standard Map. It describes the Poincaré section of a periodically kicked oscillator that rotates or oscillates in the angular direction with an angular momentm J. The map has the simple form

in which the angular momentum in updated first, and then the angle variable is updated with the new angular momentum. When plotted on the (θ,J) plane, the standard map produces a beautiful kaleidograph of intertwined trajectories piercing the Poincaré plane, as shown in the figure below. The small points or dots are successive intersections of the higher-dimensional trajectory intersecting a plane. It is possible to trace successive points by starting very close to a saddle point (on the left) and connecting successive iterates with lines. These lines merge into the black trace in the figure that emerges along the unstable manifold of the saddle point on the left and approaches the saddle point on the right generally along the stable manifold.

Fig. Standard map for K = 0.97 at the transition to full chaos. The dark line is the trajectory of the unstable manifold emerging from the saddle point at (p,0). Note the wild oscillations as it approaches the saddle point at (3pi,0).

However, as the successive iterates approach the new saddle (which is really just the old saddle point because of periodicity) it crosses the stable manifold again and again, in ever wilder swings that diverge as it approaches the saddle point. This is just one trace. By calculating traces along all four stable and unstable manifolds and carrying them through to the saddle, a lattice work, or homoclinic tangle emerges.

Two of those traces originate from the stable manifolds, so to calculate their contributions to the homoclinic tangle, one must run these traces backwards in time using the inverse Chirikov map. This is

The four traces all intertwine at the saddle point in the figure below with a zoom in on the tangle in the next figure. This is the lattice work that Poincaré glimpsed in 1889 as he worked feverishly to correct the manuscript that won him the prize that established him as one of the preeminent mathematicians of Europe.

Fig. The homoclinic tangle caused by the folding of phase space trajectories as stable and unstable manifolds criss-cross in the Poincare map at the saddle point. This was the figure that Poincaré could not attempt to draw because of its complexity.
Fig. A zoom-in of the homoclinic tangle at the saddle point as the stable and unstable manifolds create a lattice of intersections. This is the fundamental origin of chaos and the sensitivity to initial conditions (SIC) that make forecasting almost impossible in chaotic systems.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sun Aug  2  2020

"Introduction to Modern Dynamics" 2nd Edition (Oxford, 2019)

@author: nolte
"""

import numpy as np
from matplotlib import pyplot as plt
from numpy import linalg as LA

plt.close('all')

eps = 0.97

np.random.seed(2)

plt.figure(1)

for eloop in range(0,100):

    rlast = 2*np.pi*(0.5-np.random.random())
    thlast = 4*np.pi*np.random.random()
    

    rplot = np.zeros(shape=(200,))
    thetaplot = np.zeros(shape=(200,))
    for loop in range(0,200):
        rnew = rlast + eps*np.sin(thlast)
        thnew = np.mod(thlast+rnew,4*np.pi)
        
        thetaplot[loop] = np.mod(thnew-np.pi,4*np.pi)
                    
        rtemp = np.mod(rnew + np.pi,2*np.pi)
        
        rplot[loop] = rtemp - np.pi
  
        
        rlast = rnew
        thlast = thnew
        
    plt.plot(np.real(thetaplot),np.real(rplot),'o',ms=0.2)
    plt.xlim(xmin=np.pi,xmax=4*np.pi)
    plt.ylim(ymin=-2.5,ymax=2.5)
        

plt.savefig('StandMap')

K = eps
eps0 = 5e-7

J = [[1,1+K],[1,1]]
w, v = LA.eig(J)

My = w[0]
Vu = v[:,0]     # unstable manifold
Vs = v[:,1]     # stable manifold

# Plot the unstable manifold
Hr = np.zeros(shape=(100,150))
Ht = np.zeros(shape=(100,150))
for eloop in range(0,100):
    
    eps = eps0*eloop

    roldu1 = eps*Vu[0]
    thetoldu1 = eps*Vu[1]
    
    Nloop = np.ceil(-6*np.log(eps0)/np.log(eloop+2))
    flag = 1
    cnt = 0
    
    while flag==1 and cnt < Nloop:
        
        
        ru1 = roldu1 + K*np.sin(thetoldu1)
        thetau1 = thetoldu1 + ru1
        
        roldu1 = ru1
        thetoldu1 = thetau1
        
        if thetau1 > 4*np.pi:
            flag = 0
            
        Hr[eloop,cnt] = roldu1
        Ht[eloop,cnt] = thetoldu1 + 3*np.pi
        cnt = cnt+1
    

x = Ht[0:99,12] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[0:99,12]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[5:39,15] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[5:39,15]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[12:69,16] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[12:69,16]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[15:89,17] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[15:89,17]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[30:99,18] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[30:99,18]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

# Plot the stable manifold
del Hr, Ht
Hr = np.zeros(shape=(100,150))
Ht = np.zeros(shape=(100,150))
#eps0 = 0.03
for eloop in range(0,100):
    
    eps = eps0*eloop

    roldu1 = eps*Vs[0]
    thetoldu1 = eps*Vs[1]
    
    Nloop = np.ceil(-6*np.log(eps0)/np.log(eloop+2))
    flag = 1
    cnt = 0
    
    while flag==1 and cnt < Nloop:
        
        
        thetau1 = thetoldu1 - roldu1
        ru1 = roldu1 - K*np.sin(thetau1)

        
        roldu1 = ru1
        thetoldu1 = thetau1
        
        if thetau1 > 4*np.pi:
            flag = 0
            
        Hr[eloop,cnt] = roldu1
        Ht[eloop,cnt] = thetoldu1
        cnt = cnt+1
    
x = Ht[0:79,12] + np.pi
x2 = 6*np.pi - x
y = Hr[0:79,12]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[4:39,15] + np.pi
x2 = 6*np.pi - x
y = Hr[4:39,15]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[12:69,16] + np.pi
x2 =  6*np.pi - x
y = Hr[12:69,16]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[15:89,17] + np.pi
x2 =  6*np.pi - x
y = Hr[15:89,17]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[30:99,18] + np.pi
x2 =  6*np.pi - x
y = Hr[30:99,18]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

References

[1] D. D. Nolte, “The tangled tale of phase space,” Physics Today, vol. 63, no. 4, pp. 33-38, Apr (2010)

[2] M. C. Gutzwiller, “Moon-Earth-Sun: The oldest three-body problem,” Reviews of Modern Physics, vol. 70, no. 2, pp. 589-639, Apr (1998)

[3] Barrow-Green J. Oscar II’s Prize Competition and the Error in Poindare’s Memoir on the Three Body Problem. Arch Hist Exact Sci 48: 107-131, 1994.

[4] Barrow-Green J. Poincaré and the three body problem. London Mathematical Society, 1997.

[5] https://the-moon.us/wiki/Poincar%C3%A9

[6] Poincaré H and Goroff DL. New methods of celestial mechanics … Edited and introduced by Daniel L. Goroff. New York: American Institute of Physics, 1993.

Brook Taylor’s Infinite Series

When Leibniz claimed in 1704, in a published article in Acta Eruditorum, to have invented the differential calculus in 1684 prior to anyone else, the British mathematicians rushed to Newton’s defense. They knew Newton had developed his fluxions as early as 1666 and certainly no later than 1676. Thus ensued one of the most bitter and partisan priority disputes in the history of math and science that pitted the continental Leibnizians against the insular Newtonians. Although a (partisan) committee of the Royal Society investigated the case and found in favor of Newton, the affair had the effect of insulating British mathematics from Continental mathematics, creating an intellectual desert as the forefront of mathematical analysis shifted to France. Only when George Green filled his empty hours with the latest advances in French analysis, as he tended his father’s grist mill, did British mathematics wake up. Green self-published his epic work in 1828 that introduced what is today called Green’s Theorem.

Yet the period from 1700 to 1828 was not a complete void for British mathematics. A few points of light shone out in the darkness, Thomas Simpson, Collin Maclaurin, Abraham de Moivre, and Brook Taylor (1685 – 1731) who came from an English family that had been elevated to minor nobility by an act of Cromwell during the English Civil War.

Growing up in Bifrons House

 

View of Bifrons House from sometime in the late-1600’s showing the Jacobean mansion and the extensive south gardens.

When Brook Taylor was ten years old, his father bought Bifrons House [1], one of the great English country houses, located in the county of Kent just a mile south of Canterbury.  English country houses were major cultural centers and sources of employment for 300 years from the seventeenth century through the early 20th century. While usually being the country homes of nobility of all levels, from Barons to Dukes, sometimes they were owned by wealthy families or by representatives in Parliament, which was the case for the Taylors. Bifrons House had been built around 1610 in the Jacobean architectural style that was popular during the reign of James I.  The house had a stately front façade, with cupola-topped square towers, gable ends to the roof, porches of a renaissance form, and extensive manicured gardens on the south side.  Bifrons House remained the seat of the Taylor family until 1824 when they moved to a larger house nearby and let Bifrons first to a Marquess and then in 1828 to Lady Byron (ex-wife of Lord Byron) and her daughter Ada Lovelace (the mathematician famous for her contributions to early computer science). The Taylor’s sold the house in 1830 to the first Marquess Conyngham.

Taylor’s life growing up in the rarified environment of Bifrons House must have been like scenes out of the popular BBC TV drama Downton Abbey.  The house had a large staff of servants and large grounds at the edge of a large park near the town of Patrixbourne. Life as the heir to the estate would have been filled with social events and fine arts that included music and painting. Taylor developed a life-long love of music during his childhood, later collaborating with Isaac Newton on a scientific investigation of music (it was never published). He was also an amateur artist, and one of the first books he published after being elected to the Royal Society was on the mathematics of linear perspective, which contained some of the early results of projective geometry.

There is a beautiful family portrait in the National Portrait Gallery in London painted by John Closterman around 1696. The portrait is of the children of John Taylor about a year after he purchased Bifrons House. The painting is notable because Brook, the heir to the family fortunes, is being crowned with a wreath by his two older sisters (who would not inherit). Brook was only about 11 years old at the time and was already famous within his family for his ability with music and numbers.

Portrait of the children of John Taylor around 1696. Brook Taylor is the boy being crowned by his sisters on the left. (National Portrait Gallery)

Taylor never had to go to school, being completely tutored at home until he entered St. John’s College, Cambridge, in 1701.  He took mathematics classes from Machin and Keill and graduated in 1709.  The allowance from his father was sufficient to allow him to lead the life of a gentleman scholar, and he was elected a member of the Royal Society in 1712 and elected secretary of the Society just two years later.  During the following years he was active as a rising mathematician until 1721 when he married a woman of a good family but of no wealth.  The support of a house like Bifrons always took money, and the new wife’s lack of it was enough for Taylor’s father to throw the new couple out.  Unfortunately, his wife died in childbirth along with the child, so Taylor returned home in 1723.  These family troubles ended his main years of productivity as a mathematician.

Portrait of Brook Taylor

Methodus incrementorum directa et inversa

Under the eye of the Newtonian mathematician Keill at Cambridge, Taylor became a staunch supporter and user of Newton’s fluxions. Just after he was elected as a member of the Royal Society in 1712, he participated in an investigation of the priority for the invention of the calculus that pitted the British Newtonians against the Continental Leibnizians. The Royal Society found in favor of Newton (obviously) and raised the possibility that Leibniz learned of Newton’s ideas during a visit to England just a few years before Leibniz developed his own version of the differential calculus.

A re-evaluation of the priority dispute from today’s perspective attributes the calculus to both men. Newton clearly developed it first, but did not publish until much later. Leibniz published first and generated the excitement for the new method that dispersed its use widely. He also took an alternative route to the differential calculus that is demonstrably different than Newton’s. Did Leibniz benefit from possibly knowing Newton’s results (but not his methods)? Probably. But that is how science is supposed to work … building on the results of others while bringing new perspectives. Leibniz’ methods and his notations were superior to Newton’s, and the calculus we use today is closer to Leibniz’ version than to Newton’s.

Once Taylor was introduced to Newton’s fluxions, he latched on and helped push its development. The same year (1715) that he published a book on linear perspective for art, he also published a ground-breaking book on the use of the calculus to solve practical problems. This book, Methodus incrementorum directa et inversa, introduced several new ideas, including finite difference methods (which are used routinely today in numerical simulations of differential equations). It also considered possible solutions to the equation for a vibrating string for the first time.

The vibrating string is one of the simplest problem in “continuum mechanics”, but it posed a severe challenge to Newtonian physics of point particles. It was only much later that D’Alembert used Newton’s first law of action-reaction to eliminate internal forces to derive D’Alembert’s principle on the net force on an extended body. Yet Taylor used finite differences to treat the line mass of the string in a way that yielded a possible solution of a sine function. Taylor was the first to propose that a sine function was the form of the string displacement during vibration. This idea would be taken up later by D’Alembert (who first derived the wave equation), and by Euler (who vehemently disagreed with D’Alembert’s solutions) and Daniel Bernoulli (who was the first to suggest that it is not just a single sine function, but a sum of sine functions, that described the string’s motion — the principle of superposition).

Of course, the most influential idea in Taylor’s 1715 book was his general use of an infinite series to describe a curve.

Taylor’s Series

Infinite series became a major new tool in the toolbox of analysis with the publication of John WallisArithmetica Infinitorum published in 1656. Shortly afterwards many series were published such as Nikolaus Mercator‘s series (1668)

and James Gregory‘s series (1668)

And of course Isaac Newton’s generalized binomial theorem that he worked out famously during the plague years of 1665-1666

But these consisted mainly of special cases that had been worked out one by one. What was missing was a general method that could yield a series expression for any curve.

Taylor used concepts of finite differences as well as infinitesimals to derive his formula for expanding a function as a power series around any point. His derivation in Methodus incrementorum directa et inversa is not easily recognized today. Using difference tables, and ideas from Newton’s fluxions that viewed functions as curves traced out as a function of time, he arrived at the somewhat cryptic expression

where the “dots” are time derivatives, x stands for the ordinate (the function), v is a finite difference, and z is the abcissa moving with constant speed. If the abcissa moves with unit speed, then this becomes Taylor’s Series (in modern notation)

The term “Taylor’s series” was probably first used by L’Huillier in 1786, although Condorcet attributed the equation to both Taylor and d’Alembert in 1784. It was Lagrange in 1797 who immortalized Taylor by claiming that Taylor’s theorem was the foundation of analysis.

Example: sin(x)

Expand sin(x) around x = π

This is related to the expansion around x = 0 (also known as a Maclaurin series)

Example: arctan(x)

To get an feel for how to apply Taylor’s theorem to a function like arctan, begin with

and take the derivative of both sides

Rewrite this as

and substitute the expression for y

and integrate term by term to arrive at

This is James Gregory’s famous series. Although the math here is modern and only takes a few lines, it parallel’s Gregory’s approach. But Gregory had to invent aspects of calculus as he went along — his derivation covering many dense pages. In the priority dispute between Leibniz and Newton, Gregory is usually overlooked as an independent inventor of many aspects of the calculus. This is partly because Gregory acknowledged that Newton had invented it first, and he delayed publishing to give Newton priority.

Two-Dimensional Taylor’s Series

The ideas behind the Taylor’s series generalizes to any number of dimensions. For a scalar function of two variables it takes the form (out to second order)

where J is the Jacobian matrix (vector) and H is the Hessian matrix defined for the scalar function as

and

As a concrete example, consider the two-dimensional Gaussian function

The Jacobean and Hessian matrices are

which are the first- and second-order coefficients of the Taylor series.

References

[1] “A History of Bifrons House”, B. M. Thomas, Kent Archeological Society (2017)

[2] L. Feigenbaum, “TAYLOR,BROOK AND THE METHOD OF INCREMENTS,” Archive for History of Exact Sciences, vol. 34, no. 1-2, pp. 1-140, (1985)

[3] A. Malet, “GREGORIE, JAMES ON TANGENTS AND THE TAYLOR RULE FOR SERIES EXPANSIONS,” Archive for History of Exact Sciences, vol. 46, no. 2, pp. 97-137, (1993)

[4] E. Harier and G. Wanner, Analysis by its History (Springer, 1996)

Painting of Bifrons Park by Jan Wyck c. 1700

Physics in the Age of Contagion: Part 4. Fifty Shades of Immunity to COVID-19

This is the fourth installment in a series of blogs on the population dynamics of COVID-19. In my first blog I looked at a bifurcation physics model that held the possibility (and hope) that with sufficient preventive action the pandemic could have died out and spared millions. That hope was in vain.

What will it be like to live with COVID-19 as a constant factor of modern life for years to come?

In my second blog I looked at a two-component population dynamics model that showed the importance of locking down and not emerging too soon. It predicted that waiting only a few extra weeks before opening could have saved tens of thousands of lives. Unfortunately, because states like Texas and Florida opened too soon and refused to mandate the wearing of masks, thousands of lives were lost.

In my third blog I looked at a network physics model that showed the importance of rapid testing and contact tracing to remove infected individuals to push the infection rate low — not only to flatten the curve, but to drive it down. While most of the developed world is succeeding in achieving this, the United States is not.

In this fourth blog, I am looking at a simple mean-field model that shows what it will be like to live with COVID-19 as a constant factor of modern life for years to come. This is what will happen if the period of immunity to the disease is short and people who recover from the disease can get it again. Then the disease will never go away and the world will need to learn to deal with it. How different that world will look from the one we had just a year ago will depend on the degree of immunity that is acquired after infection, how long a vaccine will provide protection before booster shots are needed, and how many people will get the vaccine or will refus.

SIRS for SARS

COVID-19 is a SARS corona virus known as SARS-CoV-2. SARS stands for Severe Acute Respiratory Syndrome. There is a simple and well-established mean-field model for an infectious disease like SARS that infects individuals, from which they recover, but after some lag period they become susceptible again. This is called the SIRS model, standing for Susceptible-Infected-Recovered-Susceptible. This model is similar to the SIS model of my first blog, but it now includes a mean lifetime for the acquired immunity, after an individual recovers from the infection and then becomes susceptible again. The bifurcation threshold is the same for the SIRS model as the SIS model, but with SIRS there is a constant susceptible population. The mathematical flow equations for this model are

where i is the infected fraction, r is the recovered fraction, and 1 – i – r = s is the susceptible fraction. The infection rate for an individual who has k contacts is βk. The recovery rate is μ and the mean lifetime of acquired immunity after recovery is τlife = 1/ν. This model assumes that all individuals are equivalent (no predispositions) and there is no vaccine–only natural immunity that fades in time after recovery.

The population trajectories for this model are shown in Fig. 1. The figure on the left is a 3-simplex where every point in the triangle stands for a 3-tuple (i, r, s). Our own trajectory starts at the right bottom vertex and generates the green trajectory that spirals into the fixed point. The parameters are chosen to be roughly equivalent to what is known about the virus (but with big uncertainties in the model parameters). One of the key results is that the infection will oscillate over several years, settling into a steady state after about 4 years. Thereafter, there is a steady 3% infected population with 67% of the population susceptible and 30% recovered. The decay time for the immunity is assumed to be one year in this model. Note that the first peak in the infected numbers will be about 1 year, or around March 2021. There is a second smaller peak (the graph on the right is on a vertical log scale) at about 4 years, or sometime in 2024.

Fig. 1 SIRS model for COVID-19 in which immunity acquired after recovery fades in time so an individual can be infected again. If immunity fades and there is never a vaccine, a person will have an 80% chance of getting the virus at least twice in their lifetime, and COVID will become the third highest cause of death in the US after heart disease and cancer.

Although the recovered fraction is around 30% for these parameters, it is important to understand that this is a dynamic equilibrium. If there is no vaccine, then any individual who was once infected can be infected again after about a year. So if they don’t get the disease in the first year, they still have about a 4% chance to get it every following year. In 50 years, a 20-year-old today would have almost a 90% chance of having been infected at least once and an 80% chance of having gotten it at least twice. In other words, if there is never a vaccine, and if immunity fades after each recovery, then almost everyone will eventually get the disease several times in their lifetime. Furthermore, COVID will become the third most likely cause of death in the US after heart disease (first) and cancer (second). The sad part of this story is that it all could have been avoided if the government leaders of several key nations, along with their populations, had behaved responsibly.

The Asymmetry of Personal Cost under COVID

The nightly news in the US during the summer of 2020 shows endless videos of large parties, dense with people, mostly young, wearing no masks. This is actually understandable even though regrettable. It is because of the asymmetry of personal cost. Here is what that means …

On any given day, an individual who goes out and about in the US has only about a 0.01 percent chance of contracting the virus. In the entire year, there is only about a 3% chance that that individual will get the disease. And even if they get the virus, they only have a 2% chance of dying. So the actual danger per day per person is so minuscule that it is hard to understand why it is so necessary to wear a mask and socially distance. Therefore, if you go out and don’t wear a mask, almost nothing bad will happen to YOU. So why not? Why not screw the masks and just go out!

And this is why that’s such a bad idea: because if no-one wears a mask, then tens or hundreds of thousands of OTHERS will die.

This is the asymmetry of personal cost. By ignoring distancing, nothing is likely to happen to YOU, but thousands of OTHERS will die. How much of your own comfort are you willing to give up to save others? That is the existential question.

This year is the 75th anniversary of the end of WW II. During the war everyone rationed and recycled, not because they needed it for themselves, but because it was needed for the war effort. Almost no one hesitated back then. It was the right thing to do even though it cost personal comfort. There was a sense of community spirit and doing what was good for the country. Where is that spirit today? The COVID-19 pandemic is a war just as deadly as any war since WW II. There is a community need to battle it. All anyone has to do is wear a mask and behave responsibly. Is this such a high personal cost?

The Vaccine

All of this can change if a reliable vaccine can be developed. There is no guarantee that this can be done. For instance, there has never been a reliable vaccine for the common cold. A more sobering thought is to realize is that there has never been a vaccine for the closely related virus SARS-CoV-1 that broke out in 2003 in China but was less infectious. But the need is greater now, so there is reason for optimism that a vaccine can be developed that elicits the production of antibodies with a mean lifetime at least as long as for naturally-acquired immunity.

The SIRS model has the same bifurcation threshold as the SIS model that was discussed in a previous blog. If the infection rate can be made slower than the recovery rate, then the pandemic can be eliminated entirely. The threshold is

The parameter μ, the recovery rate, is intrinsic and cannot be changed. The parameter β, the infection rate per contact, can be reduced by personal hygiene and wearing masks. The parameter <k>, the average number of contacts to a susceptible person, can be significantly reduced by vaccinating a large fraction of the population.

To simulate the effect of vaccination, the average <k> per person can be reduced at the time of vaccination. This lowers the average infection rate. The results are shown in Fig. 2 for the original dynamics, a vaccination of 20% of the populace, and a vaccination of 40% of the populace. For 20% vaccination, the epidemic is still above threshold, although the long-time infection is lower. For 40% of the population vaccinated, the disease falls below threshold and would decay away and vanish.

Fig. 2 Vaccination at 52 weeks can lower the infection cases (20% vaccinated) or eliminate them entirely (40% vaccinated). The vaccinations would need booster shots every year (if the decay time of immunity is one year).

In this model, the vaccination is assumed to decay at the same rate as naturally acquired immunity (one year), so booster shots would be needed every year. Getting 40% of the population to get vaccinated may be achieved. Roughly that fraction get yearly flu shots in the US, so the COVID vaccine could be added to the list. But at 40% it would still be necessary for everyone to wear face masks and socially distance until the pandemic fades away. Interestingly, if the 40% got vaccinated all on the same date (across the world), then the pandemic would be gone in a few months. Unfortunately, that’s unrealistic, so with a world-wide push to get 40% of the World’s population vaccinated within five years, it would take that long to eliminate the disease, taking us to 2025 before we could go back to the way things were in November of 2019. But that would take a world-wide vaccination blitz the likes of which the world has never seen.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Fri July 17 2020

D. D. Nolte, "Introduction to Modern Dynamics: 
    Chaos, Networks, Space and Time, 2nd Edition (Oxford University Press, 2019)

@author: nolte
"""

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt

plt.close('all')


def tripartite(x,y,z):

    sm = x + y + z
    xp = x/sm
    yp = y/sm
    
    f = np.sqrt(3)/2
    
    y0 = f*xp
    x0 = -0.5*xp - yp + 1;
    
    lines = plt.plot(x0,y0)
    plt.setp(lines, linewidth=0.5)
    plt.plot([0, 1],[0, 0],'k',linewidth=1)
    plt.plot([0, 0.5],[0, f],'k',linewidth=1)
    plt.plot([1, 0.5],[0, f],'k',linewidth=1)
    plt.show()
    
print(' ')
print('SIRS.py')

def solve_flow(param,max_time=1000.0):

    def flow_deriv(x_y,tspan,mu,betap,nu):
        x, y = x_y
        
        return [-mu*x + betap*x*(1-x-y),mu*x-nu*y]
    
    x0 = [del1, del2]
    
    # Solve for the trajectories
    t = np.linspace(0, int(tlim), int(250*tlim))
    x_t = integrate.odeint(flow_deriv, x0, t, param)

    return t, x_t

 # rates per week
betap = 0.3;   # infection rate
mu = 0.2;      # recovery rate
nu = 0.02      # immunity decay rate

print('beta = ',betap)
print('mu = ',mu)
print('nu =',nu)
print('betap/mu = ',betap/mu)
          
del1 = 0.005         # initial infected
del2 = 0.005         # recovered

tlim = 600          # weeks (about 12 years)

param = (mu, betap, nu)    # flow parameters

t, y = solve_flow(param)
I = y[:,0]
R = y[:,1]
S = 1 - I - R

plt.figure(1)
lines = plt.semilogy(t,I,t,S,t,R)
plt.ylim([0.001,1])
plt.xlim([0,tlim])
plt.legend(('Infected','Susceptible','Recovered'))
plt.setp(lines, linewidth=0.5)
plt.xlabel('Days')
plt.ylabel('Fraction of Population')
plt.title('Population Dynamics for COVID-19')
plt.show()

plt.figure(2)
plt.hold(True)
for xloop in range(0,10):
    del1 = xloop/10.1 + 0.001
    del2 = 0.01

    tlim = 300
    param = (mu, betap, nu)    # flow parameters
    t, y = solve_flow(param)       
    I = y[:,0]
    R = y[:,1]
    S = 1 - I - R
    
    tripartite(I,R,S);

for yloop in range(1,6):
    del1 = 0.001;
    del2 = yloop/10.1
    t, y = solve_flow(param)
    I = y[:,0]
    R = y[:,1]
    S = 1 - I - R
    
    tripartite(I,R,S);
    
for loop in range(2,10):
    del1 = loop/10.1
    del2 = 1 - del1 - 0.01
    t, y = solve_flow(param)
    I = y[:,0]
    R = y[:,1]
    S = 1 - I - R
        
    tripartite(I,R,S);
    
plt.hold(False)
plt.title('Simplex Plot of COVID-19 Pop Dynamics')
 
vac = [1, 0.8, 0.6]
for loop in vac:
               
    # Run the epidemic to the first peak
    del1 = 0.005
    del2 = 0.005
    tlim = 52
    param = (mu, betap, nu)
    t1, y1 = solve_flow(param)
    
    # Now vaccinate a fraction of the population
    st = np.size(t1)
    del1 = y1[st-1,0]
    del2 = y1[st-1,1]
    tlim = 400
    
    param = (mu, loop*betap, nu)
    t2, y2 = solve_flow(param)
    
    t2 = t2 + t1[st-1]
    
    tc = np.concatenate((t1,t2))
    yc = np.concatenate((y1,y2))
    
    I = yc[:,0]
    R = yc[:,1]
    S = 1 - I - R
    
    plt.figure(3)
    plt.hold(True)
    lines = plt.semilogy(tc,I,tc,S,tc,R)
    plt.ylim([0.001,1])
    plt.xlim([0,tlim])
    plt.legend(('Infected','Susceptible','Recovered'))
    plt.setp(lines, linewidth=0.5)
    plt.xlabel('Weeks')
    plt.ylabel('Fraction of Population')
    plt.title('Vaccination at 1 Year')
    plt.show()
    
plt.hold(False)

Caveats and Disclaimers

No effort was made to match parameters to the actual properties of the COVID-19 pandemic. The SIRS model is extremely simplistic and can only show general trends because it homogenizes away all the important spatial heterogeneity of the disease across the cities and states of the country. If you live in a hot spot, this model says little about what you will experience locally. The decay of immunity is also a completely open question and the decay rate is completely unknown. It is easy to modify the Python program to explore the effects of differing decay rates and vaccination fractions. The model also can be viewed as a “compartment” to model local variations in parameters.

Johann Bernoulli’s Brachistochrone

Johann Bernoulli was an acknowledged genius–and he acknowledged it of himself.  Some flavor of his character can be seen in his opening lines of one of the most famous challenges in the history of mathematics—the statement of the Brachistrochrone Challenge.

“I, Johann Bernoulli, address the most brilliant mathematicians in the world. Nothing is more attractive to intelligent people than an honest, challenging problem, whose possible solution will bestow fame and remain as a lasting monument.”

Of course, he meant his own fame, because he thought he already had a solution to the problem he posed to the mathematical community of the day.

The Problem of Fastest Descent

The problem posed by Johann Bernoulli was the brachistochrone (Gk: brachis + chronos) or the path of fastest descent. 

Galileo had attempted to tackle this problem in his Two New Sciences and had concluded, based on geometric arguments, that the solution was a circular path.  Yet he hedged—he confessed that he had reservations about this conclusion and suggested that a “higher mathematics” would possibly find a better solution. In fact he was right.




Fig. 1  Galileo considered a mass falling along different chords of a circle starting at A.  He proved that the path along ABG was quicker than along AG, and ABCG was quicker than ABG, and ABCDG was quicker than ABCG, etc.  In this way he showed that the path along the circular arc was quicker than any set of chords.  From this he inferred that the circle was the path of quickest descent—but he held out reservations, and rightly so.

In 1659, when Christiaan Huygens was immersed in the physics of pendula and time keeping, he was possibly the first mathematician to recognize that a perfect harmonic oscillator, one whose restoring force was linear in the displacement of the oscillator, would produce the perfect time piece.  Unfortunately, the pendulum, proposed by Galileo, was the simplest oscillator to construct, but Huygens already knew that it was not a perfect harmonic oscillator.  The period of oscillation became smaller when the amplitude of the oscillation became larger.  In order to “fix” the pendulum, he searched for a curve of equal time, called the tautochrone, that would allow all amplitudes of the pendulum to have the same period.  He found the solution and recognized it to be a cycloid arc. 

On a cycloid whose axis is erected on the perpendicular and whose vertex is located at the bottom, the times of descent, in which a body arrives at the lowest point at the vertex after having departed from any point on the cycloid, are equal to each other…

His derivation filled 16 pages with geometric arguments, which was not a very efficient way to derive the thing.

Almost thirty years later, during the infancy of the infinitesimal calculus, the tautochrone was held up as a master example of an “optimal” solution whose derivation should yield to the much more powerful and elegant methods of the calculus.  Jakob Bernoulli, Johann’s brother, succeeded in deriving the tautochrone in 1690 using the calculus, using the term “integral” for the first time in print, but it was not at first clear what other problems could yield in a similar way.

Then, in 1696, Johann Bernoulli posed the brachistrochrone problem in the pages of Acta Eruditorum.

Fig. 2 The shortest-time route from A to B, relying only on gravity, is the cycloid, compared to the parabola, circle and linear paths. Johann and Jakob Bernoulli, brothers, competed to find the best solution.

Acta Eruditorum

The Acta Eruditorum was the German answer to the Proceedings of the Royal Society of London.  It began publishing in Leipzig in 1682 under the editor Otto Mencke.  Although Mencke was the originator, launching and supporting the journal became the obsession of Gottfried Lebiniz, who felt he was a hostage in the backwaters of Hanover Germany but who yearned for a place on the world stage (i.e. Paris or London).  By launching the continental publication, the Continental scientists had a freer voice without needing to please the gate keepers at the Royal Society.  And by launching a German journal, it gave German scientists like Leibniz (and the Bernoullis and Euler, and von Tschirnhaus among others) a freer voice without censor by the Journal des Savants of Paris.

Fig. 3 Acta Eruditorum of 1684 containing one of Leibniz’ early papers on the calculus.

The Acta Eruditorum was almost a vanity press for Leibniz.  He published 13 papers in the journal in its first 4 years of activity starting in 1682.  In return, when Leibniz became embroiled in the priority dispute with Newton over the invention of the calculus, the Acta provided loyal support for Leibniz’ side just as the Proceedings of the Royal Society gave loyal support to Newton.  In fact, the trigger that launched the nasty battle with Newton was a review that Leibniz wrote for the Acta in 17?? [Ref] in which he presented himself as the primary inventor of the calculus.  When he failed to give due credit, not only to Newton, but also to lesser contributors, they fought back by claiming that Leibniz had stolen the idea from Newton.  Although a kangaroo court by the Royal Society found in favor of Newton, posterity gives most of the credit for the development and dissemination of the calculus to Leibniz.  Where Newton guarded his advances jealously and would not explain his approach, Leibniz freely published his methods for all to see and to learn and to try out for themselves.  In this open process, the Acta was the primary medium of communication and gets the credit for being the conduit by which the calculus was presented to the world.

Although the Acta Eruditorum only operated for 100 years, it stands out as the most important publication for the development of the calculus.  Leibnitz published in the Acta a progressive set of papers that outlined his method for the calculus.  More importantly, his papers elicited responses from other mathematicians, most notably Johann Bernoulli and von Tschirnhaus and L’Hopital, who helped to refine the methods and advance the art.  The Acta became a collaborative space for this team of mathematicians as they fine-tuned the methods as well as the notations for the calculus, most of which stand to this day.  In contrast, Newton’s notations have all but faded, save the simple “dot” notation over variables to denote them as time derivatives (his fluxions).  Therefore, for most of continental Europe, the Acta Eruditorum was the place to publish, and it was here that Johann Bernoulli published his famous challenge of the brachistochrone.

The Competition

Johann suggested the problem in the June 1696 Acta Eruditorum

Following the example set by Pascal, Fermat, etc., I hope to gain the gratitude of the whole scientific community by placing before the finest mathematicians of our time a problem which will test their methods and the strength of their intellect. If someone communicates to me the solution of the proposed problem, I shall publicly declare him worthy of praise

Given two points A and B in a vertical plane, what is the curve traced out by a point acted on only by gravity, which starts at A and reaches B in the shortest time

The competition was originally proposed for 6 months, but it was then extended to a year and a half.  Johann published his results about a year later, but not without controversy.  Johann had known that his brother Jakob was also working on the problem, but he incorrectly thought that Jakob was convinced that Galileo had been right, so Johann described his approach to Jakob thinking he had little to fear in the competition.  Johann didn’t know that Jakob had already taken an approach similar to Johann’s, and even more importantly, Jakob had done the math correctly.  When Jakob showed Johann his mistake, he also ill-advisedly showed him the correct derivation.  Johann sent off a manuscript to Acta with the correct derivation that he had learned from Jakob.

Within the year and a half there were 4 additional solutions—all correct—using different approaches.  One of the most famous responses was by Newton (who as usual did not give up his method) but who is reported to have solved the problem in a day.  Others who contributed solutions were Gottfried Leibniz, Ehrenfried Walther von Tschirnhaus, and Guillaume de l’Hôpital’s.  Of course, Jakob sent in his own solution, although it overlapped with the one Johann had already published.

The Solution of Jakob and Johann Bernoulli

The stroke of genius of Jakob and Johann Bernoulli, accomplished in 1697 only about 20 years after the invention of the calculus, was to recognize an amazing analogy between mechanics and light.  Their insight foreshadowed Lagrange by a hundred years and William Rowan Hamilton by a hundred and fifty.  They did this by recognizing that the path of a light beam, just like the trajectory of a particle, conserves certain properties.  In the case of Fermat’s principle, a light ray refracts to take the path of least time between two points.  The insight of the Bernoulli’s is that a mechanical particle would behave in exactly the same way.  Therefore, the brachistrochrone can be obtained by considering the path that a light beam would take if the light ray were propagating through a medium with non-uniform refractive index to that the speed of light varies with height y as

Fermat’s principle of least time, which is consistent with Snell’s Law at interfaces, imposes the constraint on the path

This equation for a light ray propagating through a non-uniform medium would later become known as the Eikonal Equation.  The conserved quantity along this path is the value 1/vm.  Rewriting the Eikonal equation as

it can be solved for the differential equation

which those in the know (as certainly the Bernoullis were) would know is the equation of a cycloid.  If the sliding bead is on a wire shaped like a cycloid, there must be a lowest point for which the speed is a maximum.  For the cycloid curve of diameter D, this is

Therefore, the equation for the brachistochrone is

which is the differential equation for an inverted cycloid of diameter D.





Fig. 4 A light ray enters vertically on a medium whose refractive index varies as the square-root of depth.  The path of least time for the light ray to travel through the material is a cycloid—the same as for a massive particle traveling from point A to point B.

Calculus of Variations

Variational calculus had not quite been invented in time to solve the Brachistochrone, although the brachistochrone challenge helped motivate its eventual development by Euler and Lagrange later in the eighteenth century. Nonetheless, it is helpful to see the variational solution, which is the way we would solve this problem if it were a Lagrangian problem in advanced classical mechanics.

First, the total time taken by the sliding bead is defined as

Then we take energy conservation to solve for v(y)

The path element is

which leads to the expression for total time

It is the argument of the integral which is the quantity to be varied (the Lagrangian)

which can be inserted into the Lagrange equation

This has a simple first integral

This is explicitly solved

Once again, it helps to recognize the equation of a cycloid, because the last line can be solved as the parametric curves

which is the cycloid curve.

References

C. B. Boyer, The History of the Calculus and its Conceptual Development. New York: Dover, 1959.

J. Coopersmith, The lazy universe : an introduction to the principle of least action. Oxford University Press, 2017.

D. S. Lemons, Perfect Form: Variational Principles, Methods, and Applications in Elementary Physics. Princeton University Press, 1997.

Wikipedia: The Brachistrochrone Curve

W. Yourgrau, Variational principles in dynamics and quantum theory, 2d ed.. ed. New York: New York, Pitman Pub. Corp., 1960.

The Solvay Debates: Einstein versus Bohr

Einstein is the alpha of the quantum. Einstein is also the omega. Although he was the one who established the quantum of energy and matter (see my Blog Einstein vs Planck), Einstein pitted himself in a running debate against Niels Bohr’s emerging interpretation of quantum physics that had, in Einstein’s opinion, severe deficiencies. Between sessions during a series of conferences known as the Solvay Congresses over a period of eight years from 1927 to 1935, Einstein constructed a challenges of increasing sophistication to confront Bohr and his quasi-voodoo attitudes about wave-function collapse. To meet the challenge, Bohr sharpened his arguments and bested Einstein, who ultimately withdrew from the field of battle. Einstein, as quantum physics’ harshest critic, played a pivotal role, almost against his will, establishing the Copenhagen interpretation of quantum physics that rules to this day, and also inventing the principle of entanglement which lies at the core of almost all quantum information technology today.

Debate Timeline

  • Fifth Solvay Congress: 1927 October Brussels: Debate Round 1
    • Einstein and ensembles
  • Sixth Solvay Congress: 1930 Debate Round 2
    • Photon in a box
  • Seventh Solvay Congress: 1933
    • Einstein absent (visiting the US when Hitler takes power…decides not to return to Germany.)
  • Physical Review 1935: Debate Round 3
    • EPR paper and Bohr’s response
    • Schrödinger’s Cat
  • Notable Nobel Prizes
    • 1918 Planck
    • 1921 Einstein
    • 1922 Bohr
    • 1932 Heisenberg
    • 1933 Dirac and Schrödinger

The Solvay Conferences

The Solvay congresses were unparalleled scientific meetings of their day.  They were attended by invitation only, and invitations were offered only to the top physicists concerned with the selected topic of each meeting.  The Solvay congresses were held about every three years always in Belgium, supported by the Belgian chemical industrialist Ernest Solvay.  The first meeting, held in 1911, was on the topic of radiation and quanta. 

Fig. 1 First Solvay Congress (1911). Einstein (standing second from right) was one of the youngest attendees.

The fifth meeting, held in 1927, was on electrons and photons and focused on the recent rapid advances in quantum theory.  The old quantum guard was invited—Planck, Bohr and Einstein.  The new quantum guard was invited as well—Heisenberg, de Broglie, Schrödinger, Born, Pauli, and Dirac.  Heisenberg and Bohr joined forces to present a united front meant to solidify what later became known as the Copenhagen interpretation of quantum physics.  The basic principles of the interpretation include the wavefunction of Schrödinger, the probabilistic interpretation of Born, the uncertainty principle of Heisenberg, the complementarity principle of Bohr and the collapse of the wavefunction during measurement.  The chief conclusion that Heisenberg and Bohr sought to impress on the assembled attendees was that the theory of quantum processes was complete, meaning that unknown or uncertain  characteristics of measurements could not be attributed to lack of knowledge or understanding, but were fundamental and permanently inaccessible.

Fig. 2 Fifth Solvay Congress (1927). Einstein front and center. Bohr on the far right middle row.

Einstein was not convinced with that argument, and he rose to his feet to object after Bohr’s informal presentation of his complementarity principle.  Einstein insisted that uncertainties in measurement were not fundamental, but were caused by incomplete information, that , if known, would accurately account for the measurement results.  Bohr was not prepared for Einstein’s critique and brushed it off, but what ensued in the dining hall and the hallways of the Hotel Metropole in Brussels over the next several days has become one of the most famous scientific debates of the modern era, known as the Bohr-Einstein debate on the meaning of quantum theory.  The debate gently raged night and day through the fifth congress, and was renewed three years later at the 1930 congress.  It finished, in a final flurry of published papers in 1935 that launched some of the central concepts of quantum theory, including the idea of quantum entanglement and, of course, Schrödinger’s cat.

Einstein’s strategy, to refute Bohr, was to construct careful thought experiments that envisioned perfect experiments, without errors, that measured properties of ideal quantum systems.  His aim was to paint Bohr into a corner from which he could not escape, caught by what Einstein assumed was the inconsistency of complementarity.  Einstein’s “thought experiments” used electrons passing through slits, diffracting as required by Schrödinger’s theory, but being detected by classical measurements.  Einstein would present a thought experiment to Bohr, who would then retreat to consider the way around Einstein’s arguments, returning the next hour or the next day with his answer, only to be confronted by yet another clever device of Einstein’s clever imagination that would force Bohr to retreat again.  The spirit of this back and forth encounter between Bohr and Einstein is caught dramatically in the words of Paul Ehrenfest who witnessed the debate first hand, partially mediating between Bohr and Einstein, both of whom he respected deeply.

“Brussels-Solvay was fine!… BOHR towering over everybody.  At first not understood at all … , then  step by step defeating everybody.  Naturally, once again the awful Bohr incantation terminology.  Impossible for anyone else to summarise … (Every night at 1 a.m., Bohr came into my room just to say ONE SINGLE WORD to me, until three a.m.)  It was delightful for me to be present during the conversation between Bohr and Einstein.  Like a game of chess, Einstein all the time with new examples.  In a certain sense a sort of Perpetuum Mobile of the second kind to break the UNCERTAINTY RELATION.  Bohr from out of philosophical smoke clouds constantly searching for the tools to crush one example after the other.  Einstein like a jack-in-the-box; jumping out fresh every morning.  Oh, that was priceless.  But I am almost without reservation pro Bohr and contra Einstein.  His attitude to Bohr is now exacly like the attitude of the defenders of absolute simultaneity towards him …” [1]

The most difficult example that Einstein constructed during the fifth Solvary Congress involved an electron double-slit apparatus that could measure, in principle, the momentum imparted to the slit by the passing electron, as shown in Fig.3.  The electron gun is a point source that emits the electrons in a range of angles that illuminates the two slits.  The slits are small relative to a de Broglie wavelength, so the electron wavefunctions diffract according to Schrödinger’s wave mechanics to illuminate the detection plate.  Because of the interference of the electron waves from the two slits, electrons are detected clustered in intense fringes separated by dark fringes. 

So far, everyone was in agreement with these suggested results.  The key next step is the assumption that the electron gun emits only a single electron at a time, so that only one electron is present in the system at any given time.  Furthermore, the screen with the double slit is suspended on a spring, and the position of the screen is measured with complete accuracy by a displacement meter.  When the single electron passes through the entire system, it imparts a momentum kick to the screen, which is measured by the meter.  It is also detected at a specific location on the detection plate.  Knowing the position of the electron detection, and the momentum kick to the screen, provides information about which slit the electron passed through, and gives simultaneous position and momentum values to the electron that have no uncertainty, apparently rebutting the uncertainty principle.             

Fig. 3 Einstein’s single-electron thought experiment in which the recoil of the screen holding the slits can be measured to tell which way the electron went. Bohr showed that the more “which way” information is obtained, the more washed-out the interference pattern becomes.

This challenge by Einstein was the culmination of successively more sophisticated examples that he had to pose to combat Bohr, and Bohr was not going to let it pass unanswered.  With ingenious insight, Bohr recognized that the key element in the apparatus was the fact that the screen with the slits must have finite mass if the momentum kick by the electron were to produce a measurable displacement.  But if the screen has finite mass, and hence a finite momentum kick from the electron, then there must be an uncertainty in the position of the slits.  This uncertainty immediately translates into a washout of the interference fringes.  In fact the more information that is obtained about which slit the electron passed through, the more the interference is washed out.  It was a perfect example of Bohr’s own complementarity principle.  The more the apparatus measures particle properties, the less it measures wave properties, and vice versa, in a perfect balance between waves and particles. 

Einstein grudgingly admitted defeat at the end of the first round, but he was not defeated.  Three years later he came back armed with more clever thought experiments, ready for the second round in the debate.

The Sixth Solvay Conference: 1930

At the Solvay Congress of 1930, Einstein was ready with even more difficult challenges.  His ultimate idea was to construct a box containing photons, just like the original black bodies that launched Planck’s quantum hypothesis thirty years before.  The box is attached to a weighing scale so that the weight of the box plus the photons inside can be measured with arbitrarily accuracy. A shutter over a hole in the box is opened for a time T, and a photon is emitted.  Because the photon has energy, it has an equivalent weight (Einstein’s own famous E = mc2), and the mass of the box changes by an amount equal to the photon energy divided by the speed of light squared: m = E/c2.  If the scale has arbitrary accuracy, then the energy of the photon has no uncertainty.  In addition, because the shutter was open for only a time T, the time of emission similarly has no uncertainty.  Therefore, the product of the energy uncertainty and the time uncertainty is much smaller than Planck’s constant, apparently violating Heisenberg’s precious uncertainty principle.

Bohr was stopped in his tracks with this challenge.  Although he sensed immediately that Einstein had missed something (because Bohr had complete confidence in the uncertainty principle), he could not put his finger immediately on what it was.  That evening he wandered from one attendee to another, very unhappy, trying to persuade them and saying that Einstein could not be right because it would be the end of physics.  At the end of the evening, Bohr was no closer to a solution, and Einstein was looking smug.  However, by the next morning Bohr reappeared tired but in high spirits, and he delivered a master stroke.  Where Einstein had used special relaitivity against Bohr, Bohr now used Einstein’s own general relativity against him. 

The key insight was that the weight of the box must be measured, and the process of measurement was just as important as the quantum process being measured—this was one of the cornerstones of the Copenhagen interpretation.  So Bohr envisioned a measuring apparatus composed of a spring and a scale with the box suspended in gravity from the spring.  As the photon leaves the box, the weight of the box changes, and so does the deflection of the spring, changing the height of the box.  This change in height, in a gravitational potential, causes the timing of the shutter to change according to the law of gravitational time dilation in general relativity.  By calculating the the general relativistic uncertainty in the time, coupled with the special relativistic uncertainty in the weight of the box, produced a product that was at least as big as Planck’s constant—Heisenberg’s uncertainty principle was saved!

Fig. 4 Einstein’s thought experiment that uses special relativity to refute quantum mechanics. Bohr then invoked Einstein’s own general relativity to refute him.

Entanglement and Schrödinger’s Cat

Einstein ceded the point to Bohr but was not convinced. He still believed that quantum mechanics was not a “complete” theory of quantum physics and he continued to search for the perfect thought experiment that Bohr could not escape. Even today when we have become so familiar with quantum phenomena, the Copenhagen interpretation of quantum mechanics has weird consequences that seem to defy common sense, so it is understandable that Einstein had his reservations.

After the sixth Solvay congress Einstein and Schrödinger exchanged many letters complaining to each other about Bohr’s increasing strangle-hold on the interpretation of quantum mechanics. Egging each other on, they both constructed their own final assault on Bohr. The irony is that the concepts they devised to throw down quantum mechanics have today become cornerstones of the theory. For Einstein, his final salvo was “Entanglement”. For Schrödinger, his final salvo was his “cat”. Today, Entanglement and Schrödinger’s Cat have become enshrined on the alter of quantum interpretation even though their original function was to thwart that interpretation.

The final round of the debate was carried out, not at a Solvay congress, but in the Physical review journal by Einstein [2] and Bohr [3], and in the Naturwissenshaften by Schrödinger [4].

In 1969, Heisenberg looked back on these years and said,

To those of us who participated in the development of atomic theory, the five years following the Solvay Conference in Brussels in 1927 looked so wonderful that we often spoke of them as the golden age of atomic physics. The great obstacles that had occupied all our efforts in the preceding years had been cleared out of the way, the gate to an entirely new field, the quantum mechanics of the atomic shells stood wide open, and fresh fruits seemed ready for the picking. [5]

References

[1] A. Whitaker, Einstein, Bohr, and the quantum dilemma : from quantum theory to quantum information, 2nd ed. Cambridge University Press, 2006. (pg. 210)

[2] A. Einstein, B. Podolsky, and N. Rosen, “Can quantum-mechanical description of physical reality be considered complete?,” Physical Review, vol. 47, no. 10, pp. 0777-0780, May (1935)

[3] N. Bohr, “Can quantum-mechanical description of physical reality be considered complete?,” Physical Review, vol. 48, no. 8, pp. 696-702, Oct (1935)

[4] E. Schrodinger, “The current situation in quantum mechanics,” Naturwissenschaften, vol. 23, pp. 807-812, (1935)

[5] W Heisenberg, Physics and beyond : Encounters and conversations (Harper, New York, 1971)

Timelines in the History and Physics of Dynamics (with links to primary texts)

These timelines in the History of Dynamics are organized along the Chapters in Galileo Unbound (Oxford, 2018). The book is about the physics and history of dynamics including classical and quantum mechanics as well as general relativity and nonlinear dynamics (with a detour down evolutionary dynamics and game theory along the way). The first few chapters focus on Galileo, while the following chapters follow his legacy, as theories of motion became more abstract, eventually to encompass the evolution of species within the same theoretical framework as the orbit of photons around black holes.

Galileo: A New Scientist

Galileo Galilei was the first modern scientist, launching a new scientific method that superseded, after one and a half millennia, Aristotle’s physics.  Galileo’s career began with his studies of motion at the University of Pisa that were interrupted by his move to the University of Padua and his telescopic discoveries of mountains on the moon and the moons of Jupiter.  Galileo became the first rock star of science, and he used his fame to promote the ideas of Copernicus and the Sun-centered model of the solar system.  But he pushed too far when he lampooned the Pope.  Ironically, Galileo’s conviction for heresy and his sentence to house arrest for the remainder of his life gave him the free time to finally finish his work on the physics of motion, which he published in Two New Sciences in 1638.

1543 Copernicus dies, publishes posthumously De Revolutionibus

1564    Galileo born

1581    Enters University of Pisa

1585    Leaves Pisa without a degree

1586    Invents hydrostatic balance

1588    Receives lecturship in mathematics at Pisa

1592    Chair of mathematics at Univeristy of Padua

1595    Theory of the tides

1595    Invents military and geometric compass

1596    Le Meccaniche and the principle of horizontal inertia

1600    Bruno Giordano burned at the stake

1601    Death of Tycho Brahe

1609    Galileo constructs his first telescope, makes observations of the moon

1610    Galileo discovers 4 moons of Jupiter, Starry Messenger (Sidereus Nuncius), appointed chief philosopher and mathematician of the Duke of Tuscany, moves to Florence, observes Saturn, Venus goes through phases like the moon

1611    Galileo travels to Rome, inducted into the Lyncean Academy, name “telescope” is first used

1611    Scheiner discovers sunspots

1611    Galileo meets Barberini, a cardinal

1611 Johannes Kepler, Dioptrice

1613    Letters on sunspots published by Lincean Academy in Rome

1614    Galileo denounced from the pulpit

1615    (April) Bellarmine writes an essay against Coperinicus

1615    Galileo investigated by the Inquisition

1615    Writes Letter to Christina, but does not publish it

1615    (December) travels to Rome and stays at Tuscan embassy

1616    (January) Francesco Ingoli publishes essay against Copernicus

1616    (March) Decree against copernicanism

1616    Galileo publishes theory of tides, Galileo meets with Pope Paul V, Copernicus’ book is banned, Galileo warned not to support the Coperinican system, Galileo decides not to reply to Ingoli, Galileo proposes eclipses of Jupter’s moons to determine longitude at sea

1618    Three comets appear, Grassi gives a lecture not hostile to Galileo

1618    Galileo, through Mario Guiducci, publishes scathing attack on Grassi

1619    Jesuit Grassi (Sarsi) publishes attack on Galileo concerning 3 comets

1619    Marina Gamba dies, Galileo legitimizes his son Vinczenzio

1619 Kepler’s Laws, Epitome astronomiae Copernicanae.

1623    Barberini becomes Urban VIII, The Assayer published (response to Grassi)

1624    Galileo visits Rome and Urban VIII

1629    Birth of his grandson Galileo

1630    Death of Johanes Kepler

1632    Publication of the Dialogue Concerning the Two Chief World Systems, Galileo is indicted by the Inquisition (68 years old)

1633    (February) Travels to Rome

1633    Convicted, abjurs, house arrest in Rome, then Siena, then home to Arcetri

1638    Blind, publication of Two New Sciences

1642    Galileo dies (77 years old)

Galileo’s Trajectory

Galileo’s discovery of the law of fall and the parabolic trajectory began with early work on the physics of motion by predecessors like the Oxford Scholars, Tartaglia and the polymath Simon Stevin who dropped lead weights from the leaning tower of Delft three years before Galileo (may have) dropped lead weights from the leaning tower of Pisa.  The story of how Galileo developed his ideas of motion is described in the context of his studies of balls rolling on inclined plane and the surprising accuracy he achieved without access to modern timekeeping.

1583    Galileo Notices isochronism of the pendulum

1588    Receives lecturship in mathematics at Pisa

1589 – 1592  Work on projectile motion in Pisa

1592    Chair of mathematics at Univeristy of Padua

1596    Le Meccaniche and the principle of horizontal inertia

1600    Guidobaldo shares technique of colored ball

1602    Proves isochronism of the pendulum (experimentally)

1604    First experiments on uniformly accelerated motion

1604    Wrote to Scarpi about the law of fall (s ≈ t2)

1607-1608  Identified trajectory as parabolic

1609    Velocity proportional to time

1632    Publication of the Dialogue Concerning the Two Chief World Systems, Galileo is indicted by the Inquisition (68 years old)

1636    Letter to Christina published in Augsburg in Latin and Italian

1638    Blind, publication of Two New Sciences

1641    Invented pendulum clock (in theory)

1642    Dies (77 years old)

On the Shoulders of Giants

Galileo’s parabolic trajectory launched a new approach to physics that was taken up by a new generation of scientists like Isaac Newton, Robert Hooke and Edmund Halley.  The English Newtonian tradition was adopted by ambitious French iconoclasts who championed Newton over their own Descartes.  Chief among these was Pierre Maupertuis, whose principle of least action was developed by Leonhard Euler and Joseph Lagrange into a rigorous new science of dynamics.  Along the way, Maupertuis became embroiled in a famous dispute that entangled the King of Prussia as well as the volatile Voltaire who was mourning the death of his mistress Emilie du Chatelet, the lone female French physicist of the eighteenth century.

1644    Descartes’ vortex theory of gravitation

1662    Fermat’s principle

1669 – 1690    Huygens expands on Descartes’ vortex theory

1687 Newton’s Principia

1698    Maupertuis born

1729    Maupertuis entered University in Basel.  Studied under Johann Bernoulli

1736    Euler publishes Mechanica sive motus scientia analytice exposita

1737   Maupertuis report on expedition to Lapland.  Earth is oblate.  Attacks Cassini.

1744    Maupertuis Principle of Least Action.  Euler Principle of Least Action.

1745    Maupertuis becomes president of Berlin Academy.  Paris Academy cancels his membership after a campaign against him by Cassini.

1746    Maupertuis principle of Least Action for mass

1751    Samuel König disputes Maupertuis’ priority

1756    Cassini dies.  Maupertuis reinstated in the French Academy

1759    Maupertuis dies

1759    du Chatelet’s French translation of Newton’s Principia published posthumously

1760    Euler 3-body problem (two fixed centers and coplanar third body)

1760-1761 Lagrange, Variational calculus (J. L. Lagrange, “Essai d’une nouvelle méthod pour dEeterminer les maxima et lest minima des formules intégrales indéfinies,” Miscellanea Teurinensia, (1760-1761))

1762    Beginning of the reign of Catherine the Great of Russia

1763    Euler colinear 3-body problem

1765    Euler publishes Theoria motus corporum solidorum on rotational mechanics

1766    Euler returns to St. Petersburg

1766    Lagrange arrives in Berlin

1772    Lagrange equilateral 3-body problem, Essai sur le problème des trois corps, 1772, Oeuvres tome 6

1775    Beginning of the American War of Independence

1776    Adam Smith Wealth of Nations

1781    William Herschel discovers Uranus

1783    Euler dies in St. Petersburg

1787    United States Constitution written

1787    Lagrange moves from Berlin to Paris

1788    Lagrange, Méchanique analytique

1789    Beginning of the French Revolution

1799    Pierre-Simon Laplace Mécanique Céleste (1799-1825)

Geometry on My Mind

This history of modern geometry focuses on the topics that provided the foundation for the new visualization of physics.  It begins with Carl Gauss and Bernhard Riemann, who redefined geometry and identified the importance of curvature for physics.  Vector spaces, developed by Hermann Grassmann, Giuseppe Peano and David Hilbert, are examples of the kinds of abstract new spaces that are so important for modern physics, such as Hilbert space for quantum mechanics.  Fractal geometry developed by Felix Hausdorff later provided the geometric language needed to solve problems in chaos theory.

1629    Fermat described higher-dim loci

1637    Descarte’s Geometry

1649    van Schooten’s commentary on Descartes Geometry

1694    Leibniz uses word “coordinate” in its modern usage

1697    Johann Bernoulli shortest distance between two points on convex surface

1732    Euler geodesic equations for implicit surfaces

1748    Euler defines modern usage of function

1801    Gauss calculates orbit of Ceres

1807    Fourier analysis (published in 1822(

1807    Gauss arrives in Göttingen

1827    Karl Gauss establishes differential geometry of curved surfaces, Disquisitiones generales circa superficies curvas

1830    Bolyai and Lobachevsky publish on hyperbolic geometry

1834    Jacobi n-fold integrals and volumes of n-dim spheres

1836    Liouville-Sturm theorem

1838    Liouville’s theorem

1841    Jacobi determinants

1843    Arthur Cayley systems of n-variables

1843    Hamilton discovers quaternions

1844    Hermann Grassman n-dim vector spaces, Die Lineale Ausdehnungslehr

1846    Julius Plücker System der Geometrie des Raumes in neuer analytischer Behandlungsweise

1848 Jacobi Vorlesungen über Dynamik

1848    “Vector” coined by Hamilton

1854    Riemann’s habilitation lecture

1861    Riemann n-dim solution of heat conduction

1868    Publication of Riemann’s Habilitation

1869    Christoffel and Lipschitz work on multiple dimensional analysis

1871    Betti refers to the n-ply of numbers as a “space”.

1871    Klein publishes on non-euclidean geometry

1872 Boltzmann distribution

1872    Jordan Essay on the geometry of n-dimensions

1872    Felix Klein’s “Erlangen Programme”

1872    Weierstrass’ Monster

1872    Dedekind cut

1872    Cantor paper on irrational numbers

1872    Cantor meets Dedekind

1872 Lipschitz derives mechanical motion as a geodesic on a manifold

1874    Cantor beginning of set theory

1877    Cantor one-to-one correspondence between the line and n-dimensional space

1881    Gibbs codifies vector analysis

1883    Cantor set and staircase Grundlagen einer allgemeinen Mannigfaltigkeitslehre

1884    Abbott publishes Flatland

1887    Peano vector methods in differential geometry

1890    Peano space filling curve

1891    Hilbert space filling curve

1887    Darboux vol. 2 treats dynamics as a point in d-dimensional space.  Applies concepts of geodesics for trajectories.

1898    Ricci-Curbastro Lesons on the Theory of Surfaces

1902    Lebesgue integral

1904    Hilbert studies integral equations

1904    von Koch snowflake

1906    Frechet thesis on square summable sequences as infinite dimensional space

1908    Schmidt Geometry in a Function Space

1910    Brouwer proof of dimensional invariance

1913    Hilbert space named by Riesz

1914    Hilbert space used by Hausdorff

1915    Sierpinski fractal triangle

1918    Hausdorff non-integer dimensions

1918    Weyl’s book Space, Time, Matter

1918    Fatou and Julia fractals

1920    Banach space

1927    von Neumann axiomatic form of Hilbert Space

1935    Frechet full form of Hilbert Space

1967    Mandelbrot coast of Britain

1982    Mandelbrot’s book The Fractal Geometry of Nature

The Tangled Tale of Phase Space

Phase space is the central visualization tool used today to study complex systems.  The chapter describes the origins of phase space with the work of Joseph Liouville and Carl Jacobi that was later refined by Ludwig Boltzmann and Rudolf Clausius in their attempts to define and explain the subtle concept of entropy.  The turning point in the history of phase space was when Henri Poincaré used phase space to solve the three-body problem, uncovering chaotic behavior in his quest to answer questions on the stability of the solar system.  Phase space was established as the central paradigm of statistical mechanics by JW Gibbs and Paul Ehrenfest.

1804    Jacobi born (1904 – 1851) in Potsdam

1804    Napoleon I Emperor of France

1806    William Rowan Hamilton born (1805 – 1865)

1807    Thomas Young describes “Energy” in his Course on Natural Philosophy (Vol. 1 and Vol. 2)

1808    Bethoven performs his Fifth Symphony

1809    Joseph Liouville born (1809 – 1882)

1821    Hermann Ludwig Ferdinand von Helmholtz born (1821 – 1894)

1824    Carnot published Reflections on the Motive Power of Fire

1834    Jacobi n-fold integrals and volumes of n-dim spheres

1834-1835       Hamilton publishes his principle (1834, 1835).

1836    Liouville-Sturm theorem

1837    Queen Victoria begins her reign as Queen of England

1838    Liouville develops his theorem on products of n differentials satisfying certain first-order differential equations.  This becomes the classic reference to Liouville’s Theorem.

1847    Helmholtz  Conservation of Energy (force)

1849    Thomson makes first use of “Energy” (From reading Thomas Young’s lecture notes)

1850    Clausius establishes First law of Thermodynamics: Internal energy. Second law:  Heat cannot flow unaided from cold to hot.  Not explicitly stated as first and second laws

1851    Thomson names Clausius’ First and Second laws of Thermodynamics

1852    Thomson describes general dissipation of the universe (“energy” used in title)

1854    Thomson defined absolute temperature.  First mathematical statement of 2nd law.  Restricted to reversible processes

1854    Clausius stated Second Law of Thermodynamics as inequality

1857    Clausius constructs kinetic theory, Mean molecular speeds

1858    Clausius defines mean free path, Molecules have finite size. Clausius assumed that all molecules had the same speed

1860    Maxwell publishes first paper on kinetic theory. Distribution of speeds. Derivation of gas transport properties

1865    Loschmidt size of molecules

1865    Clausius names entropy

1868    Boltzmann adds (Boltzmann) factor to Maxwell distribution

1872    Boltzmann transport equation and H-theorem

1876    Loschmidt reversibility paradox

1877    Boltzmann  S = k logW

1890    Poincare: Recurrence Theorem. Recurrence paradox with Second Law (1893)

1896    Zermelo criticizes Boltzmann

1896    Boltzmann posits direction of time to save his H-theorem

1898    Boltzmann Vorlesungen über Gas Theorie

1905    Boltzmann kinetic theory of matter in Encyklopädie der mathematischen Wissenschaften

1906    Boltzmann dies

1910    Paul Hertz uses “Phase Space” (Phasenraum)

1911    Ehrenfest’s article in Encyklopädie der mathematischen Wissenschaften

1913    A. Rosenthal writes the first paper using the phrase “phasenraum”, combining the work of Boltzmann and Poincaré. “Beweis der Unmöglichkeit ergodischer Gassysteme” (Ann. D. Physik, 42, 796 (1913)

1913    Plancheral, “Beweis der Unmöglichkeit ergodischer mechanischer Systeme” (Ann. D. Physik, 42, 1061 (1913).  Also uses “Phasenraum”.

The Lens of Gravity

Gravity provided the backdrop for one of the most important paradigm shifts in the history of physics.  Prior to Albert Einstein’s general theory of relativity, trajectories were paths described by geometry.  After the theory of general relativity, trajectories are paths caused by geometry.  This chapter explains how Einstein arrived at his theory of gravity, relying on the space-time geometry of Hermann Minkowski, whose work he had originally harshly criticized.  The confirmation of Einstein’s theory was one of the dramatic high points in 20th century history of physics when Arthur Eddington journeyed to an island off the coast of Africa to observe stellar deflections during a solar eclipse.  If Galileo was the first rock star of physics, then Einstein was the first worldwide rock star of science.

1697    Johann Bernoulli was first to find solution to shortest path between two points on a curved surface (1697).

1728    Euler found the geodesic equation.

1783    The pair 40 Eridani B/C was discovered by William Herschel on 31 January

1783    John Michell explains infalling object would travel faster than speed of light

1796    Laplace describes “dark stars” in Exposition du system du Monde

1827    The first orbit of a binary star computed by Félix Savary for the orbit of Xi Ursae Majoris.

1827    Gauss curvature Theoriem Egregum

1844    Bessel notices periodic displacement of Sirius with period of half a century

1844    The name “geodesic line” is attributed to Liouville.

1845    Buys Ballot used musicians with absolute pitch for the first experimental verification of the Doppler effect

1854    Riemann’s habilitationsschrift

1862    Discovery of Sirius B (a white dwarf)

1868    Darboux suggested motions in n-dimensions

1872    Lipshitz first to apply Riemannian geometry to the principle of least action.

1895    Hilbert arrives in Göttingen

1902    Minkowski arrives in Göttingen

1905    Einstein’s miracle year

1906    Poincaré describes Lorentz transformations as rotations in 4D

1907    Einstein has “happiest thought” in November

1907    Einstein’s relativity review in Jahrbuch

1908    Minkowski’s Space and Time lecture

1908    Einstein appointed to unpaid position at University of Bern

1909    Minkowski dies

1909    Einstein appointed associate professor of theoretical physics at U of Zürich

1910    40 Eridani B was discobered to be of spectral type A (white dwarf)

1910    Size and mass of Sirius B determined (heavy and small)

1911    Laue publishes first textbook on relativity theory

1911    Einstein accepts position at Prague

1911    Einstein goes to the limits of special relativity applied to gravitational fields

1912    Einstein’s two papers establish a scalar field theory of gravitation

1912    Einstein moves from Prague to ETH in Zürich in fall.  Begins collaboration with Grossmann.

1913    Einstein EG paper

1914    Adams publishes spectrum of 40 Eridani B

1915    Sirius B determined to be also a low-luminosity type A white dwarf

1915    Einstein Completes paper

1916    Density of 40 Eridani B by Ernst Öpik

1916    Schwarzschild paper

1916 Einstein’s publishes theory of gravitational waves

1919    Eddington expedition to Principe

1920    Eddington paper on deflection of light by the sun

1922    Willem Luyten coins phrase “white dwarf”

1924    Eddington found a set of coordinates that eliminated the singularity at the Schwarzschild radius

1926    R. H. Fowler publishes paper on degenerate matter and composition of white dwarfs

1931    Chandrasekhar calculated the limit for collapse to white dwarf stars at 1.4MS

1933    Georges Lemaitre states the coordinate singularity was an artefact

1934    Walter Baade and Fritz Zwicky proposed the existence of the neutron star only a year after the discovery of the neutron by Sir James Chadwick.

1939    Oppenheimer and Snyder showed ultimate collapse of a 3MS  “frozen star”

1958    David Finkelstein paper

1965    Antony Hewish and Samuel Okoye discovered “an unusual source of high radio brightness temperature in the Crab Nebula”. This source turned out to be the Crab Nebula neutron star that resulted from the great supernova of 1054.

1967    Jocelyn Bell and Antony Hewish discovered regular radio pulses from CP 1919. This pulsar was later interpreted as an isolated, rotating neutron star.

1967    Wheeler’s “black hole” talk

1974    Joseph Taylor and Russell Hulse discovered the first binary pulsar, PSR B1913+16, which consists of two neutron stars (one seen as a pulsar) orbiting around their center of mass.

2015    LIGO detects gravitational waves on Sept. 14 from the merger of two black holes

2017    LIGO detects the merger of two neutron stars

On the Quantum Footpath

The concept of the trajectory of a quantum particle almost vanished in the battle between Werner Heisenberg’s matrix mechanics and Erwin Schrödinger’s wave mechanics.  It took Niels Bohr and his complementarity principle of wave-particle duality to cede back some reality to quantum trajectories.  However, Schrödinger and Einstein were not convinced and conceived of quantum entanglement to refute the growing acceptance of the Copenhagen Interpretation of quantum physics.  Schrödinger’s cat was meant to be an absurdity, but ironically it has become a central paradigm of practical quantum computers.  Quantum trajectories took on new meaning when Richard Feynman constructed quantum theory based on the principle of least action, inventing his famous Feynman Diagrams to help explain quantum electrodynamics.

1885    Balmer Theory: 

1897    J. J. Thomson discovered the electron

1904    Thomson plum pudding model of the atom

1911    Bohr PhD thesis filed. Studies on the electron theory of metals.  Visited England.

1911    Rutherford nuclear model

1911    First Solvay conference

1911    “ultraviolet catastrophe” coined by Ehrenfest

1913    Bohr combined Rutherford’s nuclear atom with Planck’s quantum hypothesis: 1913 Bohr model

1913    Ehrenfest adiabatic hypothesis

1914-1916       Bohr at Manchester with Rutherford

1916    Bohr appointed Chair of Theoretical Physics at University of Copenhagen: a position that was made just for him

1916    Schwarzschild and Epstein introduce action-angle coordinates into quantum theory

1920    Heisenberg enters University of Munich to obtain his doctorate

1920    Bohr’s Correspondence principle: Classical physics for large quantum numbers

1921    Bohr Founded Institute of Theoretical Physics (Copenhagen)

1922-1923       Heisenberg studies with Born, Franck and Hilbert at Göttingen while Sommerfeld is in the US on sabbatical.

1923    Heisenberg Doctorate.  The exam does not go well.  Unable to derive the resolving power of a microscope in response to question by Wien.  Becomes Born’s assistant at Göttingen.

1924    Heisenberg visits Niels Bohr in Copenhagen (and met Einstein?)

1924    Heisenberg Habilitation at Göttingen on anomalous Zeeman

1924 – 1925    Heisenberg worked with Bohr in Copenhagen, returned summer of 1925 to Göttiingen

1924    Pauli exclusion principle and state occupancy

1924    de Broglie hypothesis extended wave-particle duality to matter

1924    Bohr Predicted Halfnium (72)

1924    Kronig’s proposal for electron self spin

1924    Bose (Einstein)

1925    Heisenberg paper on quantum mechanics

1925    Dirac, reading proof from Heisenberg, recognized the analogy of noncommutativity with Poisson brackets and the correspondence with Hamiltonian mechanics.

1925    Uhlenbeck and Goudschmidt: spin

1926    Born, Heisenberg, Kramers: virtual oscillators at transition frequencies: Matrix mechanics (alternative to Bohr-Kramers-Slater 1924 model of orbits).  Heisenberg was Born’s student at Göttingen.

1926    Schrödinger wave mechanics

1927    de Broglie hypotehsis confirmed by Davisson and Germer

1927    Complementarity by Bohr: wave-particle duality “Evidence obtained under different experimental conditions cannot be comprehended within a single picture, but must be regarded as complementary in the sense that only the totality of the phenomena exhausts the possible information about the objects.

1927    Heisenberg uncertainty principle (Heisenberg was in Copenhagen 1926 – 1927)

1927    Solvay Conference in Brussels

1928    Heisenberg to University of Leipzig

1928    Dirac relativistic QM equation

1929    de Broglie Nobel Prize

1930    Solvay Conference

1932    Heisenberg Nobel Prize

1932    von Neumann operator algebra

1933    Dirac Lagrangian form of QM (basis of Feynman path integral)

1933    Schrödinger and Dirac Nobel Prize

1935    Einstein, Poldolsky and Rosen EPR paper

1935 Bohr’s response to Einsteins “EPR” paradox

1935    Schrodinger’s cat

1939    Feynman graduates from MIT

1941    Heisenberg (head of German atomic project) visits Bohr in Copenhagen

1942    Feynman PhD at Princeton, “The Principle of Least Action in Quantum Mechanics

1942 – 1945    Manhattan Project, Bethe-Feynman equation for fission yield

1943    Bohr escapes to Sweden in a fishing boat.  Went on to England secretly.

1945    Pauli Nobel Prize

1945    Death of Feynman’s wife Arline (married 4 years)

1945    Fall, Feynman arrives at Cornell ahead of Hans Bethe

1947    Shelter Island conference: Lamb Shift, did Kramer’s give a talk suggesting that infinities could be subtracted?

1947    Fall, Dyson arrives at Cornell

1948    Pocono Manor, Pennsylvania, troubled unveiling of path integral formulation and Feynman diagrams, Schwinger’s master presentation

1948    Feynman and Dirac. Summer drive across the US with Dyson

1949    Dyson joins IAS as a postdoc, trains a cohort of theorists in Feynman’s technique

1949    Karplus and Kroll first g-factor calculation

1950    Feynman moves to Cal Tech

1965    Schwinger, Tomonaga and Feynman Nobel Prize

1967    Hans Bethe Nobel Prize

From Butterflies to Hurricanes

Half a century after Poincaré first glimpsed chaos in the three-body problem, the great Russian mathematician Andrey Kolmogorov presented a sketch of a theorem that could prove that orbits are stable.  In the hands of Vladimir Arnold and Jürgen Moser, this became the KAM theory of Hamiltonian chaos.  This chapter shows how KAM theory fed into topology in the hands of Stephen Smale and helped launch the new field of chaos theory.  Edward Lorenz discovered chaos in numerical models of atmospheric weather and discovered the eponymous strange attractor.  Mathematical aspects of chaos were further developed by Mitchell Feigenbaum studying bifurcations in the logistic map that describes population dynamics.

1760    Euler 3-body problem (two fixed centers and coplanar third body)

1763    Euler colinear 3-body problem

1772    Lagrange equilateral 3-body problem

1881-1886       Poincare memoires “Sur les courbes de ́finies par une equation differentielle”

1890    Poincare “Sur le probleme des trois corps et les equations de la dynamique”. First-return map, Poincare recurrence theorem, stable and unstable manifolds

1892 – 1899    Poincare New Methods in Celestial Mechanics

1892    Lyapunov The General Problem of the Stability of Motion

1899    Poincare homoclinic trajectory

1913    Birkhoff proves Poincaré’s last geometric theorem, a special case of the three-body problem.

1927    van der Pol and van der Mark

1937    Coarse systems, Andronov and Pontryagin

1938    Morse theory

1942    Hopf bifurcation

1945    Cartwright and Littlewood study the van der Pol equation (Radar during WWII)

1954    Kolmogorov A. N., On conservation of conditionally periodic motions for a small change in Hamilton’s function.

1960    Lorenz: 12 equations

1962    Moser On Invariant Curves of Area-Preserving Mappings of an Annulus.

1963    Arnold Small denominators and problems of the stability of motion in classical and celestial mechanics

1963    Lorenz: 3 equations

1964    Arnold diffusion

1965    Smale’s horseshoe

1969    Chirikov standard map

1971    Ruelle-Takens (Ruelle coins phrase “strange attractor”)

1972    “Butterfly Effect” given for Lorenz’ talk (by Philip Merilees)

1975    Gollub-Swinney observe route to turbulence along lines of Ruelle

1975    Yorke coins “chaos theory”

1976    Robert May writes review article of the logistic map

1977    New York conference on bifurcation theory

1987    James Gleick Chaos: Making a New Science

Darwin in the Clockworks

The preceding timelines related to the central role played by families of trajectories phase space to explain the time evolution of complex systems.  These ideas are extended to explore the history and development of the theory of natural evolution by Charles Darwin.  Darwin had many influences, including ideas from Thomas Malthus in the context of economic dynamics.  After Darwin, the ideas of evolution matured to encompass broad topics in evolutionary dynamics and the emergence of the idea of fitness landscapes and game theory driving the origin of new species.  The rise of genetics with Gregor Mendel supplied a firm foundation for molecular evolution, leading to the moleculer clock of Linus Pauling and the replicator dynamics of Richard Dawkins.

1202    Fibonacci

1766    Thomas Robert Malthus born

1776    Adam Smith The Wealth of Nations

1798    Malthus “An Essay on the Principle of Population

1817    Ricardo Principles of Political Economy and Taxation

1838    Cournot early equilibrium theory in duopoly

1848    John Stuart Mill

1848    Karl Marx Communist Manifesto

1859    Darwin Origin of Species

1867    Karl Marx Das Kapital

1871    Darwin Descent of Man, and Selection in Relation to Sex

1871    Jevons Theory of Political Economy

1871    Menger Principles of Economics

1874    Walrus Éléments d’économie politique pure, or Elements of Pure Economics (1954)

1890    Marshall Principles of Economics

1908    Hardy constant genetic variance

1910    Brouwer fixed point theorem

1910    Alfred J. Lotka autocatylitic chemical reactions

1913    Zermelo determinancy in chess

1922    Fisher dominance ratio

1922    Fisher mutations

1925    Lotka predator-prey in biomathematics

1926    Vita Volterra published same equations independently

1927    JBS Haldane (1892—1964) mutations

1928    von Neumann proves the minimax theorem

1930    Fisher ratio of sexes

1932    Wright Adaptive Landscape

1932    Haldane The Causes of Evolution

1933    Kolmogorov Foundations of the Theory of Probability

1934    Rudolph Carnap The Logical Syntax of Language

1936    John Maynard Keynes, The General Theory of Employment, Interest and Money

1936    Kolmogorov generalized predator-prey systems

1938    Borel symmetric payoff matrix

1942    Sewall Wright    Statistical Genetics and Evolution

1943    McCulloch and Pitts A Logical Calculus of Ideas Immanent in Nervous Activity

1944    von Neumann and Morgenstern Theory of Games and Economic Behavior

1950    Prisoner’s Dilemma simulated at Rand Corportation

1950    John Nash Equilibrium points in n-person games and The Bargaining Problem

1951    John Nash Non-cooperative Games

1952    McKinsey Introduction to the Theory of Games (first textbook)

1953    John Nash Two-Person Cooperative Games

1953    Watson and Crick DNA

1955    Braithwaite’s Theory of Games as a Tool for the Moral Philosopher

1961    Lewontin Evolution and the Theory of Games

1962    Patrick Moran The Statistical Processes of Evolutionary Theory

1962    Linus Pauling molecular clock

1968    Motoo Kimura  neutral theory of molecular evolution

1972    Maynard Smith introduces the evolutionary stable solution (ESS)

1972    Gould and Eldridge Punctuated equilibrium

1973    Maynard Smith and Price The Logic of Animal Conflict

1973    Black Scholes

1977    Eigen and Schuster The Hypercycle

1978    Replicator equation (Taylor and Jonker)

1982    Hopfield network

1982    John Maynard Smith Evolution and the Theory of Games

1984    R. Axelrod The Evolution of Cooperation

The Measure of Life

This final topic extends the ideas of dynamics into abstract spaces of high dimension to encompass the idea of a trajectory of life.  Health and disease become dynamical systems defined by all the proteins and nucleic acids that comprise the physical self.  Concepts from network theory, autonomous oscillators and synchronization contribute to this viewpoint.  Healthy trajectories are like stable limit cycles in phase space, but disease can knock the system trajectory into dangerous regions of health space, as doctors turn to new developments in personalized medicine try to return the individual to a healthy path.  This is the ultimate generalization of Galileo’s simple parabolic trajectory.

1642    Galileo dies

1656    Huygens invents pendulum clock

1665    Huygens observes “odd kind of sympathy” in synchronized clocks

1673    Huygens publishes Horologium Oscillatorium sive de motu pendulorum

1736    Euler Seven Bridges of Königsberg

1845    Kirchhoff’s circuit laws

1852    Guthrie four color problem

1857    Cayley trees

1858    Hamiltonian cycles

1887    Cajal neural staining microscopy

1913    Michaelis Menten dynamics of enzymes

1924    Berger, Hans: neural oscillations (Berger invented the EEG)

1926    van der Pol dimensioness form of equation

1927    van der Pol periodic forcing

1943    McCulloch and Pits mathematical model of neural nets

1948    Wiener cybernetics

1952    Hodgkin and Huxley action potential model

1952    Turing instability model

1956    Sutherland cyclic AMP

1957    Broadbent and Hammersley bond percolation

1958    Rosenblatt perceptron

1959    Erdös and Renyi random graphs

1962    Cohen EGF discovered

1965    Sebeok coined zoosemiotics

1966    Mesarovich systems biology

1967    Winfree biological rythms and coupled oscillators

1969    Glass Moire patterns in perception

1970    Rodbell G-protein

1971    phrase “strange attractor” coined (Ruelle)

1972    phrase “signal transduction” coined (Rensing)

1975    phrase “chaos theory” coined (Yorke)

1975    Werbos backpropagation

1975    Kuramoto transition

1976    Robert May logistic map

1977    Mackey-Glass equation and dynamical disease

1982    Hopfield network

1990    Strogatz and Murillo pulse-coupled oscillators

1997    Tomita systems biology of a cell

1998    Strogatz and Watts Small World network

1999    Barabasi Scale Free networks

2000    Sequencing of the human genome

Physics in the Age of Contagion. Part 3: Testing and Tracing COVID-19

In the midst of this COVID crisis (and the often botched governmental responses to it), there have been several success stories: Taiwan, South Korea, Australia and New Zealand stand out. What are the secrets to their success? First, is the willingness of the population to accept the seriousness of the pandemic and to act accordingly. Second, is a rapid and coherent (and competent) governmental response. Third, is biotechnology and the physics of ultra-sensitive biomolecule detection.

Antibody Testing

A virus consists a protein package called a capsid that surrounds polymers of coding RNA. Protein molecules on the capsid are specific to the virus and are the key to testing whether a person has been exposed to the virus. These specific molecules are called antigens, and the body produces antibodies — large biomolecules — that are rapidly evolved by the immune system and released into the blood system to recognize and bind to the antigen. The recognition and binding is highly specific (though not perfect) to the capsid proteins of the virus, so that other types of antibodies (produced to fend off other infections) tend not to bind to it. This specificity enables antibody testing.

In principle, all one needs to do is isolate the COVID-19 antigen, bind it to a surface, and run a sample of a patient’s serum (the part of the blood without the blood cells) over the same surface. If the patient has produced antibodies against the COVID-19, these antibodies will attach to the antigens stuck to the surface. After washing away the rest of the serum, what remains are anti-COVID antibodies attached to the antigens bound to the surface. The next step is to determine whether these antibodies have been bound to the surface or not.

Fig. 1 Schematic of an antibody macromolecule. The total height of the molecule is about 3 nanometers. The antigen binding sites are at the ends of the upper arms.

At this stage, there are many possible alternative technologies to detecting the bound antibodies (see section below on the physics of the BioCD for one approach). A conventional detection approach is known as ELISA (Enzyme-linked immunosorbant assay). To detect the bound antibody, a secondary antibody that binds to human antibodies is added to the test well. This secondary antibody contains either a fluorescent molecular tag or an enzyme that causes the color of the well to change (kind of like how a pregnancy test causes a piece of paper to change color). If the COVID antigen binds antibodies from the patient serum, then this second antibody will bind to the first and can be detected by fluorescence or by simple color change.

The technical challenges associated with antibody assays relate to false positives and false negatives. A false positive happens when the serum is too sticky and some antibodies NOT against COVID tend to stick to the surface of the test well. This is called non-specific binding. The secondary antibodies bind to these non-specifically-bound antibodies and a color change reports a detection, when in fact no COVID-specific antibodies were there. This is a false positive — the patient had not been exposed, but the test says they were.

On the other hand, a false negative occurs when the patient serum is possibly too dilute and even though anti-COVID antibodies are present, they don’t bind sufficiently to the test well to be detectable. This is a false negative — the patient had been exposed, but the test says they weren’t. Despite how mature antibody assay technology is, false positives and false negatives are very hard to eliminate. It is fairly common for false rates to be in the range of 5% to 10% even for high-quality immunoassays. The finite accuracy of the tests must be considered when doing community planning for testing and tracking. But the bottom line is that even 90% accuracy on the test can do a lot to stop the spread of the infection. This is because of the geometry of social networks and how important it is to find and isolate the super spreaders.

Social Networks

The web of any complex set of communities and their interconnections aren’t just random. Whether in interpersonal networks, or networks of cities and states and nations, it’s like the world-wide-web where the most popular webpages get the most links. This is the same phenomenon that makes the rich richer and the poor poorer. It produces a network with a few hubs that have a large fraction of the links. A network model that captures this network topology is known as the Barabasi-Albert model for scale-free networks [1]. A scale-free network tends to have one node that has the most links, then a couple of nodes that have a little fewer links, then several more with even fewer, and so on, until there are a vary large number of nodes with just a single link each.

When it comes to pandemics, this type of network topology is both a curse and a blessing. It is a curse, because if the popular node becomes infected it tends to infect a large fraction of the rest of the network because it is so linked in. But it is a blessing, because if that node can be identified and isolated from the rest of the network, then the chance of the pandemic sweeping across the whole network can be significantly reduced. This is where testing and contact tracing becomes so important. You have to know who is infected and who they are connected with. Only then can you isolate the central nodes of the network and make a dent in the pandemic spread.

An example of a Barabasi-Albert network is shown in Fig. 2 fhavingor 128 nodes. Some nodes have many links out (and in) the number of links connecting a node is called the node degree. There are several nodes of very high degree (a degree around 25 in this case) but also very many nodes that have only a single link. It’s the high-degree nodes that matter in a pandemic. If they get infected, then they infect almost the entire network. This scale-free network structure emphasizes the formation of central high-degree nodes. It tends to hold for many social networks, but also can stand for cities across a nation. A city like New York has links all over the country (by flights), while my little town of Lafayette IN might be modeled by a single link to Indianapolis. That same scaling structure is seen across many scales from interactions among nations to interactions among citizens in towns.

Fig. 2 A scale-free network with 128 nodes. A few nodes have high degree, but most nodes have a degree of one.

Isolating the Super Spreaders

In the network of nodes in Fig. 2, each node can be considered as a “compartment” in a multi-compartment SIR model (see my previous blog for the two-compartment SIR model of COVID-19). The infection of each node depends on the SIR dynamics of that node, plus the infections coming in from links other infected nodes. The equations of the dynamics for each node are

where Aab is the adjacency matrix where self-connection is allowed (infection dynamics within a node) and the sums go over all the nodes of the network. In this model, the population of each node is set equal to the degree ka of the node. The spread of the pandemic across the network depends on the links and where the infection begins, but the overall infection is similar to the simple SIR model for a given average network degree

However, if the pandemic starts, but then the highest-degree node (the super spreader) is isolated (by testing and contact tracing), then the saturation of the disease across the network can be decreased in a much greater proportion than simply given by the population of the isolated node. For instance, in the simulation in Fig. 3, a node of degree 20 is removed at 50 days. The fraction of the population that is isolated is only 10%, yet the saturation of the disease across the whole network is decreased by more than a factor of 2.

Fig. 3 Scale-free network of 128 nodes. Solid curve is infection dynamics of the full network. Dashed curve is the infection when the highest-degree node was isolated at 50 days.

In a more realistic model with many more nodes, and full testing to allow the infected nodes and their connections to be isolated, the disease can be virtually halted. This is what was achieved in Taiwan and South Korea. The question is why the United States, with its technologically powerful companies and all their capabilities, was so unprepared or unwilling to achieve the same thing.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sat May 11 08:56:41 2019

@author: nolte

D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""

# https://www.python-course.eu/networkx.php
# https://networkx.github.io/documentation/stable/tutorial.html
# https://networkx.github.io/documentation/stable/reference/functions.html

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt
import networkx as nx
import time
from random import random

tstart = time.time()

plt.close('all')

betap = 0.014;
mu = 0.13;

print('beta = ',betap)
print('betap/mu = ',betap/mu)


N = 128      # 50


facoef = 2
k = 1
nodecouple = nx.barabasi_albert_graph(N, k, seed=None)

indhi = 0
deg = 0
for omloop in nodecouple.node:
    degtmp = nodecouple.degree(omloop)
    if degtmp > deg:
        deg = degtmp
        indhi = omloop
print('highest degree node = ',indhi)
print('highest degree = ',deg)

plt.figure(1)
colors = [(random(), random(), random()) for _i in range(10)]
nx.draw_circular(nodecouple,node_size=75, node_color=colors)
print(nx.info(nodecouple))
        
# function: omegout, yout = coupleN(G)
def coupleN(G,tlim):

    # function: yd = flow_deriv(x_y)
    def flow_deriv(x_y,t0):
        
        N = np.int(x_y.size/2)
        yd = np.zeros(shape=(2*N,))
        ind = -1
        for omloop in G.node:
            ind = ind + 1
            temp1 = -mu*x_y[ind] + betap*x_y[ind]*x_y[N+ind]
            temp2 =  -betap*x_y[ind]*x_y[N+ind]
            linksz = G.node[omloop]['numlink']
            for cloop in range(linksz):
                cindex = G.node[omloop]['link'][cloop]
                indx = G.node[cindex]['index']
                g = G.node[omloop]['coupling'][cloop]
                
                temp1 = temp1 + g*betap*x_y[indx]*x_y[N+ind]
                temp2 = temp2 - g*betap*x_y[indx]*x_y[N+ind]
            
            yd[ind] = temp1
            yd[N+ind] = temp2
                
        return yd
    # end of function flow_deriv(x_y)
    x0 = x_y
    t = np.linspace(0,tlim,tlim)      # 600  300
    y = integrate.odeint(flow_deriv, x0, t)        
    
    return t,y
    # end of function: omegout, yout = coupleN(G)

lnk = np.zeros(shape = (N,), dtype=int)
ind = -1
for loop in nodecouple.node:
    ind = ind + 1
    nodecouple.node[loop]['index'] = ind
    nodecouple.node[loop]['link'] = list(nx.neighbors(nodecouple,loop))
    nodecouple.node[loop]['numlink'] = len(list(nx.neighbors(nodecouple,loop)))
    lnk[ind] = len(list(nx.neighbors(nodecouple,loop)))

gfac = 0.1

ind = -1
for nodeloop in nodecouple.node:
    ind = ind + 1
    nodecouple.node[nodeloop]['coupling'] = np.zeros(shape=(lnk[ind],))
    for linkloop in range (lnk[ind]):
        nodecouple.node[nodeloop]['coupling'][linkloop] = gfac*facoef
            
x_y = np.zeros(shape=(2*N,))   
for loop in nodecouple.node:
    x_y[loop]=0
    x_y[N+loop]=nodecouple.degree(loop)
    #x_y[N+loop]=1
x_y[N-1 ]= 0.01
x_y[2*N-1] = x_y[2*N-1] - 0.01
N0 = np.sum(x_y[N:2*N]) - x_y[indhi] - x_y[N+indhi]
print('N0 = ',N0)
     
tlim0 = 600
t0,yout0 = coupleN(nodecouple,tlim0)                           # Here is the subfunction call for the flow


plt.figure(2)
plt.yscale('log')
plt.gca().set_ylim(1e-3, 1)
for loop in range(N):
    lines1 = plt.plot(t0,yout0[:,loop])
    lines2 = plt.plot(t0,yout0[:,N+loop])
    lines3 = plt.plot(t0,N0-yout0[:,loop]-yout0[:,N+loop])

    plt.setp(lines1, linewidth=0.5)
    plt.setp(lines2, linewidth=0.5)
    plt.setp(lines3, linewidth=0.5)
    

Itot = np.sum(yout0[:,0:127],axis = 1) - yout0[:,indhi]
Stot = np.sum(yout0[:,128:255],axis = 1) - yout0[:,N+indhi]
Rtot = N0 - Itot - Stot
plt.figure(3)
#plt.plot(t0,Itot,'r',t0,Stot,'g',t0,Rtot,'b')
plt.plot(t0,Itot/N0,'r',t0,Rtot/N0,'b')
#plt.legend(('Infected','Susceptible','Removed'))
plt.legend(('Infected','Removed'))
plt.hold

# Repeat but innoculate highest-degree node
x_y = np.zeros(shape=(2*N,))   
for loop in nodecouple.node:
    x_y[loop]=0
    x_y[N+loop]=nodecouple.degree(loop)
    #x_y[N+loop]=1
x_y[N-1] = 0.01
x_y[2*N-1] = x_y[2*N-1] - 0.01
N0 = np.sum(x_y[N:2*N]) - x_y[indhi] - x_y[N+indhi]
     
tlim0 = 50
t0,yout0 = coupleN(nodecouple,tlim0)


# remove all edges from highest-degree node
ee = list(nodecouple.edges(indhi))
nodecouple.remove_edges_from(ee)
print(nx.info(nodecouple))

#nodecouple.remove_node(indhi)        
lnk = np.zeros(shape = (N,), dtype=int)
ind = -1
for loop in nodecouple.node:
    ind = ind + 1
    nodecouple.node[loop]['index'] = ind
    nodecouple.node[loop]['link'] = list(nx.neighbors(nodecouple,loop))
    nodecouple.node[loop]['numlink'] = len(list(nx.neighbors(nodecouple,loop)))
    lnk[ind] = len(list(nx.neighbors(nodecouple,loop)))

ind = -1
x_y = np.zeros(shape=(2*N,)) 
for nodeloop in nodecouple.node:
    ind = ind + 1
    nodecouple.node[nodeloop]['coupling'] = np.zeros(shape=(lnk[ind],))
    x_y[ind] = yout0[tlim0-1,nodeloop]
    x_y[N+ind] = yout0[tlim0-1,N+nodeloop]
    for linkloop in range (lnk[ind]):
        nodecouple.node[nodeloop]['coupling'][linkloop] = gfac*facoef

    
tlim1 = 500
t1,yout1 = coupleN(nodecouple,tlim1)

t = np.zeros(shape=(tlim0+tlim1,))
yout = np.zeros(shape=(tlim0+tlim1,2*N))
t[0:tlim0] = t0
t[tlim0:tlim1+tlim0] = tlim0+t1
yout[0:tlim0,:] = yout0
yout[tlim0:tlim1+tlim0,:] = yout1


plt.figure(4)
plt.yscale('log')
plt.gca().set_ylim(1e-3, 1)
for loop in range(N):
    lines1 = plt.plot(t,yout[:,loop])
    lines2 = plt.plot(t,yout[:,N+loop])
    lines3 = plt.plot(t,N0-yout[:,loop]-yout[:,N+loop])

    plt.setp(lines1, linewidth=0.5)
    plt.setp(lines2, linewidth=0.5)
    plt.setp(lines3, linewidth=0.5)
    

Itot = np.sum(yout[:,0:127],axis = 1) - yout[:,indhi]
Stot = np.sum(yout[:,128:255],axis = 1) - yout[:,N+indhi]
Rtot = N0 - Itot - Stot
plt.figure(3)
#plt.plot(t,Itot,'r',t,Stot,'g',t,Rtot,'b',linestyle='dashed')
plt.plot(t,Itot/N0,'r',t,Rtot/N0,'b',linestyle='dashed')
#plt.legend(('Infected','Susceptible','Removed'))
plt.legend(('Infected','Removed'))
plt.xlabel('Days')
plt.ylabel('Fraction of Sub-Population')
plt.title('Network Dynamics for COVID-19')
plt.show()
plt.hold()

elapsed_time = time.time() - tstart
print('elapsed time = ',format(elapsed_time,'.2f'),'secs')

Caveats and Disclaimers

No effort in the network model was made to fit actual disease statistics. In addition, the network in Figs. 2 and 3 only has 128 nodes, and each node was a “compartment” that had its own SIR dynamics. This is a coarse-graining approach that would need to be significantly improved to try to model an actual network of connections across communities and states. In addition, isolating the super spreader in this model would be like isolating a city rather than an individual, which is not realistic. The value of a heuristic model is to gain a physical intuition about scales and behaviors without being distracted by details of the model.

Postscript: Physics of the BioCD

Because antibody testing has become such a point of public discussion, it brings to mind a chapter of my own life that was closely related to this topic. About 20 years ago my research group invented and developed an antibody assay called the BioCD [2]. The “CD” stood for “compact disc”, and it was a spinning-disk format that used laser interferometry to perform fast and sensitive measurements of antibodies in blood. We launched a start-up company called QuadraSpec in 2004 to commercialize the technology for large-scale human disease screening.

A conventional compact disc consists of about a billion individual nulling interferometers impressed as pits into plastic. When the read-out laser beam straddles one of the billion pits, it experiences a condition of perfect destructive interferences — a zero. But when it was not shining on a pit it experiences high reflection — a one. So as the laser scans across the surface of the disc as it spins, a series of high and low reflections read off bits of information. Because the disc spins very fast, the data rate is very high, and a billion bits can be read in a matter of minutes.

The idea struck me in late 1999 just before getting on a plane to spend a weekend in New York City: What if each pit were like a test tube, so that instead of reading bits of ones and zeros it could read tiny amounts of protein? Then instead of a billion ones and zeros the disc could read a billion protein concentrations. But nulling interferometers are the least sensitive way to measure something sensitively because it operates at a local minimum in the response curve. The most sensitive way to do interferometry is in the condition of phase quadrature when the signal and reference waves are ninety-degrees out of phase and where the response curve is steepest, as in Fig. 4 . Therefore, the only thing you need to turn a compact disc from reading ones and zeros to proteins is to reduce the height of the pit by half. In practice we used raised ridges of gold instead of pits, but it worked in the same way and was extremely sensitive to the attachment of small amounts of protein.

Fig. 4 Principle of the BioCD antibody assay. Reprinted from Ref. [3]

This first generation BioCD was literally a work of art. It was composed of a radial array of gold strips deposited on a silicon wafer. We were approached in 2004 by an art installation called “Massive Change” that was curated by the Vancouver Art Museum. The art installation travelled to Toronto and then to the Museum of Contemporary Art in Chicago, where we went to see it. Our gold-and-silicon BioCD was on display in a section on art in technology.

The next-gen BioCDs were much simpler, consisting simply of oxide layers on silicon wafers, but they were much more versatile and more sensitive. An optical scan of a printed antibody spot on a BioCD is shown in Fig. 5 The protein height is only about 1 nanometer (the diameter of the spot is 100 microns). Interferometry can measure a change in the height of the spot (caused by binding antibodies from patient serum) by only about 10 picometers averaged over the size of the spot. This exquisite sensitivity enabled us to detect tiny fractions of blood-born antigens and antibodies at the level of only a nanogram per milliliter.

Fig. 5 Interferometric measurement of a printed antibody spot on a BioCD. The spot height is about 1 nanomater and the diameter is about 100 microns. Interferometry can measure a change of height by about 10 picometers averaged over the spot.

The real estate on a 100 mm diameter disc was sufficient to do 100,000 antibody assays, which would be 256 protein targets across 512 patients on a single BioCD that would take only a few hours to finish reading!

Fig. 6 A single BioCD has the potential to measure hundreds of proteins or antibodies per patient with hundreds of patients per disc.

The potential of the BioCD for massively multiplexed protein measurements made it possible to imagine testing a single patient for hundreds of diseases in a matter of hours using only a few drops of blood. Furthermore, by being simple and cheap, the test would allow people to track their health over time to look for emerging health trends.

If this sounds familiar to you, you’re right. That’s exactly what the notorious company Theranos was promising investors 10 years after we first proposed this idea. But here’s the difference: We learned that the tech did not scale. It cost us $10M to develop a BioCD that could test for just 4 diseases. And it would cost more than an additional $10M to get it to 8 diseases, because the antibody chemistry is not linear. Each new disease that you try to test creates a combinatorics problem of non-specific binding with all the other antibodies and antigens. To scale the test up to 100 diseases on the single platform using only a few drops of blood would have cost us more than $1B of R&D expenses — if it was possible at all. So we stopped development at our 4-plex product and sold the technology to a veterinary testing company that uses it today to test for diseases like heart worm and Lymes disease in blood samples from pet animals.

Five years after we walked away from massively multiplexed antibody tests, Theranos proposed the same thing and took in more than $700M in US investment, but ultimately produced nothing that worked. The saga of Theranos and its charismatic CEO Elizabeth Holmes has been the topic of books and documentaries and movies like “The Inventor: Out for Blood in Silicon Valley” and a rumored big screen movie starring Jennifer Lawrence as Holmes.

The bottom line is that antibody testing is a difficult business, and ramping up rapidly to meet the demands of testing and tracing COVID-19 is going to be challenging. The key is not to demand too much accuracy per test. False positives are bad for the individual, because it lets them go about without immunity and they might get sick, and false negatives are bad, because it locks them in when they could be going about. But if an inexpensive test of only 90% accuracy (a level of accuracy that has already been called “unreliable” in some news reports) can be brought out in massive scale so that virtually everyone can be tested, and tested repeatedly, then the benefit to society would be great. In the scaling networks that tend to characterize human interactions, all it takes is a few high-degree nodes to be isolated to make infection rates plummet.

References

[1] A. L. Barabasi and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509-512, Oct 15 (1999)

[2] D. D. Nolte, “Review of centrifugal microfluidic and bio-optical disks,” Review Of Scientific Instruments, vol. 80, no. 10, p. 101101, Oct (2009)

[3] D. D. Nolte and F. E. Regnier, “Spinning-Disk Interferometry: The BioCD,” Optics and Photonics News, no. October 2004, pp. 48-53, (2004)

Physics in the Age of Contagion. Part 2: The Second Wave of COVID-19

Since my last Blog on the bifurcation physics of COVID-19, most of the US has approached the crest of “the wave”, with the crest arriving sooner in hot spots like New York City and a few weeks later in rural areas like Lafayette, Indiana where I live. As of the posting of this Blog, most of the US is in lock-down with only a few hold-out states. Fortunately, this was sufficient to avoid the worst case scenarios of my last Blog, but we are still facing severe challenges.

There is good news! The second wave can be managed and minimized if we don’t come out of lock-down too soon.

One fall-out of the (absolutely necessary) lock-down is the serious damage done to the economy that is now in its greatest retraction since the Great Depression. The longer the lock-down goes, the deeper the damage and the longer to recover. The single-most important question at this point in time, as we approach the crest, is when we can emerge from lock down? This is a critical question. If we emerge too early, then the pandemic will re-kindle into a second wave that could exceed the first. But if we emerge later than necessary, then the economy may take a decade to fully recover. We need a Goldilocks solution: not too early and not too late. How do we assess that?

The Value of Qualitative Physics

In my previous Blog I laid out a very simple model called the Susceptible-Infected-Removed (SIR) model and provided a Python program whose parameters can be tweaked to explore the qualitatitive behavior of the model, answering questions like: What is the effect of longer or shorter quarantine periods? What role does social distancing play in saving lives? What happens if only a small fraction of the population pays attention and practice social distancing?

It is necessary to wait from releasing the lock-down at least several weeks after the crest has passed to avoid the second wave.

It is important to note that none of the parameters in that SIR model are reliable and no attempt was made to fit the parameters to the actual data. To expert epidemiological modelers, this simplistic model is less than useless and potentially dangerous if wrong conclusions are arrived at and disseminated on the internet.

But here is the point: The actual numbers are less important than the functional dependences. What matters is how the solution changes as a parameter is changed. The Python programs allow non-experts to gain an intuitive understanding of the qualitative physics of the pandemic. For instance, it is valuable to gain a feeling of how sensitive the pandemic is to small changes in parameters. This is especially important because of the bifurcation physics of COVID-19 where very small changes can cause very different trajectories of the population dynamics.

In the spirit of the value of qualitative physics, I am extending here that simple SIR model to a slightly more sophisticated model that can help us understand the issues and parametric dependences of this question of when to emerge from lock-down. Again, no effort is made to fit actual data of this pandemic, but there are still important qualitative conclusions to be made.

The Two-Compartment SIR Model of COVID-19

To approach a qualitative understanding of what happens by varying the length of time of the country-wide shelter-in-place, it helps to think of two cohorts of the public: those who are compliant and conscientious valuing the lives of others, and those who don’t care and are non-compliant.

Fig. 1 Two-compartment SIR model for compliant and non-compliant cohorts.

These two cohorts can each be modeled separately by their own homogeneous SIR models, but with a coupling between them because even those who shelter in place must go out for food and medicines. The equations of this two-compartment model are

where n and q refer to the non-compliant and the compliant cohorts, respectively. I and S are the susceptible populations. The coupling parameters are knn for the coupling between non-compliants individuals, knq for the effect of the compliant individuals on the non-compliant, kqn for the effect of the non-compliant individuals on the compliant, and kqq for the effect of the compliant cohort on themselves.

There are two time frames for the model. The first time frame is the time of lock-down when the compliant cohort is sheltering in place and practicing good hygiene, but they still need to go out for food and medicines. (This model does not include the first responders. They are an important cohort, but do not make up a large fraction of the national population). The second time frame is after the lock-down is removed. Even then, good practices by the compliant group are expected to continue with the purpose to lower infections among themselves and among others.

This two-compartment model has roughly 8 adjustable parameters, all of which can be varied to study their effects on the predictions. None of them are well known, but general trends still can be explored.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sat March 21 2020

@author: nolte

D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)

"""

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt

plt.close('all')

print(' ')
print('SIR.py')

def solve_flow(param,max_time=1000.0):

    def flow_deriv(x_y_z_w,tspan):
        In, Sn, Iq, Sq = x_y_z_w
        
        Inp = -mu*In + beta*knn*In*Sn + beta*knq*Iq*Sn
        Snp = -beta*knn*In*Sn - beta*knq*Iq*Sn
        
        Iqp = -mu*Iq + beta*kqn*In*Sq + beta*kqq*Iq*Sq
        Sqp = -beta*kqn*In*Sq - beta*kqq*Iq*Sq
        
        return [Inp, Snp, Iqp, Sqp]
    
    x0 = [In0, Sn0, Iq0, Sq0]
    
    # Solve for the trajectories
    t = np.linspace(tlo, thi, thi-tlo)
    x_t = integrate.odeint(flow_deriv, x0, t)

   
    return t, x_t

beta = 0.02   # infection rate
dill = 5      # mean days infectious
mu = 1/dill   # decay rate
fnq = 0.3     # fraction not quarantining
fq = 1-fnq    # fraction quarantining
P = 330       # Population of the US in millions
mr = 0.002    # Mortality rate
dq = 90       # Days of lock-down (this is the key parameter)

# During quarantine
knn = 50      # Average connections per day for non-compliant group among themselves
kqq = 0       # Connections among compliant group
knq = 0       # Effect of compliaht group on non-compliant
kqn = 5       # Effect of non-clmpliant group on compliant

initfrac = 0.0001          # Initial conditions:
In0 = initfrac*fnq         # infected non-compliant
Sn0 = (1-initfrac)*fnq     # susceptible non-compliant
Iq0 = initfrac*fq          # infected compliant
Sq0 = (1-initfrac)*fq      # susceptivle compliant

tlo = 0
thi = dq

param = (mu, beta, knn, knq, kqn, kqq)    # flow parameters

t1, y1 = solve_flow(param)

In1 = y1[:,0]
Sn1 = y1[:,1]
Rn1 = fnq - In1 - Sn1
Iq1 = y1[:,2]
Sq1 = y1[:,3]
Rq1 = fq - Iq1 - Sq1

# Lift the quarantine: Compliant group continues social distancing
knn = 50      # Adjusted coupling parameters
kqq = 5
knq = 20
kqn = 15

fin1 = len(t1)
In0 = In1[fin1-1]
Sn0 = Sn1[fin1-1]
Iq0 = Iq1[fin1-1]
Sq0 = Sq1[fin1-1]

tlo = fin1
thi = fin1 + 365-dq

param = (mu, beta, knn, knq, kqn, kqq)

t2, y2 = solve_flow(param)

In2 = y2[:,0]
Sn2 = y2[:,1]
Rn2 = fnq - In2 - Sn2
Iq2 = y2[:,2]
Sq2 = y2[:,3]
Rq2 = fq - Iq2 - Sq2

fin2 = len(t2)
t = np.zeros(shape=(fin1+fin2,))
In = np.zeros(shape=(fin1+fin2,))
Sn = np.zeros(shape=(fin1+fin2,))
Rn = np.zeros(shape=(fin1+fin2,))
Iq = np.zeros(shape=(fin1+fin2,))
Sq = np.zeros(shape=(fin1+fin2,))
Rq = np.zeros(shape=(fin1+fin2,))

t[0:fin1] = t1
In[0:fin1] = In1
Sn[0:fin1] = Sn1
Rn[0:fin1] = Rn1
Iq[0:fin1] = Iq1
Sq[0:fin1] = Sq1
Rq[0:fin1] = Rq1


t[fin1:fin1+fin2] = t2
In[fin1:fin1+fin2] = In2
Sn[fin1:fin1+fin2] = Sn2
Rn[fin1:fin1+fin2] = Rn2
Iq[fin1:fin1+fin2] = Iq2
Sq[fin1:fin1+fin2] = Sq2
Rq[fin1:fin1+fin2] = Rq2

plt.figure(1)
lines = plt.semilogy(t,In,t,Iq,t,(In+Iq))
plt.ylim([0.0001,.1])
plt.xlim([0,thi])
plt.legend(('Non-compliant','Compliant','Total'))
plt.setp(lines, linewidth=0.5)
plt.xlabel('Days')
plt.ylabel('Infected')
plt.title('Infection Dynamics for COVID-19 in US')
plt.show()

plt.figure(2)
lines = plt.semilogy(t,Rn*P*mr,t,Rq*P*mr)
plt.ylim([0.001,1])
plt.xlim([0,thi])
plt.legend(('Non-compliant','Compliant'))
plt.setp(lines, linewidth=0.5)
plt.xlabel('Days')
plt.ylabel('Deaths')
plt.title('Total Deaths for COVID-19 in US')
plt.show()

D = P*mr*(Rn[fin1+fin2-1] + Rq[fin1+fin2-1])
print('Deaths = ',D)

plt.figure(3)
lines = plt.semilogy(t,In/fnq,t,Iq/fq)
plt.ylim([0.0001,.1])
plt.xlim([0,thi])
plt.legend(('Non-compliant','Compliant'))
plt.setp(lines, linewidth=0.5)
plt.xlabel('Days')
plt.ylabel('Fraction of Sub-Population')
plt.title('Population Dynamics for COVID-19 in US')
plt.show()

Trends

The obvious trend to explore is the effect of changing the quarantine period. Fig. 2 shows the results of a an early release from shelter-in-place compared to pushing the release date one month longer. The trends are:

  • If the lock-down is released early, the second wave can be larger than the first wave
  • If the lock-down is released early, the compliant cohort will be mostly susceptible and will have the majority of new cases
  • There are 40% more deaths when the lock-down is released early

If the lock-down is ended just after the crest, this is too early. It is necessary to wait at least several weeks after the crest has passed to avoid the second wave. There are almost 40% more deaths for the 90-day period than the 120-day period. In addition, for the case when the quarantine is stopped too early, the compliant cohort, since they are the larger fraction and are mostly susceptible, will suffer a worse number of new infections than the non-compliant group who put them at risk in the first place. In addition, the second wave for the compliant group would be worse than the first wave. This would be a travesty! But by pushing the quarantine out by just 1 additional month, the compliant group will suffer fewer total deaths than the non-compliant group. Most importantly, the second wave would be substantially smaller than the first wave for both cohorts.

Fig. 2 Comparison of 90-day quarantine versus 120-day quarantine for the compliant and non-compliant cohort of individuals . When the ban is lifted too soon, the second wave can be bigger than the first. This model assumes that 30% of the population are non-compliant and that the compliant group continues to practice social distancing.

The lesson from this simple model is simple: push the quarantine date out as far as the economy can allow! There is good news! The second wave can be managed and minimized if we don’t come out of lock-down too soon.

Caveats and Disclaimers

This model is purely qualitative and only has value for studying trends that depend on changing parameters. Absolute numbers are not meant to be taken too seriously. For instance, the total number of deaths in this model are about 2x larger than what we are hearing from Dr. Fauci of NIAID at this time, so this simple model overestimates fatalities. Also, it doesn’t matter whether the number of quarantine days should be 60, 90 or 120 … what matters is that an additional month makes a large difference in total number of deaths. If someone does want to model the best possible number of quarantine days — the Goldilocks solution — then they need to get their hands on a professional epidemiological model (or an actual epidemiologist). The model presented here is not appropriate for that purpose.

Note added in postscript on April 8: Since posting the original blog on April 6, Dr, Fauci announced that as many as 90% of individuals are practicing some form of social distancing. In addition, many infections are not being reported because of lack of testing, which means that the mortality rate is lower than thought. Therefore, I have changed the mortality rate and figures with numbers that better reflect the current situation (that is changing daily), but still without any attempt to fit the numerous other parameters.

Physics in the Age of Contagion: The Bifurcation of COVID-19

We are at War! That may sound like a cliche, but more people in the United States may die over the next year from COVID-19 than US soldiers have died in all the wars ever fought in US history. It is a war against an invasion by an alien species that has no remorse and gives no quarter. In this war, one of our gravest enemies, beyond the virus, is misinformation. The Internet floods our attention with half-baked half-truths. There may even be foreign powers that see this time of crisis as an opportunity to sow fear through disinformation to divide the country.

Because of the bifurcation physics of the SIR model of COVID-19, small changes in personal behavior (if everyone participates) can literally save Millions of lives!

At such times, physicists may be tapped to help the war effort. This is because physicists have unique skill sets that help us see through the distractions of details to get to the essence of the problem. Our solutions are often back-of-the-envelope, but that is their strength. We can see zeroth-order results stripped bare of all the obfuscating minutia.

One way physicists can help in this war is to shed light on how infections percolate through a population and to provide estimates on the numbers involved. Perhaps most importantly, we can highlight what actions ordinary citizens can take that best guard against the worst-case scenarios of the pandemic. The zeroth-oder solutions may not say anything new that the experts don’t already know, but it may help spread the word of why such simple actions as shelter-in-place may save millions of lives.

The SIR Model of Infection

One of the simplest models for infection is the so-called SIR model that stands for Susceptible-Infected-Removed. This model is an averaged model (or a mean-field model) that disregards the fundamental network structure of human interactions and considers only averages. The dynamical flow equations are very simple

where I is the infected fraction of the population, and S is the susceptible fraction of the population. The coefficient μ is the rate at which patients recover or die, <k> is the average number of “links” to others, and β is the infection probability per link per day. The total population fraction is give by the constraint

where R is the removed population, most of whom will be recovered, but some fraction will have passed away. The number of deaths is

where m is the mortality rate, and Rinf is the longterm removed fraction of the population after the infection has run its course.

The nullclines, the curves along which the time derivatives vanish, are

Where the first nullcline intersects the third nullcline is the only fixed point of this simple model

The phase space of the SIR flow is shown in Fig. 1 plotted as the infected fraction as a function of the susceptible fraction. The diagonal is the set of initial conditions where R = 0. Each initial condition on the diagonal produces a dynamical trajectory. The dashed trajectory that starts at (1,0) is the trajectory for a new disease infecting a fully susceptible population. The trajectories terminate on the I = 0 axis at long times when the infection dies out. In this model, there is always a fraction of the population who never get the disease, not through unusual immunity, but through sheer luck.

Fig. 1 Phase space of the SIR model. The single fixed point has “marginal” stability, but leads to a finite number of of the population who never are infected. The dashed trajectory is the trajectory of the infection starting with a single case. (Adapted from “Introduction to Modern Dynamics” (Oxford University Press, 2019))

The key to understanding the scale of the pandemic is the susceptible fraction at the fixed point S*. For the parameters chosen to plot Fig. 1, the value of S* is 1/4, or β<k> = 4μ. It is the high value of the infection rate β<k> relative to the decay rate of the infection μ that allows a large fraction of the population to become infected. As the infection rate gets smaller, the fixed point S* moves towards unity on the horizontal axis, and less of the population is infected.

As soon as S* exceeds unity, for the condition

then the infection cannot grow exponentially and will decay away without infecting an appreciable fraction of the population. This condition represents a bifurcation in the infection dynamics. It means that if the infection rate can be reduced below the recovery rate, then the pandemic fades away. (It is important to point out that the R0 of a network model (the number of people each infected person infects) is analogous to the inverse of S*. When R0 > 1 then the infection spreads, just as when S* < 1, and vice versa.)

This bifurcation condition makes the strategy for fighting the pandemic clear. The parameter μ is fixed by the virus and cannot be altered. But the infection probability per day per social link, β, can be reduced by clean hygiene:

  • Don’t shake hands
  • Wash your hands often and thoroughly
  • Don’t touch your face
  • Cover your cough or sneeze in your elbow
  • Wear disposable gloves
  • Wipe down touched surfaces with disinfectants

And the number of contacts per person, <k>, can be reduced by social distancing:

  • No large gatherings
  • Stand away from others
  • Shelter-in-place
  • Self quarantine

The big question is: can the infection rate be reduced below the recovery rate through the actions of clean hygiene and social distancing? If there is a chance that it can, then literally millions of lives can be saved. So let’s take a look at COVID-19.

The COVID-19 Pandemic

To get a handle on modeling the COVID-19 pandemic using the (very simplistic) SIR model, one key parameter is the average number of people you are connected to, represented by <k>. These are not necessarily the people in your social network, but also includes people who may touch a surface you touched earlier, or who touched a surface you later touch yourself. It also includes anyone in your proximity who has coughed or sneezed in the past few minutes. The number of people in your network is a topic of keen current interest, but is surprisingly hard to pin down. For the sake of this model, I will take the number <k> = 50 as a nominal number. This is probably too small, but it is compensated by the probability of infection given by a factor r and by the number of days that an individual is infectious.

The spread is helped when infectious people go about their normal lives infecting others. But if a fraction of the population self quarantines, especially after they “may” have been exposed, then the effective number of infectious dinf days per person can be decreased. A rough equation that captures this is

where fnq is the fraction of the population that does NOT self quarantine, dill is the mean number of days a person is ill (and infectious), and dq is the number of days quarantined. This number of infectious days goes into the parameter β.

where r = 0.0002 infections per link per day2 , which is a very rough estimate of the coefficient for COVID-19.

It is clear why shelter-in-place can be so effective, especially if the number of days quarantined is equal to the number of days a person is ill. The infection could literally die out if enough people self quarantine by pushing the critical value S* above the bifurcation threshold. However, it is much more likely that large fractions of people will continue to move about. A simulation of the “wave” that passes through the US is shown in Fig. 2 (see the Python code in the section below for parameters). In this example, 60% of the population does NOT self quarantine. The wave peaks approximately 150 days after the establishment of community spread.

Fig. 2 Population dynamics for the US spread of COVID-19. The fraction that is infected represents a “wave” that passes through a community. In this simulation fnq = 60%. The total US dead after the wave has passed is roughly 2 Million in this simulation.

In addition to shelter-in-place, social distancing can have a strong effect on the disease spread. Fig. 3 shows the number of US deaths as a function of the fraction of the population who do NOT self-quarantine for a series of average connections <k>. The bifurcation effect is clear in this graph. For instance, if <k> = 50 is a nominal value, then if 85% of the population would shelter-in-place for 14 days, then the disease would fall below threshold and only a small number of deaths would occur. But if that connection number can be dropped even to <k> = 40, then only 60% would need to shelter-in-place to avoid the pandemic. By contrast, if 80% of the people don’t self-quarantine, and if <k> = 40, then there could be 2 Million deaths in the US by the time the disease has run its course.

Because of the bifurcation physics of this SIR model of COVID-19, small changes in personal behavior (if everyone participates) can literally save Millions of lives!

Fig. 3 Bifurcation plot of the number of US deaths as a function of the fraction of the population who do NOT shelter-in-place for different average links per person. At 20 links per person, the contagion could be contained. However, at 60 links per person, nearly 90% of the population would need to quarantine for at least 14 days to stop the spread.

There has been a lot said about “flattening the curve”, which is shown in Fig. 4. There are two ways that flattening the curve saves overall lives: 1) it keeps the numbers below the threshold capacity of hospitals; and 2) it decreases the total number infected and hence decreases the total dead. When the number of critical patients exceeds hospital capacity, the mortality rate increases. This is being seen in Italy where the hospitals have been overwhelmed and the mortality rate has risen from a baseline of 1% or 2% to as large as 8%. Flattening the curve is achieved by sheltering in place, personal hygiene and other forms of social distancing. The figure shows a family of curves for different fractions of the total population who shelter in place for 14 days. If more than 70% of the population shelters in place for 14 days, then the curve not only flattens … it disappears!

Fig. 4 Flattening the curve for a range of fractions of the population that shelters in place for 14 days. (See Python code for parameters.)

SIR Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sat March 21 2020
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt

plt.close('all')

print(' ')
print('SIR.py')

def solve_flow(param,max_time=1000.0):

    def flow_deriv(x_y,tspan,mu,betap):
        x, y = x_y
        
        return [-mu*x + betap*x*y,-betap*x*y]
    
    x0 = [del1, del2]
    
    # Solve for the trajectories
    t = np.linspace(0, int(tlim), int(250*tlim))
    x_t = integrate.odeint(flow_deriv, x0, t, param)

   
    return t, x_t


r = 0.0002    # 0.0002
k = 50        # connections  50
dill = 14     # days ill 14
dpq = 14      # days shelter in place 14
fnq = 0.6     # fraction NOT sheltering in place
mr0 = 0.01    # mortality rate
mr1 = 0.03     # extra mortality rate if exceeding hospital capacity
P = 330       # population of US in Millions
HC = 0.003    # hospital capacity

dinf = fnq*dill + (1-fnq)*np.exp(-dpq/dill)*dill;

betap = r*k*dinf;
mu = 1/dill;

print('beta = ',betap)
print('dinf = ',dinf)
print('beta/mu = ',betap/mu)
          
del1 = .001         # infected
del2 = 1-del1       # susceptible

tlim = np.log(P*1e6/del1)/betap + 50/betap

param = (mu, betap)    # flow parameters

t, y = solve_flow(param)
I = y[:,0]
S = y[:,1]
R = 1 - I - S

plt.figure(1)
lines = plt.semilogy(t,I,t,S,t,R)
plt.ylim([0.001,1])
plt.xlim([0,tlim])
plt.legend(('Infected','Susceptible','Removed'))
plt.setp(lines, linewidth=0.5)
plt.xlabel('Days')
plt.ylabel('Fraction of Population')
plt.title('Population Dynamics for COVID-19 in US')
plt.show()

mr = mr0 + mr1*(0.2*np.max(I)-HC)*np.heaviside(0.2*np.max(I),HC)
Dead = mr*P*R[R.size-1]
print('US Dead = ',Dead)

D = np.zeros(shape=(100,))
x = np.zeros(shape=(100,))
for kloop in range(0,5):
    for floop in range(0,100):
        
        fnq = floop/100
        
        dinf = fnq*dill + (1-fnq)*np.exp(-dpq/dill)*dill;
        
        k = 20 + kloop*10
        betap = r*k*dinf
        
        tlim = np.log(P*1e6/del1)/betap + 50/betap

        param = (mu, betap)    # flow parameters

        t, y = solve_flow(param)       
        I = y[:,0]
        S = y[:,1]
        R = 1 - I - S
        
        mr = mr0 + mr1*(0.2*np.max(I)-HC)*np.heaviside(0.2*np.max(I),HC)

        D[floop] = mr*P*R[R.size-1]
        x[floop] = fnq
        
    plt.figure(2)
    lines2 = plt.plot(x,D)
    plt.setp(lines2, linewidth=0.5)

plt.ylabel('US Million Deaths')
plt.xlabel('Fraction NOT Quarantining')
plt.title('Quarantine and Distancing')        
plt.legend(('20','30','40','50','60','70'))
plt.show()    


label = np.zeros(shape=(9,))
for floop in range(0,8):
    
    fq = floop/10.0
    
    dinf = (1-fq)*dill + fq*np.exp(-dpq/dill)*dill;
    
    k = 50
    betap = r*k*dinf
    
    tlim = np.log(P*1e6/del1)/betap + 50/betap

    param = (mu, betap)    # flow parameters

    t, y = solve_flow(param)       
    I = y[:,0]
    S = y[:,1]
    R = 1 - I - S
    
    plt.figure(3)
    lines2 = plt.plot(t,I*P)
    plt.setp(lines2, linewidth=0.5)
    label[floop]=fq

plt.legend(label)
plt.ylabel('US Millions Infected')
plt.xlabel('Days')
plt.title('Flattening the Curve')       

You can run this Python code yourself and explore the effects of changing the parameters. For instance, the mortality rate is modeled to increase when the number of hospital beds is exceeded by the number of critical patients. This coefficient is not well known and hence can be explored numerically. Also, the infection rate r is not known well, nor the average number of connections per person. The effect of longer quarantines can also be tested relative to the fraction who do not quarantine at all. Because of the bifurcation physics of the disease model, large changes in dynamics can occur for small changes in parameters when the dynamics are near the bifurcation threshold.

Caveats and Disclaimers

This SIR model of COVID-19 is an extremely rough tool that should not be taken too literally. It can be used to explore ideas about the general effect of days quarantined, or changes in the number of social contacts, but should not be confused with the professional models used by epidemiologists. In particular, this mean-field SIR model completely ignores the discrete network character of person-to-person spread. It also homogenizes the entire country, where is it blatantly obvious that the dynamics inside New York City are very different than the dynamics in rural Indiana. And the elimination of the epidemic, so that it would not come back, would require strict compliance for people to be tested (assuming there are enough test kits) and infected individuals to be isolated after the wave has passed.