Posts by David D. Nolte

E. M. Purcell Distinguished Professor of Physics and Astronomy at Purdue University

Edward Lorenz’ Chaotic Butterfly

The butterfly effect is one of the most widely known principles of chaos theory. It has become a meme, propagating through popular culture in movies, books, TV shows and even casual conversation.

Can a butterfly flapping its wings in Florida send a hurricane to New York?

The origin of the butterfly effect is — not surprisingly — the image of a butterfly-like set of trajectories that was generated, in one of the first computer simulations of chaos theory, by Edward Lorenz.

Lorenz’ Butterfly

Excerpted from Galileo Unbound (Oxford, 2018) pg. 215

When Edward Lorenz (1917 – 2008) was a child, he memorized all perfect squares up to ten thousand.  This obvious interest in mathematics led him to a master’s degree in the subject at Harvard in 1940 under the supervision of Georg Birkhoff.  Lorenz’s master’s thesis was on an aspect of Riemannian geometry, but his foray into nonlinear dynamics was triggered by the intervention of World War II.  Only a few months before receiving his doctorate in mathematics from Harvard, the Japanese bombed Pearl Harbor. 

Lorenz left the PhD program at Harvard to join the United States Army Air Force to train as a weather forecaster in early 1942, and he took courses on forecasting and meteorology at MIT.  After receiving a second master’s degree, this time in meteorology, Lorenz was posted to Hawaii, then to Saipan and finally to Guam.  His area of expertise was in high-level winds, which were important for high-altitude bombing missions during the final months of the war in the Pacific.  After the Japanese surrender, Lorenz returned to MIT, where he continued his studies in meteorology, receiving his doctorate degree in 1948 with a thesis on the application of fluid dynamical equations to predict the motion of storms. 

One of Lorenz’ colleagues at MIT was Norbert Wiener (1894 – 1964), with whom he sometimes played chess during lunch at the faculty club.  Wiener had published his landmark book Cybernetics: Control and Communication in the Animal and Machine in 1949 which arose out of the apparently mundane problem of gunnery control during the Second World War.  As an abstract mathematician, Wiener attempted to apply his cybernetic theory to the complexities of weather, but he developed a theorem concerning nonlinear fluid dynamics which appeared to show that linear interpolation, of sufficient resolution, would suffice for weather forecasting, possibly even long-range forecasting.  Many on the meteorology faculty embraced this theorem because it fell in line with common practices of the day in which tomorrow’s weather was predicted using linear regression on measurements taken today.  However, Lorenz was skeptical, having acquired a detailed understanding of atmospheric energy cascades as larger vortices induced smaller vortices all the way down to the molecular level, dissipating as heat, and then all the way back up again as heat drove large-scale convection.  This was clearly not a system that would yield to linearization.  Therefore, Lorenz determined to solve nonlinear fluid dynamics models to test this conjecture.

Even with a computer in hand, the atmospheric equations needed to be simplified to make the calculations tractable.  Lorenz was more a scientist than an engineer, and more of a meteorologist than a forecaster.  He did not hesitate to make simplifying assumptions if they retained the correct phenomenological behavior, even if they no longer allowed for accurate weather predictions. 

He had simplified the number of atmospheric equations down to twelve.  Progress was good, and by 1961, he had completed a large initial numerical study.  He focused on nonperiodic solutions, which he suspected would deviate significantly from the predictions made by linear regression, and this hunch was vindicated by his numerical output.  One day, as he was testing his results, he decided to save time by starting the computations midway by using mid-point results from a previous run as initial conditions.  He typed in the three-digit numbers from a paper printout and went down the hall for a cup of coffee.  When he returned, he looked at the printout of the twelve variables and was disappointed to find that they were not related to the previous full-time run.  He immediately suspected a faulty vacuum tube, as often happened.  But as he looked closer at the numbers, he realized that, at first, they tracked very well with the original run, but then began to diverge more and more rapidly until they lost all connection with the first-run numbers.  His initial conditions were correct to a part in a thousand, but this small error was magnified exponentially as the solution progressed.

At this point, Lorenz recalled that he “became rather excited”.  He was looking at a complete breakdown of predictability in atmospheric science.  If radically different behavior arose from the smallest errors, then no measurements would ever be accurate enough to be useful for long-range forecasting.  At a more fundamental level, this was a break with a long-standing tradition in science and engineering that clung to the belief that small differences produced small effects.  What Lorenz had discovered, instead, was that the deterministic solution to his 12 equations was exponentially sensitive to initial conditions (known today as SIC). 

The Lorenz Equations

Over the following months, he was able to show that SIC was a result of the nonperiodic solutions.  The more Lorenz became familiar with the behavior of his equations, the more he felt that the 12-dimensional trajectories had a repeatable shape.  He tried to visualize this shape, to get a sense of its character, but it is difficult to visualize things in twelve dimensions, and progress was slow.  Then Lorenz found that when the solution was nonperiodic (the necessary condition for SIC), four of the variables settled down to zero, leaving all the dynamics to the remaining three variables. 

Lorenz narrowed the equations of atmospheric instability down to three variables: the stream function, the change in temperature and the deviation in linear temperature. The only parameter in the stream function is something known as the Prandtl Number. This is a dimensionless number which is the ratio of the kinetic viscosity of the fluid to its thermal diffusion coefficient and is a physical property of the fluid. The only parameter in the change in temperature is the Rayleigh Number which is a dimensionless parameter proportional to the difference in temperature between the top and the bottom of the fluid layer. The final parameter, in the equation for the deviation in linear temperature, is the ratio of the height of the fluid layer to the width of the convection rolls. The final simplified model is given by the flow equations

The Butterfly

Lorenz finally had a 3-variable dynamical system that displayed chaos.  Moreover, it had a three-dimensional state space that could be visualized directly.  He ran his simulations, exploring the shape of the trajectories in three-dimensional state space for a wide range of initial conditions, and the trajectories did indeed always settle down to restricted regions of state space.  They relaxed in all cases to a sort of surface that was elegantly warped, with wing-like patterns like a butterfly, as the state point of the system followed its dynamics through time.  The attractor of the Lorenz equations was strange.  Later, in 1971, David Ruelle (1935 – ), a Belgian-French mathematical physicist named this a “strange attractor”, and this name has become a standard part of the language of the theory of chaos.

The first graphical representation of the butterfly attractor is shown in Fig. 1 drawn by Lorenz for his 1963 publication.

Fig. 1 Excerpts of the title, abstract and sections of Lorenz’ 1963 paper. His three-dimensional flow equations produce trajectories that relax onto a three-dimensional “strange attractor“.

Using our modern plotting ability, the 3D character of the butterfly is shown in Fig. 2

Fig. 2 Edward Lorenz’ chaotic butterfly

A projection onto the x-y plane is shown in Fig. 3. In the full 3D state space the trajectories never overlap, but in the projection onto a 2D plane the trajectories are moving above and below each other.

Fig. 3 Projection of the butterfly onto the x-y plane centered on the origin.

The reason it is called a strange attractor is because all initial conditions relax onto the strange attractor, yet every trajectory on the strange attractor separates exponentially from neighboring trajectories, displaying the classic SIC property of chaos. So here is an elegant collection of trajectories that are certainly not just random noise, yet detailed prediction is still impossible. Deterministic chaos has significant structure, and generates beautiful patterns, without actual “randomness”.

Python Program

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Created on Mon Apr 16 07:38:57 2018

@author: nolte
Introduction to Modern Dynamics, 2nd edition (Oxford University Press, 2019)

Lorenz model of atmospheric turbulence
import numpy as np
import matplotlib as mpl

import matplotlib.colors as colors
import as cmx

from scipy import integrate
from matplotlib import cm
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.colors import cnames
from matplotlib import animation


jet = cm = plt.get_cmap('jet') 
values = range(10)
cNorm  = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)

def solve_lorenz(N=12, angle=0.0, max_time=8.0, sigma=10.0, beta=8./3, rho=28.0):

    fig = plt.figure()
    ax = fig.add_axes([0, 0, 1, 1], projection='3d')

    # prepare the axes limits
    ax.set_xlim((-25, 25))
    ax.set_ylim((-35, 35))
    ax.set_zlim((5, 55))

    def lorenz_deriv(x_y_z, t0, sigma=sigma, beta=beta, rho=rho):
        """Compute the time-derivative of a Lorenz system."""
        x, y, z = x_y_z
        return [sigma * (y - x), x * (rho - z) - y, x * y - beta * z]

    # Choose random starting points, uniformly distributed from -15 to 15
    x0 = -10 + 20 * np.random.random((N, 3))

    # Solve for the trajectories
    t = np.linspace(0, max_time, int(500*max_time))
    x_t = np.asarray([integrate.odeint(lorenz_deriv, x0i, t)
                      for x0i in x0])

    # choose a different color for each trajectory
    # colors =, 1, N))
    # colors =, 1, N))
    # colors =, 1, N))
    colors =, 1, N))

    for i in range(N):
        x, y, z = x_t[i,:,:].T
        lines = ax.plot(x, y, z, '-', c=colors[i])
        plt.setp(lines, linewidth=1)

    ax.view_init(30, angle)

    return t, x_t

t, x_t = solve_lorenz(angle=0, N=12)

lines = plt.plot(t,x_t[1,:,0],t,x_t[1,:,1],t,x_t[1,:,2])
plt.setp(lines, linewidth=1)
lines = plt.plot(t,x_t[2,:,0],t,x_t[2,:,1],t,x_t[2,:,2])
plt.setp(lines, linewidth=1)
lines = plt.plot(t,x_t[10,:,0],t,x_t[10,:,1],t,x_t[10,:,2])
plt.setp(lines, linewidth=1)

To explore the parameter space of the Lorenz attractor, the key parameters to change are sigma (the Prandtl number), r (the Rayleigh number) and b on line 31 of the Python code.


[1] E. N. Lorenz, The essence of chaos (The Jessie and John Danz lectures; Jessie and John Danz lectures.). Seattle :: University of Washington Press (in English), 1993.

[2] E. N. Lorenz, “Deterministic Nonperiodic Flow,” Journal of the Atmospheric Sciences, vol. 20, no. 2, pp. 130-141, 1963 (1963)

The Physics of U. S. Presidential Elections (why are so many elections so close?)

Well here is another squeaker! The 2020 U. S. presidential election was a dead heat. What is most striking is that half of the past six US presidential elections have been won by less than 1% of the votes cast in certain key battleground states. For instance, in 2000 the election was won in Florida by less than 1/100th of a percent of the total votes cast.

How can so many elections be so close? This question is especially intriguing when one considers the 2020 election, which should have been strongly asymmetric, because one of the two candidates had such serious character flaws. It is also surprising because the country is NOT split 50/50 between urban and rural populations (it’s more like 60/40). And the split of Democrat/Republican is about 33/29 — close, but not as close as the election. So how can the vote be so close so often? Is this a coincidence? Or something fundamental about our political system? The answer lies (partially) in nonlinear dynamics coupled with the libertarian tendencies of American voters.

Rabbits and Sheep

Elections are complex dynamical systems consisting of approximately 140 million degrees of freedom (the voters). Yet US elections are also surprisingly simple. They are dynamical systems with only 2 large political parties, and typically a very small third party.

Voters in a political party are not too different from species in an ecosystem. There are many population dynamics models of things like rabbit and sheep that seek to understand the steady-state solutions when two species vie for the same feedstock (or two parties vie for the same votes). Depending on reproduction rates and competition payoff, one species can often drive the other species to extinction. Yet with fairly small modifications of the model parameters, it is often possible to find a steady-state solution in which both species live in harmony. This is a symbiotic solution to the population dynamics, perhaps because the rabbits help fertilize the grass for the sheep to eat, and the sheep keep away predators for the rabbits.

There are two interesting features to such a symbiotic population-dynamics model. First, because there is a stable steady-state solution, if there is a perturbation of the populations, for instance if the rabbits are culled by the farmer, then the two populations will slowly relax back to the original steady-state solution. For this reason, this solution is called a “stable fixed point”. Deviations away from the steady-state values experience an effective “restoring force” that moves the population values back to the fixed point. The second feature of these models is that the steady state values depend on the parameters of the model. Small changes in the model parameters then cause small changes in the steady-state values. In this sense, this stable fixed point is not fundamental–it depends on the parameters of the model.

Fig. 1 Dynamics of rabbits and sheep competing for the same resource (grass). For these parameters, one species dies off while the other thrives. A slight shift in parameters can turn the central saddle point into a stable fixed point where sheep and rabbits coexist in stead state. ([1] Reprinted from Introduction to Modern Dynamics (Oxford University Press, 2019) pg. 119)

But there are dynamical models which do have a stability that maintains steady values even as the model parameters shift. These models have negative feedback, like many dynamical systems, but if the negative feedback is connected to winner-take-all outcomes of game theory, then a robustly stable fixed point can emerge at precisely the threshold where such a winner would take all.

The Replicator Equation

The replicator equation provides a simple model for competing populations [2]. Despite its simplicity, it can model surprisingly complex behavior. The central equation is a simple growth model

where the growth rate depends on the fitness fa of the a-th species relative to the average fitness φ of all the species. The fitness is given by

where pab is the payoff matrix among the different species (implicit Einstein summation applies). The fitness is frequency dependent through the dependence on xb. The average fitness is

This model has a zero-sum rule that keeps the total population constant. Therefore, a three-species dynamics can be represented on a two-dimensional “simplex” where the three vertices are the pure populations for each of the species. The replicator equation can be applied easily to a three-party system, one simply defines a payoff matrix that is used to define the fitness of a party relative to the others.

The Nonlinear Dynamics of Presidential Elections

Here we will consider the replicator equation with three political parties (Democratic, Republican and Libertarian). Even though the third party is never a serious contender, the extra degree of freedom provided by the third party helps to stabilize the dynamics between the Democrats and the Republicans.

It is already clear that an essentially symbiotic relationship is at play between Democrats and Republicans, because the elections are roughly 50/50. If this were not the case, then a winner-take-all dynamic would drive virtually everyone to one party or the other. Therefore, having 100% Democrats is actually unstable, as is 100% Republicans. When the populations get too far out of balance, they get too monolithic and too inflexible, then defections of members will occur to the other parties to rebalance the system. But this is just a general trend, not something that can explain the nearly perfect 50/50 vote of the 2020 election.

To create the ultra-stable fixed point at 50/50 requires an additional contribution to the replicator equation. This contribution must create a type of toggle switch that depends on the winner-take-all outcome of the election. If a Democrat wins 51% of the vote, they get 100% of the Oval Office. This extreme outcome then causes a back action on the electorate who is always afraid when one party gets too much power.

Therefore, there must be a shift in the payoff matrix when too many votes are going one way or the other. Because the winner-take-all threshold is at exactly 50% of the vote, this becomes an equilibrium point imposed by the payoff matrix. Deviations in the numbers of voters away from 50% causes a negative feedback that drives the steady-state populations back to 50/50. This means that the payoff matrix becomes a function of the number of voters of one party or the other. In the parlance of nonlinear dynamics, the payoff matrix becomes frequency dependent. This goes one step beyond the original replicator equation where it was the population fitness that was frequency dependent, but not the payoff matrix. Now the payoff matrix also becomes frequency dependent.

The frequency-dependent payoff matrix (in an extremely simple model of the election dynamics) takes on negative feedback between two of the species (here the Democrats and the Republicans). If these are the first and third species, then the payoff matrix becomes

where the feedback coefficient is

and where the population dependences on the off-diagonal terms guarantee that, as soon as one party gains an advantage, there is defection of voters to the other party. This establishes a 50/50 balance that is maintained even when the underlying parameters would predict a strongly asymmetric election.

For instance, look at the dynamics in Fig. 2. For this choice of parameters, the replicator model predicts a 75/25 win for the democrats. However, when the feedback is active, it forces the 50/50 outcome, despite the underlying advantage for the original parameters.

Fig. 2 Comparison of the stabilized election with 50/50 outcome compared to the replicator dynamics without the feedback. For the parameters chosen here, there would be a 75/25 victory of the Democrats over the Republications. However, when the feedback is in play, the votes balance out at 50/50.

There are several interesting features in this model. It may seem that the Libertarians are irrelevant because they never have many voters. But their presence plays a surprisingly important role. The Libertarians tend to stabilize the dynamics so that neither the democrats nor the republicans would get all the votes. Also, there is a saddle point not too far from the pure Libertarian vertex. That Libertarian vertex is an attractor in this model, so under some extreme conditions, this could become a one-party system…maybe not Libertarian in that case, but possibly something more nefarious, of which history can provide many sad examples. It’s a word of caution.

Disclaimers and Caveats

No attempt has been made to actually mode the US electorate. The parameters in the modified replicator equations are chosen purely for illustration purposes. This model illustrates a concept — that feedback in the payoff matrix can create an ultra-stable fixed point that is insensitive to changes in the underlying parameters of the model. This can possibly explain why so many of the US presidential elections are so tight.

Someone interested in doing actual modeling of US elections would need to modify the parameters to match known behavior of the voting registrations and voting records. The model presented here assumes a balanced negative feedback that ensures a 50/50 fixed point. This model is based on the aversion of voters to too much power in one party–an echo of the libertarian tradition in the country. A more sophisticated model would yield the fixed point as a consequence of the dynamics, rather than being a feature assumed in the model. In addition, nonlinearity could be added that would drive the vote off of the 50/50 point when the underlying parameters shift strongly enough. For instance, the 2008 election was not a close one, in part because the strong positive character of one of the candidates galvanized a large fraction of the electorate, driving the dynamics away from the 50/50 balance.


[1] D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time (Oxford University Press, 2019) 2nd Edition.

[2] Nowak, M. A. (2006). Evolutionary Dynamics: Exploring the Equations of Life. Cambridge, Mass., Harvard University Press.

The Ups and Downs of the Compound Double Pendulum

A chief principle of chaos theory states that even simple systems can display complex dynamics.  All that is needed for chaos, roughly, is for a system to have at least three dynamical variables plus some nonlinearity. 

A classic example of chaos is the driven damped pendulum.  This is a mass at the end of a massless rod driven by a sinusoidal perturbation.  The three variables are the angle, the angular velocity and the phase of the sinusoidal drive.  The nonlinearity is provided by the cosine function in the potential energy which is anharmonic for large angles.  However, the driven damped pendulum is not an autonomous system, because the drive is an external time-dependent function.  To find an autonomous system—one that persists in complex motion without any external driving function—one needs only to add one more mass to a simple pendulum to create what is known as a compound pendulum, or a double pendulum.

Daniel Bernoulli and the Discovery of Normal Modes

After the invention of the calculus by Newton and Leibniz, the first wave of calculus practitioners (Leibniz, Jakob and Johann Bernoulli and von Tschirnhaus) focused on static problems, like the functional form of the catenary (the shape of a hanging chain), or on constrained problems, like the brachistochrone (the path of least time for a mass under gravity to move between two points) and the tautochrone (the path of equal time).

The next generation of calculus practitioners (Euler, Johann and Daniel Bernoulli, and  D’Alembert) focused on finding the equations of motion of dynamical systems.  One of the simplest of these, that yielded the earliest equations of motion as well as the first identification of coupled modes, was the double pendulum.  The double pendulum, in its simplest form, is a mass on a rigid massless rod attached to another mass on a massless rod.  For small-angle motion, this is a simple coupled oscillator.

Fig. 1 The double pendulum as seen by Daniel Bernoulli, Johann Bernoulli and D’Alembert. This two-mass system played a central role in the earliest historical development of dynamical equations of motion.

Daniel Bernoulli, the son of Johann I Bernoulli, was the first to study the double pendulum, publishing a paper on the topic in 1733 in the proceedings of the Academy in St. Petersburg just as he returned from Russia to take up a post permanently in his home town of Basel, Switzerland.  Because he was a physicist first and mathematician second, he performed experiments with masses on strings to attempt to understand the qualitative as well as quantitative behavior of the two-mass system.  He discovered that for small motions there was a symmetric behavior that had a low frequency of oscillation and an antisymmetric motion that had a higher frequency of oscillation.  Furthermore, he recognized that any general motion of the double pendulum was a combination of the fundamental symmetric and antisymmetric motions.  This work by Daniel Bernoulli represents the discovery of normal modes of coupled oscillators.  It is also the first statement of the combination of motions that he would use later (1753) to express for the first time the principle of superposition. 

Superposition is one of the guiding principles of linear physical systems.  It provides a means for the solution of differential equations.  It explains the existence of eigenmodes and their eigenfrequencies.  It is the basis of all interference phenomenon, whether classical like the Young’s double-slit experiment or quantum like Schrödinger’s cat.  Today, superposition has taken center stage in quantum information sciences and helps define the spooky (and useful) properties of quantum entanglement.  Therefore, normal modes, composition of motion, superposition of harmonics on a musical string—these all date back to Daniel Bernoulli in the twenty years between 1733 and 1753.  (Daniel Bernoulli is also the originator of the Bernoulli principle that explains why birds and airplanes fly.)

Johann Bernoulli and the Equations of Motion

Daniel Bernoulli’s father was Johann I Bernoulli.  Daniel had been tutored by Johann, along with his friend Leonhard Euler, when Daniel was young.  But as Daniel matured as a mathematician, he and his father began to compete against each other in international mathematics competitions (which were very common in the early eighteenth century).  When Daniel beat his father in a competition sponsored by the French Academy, Johann threw Daniel out of his house and their relationship remained strained for the remainder of their lives.

Johann had a history of taking ideas from Daniel and never citing the source. For instance, when Johann published his work on equations of motion for masses on strings in 1742, he built on the work of his son Daniel from 1733 but never once mentioned it. Daniel, of course, was not happy.

In a letter dated 20 October 1742 that Daniel wrote to Euler, he said, “The collected works of my father are being printed, and I have Just learned that he has inserted, without any mention of me, the dynamical problems I first discovered and solved (such as e. g. the descent of a sphere on a moving triangle; the linked pendulum, the center of spontaneous rotation, etc.).” And on 4 September 1743, when Daniel had finally seen his father’s works in print, he said, “The new mechanical problems are mostly mine, and my father saw my solutions before he solved the problems in his way …”. [2]

Daniel clearly has the priority for the discovery of the normal modes of the linked (i.e. double or compound) pendulum, but Johann often would “improve” on Daniel’s work despite giving no credit for the initial work. As a mathematician, Johann had a more rigorous approach and could delve a little deeper into the math. For this reason, it was Johann in 1742 who came closest to writing down differential equations of motion for multi-mass systems, but falling just short. It was D’Alembert only one year later who first wrote down the differential equations of motion for systems of masses and extended it to the loaded string for which he was the first to derive the wave equation. The D’Alembertian operator is today named after him.

Double Pendulum Dynamics

The general dynamics of the double pendulum are best obtained from Lagrange’s equations of motion. However, setting up the Lagrangian takes careful thought, because the kinetic energy of the second mass depends on its absolute speed which is dependent on the motion of the first mass from which it is suspended. The velocity of the second mass is obtained through vector addition of velocities.

Fig. 2. The dynamics of the double pendulum.

The potential energy of the system is

so that the Lagrangian is

The partial derivatives are

and the time derivatives of the last two expressions are

Therefore, the equations of motion are

To get a sense of how this system behaves, we can make a small-angle approximation to linearize the equations to find the lowest-order normal modes.  In the small-angle approximation, the equations of motion become

where the determinant is

This quartic equation is quadratic in w2 and the quadratic solution is

This solution is still a little opaque, so taking the special case: R = R1 = R2 and M = M1 = M2 it becomes

There are two normal modes.  The low-frequency mode is symmetric as both masses swing (mostly) together, while the higher frequency mode is antisymmetric with the two masses oscillating against each other.  These are the motions that Daniel Bernoulli discovered in 1733.

It is interesting to note that if the string were rigid, so that the two angles were the same, then the lowest frequency would be 3/5 which is within 2% of the above answer but is certainly not equal.  This tells us that there is a slightly different angular deflection for the second mass relative to the first.

Chaos in the Double Pendulum

The full expression for the nonlinear coupled dynamics is expressed in terms of four variables (q1, q2, w1, w2).  The dynamical equations are

These can be put into the normal form for a four-dimensional flow as

The numerical solution of these equations produce a complex interplay between the angle of the first mass and the angle of the second mass. Examples of trajectory projections in configuration space are shown in Fig. 3 for E = 1. The horizontal is the first angle, and the vertical is the angle of the second mass.

Fig. 3 Trajectory projections onto configuration space. The horizontal axis is the first mass angle, and the vertical axis is the second mass angle. All of these are periodic or nearly periodic orbits except for the one on the lower left. E = 1.

The dynamics in state space are four dimensional which are difficult to visualize directly. Using the technique of the Poincaré first-return map, the four-dimensional trajectories can be viewed as a two-dimensional plot where the trajectories pierce the Poincaré plane. Poincare sections are shown in Fig. 4.

Fig. Poincare sections of the double pendulum in state space for increasing kinetic energy. Initial conditions are vertical in all. The horizontal axis is the angle of the second mass, and the vertical axis is the angular velocity of the second mass.

Python Code:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Created on Oct 16 06:03:32 2020
"Introduction to Modern Dynamics" 2nd Edition (Oxford, 2019)
@author: nolte

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt
import time


E = 1.       # Try 0.8 to 1.5

def flow_deriv(x_y_z_w,tspan):
    x, y, z, w = x_y_z_w

    A = w**2*np.sin(y-x);
    B = -2*np.sin(x);
    C = z**2*np.sin(y-x)*np.cos(y-x);
    D = np.sin(y)*np.cos(y-x);
    EE = 2 - (np.cos(y-x))**2;
    FF = w**2*np.sin(y-x)*np.cos(y-x);
    G = -2*np.sin(x)*np.cos(y-x);
    H = 2*z**2*np.sin(y-x);
    I = 2*np.sin(y);
    JJ = (np.cos(y-x))**2 - 2;

    a = z
    b = w
    c = (A+B+C+D)/EE
    d = (FF+G+H+I)/JJ

repnum = 75

for reploop  in range(repnum):
    px1 = 2*(np.random.random((1))-0.499)*np.sqrt(E);
    py1 = -px1 + np.sqrt(2*E - px1**2);            

    xp1 = 0   # Try 0.1
    yp1 = 0   # Try -0.2
    x_y_z_w0 = [xp1, yp1, px1, py1]
    tspan = np.linspace(1,1000,10000)
    x_t = integrate.odeint(flow_deriv, x_y_z_w0, tspan)
    siztmp = np.shape(x_t)
    siz = siztmp[0]

    if reploop % 50 == 0:
        lines = plt.plot(x_t[:,0],x_t[:,1])
        plt.setp(lines, linewidth=0.5)

    y1 = np.mod(x_t[:,0]+np.pi,2*np.pi) - np.pi
    y2 = np.mod(x_t[:,1]+np.pi,2*np.pi) - np.pi
    y3 = np.mod(x_t[:,2]+np.pi,2*np.pi) - np.pi
    y4 = np.mod(x_t[:,3]+np.pi,2*np.pi) - np.pi
    py = np.zeros(shape=(10*repnum,))
    yvar = np.zeros(shape=(10*repnum,))
    cnt = -1
    last = y1[1]
    for loop in range(2,siz):
        if (last < 0)and(y1[loop] > 0):
            cnt = cnt+1
            del1 = -y1[loop-1]/(y1[loop] - y1[loop-1])
            py[cnt] = y4[loop-1] + del1*(y4[loop]-y4[loop-1])
            yvar[cnt] = y2[loop-1] + del1*(y2[loop]-y2[loop-1])
            last = y1[loop]
            last = y1[loop]
    lines = plt.plot(yvar,py,'o',ms=1)

You can change the energy E on line 16 and also the initial conditions xp1 and yp1 on lines 48 and 49. The energy E is the initial kinetic energy imparted to the two masses. For a given initial condition, what happens to the periodic orbits as the energy E increases?


[1] Daniel Bernoulli, Theoremata de oscillationibus corporum filo flexili connexorum et catenae verticaliter suspensae,” Academiae Scientiarum Imperialis Petropolitanae, 6, 1732/1733

[2] Truesdell B. The rational mechanics of flexible or elastic bodies, 1638-1788. (Turici: O. Fussli, 1960). (This rare and artistically produced volume, that is almost impossible to find today in any library, is one of the greatest books written about the early history of dynamics.)

Cancer Holography for Personalized Medicine

Imagine if you could use the physics of coherent light to record a 3D hologram of a cancer tumor and use it to select the best therapy for the cancer patient.

This week in Scientific Reports, a Nature Research publication, we demonstrate the first step towards that goal.

In a collaboration between Purdue University and the Northwestern University School of Medicine, we performed Doppler spectroscopy of intracellular dynamics of human epithelial ovarian cancer tumor biopsies and observed how they responded to selected anti-cancer drugs. Distinctly different Doppler spectra were observed for patients who went into remission versus those who failed to achieve cancer remission. This is the first clinical pilot trial of the technology, known as Biodynamic Imaging (BDI), published in human cancer research.

BDI may, in the future, make it possible to select the most effective therapies for individual cancer patients, realizing the long-sought dream of personalized cancer care.

Read it here: This latest research on personalized medicine has just been published with @SpringerNature in @ScientificReports.

The Purdue University Office of Technology Transfer has licensed the BDI patent portfolio to Animated Dynamics, Inc., located in Indianapolis, IN, that is working to commercialize the technology to translate it to the cancer clinic. Currently less than 40% of all cancer patients respond favorably to their chemotherapy. Using BDI technology our hope is to improve rates of remission in select cancer settings.

This work was supported by the NIH under the The Office of Physical Sciences – Oncology (OPSO) and by NSF CBET.

The Bountiful Bernoullis of Basel

The task of figuring out who’s who in the Bernoulli family is a hard nut to crack.  The Bernoulli name populates a dozen different theorems or physical principles in the history of science and mathematics, but each one was contributed by any of four or five different Bernoullis of different generations—brothers, uncles, nephews and cousins.  What makes the task even more difficult is that any given Bernoulli might be called by several different aliases, while many of them shared the same name across generations.  To make things worse, they often worked and published on each other’s problems.

To attribute a theorem to a Bernoulli is not too different from attributing something to the famous mathematical consortium called Nicholas Bourbaki.  It’s more like a team rather than an individual.  But in the case of Bourbaki, the goal was selfless anonymity, while in the case of the Bernoullis it was sometimes the opposite—bald-faced competition and one-up-manship coupled with jealousy and resentment. Fortunately, the competition tended to breed more output than less, and the world benefited from the family feud.

The Bernoulli Family Tree

The Bernoullis are intimately linked with the beautiful city of Basel, Switzerland, situated on the Rhine River where it leaves Switzerland and forms the border between France and Germany . The family moved there from the Netherlands in the 1600’s to escape the Spanish occupation.

Basel Switzerland

The first Bernoulli born in Basel was Nikolaus Bernoulli (1623 – 1708), and he had four sons: Jakob I, Nikolaus, Johann I and Hieronymous I. The “I”s in this list refer to the fact, or the problem, that many of the immediate descendants took their father’s or uncle’s name. The long-lived family heritage in the roles of mathematician and scientist began with these four brothers. Jakob Bernoulli (1654 – 1705) was the eldest, followed by Nikolaus Bernoulli (1662 – 1717), Johann Bernoulli (1667 – 1748) and then Hieronymous (1669 – 1760). In this first generation of Bernoullis, the great mathematicians were Jakob and Johann. More mathematical equations today are named after Jakob, but Johann stands out because of the longevity of his contributions, the volume and impact of his correspondence, the fame of his students, and the number of offspring who also took up mathematics. Johann was also the worst when it came to jealousy and spitefulness—against his brother Jakob, whom he envied, and specifically against his son Daniel, whom he feared would eclipse him.

Jakob Bernoulli (aka James or Jacques or Jacob)

Jakob Bernoulli (1654 – 1705) was the eldest of the first generation of brothers and also the first to establish himself as a university professor. He held the chair of mathematics at the university in Basel. While his interests ranged broadly, he is known for his correspondences with Leibniz as he and his brother Johann were among the first mathematicians to apply Lebiniz’ calculus to solving specific problems. The Bernoulli differential equation is named after him. It was one of the first general differential equations to be solved after the invention of the calculus. The Bernoulli inequality is one of the earliest attempts to find the Taylor expansion of exponentiation, which is also related to Bernoulli numbers, Bernoulli polynomials and the Bernoulli triangle. A special type of curve that looks like an ellipse with a twist in the middle is the lemniscate of Bernoulli.

Perhaps Jakob’s most famous work was his Ars Conjectandi (1713) on probability theory. Many mathematical theorems of probability named after a Bernoulli refer to this work, such as Bernoulli distribution, Bernoulli’s golden theorem (the law of large numbers), Bernoulli process and Bernoulli trial.

Fig. Bernoulli numbers in Jakob’s Ars Conjectandi (1713)

Johann Bernoulli (aka Jean or John)

Jakob was 13 years older than his brother Johann Bernoulli (1667 – 1748), and Jakob tutored Johann in mathematics who showed great promise. Unfortunately, Johann had that awkward combination of high self esteem with low self confidence, and he increasingly sought to show that he was better than his older brother. As both brothers began corresponding with Leibniz on the new calculus, they also began to compete with one another. Driven by his insecurity, Johann also began to steal ideas from his older brother and claim them for himself.

A classic example of this is the famous brachistrochrone problem that was posed by Johann in the Acta Eruditorum in 1696. Johann at this time was a professor of mathematics at Gronigen in the Netherlands. He challenged the mathematical world to find the path of least time for a mass to travel under gravity between two points. He had already found one solution himself and thought that no-one else would succeed. Yet when he heard his brother Jakob was responding to the challenge, he spied out his result and then claimed it as his own. Within the year and a half there were 4 additional solutions—all correct—using different approaches.  One of the most famous responses was by Newton (who as usual did not give up his method) but who is reported to have solved the problem in a day.  Others who contributed solutions were Gottfried Leibniz, Ehrenfried Walther von Tschirnhaus, and Guillaume de l’Hôpital in addition to Jakob.

The participation of de l’Hôpital in the challenge was a particular thorn in Johann’s side, because de l’Hôpital had years earlier paid Johann to tutor him in Leibniz’ new calculus at a time when l’Hôpital knew nothing of the topic. What is today known as l’Hôpital’s theorem on ratios of limits in fact was taught to l’Hôpital by Johann. Johann never forgave l’Hôpital for publicizing the result—but l’Hôpital had the discipline to write a textbook while Johann did not. To be fair, l’Hôpital did give Johann credit in the opening of his book, but that was not enough for Johann who continued to carry his resentment.

When Jakob died of tuberculosis in 1705, Johann campaigned to replace him in his position as professor of mathematics and succeeded. In that chair, Johann had many famous students (Euler foremost among them, but also Maupertuis and Clairaut). Part of Johann’s enduring fame stems from his many associations and extensive correspondences with many of the top mathematicians of the day. For instance, he had a regular correspondence with the mathematician Varignon, and it was in one of these letters that Johann proposed the principle of virtual velocities which became a key axiom for Joseph Lagrange’s later epic work on the foundations of mechanics (see Chapter 4 in Galileo Unbound).

Johann remained in his chair of mathematics at Basel for almost 40 years. This longevity, and the fame of his name, guaranteed that he taught some of the most talented mathematicians of the age, including his most famous student Leonhard Euler, who is held by some as one of the four greatest mathematicians of all time (the others were Archimedes, Newton and Gauss) [1].

Nikolaus I Bernoulli

Nikolaus I Bernoulli (1687 – 1759, son of Nikolaus) was the cousin of Daniel and nephew to both Jacob and Johann. He was a well-known mathematician in his time (he briefly held Galileo’s chair in Padua), though few specific discoveries are attributed to him directly. He is perhaps most famous today for posing the “St. Petersburg Paradox” of economic game theory. Ironically, he posed this paradox while his cousin Nikolaus II Bernoulli (brother of Daniel Bernoulli) was actually in St. Petersburg with Daniel.

The St. Petersburg paradox is a simple game of chance played with a fair coin where a player must buy in at a certain price in order to place $2 in a pot that doubles each time the coin lands heads, and pays out the pot at the first tail. The average pay-out of this game has infinite expectation, so it seems that anyone should want to buy in at any cost. But most people would be unlikely to buy in even for a modest $25. Why? And is this perception correct? The answer was only partially provided by Nikolaus. The definitive answer was given by his cousin Daniel Bernoulli.

Daniel Bernoulli

Daniel Bernoulli (1700 – 1782, son of Johann I) is my favorite Bernoulli. While most of the other Bernoullis were more mathematicians than scientists, Daniel Bernoulli was more physicist than mathematician. When we speak of “Bernoulli’s principle” today, the fundamental force that allows birds and airplanes to fly, we are referring to his work on hydrodynamics. He was one of the earliest originators of economic dynamics through his invention of the utility function and diminishing returns, and he was the first to clearly state the principle of superposition, which lies at the heart today of the physics of waves and quantum technology.

Daniel Bernoulli

While in St. Petersburg, Daniel conceived of the solution to the St. Petersburg paradox (he is the one who actually named it). To explain why few people would pay high stakes to play the game, he devised a “utility function” that had “diminishing marginal utility” in which the willingness to play depended on ones wealth. Obviously a wealthy person would be willing to pay more than a poor person. Daniel stated

The determination of the value of an item must not be based on the price, but rather on the utility it yields…. There is no doubt that a gain of one thousand ducats is more significant to the pauper than to a rich man though both gain the same amount.

He created a log utility function that allowed one to calculate the highest stakes a person should be willing to take based on their wealth. Indeed, a millionaire may only wish to pay $20 per game to play, in part because the average payout over a few thousand games is only about $5 per game. It is only in the limit of an infinite number of games (and an infinite bank account by the casino) that the average payout diverges.

Daniel Bernoulli Hydrodynamica (1638)

Johann II Bernoulli

Daniel’s brother Johann II (1710 – 1790) published in 1736 one of the most important texts on the theory of light during the time between Newton and Euler. Although the work looks woefully anachronistic today, it provided one of the first serious attempts at understanding the forces acting on light rays and describing them mathematically [5]. Euler based his new theory of light, published in 1746, on much of the work laid down by Johann II. Euler came very close to proposing a wave-like theory of light, complete with a connection between frequency of wave pulses and colors, that would have preempted Thomas Young by more than 50 years. Euler, Daniel and Johann II as well as Nicholas II were all contemporaries as students of Johann I in Basel.

More Relations

Over the years, there were many more Bernoullis who followed in the family tradition. Some of these include:

Johann II Bernoulli (1710–1790; also known as Jean), son of Johann, mathematician and physicist

Johann III Bernoulli (1744–1807; also known as Jean), son of Johann II, astronomer, geographer and mathematician

Jacob II Bernoulli (1759–1789; also known as Jacques), son of Johann II, physicist and mathematician

Johann Jakob Bernoulli (1831–1913), art historian and archaeologist; noted for his Römische Ikonographie (1882 onwards) on Roman Imperial portraits

Ludwig Bernoully (1873 – 1928), German architect in Frankfurt

Hans Bernoulli (1876–1959), architect and designer of the Bernoullihäuser in Zurich and Grenchen SO

Elisabeth Bernoulli (1873-1935), suffragette and campaigner against alcoholism.

Notable marriages to the Bernoulli family include the Curies (Pierre Curie was a direct descendant to Johann I) as well as the German author Hermann Hesse (married to a direct descendant of Johann I).


[1] Calinger, Ronald S.. Leonhard Euler : Mathematical Genius in the Enlightenment, Princeton University Press (2015).

[2] Euler L and Truesdell C. Leonhardi Euleri Opera Omnia. Series secunda: Opera mechanica et astronomica XI/2. The rational mechanics of flexible or elastic bodies 1638-1788. (Zürich: Orell Füssli, 1960).

[3] D Speiser, Daniel Bernoulli (1700-1782), Helvetica Physica Acta 55 (1982), 504-523.

[4] Leibniz GW. Briefwechsel zwischen Leibniz, Jacob Bernoulli, Johann Bernoulli und Nicolaus Bernoulli. (Hildesheim: Olms, 1971).

[5] Hakfoort C. Optics in the age of Euler : conceptions of the nature of light, 1700-1795. (Cambridge: Cambridge University Press, 1995).

Up-side-down Physics: Dynamic Equilibrium and the Inverted Pendulum

In the study of mechanics, the physics student moves through several stages in their education.  The first stage is the Newtonian physics of trajectories and energy and momentum conservation—there are no surprises there.  The second stage takes them to Lagrangians and Hamiltonians—here there are some surprises, especially for rigid body rotations.  Yet even at this stage, most problems have analytical solutions, and most of those solutions are exact.

Any street busker can tell you that an equally good (and more interesting) equilibrium point of a simple pendulum is when the bob is at the top.

It is only at the third stage that physics starts to get really interesting, and when surprising results with important ramifications emerge.  This stage is nonlinear physics.  Most nonlinear problems have no exact analytical solutions, but there are regimes where analytical approximations not only are possible but provide intuitive insights.  One of the best examples of this third stage is the dynamic equilibrium of Kapitsa’s up-side-down pendulum.

Piotr Kapitsa

Piotr Kapitsa (1894 – 1984) was a Russian physicist who received the Nobel Prize in physics in 1978 for his discovery in 1937 of superfluidity in liquid helium.  (He shared the 1978 prize with Penzias and Wilson who had discovered the cosmic microwave background.)  Superfluidity is a low-temperature hydrodynamic property of superfluids that shares some aspects in common with superconductivity.  Kapitsa published his results in Nature in 1938 in the same issue as a paper by John Allen and Don Misener of Cambridge, but Kapitsa had submitted his paper 19 days before Allen and Misener and so got priority (and the Nobel).

During his career Kapitsa was a leading force in Russian physics, surviving Stalin’s great purge through force of character, and helping to establish the now-famous Moscow Institute of Physics and Technology.  However, surviving Stalin did not always mean surviving with freedom, and around 1950 Kapitsa was under effective house arrest because of his unwillingness to toe the party line.

In his enforced free time, to while away the hours, Kapitsa developed an ingenious analytical approach to the problem of dynamic equilibrium.  His toy example was the driven inverted pendulum. It is surprising how many great works have emerged from the time freed up by house arrest: Galileo finally had time to write his “Two New Sciences”  after his run-in with the Inquisition, and Fresnel was free to develop his theory of diffraction after he ill-advisedly joined a militia to support the Bourbon king during Napoleon’s return. (In our own time, with so many physicists in lock-down and working from home, it will be interesting to see what great theory emerges from the pandemic.)

Stability in the Inverted Driven Pendulum

The only stable static equilibrium of the simple pendulum is when the pendulum bob is at its lowest point.  However, any street busker can tell you that an equally good (and more interesting) equilibrium point is when the bob is at the top.  The caveat is that this “inverted” equilibrium of the pendulum requires active stabilization. 

If the inverted pendulum is a simple physical pendulum, like a meter stick that you balance on the tip of your finger, you know that you need to nudge the stick gently and continuously this way and that with your finger, in response to the tipping stick, to keep it upright.  It’s an easy trick, and almost everyone masters it as a child.  With the human as the drive force, this is an example of a closed-loop control system.  The tipping stick is observed visually by the human, and the finger position is adjusted to compensate for the tip.  On the other hand, one might be interested to find an “open-loop” system that does not require active feedback or even an operator.  In 1908, Andrew Stephenson suggested that induced stability could be achieved by the inverted pendulum if a drive force of sufficiently high frequency were applied [1].  But the proof of the stability remained elusive until Kapitsa followed Stephenson’s suggestion by solving the problem through a separation of time scales [2].

The Method of Separation of Time Scales

The driven inverted pendulum has the dynamical equation

where w0 is the natural angular frequency of small-amplitude oscillations, a is a drive amplitude (with units of frequency) and w is the drive angular frequency that is assumed to be much larger than the natural frequency.  The essential assumption that allows the problem to be separate according to widely separated timescales is that the angular displacement has a slow contribution that changes on the time scale of the natural frequency, and a fast contribution that changes on the time scale of the much higher drive frequency.  The assumed solution then looks like

This is inserted into the dynamical equation to yield

where we have used the approximation

So far this is simple.  The next step is the key step.  It assumes that the dynamical equation should also separate into fast and slow contributions.  But the last term of the sin q expansion has a product of fast and slow components.  The key insight is that a time average can be used to average over the fast contribution.  The separation of the dynamical equation is then

where the time average of the fast variables is only needed on the first line.  The second line is a simple driven harmonic oscillator with a natural frequency that depends on cos qslow and a driving amplitude that depends on sin qslow.  The classic solution to the second line for qfast is

This solution can then be inserted into the first line to yield

This describes a pendulum under an effective potential (for high drive frequency and no damping)

The first term is unstable at the inverted position, but the second term is actually a restoring force.  If the second term is stronger than the first, then a dynamic equilibrium can be achieved. This occurs when the driving amplitude is larger than a threshold value

The effective potential for increasing drive amplitude looks like

Fig. 1 Effective potential as a function of angle and drive amplitude a (in units of w0)

When the drive amplitude is larger than sqrt(2), a slight dip forms in the unstable potential. The dip increases with increasing drive amplitude, as does the oscillation frequency of the effective potential.

Python Program:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Created on Friday Sept 11 06:03:32 2020
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt


print(' ')

F = 133.5          # 30 to 140  (133.5)
delt = 0.000       # 0.000 to 0.01
w = 20          # 20
def flow_deriv(x_y_z,tspan):
    x, y, z = x_y_z
    a = y
    b = -(1 + F*np.cos(z))*np.sin(x) - delt*y
    c = w
T = 2*np.pi/w

x0 = np.pi+0.3
v0 = 0.00
z0 = 0

x_y_z = [x0, v0, z0]

# Solve for the trajectories
t = np.linspace(0, 2000, 200000)
x_t = integrate.odeint(flow_deriv, x_y_z, t)
siztmp = np.shape(x_t)
siz = siztmp[0]

#y1 = np.mod(x_t[:,0]-np.pi,2*np.pi)-np.pi
y1 = x_t[:,0]
y2 = x_t[:,1]
y3 = x_t[:,2]    

lines = plt.plot(t[0:2000],x_t[0:2000,0]/np.pi)
plt.setp(lines, linewidth=0.5)
plt.title('Angular Position')

lines = plt.plot(t[0:1000],y2[0:1000])
plt.setp(lines, linewidth=0.5)

repnum = 5000
px = np.zeros(shape=(2*repnum,))
xvar = np.zeros(shape=(2*repnum,))
cnt = -1
testwt = np.mod(t,T)-0.5*T;
last = testwt[1]
for loop in range(2,siz-1):
    if (last < 0)and(testwt[loop] > 0):
        cnt = cnt+1
        del1 = -testwt[loop-1]/(testwt[loop] - testwt[loop-1])
        px[cnt] = (y2[loop]-y2[loop-1])*del1 + y2[loop-1]
        xvar[cnt] = (y1[loop]-y1[loop-1])*del1 + y1[loop-1]
        last = testwt[loop]
        last = testwt[loop]
lines = plt.plot(xvar[0:5000],px[0:5000],'ko',ms=1)
plt.title('First Return Map')

lines = plt.plot(x_t[0:1000,0]/np.pi,y2[0:1000])
plt.setp(lines, linewidth=0.5)
plt.title('Phase Space')

You can play with the parameters of this program to explore the physics of dynamic equilibrium. For instance, if the control parameter is slightly above the threshold (F = 32) at which a dip appears in the effective potential, the slow oscillation has a vary low frequency, as shown in Fig. 2. The high-frequency drive can still be seen superposed on the slow oscillation of the pendulum that is oscillating just like an ordinary pendulum but up-side-down!

Fig. 2 Just above the pitchfork bifurcation the slow oscillation has a low frequency. F = 32, w = 20, w0 = 1

The oscillation frequency is a function of the drive amplitude. This is a classic signature of a nonlinear system: amplitude-frequency coupling. Well above the threshold (F = 100), the frequency of oscillation in the effective potential becomes much larger, as in Fig. 3.

Fig. 3 High above the transition. F = 100, w = 20, w0 = 1

When the drive amplitude is more than four times larger than the threshold value (F > 140), the equilibrium is destroyed, so there is an upper bound to the dynamic stabilization. This happens when the “slow” frequency becomes comparable to the drive frequency and the separation-of-time-scales approach is no longer valid.

You can also play with the damping (delt) to see what effect it has on thresholds and long-term behavior starting at delt = 0.001 and increasing it.

Other Examples of Dynamic Equilibrium

Every physics student learns that there is no stable electrostatic equilibrium. However, if charges are put into motion, then a time-averaged potential can be created that can confine a charged particle. This is the principle of the Paul Ion Trap, named after Wolfgang Paul who was awarded the Nobel Prize in Physics in 1989 for this invention.

One of the most famous examples of dynamic equilibrium are the L4 and L5 Lagrange points. In the Earth-Jupiter system, these are the locations of the Trojan asteroids. These special Lagrange points are maxima (unstable equilibria) in the effective potential of a rotation coordinate system, but the Coriolis force creates a local minimum that traps the asteroids in a dynamically stable equilibrium.

In economics, general equilibrium theory describes how oscillating prices among multiple markets can stabilize economic performance in macroeconomics.

A recent paper in Science magazine used the principle of dynamic equilibrium to levitate a layer of liquid on which toy boats can ride right-side-up and up-side-down. For an interesting video see Upside-down boat (link).


[1] Stephenson Andrew (1908). “XX.On induced stability”. Philosophical Magazine. 6. 15: 233–236.

[2] Kapitza P. L. (1951). “Dynamic stability of a pendulum when its point of suspension vibrates”. Soviet Phys. JETP. 21: 588–597.


A detailed derivation of Kapitsa’s approach:

The bifurcation threshold for the inverted pendulum is a pitchfork bifurcation

Henri Poincaré and his Homoclinic Tangle

Will the next extinction-scale asteroid strike the Earth in our lifetime? 

This existential question—the question of our continued existence on this planet—is rhetorical, because there are far too many bodies in our solar system to accurately calculate all trajectories of all asteroids. 

The solar system is what is known as an N-body problem.  And even the N is not well determined.  The asteroid belt alone has over a million extinction-sized asteroids, and there are tens of millions of smaller ones that could still do major damage to life on Earth if they hit.  To have a hope of calculating even one asteroid trajectory do we ignore planetary masses that are too small?  What is too small?  What if we only consider the Sun, the Earth and Jupiter?  This is what Euler did in 1760, and he still had to make more assumptions.

Stability of the Solar System

Once Newton published his Principia, there was a pressing need to calculate the orbit of the Moon (see my blog post on the three-body problem).  This was important for navigation, because if the daily position of the moon could be known with sufficient accuracy, then ships would have a means to determine their longitude at sea.  However, the Moon, Earth and Sun are already a three-body problem, which still ignores the effects of Mars and Jupiter on the Moon’s orbit, not to mention the problem that the Earth is not a perfect sphere.  Therefore, to have any hope of success, toy systems that were stripped of all their obfuscating detail were needed.

Euler investigated simplified versions of the three-body problem around 1760, treating a body attracted to two fixed centers of gravity moving in the plane, and he solved it using elliptic integrals. When the two fixed centers are viewed in a coordinate frame that is rotating with the Sun-Earth system, it can come close to capturing many of the important details of the system. In 1762 Euler tried another approach, called the restricted three-body problem, where he considered a massless Moon attracted to a massive Earth orbiting a massive Sun, again all in the plane. Euler could not find general solutions to this problem, but he did stumble on an interesting special case when the three bodies remain collinear throughout their motions in a rotating reference frame.

It was not the danger of asteroids that was the main topic of interest in those days, but the question whether the Earth itself is in a stable orbit and is safe from being ejected from the Solar system.  Despite steadily improving methods for calculating astronomical trajectories through the nineteenth century, this question of stability remained open.

Poincaré and the King Oscar Prize of 1889

Some years ago I wrote an article for Physics Today called “The Tangled Tale of Phase Space” that tracks the historical development of phase space. One of the chief players in that story was Henri Poincaré (1854 – 1912). Henri Poincare was the Einstein before Einstein. He was a minor celebrity and was considered to be the greatest genius of his era. The event in his early career that helped launch him to stardom was a mathematics prize announced in 1887 to honor the birthday of King Oscar II of Sweden. The challenge problem was as simple as it was profound: Prove rigorously whether the solar system is stable.

This was the old N-body problem that had so far resisted solution, but there was a sense at that time that recent mathematical advances might make the proof possible. There was even a rumor that Dirichlet had outlined such a proof, but no trace of the outline could be found in his papers after his death in 1859.

The prize competition was announced in Acta Mathematica, written by the Swedish mathematician Gösta Mittag-Leffler. It stated:

Given a system of arbitrarily many mass points that attract each according to Newton’s law, under the assumption that no two points ever collide, try to find a representation of the coordinates of each point as a series in a variable that is some known function of time and for all of whose values the series converges uniformly.

The timing of the prize was perfect for Poincaré who was in his early thirties and just beginning to make his mark on mathematics. He was working on the theory of dynamical systems and was developing a new viewpoint that went beyond integrating single trajectories by focusing more broadly on whole classes of solutions. The question of the stability of the solar system seemed like a good problem to use to sharpen his mathematical tools. The general problem was still too difficult, so he began with Euler’s restricted three-body problem. He made steady progress, and along the way he invented an array of new techniques for studying the general properties of dynamical systems. One of these was the Poincaré section. Another was his set of integral invariants, one of which is recognized as the conservation of volume in phase space, also known as Liouville’s theorem, although it was Ludwig Boltzmann who first derived this result (see my Physics Today article). Eventually, he believed he had proven that the restricted three-body problem was stable.

By the time Poincaré had finished is prize submission, he had invented a new field of mathematical analysis, and the judges of the prize submission recognized it. Poincaré was named the winner, and his submission was prepared for publication in the Acta. However, Mittag-Leffler was a little concerned by a technical objection that had been raised, so he forwarded the comment to Poincaré for him to look at. At first, Poincaré thought the objection could easily be overcome, but as he worked on it and delved deeper, he had a sudden attack of panic. Trajectories near a saddle point did not converge. His proof of stability was wrong!

He alerted Mittag-Leffler to stop the presses, but it was too late. The first printing had been completed and review copies had already been sent to the judges. Mittag-Leffler immediately wrote to them asking for their return while Poincaré worked nonstop to produce a corrected copy. When he had completed his reanalysis, he had discovered a divergent feature of the solution to the dynamical problem near saddle points that his recognized today as the discovery of chaos. Poincaré paid for the reprinting of his paper out of his own pocket and (almost) all of the original printing was destroyed. This embarrassing moment in the life of a great mathematician was virtually forgotten until it was brought to light by the historian Barrow-Green in 1994 [1].

Poincaré is still a popular icon in France. Here is the Poincaré cafe in Paris.
A crater on the Moon is named after Poincaré.

Chaos in the Poincaré Return Map

Despite the fact that his conclusions on the stability of the 3-body problem flipped, Poincaré’s new tools for analyzing dynamical systems earned him the prize. He did not stop at his modified prize submission but continued working on systematizing his methods, publishing New Methods in Celestial Mechanics in several volumes through the 1890’s. It was here that he fully explored what happens when a trajectory approaches a saddle point of dynamical equilibrium.

The third volume of a three-book series that grew from Poincaré’s award-winning paper

To visualize a periodic trajectory, Poincaré invented a mathematical tool called a “first-return map”, also known as a Poincaré section. It was a way of taking a higher dimensional continuous trajectory and turning it into a simple iterated discrete map. Therefore, one did not need to solve continuous differential equations, it was enough to just iterate the map. In this way, complicated periodic, or nearly periodic, behavior could be explored numerically. However, even armed with this weapon, Poincaré found that iterated maps became unstable as a trajectory that originated from a saddle point approached another equivalent saddle point. Because the dynamics are periodic, the outgoing and incoming trajectories are opposite ends of the same trajectory, repeated with 2-pi periodicity. Therefore, the saddle point is also called a homoclinic point, meaning that trajectories in the discrete map intersect with themselves. (If two different trajectories in the map intersect, that is called a heteroclinic point.) When Poincaré calculated the iterations around the homoclinic point, he discovered a wild and complicated pattern in which a trajectory intersected itself many times. Poincaré wrote:

[I]f one seeks to visualize the pattern formed by these two curves and their infinite number of intersections … these intersections form a kind of lattice work, a weave, a chain-link network of infinitely fine mesh; each of the two curves can never cross itself, but it must fold back on itself in a very complicated way so as to recross all the chain-links an infinite number of times .… One will be struck by the complexity of this figure, which I am not even attempting to draw. Nothing can give us a better idea of the intricacy of the three-body problem, and of all the problems of dynamics in general…

Poincaré’s first view of chaos.

This was the discovery of chaos! Today we call this “lattice work” the “homoclinic tangle”. He could not draw it with the tools of his day … but we can!

Chirikov’s Standard Map

The restricted 3-body problem is a bit more complicated than is needed to illustrate Poincaré’s homoclinic tangle. A much simpler model is a discrete map called Chirikov’s Map or the Standard Map. It describes the Poincaré section of a periodically kicked oscillator that rotates or oscillates in the angular direction with an angular momentm J. The map has the simple form

in which the angular momentum in updated first, and then the angle variable is updated with the new angular momentum. When plotted on the (θ,J) plane, the standard map produces a beautiful kaleidograph of intertwined trajectories piercing the Poincaré plane, as shown in the figure below. The small points or dots are successive intersections of the higher-dimensional trajectory intersecting a plane. It is possible to trace successive points by starting very close to a saddle point (on the left) and connecting successive iterates with lines. These lines merge into the black trace in the figure that emerges along the unstable manifold of the saddle point on the left and approaches the saddle point on the right generally along the stable manifold.

Fig. Standard map for K = 0.97 at the transition to full chaos. The dark line is the trajectory of the unstable manifold emerging from the saddle point at (p,0). Note the wild oscillations as it approaches the saddle point at (3pi,0).

However, as the successive iterates approach the new saddle (which is really just the old saddle point because of periodicity) it crosses the stable manifold again and again, in ever wilder swings that diverge as it approaches the saddle point. This is just one trace. By calculating traces along all four stable and unstable manifolds and carrying them through to the saddle, a lattice work, or homoclinic tangle emerges.

Two of those traces originate from the stable manifolds, so to calculate their contributions to the homoclinic tangle, one must run these traces backwards in time using the inverse Chirikov map. This is

The four traces all intertwine at the saddle point in the figure below with a zoom in on the tangle in the next figure. This is the lattice work that Poincaré glimpsed in 1889 as he worked feverishly to correct the manuscript that won him the prize that established him as one of the preeminent mathematicians of Europe.

Fig. The homoclinic tangle caused by the folding of phase space trajectories as stable and unstable manifolds criss-cross in the Poincare map at the saddle point. This was the figure that Poincaré could not attempt to draw because of its complexity.
Fig. A zoom-in of the homoclinic tangle at the saddle point as the stable and unstable manifolds create a lattice of intersections. This is the fundamental origin of chaos and the sensitivity to initial conditions (SIC) that make forecasting almost impossible in chaotic systems.

Python Code:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Created on Sun Aug  2  2020
"Introduction to Modern Dynamics" 2nd Edition (Oxford, 2019)
@author: nolte

import numpy as np
from matplotlib import pyplot as plt
from numpy import linalg as LA


eps = 0.97



for eloop in range(0,100):

    rlast = 2*np.pi*(0.5-np.random.random())
    thlast = 4*np.pi*np.random.random()
    rplot = np.zeros(shape=(200,))
    thetaplot = np.zeros(shape=(200,))
    for loop in range(0,200):
        rnew = rlast + eps*np.sin(thlast)
        thnew = np.mod(thlast+rnew,4*np.pi)
        thetaplot[loop] = np.mod(thnew-np.pi,4*np.pi)     
        rtemp = np.mod(rnew + np.pi,2*np.pi)
        rplot[loop] = rtemp - np.pi
        rlast = rnew
        thlast = thnew

K = eps
eps0 = 5e-7

J = [[1,1+K],[1,1]]
w, v = LA.eig(J)

My = w[0]
Vu = v[:,0]     # unstable manifold
Vs = v[:,1]     # stable manifold

# Plot the unstable manifold
Hr = np.zeros(shape=(100,150))
Ht = np.zeros(shape=(100,150))
for eloop in range(0,100):
    eps = eps0*eloop

    roldu1 = eps*Vu[0]
    thetoldu1 = eps*Vu[1]
    Nloop = np.ceil(-6*np.log(eps0)/np.log(eloop+2))
    flag = 1
    cnt = 0
    while flag==1 and cnt < Nloop:
        ru1 = roldu1 + K*np.sin(thetoldu1)
        thetau1 = thetoldu1 + ru1
        roldu1 = ru1
        thetoldu1 = thetau1
        if thetau1 > 4*np.pi:
            flag = 0
        Hr[eloop,cnt] = roldu1
        Ht[eloop,cnt] = thetoldu1 + 3*np.pi
        cnt = cnt+1
x = Ht[0:99,12] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[0:99,12]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[5:39,15] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[5:39,15]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[12:69,16] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[12:69,16]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[15:89,17] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[15:89,17]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[30:99,18] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[30:99,18]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

# Plot the stable manifold
del Hr, Ht
Hr = np.zeros(shape=(100,150))
Ht = np.zeros(shape=(100,150))
#eps0 = 0.03
for eloop in range(0,100):
    eps = eps0*eloop

    roldu1 = eps*Vs[0]
    thetoldu1 = eps*Vs[1]
    Nloop = np.ceil(-6*np.log(eps0)/np.log(eloop+2))
    flag = 1
    cnt = 0
    while flag==1 and cnt < Nloop:
        thetau1 = thetoldu1 - roldu1
        ru1 = roldu1 - K*np.sin(thetau1)

        roldu1 = ru1
        thetoldu1 = thetau1
        if thetau1 > 4*np.pi:
            flag = 0
        Hr[eloop,cnt] = roldu1
        Ht[eloop,cnt] = thetoldu1
        cnt = cnt+1
x = Ht[0:79,12] + np.pi
x2 = 6*np.pi - x
y = Hr[0:79,12]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[4:39,15] + np.pi
x2 = 6*np.pi - x
y = Hr[4:39,15]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[12:69,16] + np.pi
x2 =  6*np.pi - x
y = Hr[12:69,16]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[15:89,17] + np.pi
x2 =  6*np.pi - x
y = Hr[15:89,17]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)

del x,y
x = Ht[30:99,18] + np.pi
x2 =  6*np.pi - x
y = Hr[30:99,18]
y2 = -y
plt.plot(x,y,linewidth =0.75)
plt.plot(x2,y2,linewidth =0.75)


[1] D. D. Nolte, “The tangled tale of phase space,” Physics Today, vol. 63, no. 4, pp. 33-38, Apr (2010)

[2] M. C. Gutzwiller, “Moon-Earth-Sun: The oldest three-body problem,” Reviews of Modern Physics, vol. 70, no. 2, pp. 589-639, Apr (1998)

[3] Barrow-Green J. Oscar II’s Prize Competition and the Error in Poindare’s Memoir on the Three Body Problem. Arch Hist Exact Sci 48: 107-131, 1994.

[4] Barrow-Green J. Poincaré and the three body problem. London Mathematical Society, 1997.


[6] Poincaré H and Goroff DL. New methods of celestial mechanics … Edited and introduced by Daniel L. Goroff. New York: American Institute of Physics, 1993.

Brook Taylor’s Infinite Series

When Leibniz claimed in 1704, in a published article in Acta Eruditorum, to have invented the differential calculus in 1684 prior to anyone else, the British mathematicians rushed to Newton’s defense. They knew Newton had developed his fluxions as early as 1666 and certainly no later than 1676. Thus ensued one of the most bitter and partisan priority disputes in the history of math and science that pitted the continental Leibnizians against the insular Newtonians. Although a (partisan) committee of the Royal Society investigated the case and found in favor of Newton, the affair had the effect of insulating British mathematics from Continental mathematics, creating an intellectual desert as the forefront of mathematical analysis shifted to France. Only when George Green filled his empty hours with the latest advances in French analysis, as he tended his father’s grist mill, did British mathematics wake up. Green self-published his epic work in 1828 that introduced what is today called Green’s Theorem.

Yet the period from 1700 to 1828 was not a complete void for British mathematics. A few points of light shone out in the darkness, Thomas Simpson, Collin Maclaurin, Abraham de Moivre, and Brook Taylor (1685 – 1731) who came from an English family that had been elevated to minor nobility by an act of Cromwell during the English Civil War.

Growing up in Bifrons House


View of Bifrons House from sometime in the late-1600’s showing the Jacobean mansion and the extensive south gardens.

When Brook Taylor was ten years old, his father bought Bifrons House [1], one of the great English country houses, located in the county of Kent just a mile south of Canterbury.  English country houses were major cultural centers and sources of employment for 300 years from the seventeenth century through the early 20th century. While usually being the country homes of nobility of all levels, from Barons to Dukes, sometimes they were owned by wealthy families or by representatives in Parliament, which was the case for the Taylors. Bifrons House had been built around 1610 in the Jacobean architectural style that was popular during the reign of James I.  The house had a stately front façade, with cupola-topped square towers, gable ends to the roof, porches of a renaissance form, and extensive manicured gardens on the south side.  Bifrons House remained the seat of the Taylor family until 1824 when they moved to a larger house nearby and let Bifrons first to a Marquess and then in 1828 to Lady Byron (ex-wife of Lord Byron) and her daughter Ada Lovelace (the mathematician famous for her contributions to early computer science). The Taylor’s sold the house in 1830 to the first Marquess Conyngham.

Taylor’s life growing up in the rarified environment of Bifrons House must have been like scenes out of the popular BBC TV drama Downton Abbey.  The house had a large staff of servants and large grounds at the edge of a large park near the town of Patrixbourne. Life as the heir to the estate would have been filled with social events and fine arts that included music and painting. Taylor developed a life-long love of music during his childhood, later collaborating with Isaac Newton on a scientific investigation of music (it was never published). He was also an amateur artist, and one of the first books he published after being elected to the Royal Society was on the mathematics of linear perspective, which contained some of the early results of projective geometry.

There is a beautiful family portrait in the National Portrait Gallery in London painted by John Closterman around 1696. The portrait is of the children of John Taylor about a year after he purchased Bifrons House. The painting is notable because Brook, the heir to the family fortunes, is being crowned with a wreath by his two older sisters (who would not inherit). Brook was only about 11 years old at the time and was already famous within his family for his ability with music and numbers.

Portrait of the children of John Taylor around 1696. Brook Taylor is the boy being crowned by his sisters on the left. (National Portrait Gallery)

Taylor never had to go to school, being completely tutored at home until he entered St. John’s College, Cambridge, in 1701.  He took mathematics classes from Machin and Keill and graduated in 1709.  The allowance from his father was sufficient to allow him to lead the life of a gentleman scholar, and he was elected a member of the Royal Society in 1712 and elected secretary of the Society just two years later.  During the following years he was active as a rising mathematician until 1721 when he married a woman of a good family but of no wealth.  The support of a house like Bifrons always took money, and the new wife’s lack of it was enough for Taylor’s father to throw the new couple out.  Unfortunately, his wife died in childbirth along with the child, so Taylor returned home in 1723.  These family troubles ended his main years of productivity as a mathematician.

Portrait of Brook Taylor

Methodus incrementorum directa et inversa

Under the eye of the Newtonian mathematician Keill at Cambridge, Taylor became a staunch supporter and user of Newton’s fluxions. Just after he was elected as a member of the Royal Society in 1712, he participated in an investigation of the priority for the invention of the calculus that pitted the British Newtonians against the Continental Leibnizians. The Royal Society found in favor of Newton (obviously) and raised the possibility that Leibniz learned of Newton’s ideas during a visit to England just a few years before Leibniz developed his own version of the differential calculus.

A re-evaluation of the priority dispute from today’s perspective attributes the calculus to both men. Newton clearly developed it first, but did not publish until much later. Leibniz published first and generated the excitement for the new method that dispersed its use widely. He also took an alternative route to the differential calculus that is demonstrably different than Newton’s. Did Leibniz benefit from possibly knowing Newton’s results (but not his methods)? Probably. But that is how science is supposed to work … building on the results of others while bringing new perspectives. Leibniz’ methods and his notations were superior to Newton’s, and the calculus we use today is closer to Leibniz’ version than to Newton’s.

Once Taylor was introduced to Newton’s fluxions, he latched on and helped push its development. The same year (1715) that he published a book on linear perspective for art, he also published a ground-breaking book on the use of the calculus to solve practical problems. This book, Methodus incrementorum directa et inversa, introduced several new ideas, including finite difference methods (which are used routinely today in numerical simulations of differential equations). It also considered possible solutions to the equation for a vibrating string for the first time.

The vibrating string is one of the simplest problem in “continuum mechanics”, but it posed a severe challenge to Newtonian physics of point particles. It was only much later that D’Alembert used Newton’s first law of action-reaction to eliminate internal forces to derive D’Alembert’s principle on the net force on an extended body. Yet Taylor used finite differences to treat the line mass of the string in a way that yielded a possible solution of a sine function. Taylor was the first to propose that a sine function was the form of the string displacement during vibration. This idea would be taken up later by D’Alembert (who first derived the wave equation), and by Euler (who vehemently disagreed with D’Alembert’s solutions) and Daniel Bernoulli (who was the first to suggest that it is not just a single sine function, but a sum of sine functions, that described the string’s motion — the principle of superposition).

Of course, the most influential idea in Taylor’s 1715 book was his general use of an infinite series to describe a curve.

Taylor’s Series

Infinite series became a major new tool in the toolbox of analysis with the publication of John WallisArithmetica Infinitorum published in 1656. Shortly afterwards many series were published such as Nikolaus Mercator‘s series (1668)

and James Gregory‘s series (1668)

And of course Isaac Newton’s generalized binomial theorem that he worked out famously during the plague years of 1665-1666

But these consisted mainly of special cases that had been worked out one by one. What was missing was a general method that could yield a series expression for any curve.

Taylor used concepts of finite differences as well as infinitesimals to derive his formula for expanding a function as a power series around any point. His derivation in Methodus incrementorum directa et inversa is not easily recognized today. Using difference tables, and ideas from Newton’s fluxions that viewed functions as curves traced out as a function of time, he arrived at the somewhat cryptic expression

where the “dots” are time derivatives, x stands for the ordinate (the function), v is a finite difference, and z is the abcissa moving with constant speed. If the abcissa moves with unit speed, then this becomes Taylor’s Series (in modern notation)

The term “Taylor’s series” was probably first used by L’Huillier in 1786, although Condorcet attributed the equation to both Taylor and d’Alembert in 1784. It was Lagrange in 1797 who immortalized Taylor by claiming that Taylor’s theorem was the foundation of analysis.

Example: sin(x)

Expand sin(x) around x = π

This is related to the expansion around x = 0 (also known as a Maclaurin series)

Example: arctan(x)

To get an feel for how to apply Taylor’s theorem to a function like arctan, begin with

and take the derivative of both sides

Rewrite this as

and substitute the expression for y

and integrate term by term to arrive at

This is James Gregory’s famous series. Although the math here is modern and only takes a few lines, it parallel’s Gregory’s approach. But Gregory had to invent aspects of calculus as he went along — his derivation covering many dense pages. In the priority dispute between Leibniz and Newton, Gregory is usually overlooked as an independent inventor of many aspects of the calculus. This is partly because Gregory acknowledged that Newton had invented it first, and he delayed publishing to give Newton priority.

Two-Dimensional Taylor’s Series

The ideas behind the Taylor’s series generalizes to any number of dimensions. For a scalar function of two variables it takes the form (out to second order)

where J is the Jacobian matrix (vector) and H is the Hessian matrix defined for the scalar function as


As a concrete example, consider the two-dimensional Gaussian function

The Jacobean and Hessian matrices are

which are the first- and second-order coefficients of the Taylor series.


[1] “A History of Bifrons House”, B. M. Thomas, Kent Archeological Society (2017)

[2] L. Feigenbaum, “TAYLOR,BROOK AND THE METHOD OF INCREMENTS,” Archive for History of Exact Sciences, vol. 34, no. 1-2, pp. 1-140, (1985)

[3] A. Malet, “GREGORIE, JAMES ON TANGENTS AND THE TAYLOR RULE FOR SERIES EXPANSIONS,” Archive for History of Exact Sciences, vol. 46, no. 2, pp. 97-137, (1993)

[4] E. Harier and G. Wanner, Analysis by its History (Springer, 1996)

Painting of Bifrons Park by Jan Wyck c. 1700

Physics in the Age of Contagion: Part 4. Fifty Shades of Immunity to COVID-19

This is the fourth installment in a series of blogs on the population dynamics of COVID-19. In my first blog I looked at a bifurcation physics model that held the possibility (and hope) that with sufficient preventive action the pandemic could have died out and spared millions. That hope was in vain.

What will it be like to live with COVID-19 as a constant factor of modern life for years to come?

In my second blog I looked at a two-component population dynamics model that showed the importance of locking down and not emerging too soon. It predicted that waiting only a few extra weeks before opening could have saved tens of thousands of lives. Unfortunately, because states like Texas and Florida opened too soon and refused to mandate the wearing of masks, thousands of lives were lost.

In my third blog I looked at a network physics model that showed the importance of rapid testing and contact tracing to remove infected individuals to push the infection rate low — not only to flatten the curve, but to drive it down. While most of the developed world is succeeding in achieving this, the United States is not.

In this fourth blog, I am looking at a simple mean-field model that shows what it will be like to live with COVID-19 as a constant factor of modern life for years to come. This is what will happen if the period of immunity to the disease is short and people who recover from the disease can get it again. Then the disease will never go away and the world will need to learn to deal with it. How different that world will look from the one we had just a year ago will depend on the degree of immunity that is acquired after infection, how long a vaccine will provide protection before booster shots are needed, and how many people will get the vaccine or will refus.


COVID-19 is a SARS corona virus known as SARS-CoV-2. SARS stands for Severe Acute Respiratory Syndrome. There is a simple and well-established mean-field model for an infectious disease like SARS that infects individuals, from which they recover, but after some lag period they become susceptible again. This is called the SIRS model, standing for Susceptible-Infected-Recovered-Susceptible. This model is similar to the SIS model of my first blog, but it now includes a mean lifetime for the acquired immunity, after an individual recovers from the infection and then becomes susceptible again. The bifurcation threshold is the same for the SIRS model as the SIS model, but with SIRS there is a constant susceptible population. The mathematical flow equations for this model are

where i is the infected fraction, r is the recovered fraction, and 1 – i – r = s is the susceptible fraction. The infection rate for an individual who has k contacts is βk. The recovery rate is μ and the mean lifetime of acquired immunity after recovery is τlife = 1/ν. This model assumes that all individuals are equivalent (no predispositions) and there is no vaccine–only natural immunity that fades in time after recovery.

The population trajectories for this model are shown in Fig. 1. The figure on the left is a 3-simplex where every point in the triangle stands for a 3-tuple (i, r, s). Our own trajectory starts at the right bottom vertex and generates the green trajectory that spirals into the fixed point. The parameters are chosen to be roughly equivalent to what is known about the virus (but with big uncertainties in the model parameters). One of the key results is that the infection will oscillate over several years, settling into a steady state after about 4 years. Thereafter, there is a steady 3% infected population with 67% of the population susceptible and 30% recovered. The decay time for the immunity is assumed to be one year in this model. Note that the first peak in the infected numbers will be about 1 year, or around March 2021. There is a second smaller peak (the graph on the right is on a vertical log scale) at about 4 years, or sometime in 2024.

Fig. 1 SIRS model for COVID-19 in which immunity acquired after recovery fades in time so an individual can be infected again. If immunity fades and there is never a vaccine, a person will have an 80% chance of getting the virus at least twice in their lifetime, and COVID will become the third highest cause of death in the US after heart disease and cancer.

Although the recovered fraction is around 30% for these parameters, it is important to understand that this is a dynamic equilibrium. If there is no vaccine, then any individual who was once infected can be infected again after about a year. So if they don’t get the disease in the first year, they still have about a 4% chance to get it every following year. In 50 years, a 20-year-old today would have almost a 90% chance of having been infected at least once and an 80% chance of having gotten it at least twice. In other words, if there is never a vaccine, and if immunity fades after each recovery, then almost everyone will eventually get the disease several times in their lifetime. Furthermore, COVID will become the third most likely cause of death in the US after heart disease (first) and cancer (second). The sad part of this story is that it all could have been avoided if the government leaders of several key nations, along with their populations, had behaved responsibly.

The Asymmetry of Personal Cost under COVID

The nightly news in the US during the summer of 2020 shows endless videos of large parties, dense with people, mostly young, wearing no masks. This is actually understandable even though regrettable. It is because of the asymmetry of personal cost. Here is what that means …

On any given day, an individual who goes out and about in the US has only about a 0.01 percent chance of contracting the virus. In the entire year, there is only about a 3% chance that that individual will get the disease. And even if they get the virus, they only have a 2% chance of dying. So the actual danger per day per person is so minuscule that it is hard to understand why it is so necessary to wear a mask and socially distance. Therefore, if you go out and don’t wear a mask, almost nothing bad will happen to YOU. So why not? Why not screw the masks and just go out!

And this is why that’s such a bad idea: because if no-one wears a mask, then tens or hundreds of thousands of OTHERS will die.

This is the asymmetry of personal cost. By ignoring distancing, nothing is likely to happen to YOU, but thousands of OTHERS will die. How much of your own comfort are you willing to give up to save others? That is the existential question.

This year is the 75th anniversary of the end of WW II. During the war everyone rationed and recycled, not because they needed it for themselves, but because it was needed for the war effort. Almost no one hesitated back then. It was the right thing to do even though it cost personal comfort. There was a sense of community spirit and doing what was good for the country. Where is that spirit today? The COVID-19 pandemic is a war just as deadly as any war since WW II. There is a community need to battle it. All anyone has to do is wear a mask and behave responsibly. Is this such a high personal cost?

The Vaccine

All of this can change if a reliable vaccine can be developed. There is no guarantee that this can be done. For instance, there has never been a reliable vaccine for the common cold. A more sobering thought is to realize is that there has never been a vaccine for the closely related virus SARS-CoV-1 that broke out in 2003 in China but was less infectious. But the need is greater now, so there is reason for optimism that a vaccine can be developed that elicits the production of antibodies with a mean lifetime at least as long as for naturally-acquired immunity.

The SIRS model has the same bifurcation threshold as the SIS model that was discussed in a previous blog. If the infection rate can be made slower than the recovery rate, then the pandemic can be eliminated entirely. The threshold is

The parameter μ, the recovery rate, is intrinsic and cannot be changed. The parameter β, the infection rate per contact, can be reduced by personal hygiene and wearing masks. The parameter <k>, the average number of contacts to a susceptible person, can be significantly reduced by vaccinating a large fraction of the population.

To simulate the effect of vaccination, the average <k> per person can be reduced at the time of vaccination. This lowers the average infection rate. The results are shown in Fig. 2 for the original dynamics, a vaccination of 20% of the populace, and a vaccination of 40% of the populace. For 20% vaccination, the epidemic is still above threshold, although the long-time infection is lower. For 40% of the population vaccinated, the disease falls below threshold and would decay away and vanish.

Fig. 2 Vaccination at 52 weeks can lower the infection cases (20% vaccinated) or eliminate them entirely (40% vaccinated). The vaccinations would need booster shots every year (if the decay time of immunity is one year).

In this model, the vaccination is assumed to decay at the same rate as naturally acquired immunity (one year), so booster shots would be needed every year. Getting 40% of the population to get vaccinated may be achieved. Roughly that fraction get yearly flu shots in the US, so the COVID vaccine could be added to the list. But at 40% it would still be necessary for everyone to wear face masks and socially distance until the pandemic fades away. Interestingly, if the 40% got vaccinated all on the same date (across the world), then the pandemic would be gone in a few months. Unfortunately, that’s unrealistic, so with a world-wide push to get 40% of the World’s population vaccinated within five years, it would take that long to eliminate the disease, taking us to 2025 before we could go back to the way things were in November of 2019. But that would take a world-wide vaccination blitz the likes of which the world has never seen.

Python Code:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Created on Fri July 17 2020
D. D. Nolte, "Introduction to Modern Dynamics: 
    Chaos, Networks, Space and Time, 2nd Edition (Oxford University Press, 2019)
@author: nolte

import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt


def tripartite(x,y,z):

    sm = x + y + z
    xp = x/sm
    yp = y/sm
    f = np.sqrt(3)/2
    y0 = f*xp
    x0 = -0.5*xp - yp + 1;
    lines = plt.plot(x0,y0)
    plt.setp(lines, linewidth=0.5)
    plt.plot([0, 1],[0, 0],'k',linewidth=1)
    plt.plot([0, 0.5],[0, f],'k',linewidth=1)
    plt.plot([1, 0.5],[0, f],'k',linewidth=1)
print(' ')

def solve_flow(param,max_time=1000.0):

    def flow_deriv(x_y,tspan,mu,betap,nu):
        x, y = x_y
        return [-mu*x + betap*x*(1-x-y),mu*x-nu*y]
    x0 = [del1, del2]
    # Solve for the trajectories
    t = np.linspace(0, int(tlim), int(250*tlim))
    x_t = integrate.odeint(flow_deriv, x0, t, param)

    return t, x_t

 # rates per week
betap = 0.3;   # infection rate
mu = 0.2;      # recovery rate
nu = 0.02      # immunity decay rate

print('beta = ',betap)
print('mu = ',mu)
print('nu =',nu)
print('betap/mu = ',betap/mu)
del1 = 0.005         # initial infected
del2 = 0.005         # recovered

tlim = 600          # weeks (about 12 years)

param = (mu, betap, nu)    # flow parameters

t, y = solve_flow(param)
I = y[:,0]
R = y[:,1]
S = 1 - I - R

lines = plt.semilogy(t,I,t,S,t,R)
plt.setp(lines, linewidth=0.5)
plt.ylabel('Fraction of Population')
plt.title('Population Dynamics for COVID-19')

for xloop in range(0,10):
    del1 = xloop/10.1 + 0.001
    del2 = 0.01

    tlim = 300
    param = (mu, betap, nu)    # flow parameters
    t, y = solve_flow(param)       
    I = y[:,0]
    R = y[:,1]
    S = 1 - I - R

for yloop in range(1,6):
    del1 = 0.001;
    del2 = yloop/10.1
    t, y = solve_flow(param)
    I = y[:,0]
    R = y[:,1]
    S = 1 - I - R
for loop in range(2,10):
    del1 = loop/10.1
    del2 = 1 - del1 - 0.01
    t, y = solve_flow(param)
    I = y[:,0]
    R = y[:,1]
    S = 1 - I - R
plt.title('Simplex Plot of COVID-19 Pop Dynamics')
vac = [1, 0.8, 0.6]
for loop in vac:
    # Run the epidemic to the first peak
    del1 = 0.005
    del2 = 0.005
    tlim = 52
    param = (mu, betap, nu)
    t1, y1 = solve_flow(param)
    # Now vaccinate a fraction of the population
    st = np.size(t1)
    del1 = y1[st-1,0]
    del2 = y1[st-1,1]
    tlim = 400
    param = (mu, loop*betap, nu)
    t2, y2 = solve_flow(param)
    t2 = t2 + t1[st-1]
    tc = np.concatenate((t1,t2))
    yc = np.concatenate((y1,y2))
    I = yc[:,0]
    R = yc[:,1]
    S = 1 - I - R
    lines = plt.semilogy(tc,I,tc,S,tc,R)
    plt.setp(lines, linewidth=0.5)
    plt.ylabel('Fraction of Population')
    plt.title('Vaccination at 1 Year')

Caveats and Disclaimers

No effort was made to match parameters to the actual properties of the COVID-19 pandemic. The SIRS model is extremely simplistic and can only show general trends because it homogenizes away all the important spatial heterogeneity of the disease across the cities and states of the country. If you live in a hot spot, this model says little about what you will experience locally. The decay of immunity is also a completely open question and the decay rate is completely unknown. It is easy to modify the Python program to explore the effects of differing decay rates and vaccination fractions. The model also can be viewed as a “compartment” to model local variations in parameters.

Johann Bernoulli’s Brachistochrone

Johann Bernoulli was an acknowledged genius–and he acknowledged it of himself.  Some flavor of his character can be seen in his opening lines of one of the most famous challenges in the history of mathematics—the statement of the Brachistrochrone Challenge.

“I, Johann Bernoulli, address the most brilliant mathematicians in the world. Nothing is more attractive to intelligent people than an honest, challenging problem, whose possible solution will bestow fame and remain as a lasting monument.”

Of course, he meant his own fame, because he thought he already had a solution to the problem he posed to the mathematical community of the day.

The Problem of Fastest Descent

The problem posed by Johann Bernoulli was the brachistochrone (Gk: brachis + chronos) or the path of fastest descent. 

Galileo had attempted to tackle this problem in his Two New Sciences and had concluded, based on geometric arguments, that the solution was a circular path.  Yet he hedged—he confessed that he had reservations about this conclusion and suggested that a “higher mathematics” would possibly find a better solution. In fact he was right.

Fig. 1  Galileo considered a mass falling along different chords of a circle starting at A.  He proved that the path along ABG was quicker than along AG, and ABCG was quicker than ABG, and ABCDG was quicker than ABCG, etc.  In this way he showed that the path along the circular arc was quicker than any set of chords.  From this he inferred that the circle was the path of quickest descent—but he held out reservations, and rightly so.

In 1659, when Christiaan Huygens was immersed in the physics of pendula and time keeping, he was possibly the first mathematician to recognize that a perfect harmonic oscillator, one whose restoring force was linear in the displacement of the oscillator, would produce the perfect time piece.  Unfortunately, the pendulum, proposed by Galileo, was the simplest oscillator to construct, but Huygens already knew that it was not a perfect harmonic oscillator.  The period of oscillation became smaller when the amplitude of the oscillation became larger.  In order to “fix” the pendulum, he searched for a curve of equal time, called the tautochrone, that would allow all amplitudes of the pendulum to have the same period.  He found the solution and recognized it to be a cycloid arc. 

On a cycloid whose axis is erected on the perpendicular and whose vertex is located at the bottom, the times of descent, in which a body arrives at the lowest point at the vertex after having departed from any point on the cycloid, are equal to each other…

His derivation filled 16 pages with geometric arguments, which was not a very efficient way to derive the thing.

Almost thirty years later, during the infancy of the infinitesimal calculus, the tautochrone was held up as a master example of an “optimal” solution whose derivation should yield to the much more powerful and elegant methods of the calculus.  Jakob Bernoulli, Johann’s brother, succeeded in deriving the tautochrone in 1690 using the calculus, using the term “integral” for the first time in print, but it was not at first clear what other problems could yield in a similar way.

Then, in 1696, Johann Bernoulli posed the brachistrochrone problem in the pages of Acta Eruditorum.

Fig. 2 The shortest-time route from A to B, relying only on gravity, is the cycloid, compared to the parabola, circle and linear paths. Johann and Jakob Bernoulli, brothers, competed to find the best solution.

Acta Eruditorum

The Acta Eruditorum was the German answer to the Proceedings of the Royal Society of London.  It began publishing in Leipzig in 1682 under the editor Otto Mencke.  Although Mencke was the originator, launching and supporting the journal became the obsession of Gottfried Lebiniz, who felt he was a hostage in the backwaters of Hanover Germany but who yearned for a place on the world stage (i.e. Paris or London).  By launching the continental publication, the Continental scientists had a freer voice without needing to please the gate keepers at the Royal Society.  And by launching a German journal, it gave German scientists like Leibniz (and the Bernoullis and Euler, and von Tschirnhaus among others) a freer voice without censor by the Journal des Savants of Paris.

Fig. 3 Acta Eruditorum of 1684 containing one of Leibniz’ early papers on the calculus.

The Acta Eruditorum was almost a vanity press for Leibniz.  He published 13 papers in the journal in its first 4 years of activity starting in 1682.  In return, when Leibniz became embroiled in the priority dispute with Newton over the invention of the calculus, the Acta provided loyal support for Leibniz’ side just as the Proceedings of the Royal Society gave loyal support to Newton.  In fact, the trigger that launched the nasty battle with Newton was a review that Leibniz wrote for the Acta in 17?? [Ref] in which he presented himself as the primary inventor of the calculus.  When he failed to give due credit, not only to Newton, but also to lesser contributors, they fought back by claiming that Leibniz had stolen the idea from Newton.  Although a kangaroo court by the Royal Society found in favor of Newton, posterity gives most of the credit for the development and dissemination of the calculus to Leibniz.  Where Newton guarded his advances jealously and would not explain his approach, Leibniz freely published his methods for all to see and to learn and to try out for themselves.  In this open process, the Acta was the primary medium of communication and gets the credit for being the conduit by which the calculus was presented to the world.

Although the Acta Eruditorum only operated for 100 years, it stands out as the most important publication for the development of the calculus.  Leibnitz published in the Acta a progressive set of papers that outlined his method for the calculus.  More importantly, his papers elicited responses from other mathematicians, most notably Johann Bernoulli and von Tschirnhaus and L’Hopital, who helped to refine the methods and advance the art.  The Acta became a collaborative space for this team of mathematicians as they fine-tuned the methods as well as the notations for the calculus, most of which stand to this day.  In contrast, Newton’s notations have all but faded, save the simple “dot” notation over variables to denote them as time derivatives (his fluxions).  Therefore, for most of continental Europe, the Acta Eruditorum was the place to publish, and it was here that Johann Bernoulli published his famous challenge of the brachistochrone.

The Competition

Johann suggested the problem in the June 1696 Acta Eruditorum

Following the example set by Pascal, Fermat, etc., I hope to gain the gratitude of the whole scientific community by placing before the finest mathematicians of our time a problem which will test their methods and the strength of their intellect. If someone communicates to me the solution of the proposed problem, I shall publicly declare him worthy of praise

Given two points A and B in a vertical plane, what is the curve traced out by a point acted on only by gravity, which starts at A and reaches B in the shortest time

The competition was originally proposed for 6 months, but it was then extended to a year and a half.  Johann published his results about a year later, but not without controversy.  Johann had known that his brother Jakob was also working on the problem, but he incorrectly thought that Jakob was convinced that Galileo had been right, so Johann described his approach to Jakob thinking he had little to fear in the competition.  Johann didn’t know that Jakob had already taken an approach similar to Johann’s, and even more importantly, Jakob had done the math correctly.  When Jakob showed Johann his mistake, he also ill-advisedly showed him the correct derivation.  Johann sent off a manuscript to Acta with the correct derivation that he had learned from Jakob.

Within the year and a half there were 4 additional solutions—all correct—using different approaches.  One of the most famous responses was by Newton (who as usual did not give up his method) but who is reported to have solved the problem in a day.  Others who contributed solutions were Gottfried Leibniz, Ehrenfried Walther von Tschirnhaus, and Guillaume de l’Hôpital’s.  Of course, Jakob sent in his own solution, although it overlapped with the one Johann had already published.

The Solution of Jakob and Johann Bernoulli

The stroke of genius of Jakob and Johann Bernoulli, accomplished in 1697 only about 20 years after the invention of the calculus, was to recognize an amazing analogy between mechanics and light.  Their insight foreshadowed Lagrange by a hundred years and William Rowan Hamilton by a hundred and fifty.  They did this by recognizing that the path of a light beam, just like the trajectory of a particle, conserves certain properties.  In the case of Fermat’s principle, a light ray refracts to take the path of least time between two points.  The insight of the Bernoulli’s is that a mechanical particle would behave in exactly the same way.  Therefore, the brachistrochrone can be obtained by considering the path that a light beam would take if the light ray were propagating through a medium with non-uniform refractive index to that the speed of light varies with height y as

Fermat’s principle of least time, which is consistent with Snell’s Law at interfaces, imposes the constraint on the path

This equation for a light ray propagating through a non-uniform medium would later become known as the Eikonal Equation.  The conserved quantity along this path is the value 1/vm.  Rewriting the Eikonal equation as

it can be solved for the differential equation

which those in the know (as certainly the Bernoullis were) would know is the equation of a cycloid.  If the sliding bead is on a wire shaped like a cycloid, there must be a lowest point for which the speed is a maximum.  For the cycloid curve of diameter D, this is

Therefore, the equation for the brachistochrone is

which is the differential equation for an inverted cycloid of diameter D.

Fig. 4 A light ray enters vertically on a medium whose refractive index varies as the square-root of depth.  The path of least time for the light ray to travel through the material is a cycloid—the same as for a massive particle traveling from point A to point B.

Calculus of Variations

Variational calculus had not quite been invented in time to solve the Brachistochrone, although the brachistochrone challenge helped motivate its eventual development by Euler and Lagrange later in the eighteenth century. Nonetheless, it is helpful to see the variational solution, which is the way we would solve this problem if it were a Lagrangian problem in advanced classical mechanics.

First, the total time taken by the sliding bead is defined as

Then we take energy conservation to solve for v(y)

The path element is

which leads to the expression for total time

It is the argument of the integral which is the quantity to be varied (the Lagrangian)

which can be inserted into the Lagrange equation

This has a simple first integral

This is explicitly solved

Once again, it helps to recognize the equation of a cycloid, because the last line can be solved as the parametric curves

which is the cycloid curve.


C. B. Boyer, The History of the Calculus and its Conceptual Development. New York: Dover, 1959.

J. Coopersmith, The lazy universe : an introduction to the principle of least action. Oxford University Press, 2017.

D. S. Lemons, Perfect Form: Variational Principles, Methods, and Applications in Elementary Physics. Princeton University Press, 1997.

Wikipedia: The Brachistrochrone Curve

W. Yourgrau, Variational principles in dynamics and quantum theory, 2d ed.. ed. New York: New York, Pitman Pub. Corp., 1960.