Climate Change Physics 101

When our son was ten years old, he came home from a town fair in Battleground, Indiana, with an unwanted pet—a goldfish in a plastic bag.  The family rushed out to buy a fish bowl and food and plopped the golden-red animal into it.  In three days, it was dead!

It turns out that you can’t just put a gold fish in a fish bowl.  When it metabolizes its food and expels its waste, it builds up toxic levels of ammonia unless you add filters or plants or treat the water with chemicals.  In the end, the goldfish died because it was asphyxiated by its own pee.

It’s a basic rule—don’t pee in your own fish bowl.

The same can be said for humans living on the surface of our planet.  Polluting the atmosphere with our wastes cannot be a good idea.  In the end it will kill us.  The atmosphere may look vast—the fish bowl was a big one—but it is shocking how thin it is.

Turn on your Apple TV, click on the screen saver, and you are skimming over our planet on the dark side of the Earth. Then you see a thin blue line extending over the limb of the dark disc.  Hold!  That thin blue line!  That is our atmosphere! Is it really so thin?

When you look upwards on a clear sunny day, the atmosphere seems like it goes on forever.  It doesn’t.  It is a thin veneer on the surface of the Earth barely one percent of the Earth’s radius.  The Earth’s atmosphere is frighteningly thin. 

Fig. 1  A thin veneer of atmosphere paints the surface of the Earth.  The radius of the Earth is 6360 km, and the thickness of the atmosphere is 100 km, which is a bit above 1 percent of the radius.

Consider Mars.  It’s half the size of Earth, yet it cannot hold on to an atmosphere even 1/100th the thickness of ours.  When Mars first formed, it had an atmosphere not unlike our own, but through the eons its atmosphere has wafted away irretrievably into space.

An atmosphere is a precious fragile thing for a planet.  It gives life and it gives protection.  It separates us from the deathly cold of space, holding heat like a blanket.  That heat has served us well over the eons, allowing water to stay liquid and allowing life to arise on Earth.  But too much of a good thing is not a good thing.

Common Sense

If the fluid you are bathed in gives you life, then don’t mess with it.  Don’t run your car in the garage while you are working in it.  Don’t use a charcoal stove in an enclosed space.  Don’t dump carbon dioxide into the atmosphere because it also is an enclosed space.

At the end of winter, as the warm spring days get warmer, you take the winter blanket off your bed because blankets hold in heat.  The thicker the blanket, the more heat it holds in.  Common sense tells you to reduce the thickness of the blanket if you don’t want to get too warm.  Carbon dioxide in the atmosphere acts like a blanket.  If we don’t want the Earth to get too warm, then we need to limit the thickness of the blanket.

Without getting into the details of any climate change model, common sense already tells us what we should do.  Keep the atmosphere clean and stable (Don’t’ pee in our fishbowl) and limit the amount of carbon dioxide we put into it (Don’t let the blanket get too thick).

Some Atmospheric Facts

Here are some facts about the atmosphere, about the effect humans have on it, and about the climate:

Fact 1.  Humans have increased the amount of carbon dioxide in the atmosphere by 45% since 1850 (the beginning of the industrial age) and by 30% since just 1960.

Fact 2.  Carbon dioxide in the atmosphere prevents some of the heat absorbed from the Sun to re-radiate out to space.  More carbon dioxide stores more heat.

Fact 3.  Heat added to the Earth’s atmosphere increases its temperature.  This is a law of physics.

Fact 4.  The Earth’s average temperature has risen by 1.2 degrees Celsius since 1850 and 0.8 degrees of that has been just since 1960, so the effect is accelerating.

These facts are indisputable.  They hold true regardless of whether there is a Republican or a Democrat in the White House or in control of Congress.

There is another interesting observation which is not so direct, but may hold a harbinger for the distant future: The last time the Earth was 3 degrees Celsius warmer than it is today was during the Pliocene when the sea level was tens of meters higher.  If that sea level were to occur today, all of Delaware, most of Florida, half of Louisiana and the entire east coast of the US would be under water, including Houston, Miami, New Orleans, Philadelphia and New York City.  There are many reasons why this may not be an immediate worry. The distribution of water and ice now is different than in the Pliocene, and the effect of warming on the ice sheets and water levels could take centuries. Within this century, the amount of sea level rise is likely to be only about 1 meter, but accelerating after that.

Fig. 2  The east coast of the USA for a sea level 30 meters higher than today.  All of Delaware, half of Louisiana, and most of Florida are under water. Reasonable projections show only a 1 meter sea level rise by 2100, but accelerating after that. From https://www.youtube.com/watch?v=G2x1bonLJFA

Balance and Feedback

It is relatively easy to create a “rule-of-thumb” model for the Earth’s climate (see Ref. [2]).  This model is not accurate, but it qualitatively captures the basic effects of climate change and is a good way to get an intuitive feeling for how the Earth responds to changes, like changes in CO2 or to the amount of ice cover.  It can also provide semi-quantitative results, so that relative importance of various processes or perturbations can be understood.

The model is a simple energy balance statement:  In equilibrium, as much energy flows into the Earth system as out.

This statement is both simple and immediately understandable.  But then the work starts as we need to pin down how much energy is flowing in and how much is flowing out.  The energy flowing in comes from the sun, and the energy flowing out comes from thermal radiation into space. 

We also need to separate the Earth system into two components: the surface and the atmosphere.  These are two very different things that have two different average temperatures.  In addition, the atmosphere transmits sunlight to the surface, unless clouds reflect it back into space.  And the Earth radiates thermally into space, unless clouds or carbon dioxide layers reflect it back to the surface.

The energy fluxes are shown in the diagram in Fig. 3 for the 4-component system: Sun, Surface, Atmosphere, and Space. The light from the sun, mostly in the visible range of the spectrum, is partially absorbed by the atmosphere and partially transmitted and reflected. The transmitted portion is partially absorbed and partially reflected by the surface. The heat of the Earth is radiated at long wavelengths to the atmosphere, where it is partially transmitted out into space, but also partially reflected by the fraction a’a which is the blanket effect. In addition, the atmosphere itself radiates in equal parts to the surface and into outer space. On top of all of these radiative processes, there is also non-radiative convective interaction between the atmosphere and the surface.

Fig. 3 Energy flux model for a simple climate model with four interacting systems: the Sun, the Atmosphere, the Earth and Outer Space.

These processes are captured by two energy flux equations, one for the atmosphere and one for the surface, in Fig. 4. The individual contributions from Fig. 3 are annotated in each case. In equilibrium, each flux equals zero, which can then be used to solve for the two unknowns: Ts0 and Ta0: the surface and atmosphere temperatures.

Fig. 4 Energy-balance model of the Earth’s atmosphere for a simple climate approximation.

After the equilibrium temperatures Ts0 and Ta0 are found, they go into a set of dynamic response equations that governs how deviations in the temperatures relax back to the equilibrium values. These relaxation equations are

where ks and ka are the relaxation rates for the surface and atmosphere. These can be quite slow, in the range of a century. For illustration, we can take ks = 1/75 years and ka = 1/25 years. The equilibrium temperatures for the surface and atmosphere differ by about 50 degrees Celsius, with Ts = 289 K and Ta = 248 K. These are rough averages over the entire planet. The solar constant is S = 1.36×103 W/m2, the Stefan-Boltzman constant is σ = 5.67×10-8 W/m2/K4, and the convective interaction constant is c = 2.5 W m-2 K-1. Other parameters are given in Table I.

Short WavelengthLong Wavelength
as = 0.11
ts = 0.53t’a = 0.06
aa = 0.30a’a = 0.31

The relaxation equations are in the standard form of a mathematical “flow” (see Ref. [1]) and the solutions are plotted as a phase-space portrait in Fig. 5 as a video of the flow as the parameters in Table I shift because of the addition of greenhouse gases to the atmosphere. The video runs from the year 1850 (the dawn of the industrial age) through to the year 2060 about 40 years from now.

Fig. 5 Video of the phase space flow of the Surface-Atmosphere system for increasing year. The flow vectors and flow lines are the relaxation to equilibrium for temperature deviations. The change in equilibrium over the years is from increasing blanket effects in the atmosphere caused by greenhouse gases.

The scariest part of the video is how fast it accelerates. From 1850 to 1950 there is almost no change, but then it accelerates, faster and faster, reflecting the time-lag in temperature rise in response to increased greenhouse gases.

What if the Models are Wrong?  Russian Roulette

Now come the caveats.

This model is just for teaching purposes, not for any realistic modeling of climate change. It captures the basic physics, and it provides a semi-quantitative set of parameters that leads to roughly accurate current temperatures. But of course, the biggest elephant in the room is that it averages over the entire planet, which is a very crude approximation.

It does get the basic facts correct, though, showing an alarming trend in the rise in average temperatures with the temperature rising by 3 degrees by 2060.

The professionals in this business have computer models that are orders of magnitude more more accurate than this one. To understand the details of the real climate models, one needs to go to appropriate resources, like this NOAA link, this NASA link, this national climate assessment link, and this government portal link, among many others.

One of the frequent questions that is asked is: What if these models are wrong? What if global warming isn’t as bad as these models say? The answer is simple: If they are wrong, then the worst case is that life goes on. If they are right, then in the worst case life on this planet may end.

It’s like playing Russian Roulette. If just one of the cylinders on the revolver has a live bullet, do you want to pull the trigger?

Matlab Code

function flowatmos.m

mov_flag = 1;
if mov_flag == 1
    moviename = 'atmostmp';
    aviobj = VideoWriter(moviename,'MPEG-4');
    aviobj.FrameRate = 12;
    open(aviobj);
end

Solar = 1.36e3;		% Solar constant outside atmosphere [J/sec/m2]
sig = 5.67e-8;		% Stefan-Boltzman constant [W/m2/K4]

% 1st-order model of Earth + Atmosphere

ta = 0.53;			% (0.53)transmissivity of air
tpa0 = 0.06;			% (0.06)primes are for thermal radiation
as0 = 0.11;			% (0.11)
aa0 = 0.30;			% (0.30)
apa0 = 0.31;        % (0.31)
c = 2.5;               % W/m2/K

xrange = [287 293];
yrange = [247 251];

rngx = xrange(2) - xrange(1);
rngy = yrange(2) - yrange(1);

[X,Y] = meshgrid(xrange(1):0.05:xrange(2), yrange(1):0.05:yrange(2));

smallarrow = 1;
Delta0 = 0.0000009;
for tloop =1:80
    
    Delta = Delta0*(exp((tloop-1)/8)-1);   % This Delta is exponential, but should become more linear over time
    date = floor(1850 + (tloop-1)*(2060-1850)/79);
    
    [x,y] = f5(X,Y);
    
    clf
    hold off
    eps = 0.002;
    for xloop = 1:11
        xs = xrange(1) +(xloop-1)*rngx/10 + eps;
        for yloop = 1:11
            ys = yrange(1) +(yloop-1)*rngy/10 + eps;
            
            streamline(X,Y,x,y,xs,ys)
            
        end
    end
    hold on
    [XQ,YQ] = meshgrid(xrange(1):1:xrange(2),yrange(1):1:yrange(2));
    smallarrow = 1;
    [xq,yq] = f5(XQ,YQ);
    quiver(XQ,YQ,xq,yq,.2,'r','filled')
    hold off
    
    axis([xrange(1) xrange(2) yrange(1) yrange(2)])
    set(gcf,'Color','White')
    
    fun = @root2d;
    x0 = [0 -40];
    x = fsolve(fun,x0);
    
    Ts = x(1) + 288
    Ta = x(2) + 288
    
    hold on
    rectangle('Position',[Ts-0.05 Ta-0.05 0.1 0.1],'Curvature',[1 1],'FaceColor',[1 0 0],'EdgeColor','k','LineWidth',2)
    
    posTs(tloop) = Ts;
    posTa(tloop) = Ta;
    
    plot(posTs,posTa,'k','LineWidth',2);
    hold off
    
    text(287.5,250.5,strcat('Date = ',num2str(date)),'FontSize',24)
    box on
    xlabel('Surface Temperature (oC)','FontSize',24)
    ylabel('Atmosphere Temperature (oC)','FontSize',24)
    
    hh = figure(1);
    pause(0.01)
    if mov_flag == 1
        frame = getframe(hh);
        writeVideo(aviobj,frame);
    end
    
end     % end tloop

if mov_flag == 1
    close(aviobj);
end

    function F = root2d(xp)   % Energy fluxes 
        
        x = xp + 288;
        feedfac = 0.001;      % feedback parameter 
        
        apa = apa0 + feedfac*(x(2)-248) + Delta;  % Changes in the atmospheric blanket
        tpa = tpa0 - feedfac*(x(2)-248) - Delta;
        as = as0 - feedfac*(x(1)-289);
        
        F(1) = c*(x(1)-x(2)) + sig*(1-apa)*x(1).^4 - sig*x(2).^4 - ta*(1-as)*Solar/4;
        F(2) = c*(x(1)-x(2)) + sig*(1-tpa - apa)*x(1).^4 - 2*sig*x(2).^4 + (1-aa0-ta+as*ta)*Solar/4;
        
    end

    function [x,y] = f5(X,Y)   % Dynamical flow equations
        
        k1 = 1/75;   % 75 year time constant for the Earth
        k2 = 1/25;   % 25 year time constant for the Atmosphere
        
        fun = @root2d;
        x0 = [0 0];
        x = fsolve(fun,x0);   % Solve for the temperatures that set the energy fluxes to zero
        
        Ts0 = x(1) + 288;   % Surface temperature in Kelvin
        Ta0 = x(2) + 288;   % Atmosphere temperature in Kelvin
        
        xtmp = -k1*(X - Ts0);   % Dynamical equations
        ytmp = -k2*(Y - Ta0);
        
        nrm = sqrt(xtmp.^2 + ytmp.^2);
        
        if smallarrow == 1
            x = xtmp./nrm;
            y = ytmp./nrm;
        else
            x = xtmp;
            y = ytmp;
        end
        
    end     % end f5

end       % end flowatmos


This model has a lot of parameters that can be tweaked. In addition to the parameters in the Table, the time dependence on the blanket properties of the atmosphere are governed by Delta0 and by feedfac for feedback of temperature on the atmosphere, such as increasing cloud cover and decrease ice cover. As an exercise, and using only small changes in the given parameters, find the following cases: 1) An increasing surface temperature is moderated by a falling atmosphere temperature; 2) The Earth goes into thermal run-away and ends like Venus; 3) The Earth initially warms then plummets into an ice age.

References

[1] D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019)

[2] E. Boeker and R. van Grondelle, Environmental Physics (Wiley, 1995)

[3] Recent lecture at the National Academy of Engineering by John Holdren.

Is There a Quantum Trajectory? The Phase-Space Perspective

At the dawn of quantum theory, Heisenberg, Schrödinger, Bohr and Pauli were embroiled in a dispute over whether trajectories of particles, defined by their positions over time, could exist. The argument against trajectories was based on an apparent paradox: To draw a “line” depicting a trajectory of a particle along a path implies that there is a momentum vector that carries the particle along that path. But a line is a one-dimensional curve through space, and since at any point in time the particle’s position is perfectly localized, then by Heisenberg’s uncertainty principle, it can have no definable momentum to carry it along.

My previous blog shows the way out of this paradox, by assembling wavepackets that are spread in both space and momentum, explicitly obeying the uncertainty principle. This is nothing new to anyone who has taken a quantum course. But the surprising thing is that in some potentials, like a harmonic potential, the wavepacket travels without broadening, just like classical particles on a trajectory. A dramatic demonstration of this can be seen in this YouTube video. But other potentials “break up” the wavepacket, especially potentials that display classical chaos. Because phase space is one of the best tools for studying classical chaos, especially Hamiltonian chaos, it can be enlisted to dig deeper into the question of the quantum trajectory—not just about the existence of a quantum trajectory, but why quantum systems retain a shadow of their classical counterparts.

Phase Space

Phase space is the state space of Hamiltonian systems. Concepts of phase space were first developed by Boltzmann as he worked on the problem of statistical mechanics. Phase space was later codified by Gibbs for statistical mechanics and by Poincare for orbital mechanics, and it was finally given its name by Paul and Tatiana Ehrenfest (a husband-wife team) in correspondence with the German physicist Paul Hertz (See Chapter 6, “The Tangled Tale of Phase Space”, in Galileo Unbound by D. D. Nolte (Oxford, 2018)).

The stretched-out phase-space functions … are very similar to the stochastic layer that forms in separatrix chaos in classical systems.

The idea of phase space is very simple for classical systems: it is just a plot of the momentum of a particle as a function of its position. For a given initial condition, the trajectory of a particle through its natural configuration space (for instance our 3D world) is traced out as a path through phase space. Because there is one momentum variable per degree of freedom, then the dimensionality of phase space for a particle in 3D is 6D, which is difficult to visualize. But for a one-dimensional dynamical system, like a simple harmonic oscillator (SHO) oscillating in a line, the phase space is just two-dimensional, which is easy to see. The phase-space trajectories of an SHO are simply ellipses, and if the momentum axis is scaled appropriately, the trajectories are circles. The particle trajectory in phase space can be animated just like a trajectory through configuration space as the position and momentum change in time p(x(t)). For the SHO, the point follows the path of a circle going clockwise.

Fig. 1 Phase space of the simple harmonic oscillator. The “orbits” have constant energy.

A more interesting phase space is for the simple pendulum, shown in Fig. 2. There are two types of orbits: open and closed. The closed orbits near the origin are like those of a SHO. The open orbits are when the pendulum is spinning around. The dividing line between the open and closed orbits is called a separatrix. Where the separatrix intersects itself is a saddle point. This saddle point is the most important part of the phase space portrait: it is where chaos emerges when perturbations are added.

Fig. 2 Phase space for a simple pendulum. For small amplitudes the orbits are closed like those of a SHO. For large amplitudes the orbits become open as the pendulum spins about its axis. (Reproduced from Introduction to Modern Dynamics, 2nd Ed., pg. )

One route to classical chaos is through what is known as “separatrix chaos”. It is easy to see why saddle points (also known as hyperbolic points) are the source of chaos: as the system trajectory approaches the saddle, it has two options of which directions to go. Any additional degree of freedom in the system (like a harmonic drive) can make the system go one way on one approach, and the other way on another approach, mixing up the trajectories. An example of the stochastic layer of separatrix chaos is shown in Fig. 3 for a damped driven pendulum. The chaotic behavior that originates at the saddle point extends out along the entire separatrix.

Fig. 3 The stochastic layer of separatrix chaos for a damped driven pendulum. (Reproduced from Introduction to Modern Dynamics, 2nd Ed., pg. )

The main question about whether or not there is a quantum trajectory depends on how quantum packets behave as they approach a saddle point in phase space. Since packets are spread out, it would be reasonable to assume that parts of the packet will go one way, and parts of the packet will go another. But first, one has to ask: Is a phase-space description of quantum systems even possible?

Quantum Phase Space: The Wigner Distribution Function

Phase-space portraits are arguably the most powerful tool in the toolbox of classical dynamics, and one would like to retain its uses for quantum systems. However, there is that pesky paradox about quantum trajectories that cannot admit the existence of one-dimensional curves through such a phase space. Furthermore, there is no direct way of taking a wavefunction and simply “finding” its position or momentum to plot points on such a quantum phase space.

The answer was found in 1932 by Eugene Wigner (1902 – 1905), an Hungarian physicist working at Princeton. He realized that it was impossible to construct a quantum probability distribution in phase space that had positive values everywhere. This is a problem, because negative probabilities have no direct interpretation. But Wigner showed that if one relaxed the requirements a bit, so that expectation values computed over some distribution function (that had positive and negative values) gave correct answers that matched experiments, then this distribution function would “stand in” for an actual probability distribution.

The distribution function that Wigner found is called the Wigner distribution function. Given a wavefunction ψ(x), the Wigner distribution is defined as

Fig. 4 Wigner distribution function in (x, p) phase space.

The Wigner distribution function is the Fourier transform of the convolution of the wavefunction. The pure position dependence of the wavefunction is converted into a spread-out position-momentum function in phase space. For a Gaussian wavefunction ψ(x) with a finite width in space, the W-function in phase space is a two-dimensional Gaussian with finite widths in both space and momentum. In fact, the Δx-Δp product of the W-function is precisely the uncertainty production of the Heisenberg uncertainty relation.

The question of the quantum trajectory from the phase-space perspective becomes whether a Wigner function behaves like a localized “packet” that evolves in phase space in a way analogous to a classical particle, and whether classical chaos is reflected in the behavior of quantum systems.

The Harmonic Oscillator

The quantum harmonic oscillator is a rare and special case among quantum potentials, because the energy spacings between all successive states are all the same. This makes it possible for a Gaussian wavefunction, which is a superposition of the eigenstates of the harmonic oscillator, to propagate through the potential without broadening. To see an example of this, watch the first example in this YouTube video for a Schrödinger cat state in a two-dimensional harmonic potential. For this very special potential, the Wigner distribution behaves just like a (broadened) particle on an orbit in phase space, executing nice circular orbits.

A comparison of the classical phase-space portrait versus the quantum phase-space portrait is shown in Fig. 5. Where the classical particle is a point on an orbit, the quantum particle is spread out, obeying the Δx-Δp Heisenberg product, but following the same orbit as the classical particle.

Fig. 5 Classical versus quantum phase-space portraits for a harmonic oscillator. For a classical particle, the trajectory is a point executing an orbit. For a quantum particle, the trajectory is a Wigner distribution that follows the same orbit as the classical particle.

However, a significant new feature appears in the Wigner representation in phase space when there is a coherent superposition of two states, known as a “cat” state, after Schrödinger’s cat. This new feature has no classical analog. It is the coherent interference pattern that appears at the zero-point of the harmonic oscillator for the Schrödinger cat state. There is no such thing as “classical” coherence, so this feature is absent in classical phase space portraits.

Two examples of Wigner distributions are shown in Fig. 6 for a statistical (incoherent) mixture of packets and a coherent superposition of packets. The quantum coherence signature is present in the coherent case but not the statistical mixture case. The coherence in the Wigner distribution represents “off-diagonal” terms in the density matrix that leads to interference effects in quantum systems. Quantum computing algorithms depend critically on such coherences that tend to decay rapidly in real-world physical systems, known as decoherence, and it is possible to make statements about decoherence by watching the zero-point interference.

Fig. 6 Quantum phase-space portraits of double wave packets. On the left, the wave packets have no coherence, being a statistical mixture. On the right is the case for a coherent superposition, or “cat state” for two wave packets in a one-dimensional harmonic oscillator.

Whereas Gaussian wave packets in the quantum harmonic potential behave nearly like classical systems, and their phase-space portraits are almost identical to the classical phase-space view (except for the quantum coherence), most quantum potentials cause wave packets to disperse. And when saddle points are present in the classical case, then we are back to the question about how quantum packets behave as they approach a saddle point in phase space.

Quantum Pendulum and Separatrix Chaos

One of the simplest anharmonic oscillators is the simple pendulum. In the classical case, the period diverges if the pendulum gets very close to going vertical. A similar thing happens in the quantum case, but because the motion has strong anharmonicity, an initial wave packet tends to spread dramatically as parts of the wavefunction less vertical stretch away from the part of the wave function that is more nearly vertical. Fig. 7 is a snap-shot about a eighth of a period after the wave packet was launched. The packet has already stretched out along the separatrix. A double-cat-state was used, so there is a second packet that has coherent interference with the first. To see a movie of the time evolution of the wave packet and the orbit in quantum phase space, see the YouTube video.

Fig. 7 Wavefunction of a quantum pendulum released near vertical. The phase-space portrait is very similar to the classical case, except that the phase-space distribution is stretched out along the separatrix. The initial state for the phase-space portrait was a cat state.

The simple pendulum does have a saddle point, but it is degenerate because the angle is modulo -2-pi. A simple potential that has a non-degenerate saddle point is a double-well potential.

Quantum Double-Well and Separatrix Chaos

The symmetric double-well potential has a saddle point at the mid-point between the two well minima. A wave packet approaching the saddle will split into to packets that will follow the individual separatrixes that emerge from the saddle point (the unstable manifolds). This effect is seen most dramatically in the middle pane of Fig. 8. For the full video of the quantum phase-space evolution, see this YouTube video. The stretched-out distribution in phase space is highly analogous to the separatrix chaos seen for the classical system.

Fig. 8 Phase-space portraits of the Wigner distribution for a wavepacket in a double-well potential. The packet approaches the central saddle point, where the probability density splits along the unstable manifolds.

Conclusion

A common statement often made about quantum chaos is that quantum systems tend to suppress chaos, only exhibiting chaos for special types of orbits that produce quantum scars. However, from the phase-space perspective, the opposite may be true. The stretched-out Wigner distribution functions, for critical wave packets that interact with a saddle point, are very similar to the stochastic layer that forms in separatrix chaos in classical systems. In this sense, the phase-space description brings out the similarity between classical chaos and quantum chaos.


YouTube Video

YouTube Video of Dynamics in Quantum Phase Space


References

1. T. Curtright, D. Fairlie, C. Zachos, A Concise Treatise on Quantum Mechanics in Phase Space.  (World Scientific, New Jersey, 2014).

2. J. R. Nagel, A Review and Application of the Finite-Difference Time-Domain Algorithm Applied to the Schrödinger Equation, ACES Journal, Vol. 24, NO. 1, pp. 1-8 (2009)

Is There a Quantum Trajectory?

Heisenberg’s uncertainty principle is a law of physics – it cannot be violated under any circumstances, no matter how much we may want it to yield or how hard we try to bend it.  Heisenberg, as he developed his ideas after his lone epiphany like a monk on the isolated island of Helgoland off the north coast of Germany in 1925, became a bit of a zealot, like a religious convert, convinced that all we can say about reality is a measurement outcome.  In his view, there was no independent existence of an electron other than what emerged from a measuring apparatus.  Reality, to Heisenberg, was just a list of numbers in a spread sheet—matrix elements.  He took this line of reasoning so far that he stated without exception that there could be no such thing as a trajectory in a quantum system.  When the great battle commenced between Heisenberg’s matrix mechanics against Schrödinger’s wave mechanics, Heisenberg was relentless, denying any reality to Schrödinger’s wavefunction other than as a calculation tool.  He was so strident that even Bohr, who was on Heisenberg’s side in the argument, advised Heisenberg to relent [1].  Eventually a compromise was struck, as Heisenberg’s uncertainty principle allowed Schrödinger’s wave functions to exist within limits—his uncertainty limits.

Disaster in the Poconos

Yet the idea of an actual trajectory of a quantum particle remained a type of heresy within the close quantum circles.  Years later in 1948, when a young Richard Feynman took the stage at a conference in the Poconos, he almost sabotaged his career in front of Bohr and Dirac—two of the giants who had invented quantum mechanics—by having the audacity to talk about particle trajectories in spacetime diagrams.

Feynman was making his first presentation of a new approach to quantum mechanics that he had developed based on path integrals. The challenge was that his method relied on space-time graphs in which “unphysical” things were allowed to occur.  In fact, unphysical things were required to occur, as part of the sum over many histories of his path integrals.  For instance, a key element in the approach was allowing electrons to travel backwards in time as positrons, or a process in which the electron and positron annihilate into a single photon, and then the photon decays back into an electron-positron pair—a process that is not allowed by mass and energy conservation.  But this is a possible history that must be added to Feynman’s sum.

It all looked like nonsense to the audience, and the talk quickly derailed.  Dirac pestered him with questions that he tried to deflect, but Dirac persisted like a raven.  A question was raised about the Pauli exclusion principle, about whether an orbital could have three electrons instead of the required two, and Feynman said that it could—all histories were possible and had to be summed over—an answer that dismayed the audience.  Finally, as Feynman was drawing another of his space-time graphs showing electrons as lines, Bohr rose to his feet and asked derisively whether Feynman had forgotten Heisenberg’s uncertainty principle that made it impossible to even talk about an electron trajectory.

It was hopeless.  The audience gave up and so did Feynman as the talk just fizzled out.  It was a disaster.  What had been meant to be Feynman’s crowning achievement and his entry to the highest levels of theoretical physics, had been a terrible embarrassment.  He slunk home to Cornell where he sank into one of his depressions.  At the close of the Pocono conference, Oppenheimer, the reigning king of physics, former head of the successful Manhattan Project and newly selected to head the prestigious Institute for Advanced Study at Princeton, had been thoroughly disappointed by Feynman.

But what Bohr and Dirac and Oppenheimer had failed to understand was that as long as the duration of unphysical processes was shorter than the energy differences involved, then it was literally obeying Heisenberg’s uncertainty principle.  Furthermore, Feynman’s trajectories—what became his famous “Feynman Diagrams”—were meant to be merely cartoons—a shorthand way to keep track of lots of different contributions to a scattering process.  The quantum processes certainly took place in space and time, conceptually like a trajectory, but only so far as time durations, and energy differences and locations and momentum changes were all within the bounds of the uncertainty principle.  Feynman had invented a bold new tool for quantum field theory, able to supply deep results quickly.  But no one at the Poconos could see it.

Fig. 1 The first Feynman diagram.

Coherent States

When Feynman had failed so miserably at the Pocono conference, he had taken the stage after Julian Schwinger, who had dazzled everyone with his perfectly scripted presentation of quantum field theory—the competing theory to Feynman’s.  Schwinger emerged the clear winner of the contest.  At that time, Roy Glauber (1925 – 2018) was a young physicist just taking his PhD from Schwinger at Harvard, and he later received a post-doc position at Princeton’s Institute for Advanced Study where he became part of a miniature revolution in quantum field theory that revolved around—not Schwinger’s difficult mathematics—but Feynman’s diagrammatic method.  So Feynman won in the end.  Glauber then went on to Caltech, where he filled in for Feynman’s lectures when Feynman was off in Brazil playing the bongos.  Glauber eventually returned to Harvard where he was already thinking about the quantum aspects of photons in 1956 when news of the photon correlations in the Hanbury-Brown Twiss (HBT) experiment were published.  Three years later, when the laser was invented, he began developing a theory of photon correlations in laser light that he suspected would be fundamentally different than in natural chaotic light. 

Because of his background in quantum field theory, and especially quantum electrodynamics, it was fairly easy to couch the quantum optical properties of coherent light in terms of Dirac’s creation and annihilation operators of the electromagnetic field. Glauber developed a “coherent state” operator that was a minimum uncertainty state of the quantized electromagnetic field, related to the minimum-uncertainty wave functions derived initially by Schrödinger in the late 1920’s. The coherent state represents a laser operating well above the lasing threshold and behaved as “the most classical” wavepacket that can be constructed.  Glauber was awarded the Nobel Prize in Physics in 2005 for his work on such “Glauber states” in quantum optics.

Fig. 2 Roy Glauber

Quantum Trajectories

Glauber’s coherent states are built up from the natural modes of a harmonic oscillator.  Therefore, it should come as no surprise that these coherent-state wavefunctions in a harmonic potential behave just like classical particles with well-defined trajectories. The quadratic potential matches the quadratic argument of the the Gaussian wavepacket, and the pulses propagate within the potential without broadening, as in Fig. 3, showing a snapshot of two wavepackets propagating in a two-dimensional harmonic potential. This is a somewhat radical situation, because most wavepackets in most potentials (or even in free space) broaden as they propagate. The quadratic potential is a special case that is generally not representative of how quantum systems behave.

Fig. 3 Harmonic potential in 2D and two examples of pairs of pulses propagating without broadening. The wavepackets in the center are oscillating in line, and the wavepackets on the right are orbiting the center of the potential in opposite directions. (Movies of the quantum trajectories can be viewed at Physics Unbound.)

To illustrate this special status for the quadratic potential, the wavepackets can be launched in a potential with a quartic perturbation. The quartic potential is anharmonic—the frequency of oscillation depends on the amplitude of oscillation unlike for the harmonic oscillator, where amplitude and frequency are independent. The quartic potential is integrable, like the harmonic oscillator, and there is no avenue for chaos in the classical analog. Nonetheless, wavepackets broaden as they propagate in the quartic potential, eventually spread out into a ring in the configuration space, as in Fig. 4.

Fig. 4 Potential with a quartic corrections. The initial gaussian pulses spread into a “ring” orbiting the center of the potential.

A potential with integrability has as many conserved quantities to the motion as there are degrees of freedom. Because the quartic potential is integrable, the quantum wavefunction may spread, but it remains highly regular, as in the “ring” that eventually forms over time. However, integrable potentials are the exception rather than the rule. Most potentials lead to nonintegrable motion that opens the door to chaos.

A classic (and classical) potential that exhibits chaos in a two-dimensional configuration space is the famous Henon-Heiles potential. This has a four-dimensional phase space which admits classical chaos. The potential has a three-fold symmetry which is one reason it is non-integral, since a particle must “decide” which way to go when it approaches a saddle point. In the quantum regime, wavepackets face the same decision, leading to a breakup of the wavepacket on top of a general broadening. This allows the wavefunction eventually to distribute across the entire configuration space, as in Fig. 5.

Fig. 5 The Henon-Heiles two-dimensional potential supports Hamiltonian chaos in the classical regime. In the quantum regime, the wavefunction spreads to eventually fill the accessible configuration space (for constant energy).

Youtube Video

Movies of quantum trajectories can be viewed at my Youtube Channel, Physics Unbound. The answer to the question “Is there a quantum trajectory?” can be seen visually as the movies run—they do exist in a very clear sense under special conditions, especially coherent states in a harmonic oscillator. And the concept of a quantum trajectory also carries over from a classical trajectory in cases when the classical motion is integrable, even in cases when the wavefunction spreads over time. However, for classical systems that display chaotic motion, wavefunctions that begin as coherent states break up into chaotic wavefunctions that fill the accessible configuration space for a given energy. The character of quantum evolution of coherent states—the most classical of quantum wavefunctions—in these cases reflects the underlying character of chaotic motion in the classical analogs. This process can be seen directly watching the movies as a wavepacket approaches a saddle point in the potential and is split. Successive splits of the multiple wavepackets as they interact with the saddle points is what eventually distributes the full wavefunction into its chaotic form.

Therefore, the idea of a “quantum trajectory”, so thoroughly dismissed by Heisenberg, remains a phenomenological guide that can help give insight into the behavior of quantum systems—both integrable and chaotic.

As a side note, the laws of quantum physics obey time-reversal symmetry just as the classical equations do. In the third movie of “A Quantum Ballet“, wavefunctions in a double-well potential are tracked in time as they start from coherent states that break up into chaotic wavefunctions. It is like watching entropy in action as an ordered state devolves into a disordered state. But at the half-way point of the movie, the imaginary part of the wavefunction has its sign flipped, and the dynamics continue. But now the wavefunctions move from disorder into an ordered state, seemingly going against the second law of thermodynamics. Flipping the sign of the imaginary part of the wavefunction at just one instant in time plays the role of a time-reversal operation, and there is no violation of the second law.


YouTube Video

YouTube Video of Quantum Trajectories


References

[1] See Chapter 8 , On the Quantum Footpath, in Galileo Unbound, D. D. Nolte (Oxford University Press, 2018)

[2] J. R. Nagel, A Review and Application of the Finite-Difference Time-Domain Algorithm Applied to the Schrödinger Equation, ACES Journal, Vol. 24, NO. 1, pp. 1-8 (2009)

Quantum Chaos and the Cheshire Cat

Alice’s disturbing adventures in Wonderland tumbled upon her like a string of accidents as she wandered a world of chaos.  Rules were never what they seemed and shifted whenever they wanted.  She even met a cat who grinned ear-to-ear and could disappear entirely, or almost entirely, leaving only its grin.

The vanishing Cheshire Cat reminds us of another famous cat—Arnold’s Cat—that introduced the ideas of stretching and folding of phase-space volumes in non-integrable Hamiltonian systems.  But when Arnold’s Cat becomes a Quantum Cat, a central question remains: What happens to the chaotic behavior of the classical system … does it survive the transition to quantum mechanics?  The answer is surprisingly like the grin of the Cheshire Cat—the cat vanishes, but the grin remains.  In the quantum world of the Cheshire Cat, the grin of the classical cat remains even after the rest of the cat vanished. 

The Cheshire Cat fades away, leaving only its grin, like a fine filament, as classical chaos fades into quantum, leaving behind a quantum scar.

The Quantum Mechanics of Classically Chaotic Systems

The simplest Hamiltonian systems are integrable—they have as many constants of the motion as degrees of freedom.  This holds for quantum systems as well as for classical.  There is also a strong correspondence between classical and quantum systems for the integrable cases—literally the Correspondence Principle—that states that quantum systems at high quantum number approach classical behavior.  Even at low quantum numbers, classical resonances are mirrored by quantum eigenfrequencies that can show highly regular spectra.

But integrable systems are rare—surprisingly rare.  Almost no real-world Hamiltonian system is integrable, because the real world warps the ideal.  No spring can displace indefinitely, and no potential is perfectly quadratic.  There are always real-world non-idealities that destroy one constant of the motion or another, opening the door to chaos.

When classical Hamiltonian systems become chaotic, they don’t do it suddenly.  Almost all transitions to chaos in Hamiltonian systems are gradual.  One of the best examples of this is the KAM theory that starts with invariant action integrals that generate invariant tori in phase space.  As nonintegrable perturbations increase, the tori break up slowly into island chains of stability as chaos infiltrates the separatrixes—first as thin filaments of chaos surrounding the islands—then growing in width to take up more and more of phase space.  Even when chaos is fully developed, small islands of stability can remain—the remnants of stable orbits of the unperturbed system.

When the classical becomes quantum, chaos softens.  Quantum wave functions don’t like to be confined—they spread and they tunnel.  The separatrix of classical chaos—that barrier between regions of phase space—cannot constrain the exponential tails of wave functions.  And the origin of chaos itself—the homoclinic point of the separatrix—gets washed out.  Then the regular orbits of the classical system reassert themselves, and they appear, like the vestige of the Cheshire Cat, as a grin.

The Quantum Circus

The empty stadium is a surprisingly rich dynamical system that has unexpected structure in both the classical and the quantum domain.  Its importance in classical dynamics comes from the fact that its periodic orbits are unstable and its non-periodic orbits are ergodic (filling all available space if given long enough).  The stadium itself is empty so that particles (classical or quantum) are free to propagate between reflections from the perfectly-reflecting walls of the stadium.  The ergodicity comes from the fact that the stadium—like a classic Roman chariot-race stadium, also known as a circus—is not a circle, but has a straight stretch between two half circles.  This simple modification takes the stable orbits of the circle into the unstable orbits of the stadium.

A single classical orbit in a stadium is shown in Fig 1. This is an ergodic orbit that is non-periodic and eventually would fill the entire stadium space. There are other orbits that are nearly periodic, such as one that bounces back and forth vertically between the linear portions, but even this orbit will eventually wander into the circular part of the stadium and then become ergodic. The big quantum-classical question is what happens to these classical orbits when the stadium is shrunk to the nanoscale?

Fig. 1 A classical trajectory in a stadium. It will eventually visit every point, a property known as ergodicity.

Simulating an evolving quantum wavefunction in free space is surprisingly simple. Given a beginning quantum wavefunction A(x,y,t0), the discrete update equation is

Perfect reflection from the boundaries of the stadium are incorporated through imposing a boundary condition that sends the wavefunction to zero. Simple!

A snap-shot of a wavefunction evolving in the stadium is shown in Fig. 2. To see a movie of the time evolution, see my YouTube episode.

Fig. 2 Snapshot of a quantum wavefunction in the stadium. (From YouTube)

The time average of the wavefunction after a long time has passed is shown in Fig. 3. Other than the horizontal nodal line down the center of the stadium, there is little discernible structure or symmetry. This is also true for the mean squared wavefunction shown in Fig. 4, although there is some structure that may be emerging in the semi-circular regions.

Fig. 3 Time-average wavefunction after a long time.
Fig. 4 Time-average of the squared wavefunction after a long time.

On the other hand, for special initial conditions that have a lot of symmetry, something remarkable happens. Fig. 5 shows several mean-squared results for special initial conditions. There is definite structure in these cases that were given the somewhat ugly name “quantum scars” in the 1980’s by Eric Heller who was one of the first to study this phenomenon [1].

Fig. 5 Quantum scars reflect periodic (but unstable) orbits of the classical system. Quantum effects tend to quench chaos and favor regular motion.

One can superpose highly-symmetric classical trajectories onto the figures, as shown in the bottom row. All of these classical orbits go through a high-symmetry point, such as the center of the stadium (on the left image) and through the focal point of the circular mirrors (in the other two images). The astonishing conclusion of this exercise is that the highly-symmetric periodic classical orbits remain behind as quantum scars—like the Cheshire Cat’s grin—when the system is in the quantum realm. The classical orbits that produce quantum scars have the important property of being periodic but unstable. A slight perturbation from the symmetric trajectory causes it to eventually become ergodic (chaotic). These scars are regions with enhanced probability density, what might be termed “quantum trajectories”, but do not show strong interference patterns.

It is important to make the distinction that it is also possible to construct special wavefunctions that are strictly periodic, such as a wave bouncing perfectly vertically between the straight portions. This leads to large-scale interference patterns that are not the same as the quantum scars.

Quantum Chaos versus Laser Speckle

In addition to the bouncing-wave cases that do not strictly produce quantum scars, there is another “neutral” phenomenon that produces interference patterns that look a lot like scars, but are simply the random addition of lots of plane waves with the same wavelength [2]. A snapshot in time of one of these superpositions is shown in Fig. 6. To see how the waves add together, see the YouTube channel episode.

Fig. 6 The sum of 100 randomly oriented plane waves of constant wavelength. (A snapshot from YouTube.)

YouTube Video

YouTube Video of Quantum Chaos


References

[1] Heller E J, Bound-state eigenfunctions of classically chaotic hamiltonian-systems – scars of periodic-orbits, Physical Review Letters 53 ,1515 (1984)

[2] Gutzwiller M C, Chaos in classical and quantum mechanics (New York: New York : Springer-Verlag, 1990)

George Cantor meets Machine Learning: Deep Discrete Encoders

Machine learning is characterized, more than by any other aspect, by the high dimensionality of the data spaces it seeks to find patterns in.  Hence, one of the principle functions of machine learning is to reduce the dimensionality of the data to lower dimensions—a process known as dimensionality reduction.

There are two driving reasons to reduce the dimensionality of data: 

First, typical dimensionalities faced by machine learning problems can be in the hundreds or thousands.  Trying to visualize such high dimensions may sound mind expanding, but it is really just saying that a data problem may have hundreds or thousands of different data entries for a single event.  And many, or even most, of those entries may not be independent.  While many others may be pure noise—or at least not relevant to the pattern.  Deep learning dimensionality reduction seeks to find the dependences—many of them nonlinear and non-single-valued (non-invertible)—and to reject the noise channels.

Second, the geometry of high dimension is highly unintuitive.  Many of the things we take for granted in our pitifully low dimension of 3 (or 4 if you include time) just don’t hold in high dimensions.  For instance, in very high dimensions almost all random vectors in a hyperspace are orthogonal, and almost all random unit vectors in the hyperspace are equidistant.  Even the topology of landscapes in high dimensions is unintuitive—there are far more mountain ridges than mountain peaks—with profound consequences for dynamical processes such as random walks (see my Blog on a Random Walk in 10 Dimensions).  In fact, we owe our evolutionary existence to this effect!  Therefore, deep dimensionality reduction is a way to bring complex data down to a dimensionality where our intuition can be applied to “explain” the data.

But what is a dimension?  And can you find the “right” dimensionality when performing dimensionality reduction?  Once again, our intuition struggles with these questions, as first discovered by a nineteenth-century German mathematician whose mind-expanding explorations of the essence of different types of infinity shattered the very concept of dimension.

George Cantor

Georg Cantor (1845 – 1918) was born in Russia, and the family moved to Germany while Cantor was still young.  In 1863, he enrolled at the University of Berlin where he sat on lectures by Weierstrass and Kronecker.  He received his doctorate in 1867 and his Habilitation in 1869, moving into a faculty position at the University of Halle and remaining there for the rest of his career.  Cantor published a paper early in 1872 on the question of whether the representation of an arbitrary function by a Fourier series is unique.  He had found that even though the series might converge to a function almost everywhere, there surprisingly could still be an infinite number of points where the convergence failed.  Originally, Cantor was interested in the behavior of functions at these points, but his interest soon shifted to the properties of the points themselves, which became his life’s work as he developed set theory and transfinite mathematics.

Georg Cantor (1845 – 1918)

In 1878, in a letter to his friend Richard Dedekind, Cantor showed that there was a one-to-one correspondence between the real numbers and the points in any n-dimensional space.  He was so surprised by his own result that he wrote to Dedekind “I see it, but I don’t believe it.”  Previously, the ideas of a dimension, moving in a succession from one (a line) to two (a plane) to three (a volume) had been absolute.  However, with his newly-discovered mapping, the solid concepts of dimension and dimensionality began to dissolve.  This was just the beginning of a long history of altered concepts of dimension (see my Blog on the History of Fractals).

Mapping Two Dimensions to One

Cantor devised a simple mapping that is at once obvious and subtle. To take a specific example of mapping a plane to a line, one can consider the two coordinate values (x,y) both with a closed domain on [0,1]. Each can be expressed as a decimal fraction given by

These two values can be mapped to a single value through a simple “ping-pong” between the decimal digits as

If x and y are irrational, then this presents a simple mapping of a pair of numbers (two-dimensional coordinates) to a single number. In this way, the special significance of two dimensions relative to one dimension is lost. In terms of set theory nomenclature, the cardinality of the two-dimensional R2 is the same as for the one-dimensions R.

Nonetheless, intuition about dimension still has it merits, and a subtle aspect of this mapping is that it contains discontinuities. These discontinuities are countably infinite, but they do disrupt any smooth transformation from 2D to 1D, which is where the concepts of intrinsic dimension are hidden. The topological dimension of the plane is clearly equal to 2, and that of the line is clearly equal to 1, determined by the dimensionality D+1 of cuts that are required to separate the sets. The area is separated by a D = 1 line, while the line is separated by a D= 0 point.

Fig. 1 A mapping of a 2D plane to a 1D line. Every point in the plane has an associated point on the line (except for a countably infinite number of special points … see the next figure.)

While the discontinuities help preserve the notions of dimension, and they are countably infinite (with the cardinality of the rational numbers), they pertain merely to the representation of number by decimals. As an example, in decimal notation for a number a = 14159/105, one has two equivalent representations

trailing either infinitely many 0’s or infinitely many 9’s. Despite the fact that these representations point to the same number, when it is used as one of the pairs in the bijection of Fig. 1, it produces two distinctly different numbers of t in R. Fortunately, there is a way to literally sweep this under the rug. Any time one retrieves a number that has repeating 0’s or 9’s, simply sweep it to the right it by dividing by a power of 10 to a region of trailing zeros for a different number, call it b, as in Fig. 2. The shifted version of a will not overlap with the number b, because b also ends in repeating 0’s or 9’s and so is swept farther to the right, and so on to infinity, so none of these numbers overlap, each is distinct, and the mapping becomes what is known as a bijection with one-to-one correspondence.

Fig. 3 The “measure-zero” fix. Numbers that have two equivalent representations can be shifted to the right to replace other numbers that are shifted further to the right, and so on to infinity. There is infinite room within the reals to accommodate the countably infinite number of repeating-decimal numbers.

Space-Filling Curves

When the peripatetic mathematician Guiseppe Peano learned of Cantor’s result for the mapping of 2D to 1D, he sought to demonstrate the correspondence geometrically, and he constructed a continuous curve that filled space, publishing the method in Sur une courbe, qui remplit toute une aire plane [1] in 1890.  The construction of Peano’s curve proceeds by taking a square and dividing it into 9 equal sub squares.  Lines connect the centers of each of the sub squares.  Then each sub square is divided again into 9 sub squares whose centers are all connected by lines.  At this stage, the original pattern, repeated 9 times, is connected together by 8 links, forming a single curve.  This process is repeated infinitely many times, resulting in a curve that passes through every point of the original plane square.  In this way, a line is made to fill a plane.  Where Cantor had proven abstractly that the cardinality of the real numbers was the same as the points in n-dimensional space, Peano created a specific example. 

This was followed quickly by another construction, invented by David Hilbert in 1891, that divided the square into four instead of nine, simplifying the construction, but also showing that such constructions were easily generated [2].  The space-filling curves of Peano and Hilbert have the extreme properties that a one-dimensional curve approaches every point in a two-dimensional space.  These curves have topological dimensionality of 1D and a fractal dimensionality of 2D. 

Fig. 3 Peano’s (1890) and Hilbert’s (1891) plane-filling curves.  When the iterations are taken to infinity, the curves approach every point of two-dimensional space arbitrarily closely, giving them a dimension D = 2.

A One-Dimensional Deep Discrete Encoder for MNIST

When performing dimensionality reduction in deep learning it is tempting to think that there is an underlying geometry to the data—as if they reside on some submanifold of the original high-dimensional space.  While this can be the case (for which dimensionality reduction is called deep metric learning) more often there is no such submanifold.  For instance, when there is a highly conditional nature to the different data channels, in which some measurements are conditional on the values of other measurements, then there is no simple contiguous space that supports the data.

It is also tempting to think that a deep learning problem has some intrinsic number of degrees of freedom which should be matched by the selected latent dimensionality for the dimensionality reduction.  For instance, if there are roughly five degrees of freedom buried within a complicated data set, then it is tempting to believe that the appropriate latent dimensionality also should be chosen to be five.  But this also is missing the mark. 

Take, for example, the famous MNIST data set of hand-written digits from 0 to 9.  Each digit example is contained in a 28-by28 two-dimensional array that is typically flattened to a 784-element linear vector that locates that single digit example within a 784-dimensional space.  The goal is to reduce the dimensionality down to a manageable number—but what should the resulting latent dimensionality be?  How many degrees of freedom are involved in writing digits?  Furthermore, given that there are ten classes of digits, should the chosen dimensionality of the latent space be related to the number of classes?

Fig. 4 A sampling of MNIST digits.

To illustrate that the dimensionality of the latent space has little to do with degrees of freedom or the number of classes, let’s build a simple discrete encoder that encodes the MNIST data set to the one-dimensional line—following Cantor.

The deep network of the encoder can have a simple structure that terminates with a single neuron that has a piece-wise linear output.  The objective function (or loss function) measures the squared distances of the outputs of the single neuron, after transformation by the network, to the associated values 0 to 9.  And that is it!  Train the network by minimizing the loss function.

Fig. 5 A deep encoder that encodes discrete classes to a one-dimensional latent variable.

The results of the linear encoder are shown in Fig. 6 (the transverse directions are only for ease of visualization…the classification is along the long axis). The different dots are specific digits, and the colors are the digit value. There is a clear trend from 1 through 10, although with this linear encoder there is substantial overlap among the point clouds.

Fig. 6 Latent space for a one-dimensional (line) encoder. The transverse dimensions are only for visualization.

Despite appearances, this one-dimensional discrete line encoder is NOT a form of regression. There is no such thing as 1.5 as the average of a 1-digit and a 2-digit. And 5 is not the average of a 4-digit and a 6-digit. The digits are images that have no intrinsic numerical value. Therefore, this one-dimensional encoder is highly discontinuous, and intuitive ideas about intrinsic dimension for continuous data remain secure.

Summary

The purpose of this Blog (apart from introducing Cantor and his ideas on dimension theory) was to highlight a key question of dimensionality reduction in representation theory: What is the intrinsic dimensionality of a dataset? The answer, in the case of discrete classes, is that there is no intrinsic dimensionality. Having 1 degree of freedom or 10 degrees of freedom, i.e. latent dimensionalities of 1 or 10, is mostly irrelevant. In ideal cases, one is just as good as the other.

On the other hand, for real-world data with its inevitable variability and finite variance, there can be reasons to choose one latent dimensionality over another. In fact, the “best” choice of dimensionality is one less than the number of classes. For instance, in the case of MNIST with its 10 classes, that is a 9-dimensional latent space. The reason this is “best” has to do with the geometry of high-dimensional simplexes, which will be the topic of a future Blog.


[1] G.Peano: Sur une courbe, qui remplit toute une aire plane. Mathematische Annalen 36 (1890), 157–160.

[2] D. Hilbert, “Uber die stetige Abbildung einer Linie auf ein Fllichenstilck,” Mathemutische Anna/en, vol. 38, pp. 459-460, (1891)

Democracy against Authoritarians: The Physics of Public Opinion

An old joke goes that Democracy is a terrible form of government … except it’s better than all the others!

Our world today is faced with conflict between democracy and dictatorship. On the one side is the free world, where leaders are chosen by some form of representation of large numbers of citizens and sometimes even a majority. On the other side is authoritarianism where a select few are selected by a select few to govern everyone else.

[I]t has been said that democracy is the worst form of Government except all those other forms that have been tried from time to time; but there is the broad feeling in our country that the people should rule, and that public opinion expressed by all constitutional means, should shape, guide, and control the actions of Ministers who are their servants and not their masters.

Winston Churchill (1947)

An argument in favor of democracy is freedom of choice for the largest segment of the population, plus the ability to remove leaders who fail to provide for the perceived welfare of the most citizens. This makes democracy adaptive, shifting with the times. It also makes leaders accountable for their actions and crimes. An argument in favor of authoritarianism is the myth of the benevolent dictator–someone who knows what’s best for the people even if the people don’t know it themselves.

But dictators are rarely benevolent, and as they become saturated with power, they are corrupted. The criminal massacres, perpetrated by Putin, of Ukrainian civilians is one of the strongest recent arguments against authoritarianism. A single man decides, on a whim, the life and death of thousands or maybe more. The invasion of Ukraine is so egregious and unwarranted, that we wonder how the Russian people can put up with their isolated and manic leader. Yet by some measure more than 60% of the people in Russia approve of the war.

How can the free world see the invasion as the atrocity it is, while Russia’s majority sees it as a just war? The answer is a surprising result of population dynamics known as the replicator-mutator equation. The challenge for us here in the free world is to learn how to game the replicator-mutator equation to break up the monopoly of popular opinion and make Putin pay for his arrogance. This blog explains how “mass hysteria” can arise from forces within a complex environment, and how to construct a possible antidote.

Replicator-Mutator Equation

There are several simple models of population dynamics that try to explain the rise and fall of the number of individuals that belong to varying cohorts within the population. These models incorporate aspects of relative benefit of one group over another, plus the chance to change sides–defection. The dynamics under these conditions can be highly nonlinear and highly non-intuitive. One of the simplest of these models is known as the replicator-mutator model where replication follows the fitness of the cohort, and where individuals can defect to a “more fit” cohort.

The basic dynamics of the model are

where xa is the fraction of the population that is in cohort a, Wab is a transition probability, and φ is the average fitness of the full population. The transition matrix is given by

where fb is the fitness of cohort b and Qba is a stochastic matrix that allows for defection of an individual from one cohort to another. The fitness of a cohort is given by

where pbc is the pay-off matrix for the relative benefit of one cohort at the expense of another. Finally the average fitness is

The Einstein implicit summation convention is assumed in all of these equations, and the metric space in which the dynamics are embedded is “flat” so that there is no essential difference between superscripts and subscripts. There is also a conservation law that the sum over all population fractions equals unity.

In the language of population dynamics, this model has frequency-dependent fitness, with defection and pay-off, in a zero-sum game.

One of the simplest questions to answer with this model is how so many people can come to believe one thing. This is known as “opinion uniformity”.

Uniformity versus Diversity

This replicator-mutator model explains the property of opinion uniformity, as well as the opposite extreme of opinion diversity. The starting point for both is the pay-off matrix pbc which is assumed to be unity on the diagonal for b = c and to a constant factor a for b ~= c. This pay-off is symmetric so that all opinions are equally “believable”. The stochastic defection matrix is close to unity on the diagonal, and has random terms on the off-diagonal that are proportional to a constant ε. The defection matrix allows a person from one cohort to defect to the belief system of another cohort if they believe that the new cohort has more merit. Cohorts with greater merit (fitness) gain more members over time, while cohorts with lower merit have fewer members over time.

Note that the fitness increases with the number of members in the cohort. This is the bandwagon effect. A belief system is perceived to have more merit if there are more people who believe it. This clearly creates a positive feedback that would cause this cohort to grow. Even though all the cohorts have this same positive feedback, the zero-sum rule only allows one of the cohorts to grow to its maximum extent, taking members away from all the other cohorts. This is illustrated in Fig. 1. One belief system wins, taking almost the full population with it.

Fig. 1 Population fractions evolving as a function of time for a = 0.5 and a small defection rate ε = 0.02. One winner takes almost all the population. These are two views of the same data on semilog and log-log.

What allows the winner to take all is the positive feedback where the fitness of the cohort increases with the number of members, combined with the ability for that cohort to take members from other cohorts through the defection matrix.

However, all of the cohorts are trying the same thing, and the pay-off matrix is fully symmetric and equal for all cohorts, so no cohort is intrinsically “better” than another. This property opens the door to a strong alternative to opinion uniformity. In fact, as more members are allowed to defect, it creates a trend counter to winner-take-all, helping to equalize the cohorts. Suddenly, a bifurcation is passed when the winner-take-all converts discontinuously to a super-symmetric situation when all opinions are held by equal numbers of people. This is illustrated in Fig. 2 for a slightly higher defection rate ε = 0.03. The parameters are identical to those in Fig. 1, but the higher defection rate stabilizes the super-symmetric state of maximum diversity.

Fig. 2 Population fractions for higher defection rate of 0.03. In super-symmetric state, all opinions are held at the same rate with maximum diversity.

These two extreme results of the replicator-mutator equation, that switch suddenly from one to the other dependent on the defection rate, may seem to produce solutions neither of which are ideal for a healthy democracy. One the one hand, in the uniform case where the wining opinion is monolithic, everyone is a carbon-copy of everyone else, which is a form of cultural death (lack of diversity). But, on the other hand, one might argue that maximum opinion diversity is just as concerning, because no-one can agree on anything. If all opinions are equivalent, then everyone in the society believes something different and there is no common ground. But in the diversity case, at least there is no state-level control of the population. In the case of opinion uniformity, the wining opinion can be manipulated by propaganda.

The Propaganda Machine

A government can “seed” the belief networks with propaganda that favors the fitness of what they want their citizens to hear. Because of the positive feedback, any slight advantage of one opinion over others can allow that opinion to gain large numbers through the bandwagon effect. Of course, even stronger control that stifles dissent, for instance by shutting down the free press, makes it that much more likely that the state-controlled story is believed. This may be one reason for the 60% (as of the writing of this blog) support Putin’s war, despite the obvious lies that are being told. This is illustrated in Fig. 3 by boosting the payoff between two similar lies that the government wants its people to believe. These rise to take about 60% of the population. Members of the cohort are brain-washed, not by the government alone, but by all their neighbors who are parroting the same thing.

Fig. 3 Government propaganda acts as a “seed” that makes the propaganda grow faster than other beliefs, even for a defection rate of 0.03 which is above the threshold of Fig. 2.

Breaking the Monopoly of Thought

How do we fight back? Not just against the Kremlin’s propaganda, but also against QANON and Trump’s Big Lie and the pernicious fallacy of nationalism? The answer is simple: diversity of thought! The sliver bullet in the replicator-mutator model is the defection matrix. The existence of a bifurcation means that a relatively small increase in the amount of diverse opinion, and the freedom to swap opinions, can lead to a major qualitative collapse of the monolithic thought, even when supported by government propaganda, as shown in Fig. 4. More people may still believe in the state-supported propaganda than the others, but it is no longer a majority.

Fig. 4 Increasing the defection rate can help equalize free opinions against the state-supported propaganda

The above models were all very homogeneous. It is more realistic that people are connected through small-world networks. In this case, there is much more diversity, as shown in Fig. 5, although the defection rate needs to be much higher to prevent a monolithic opinion from dominating. The state-supported propaganda is buried in the resulting mix of diverse ideas. Therefore, to counteract state control, people must feel free to hop about in their choice of beliefs and have access to other beliefs.

Fig. 5 The defection matrix is multiplied by the adjacency matrix of a small-world network. There is significant diversity of thought, but a relatively high defection rate is needed. The state-supported propaganda is buried in this mix.

This is a bit paradoxical. On the one hand, the connectivity of the internet has fostered the rise of conspiracy theories and other odd-ball ideas. But sustained access to multiple sources of information is the best defense against all that crazy stuff winning out. In other words, not only do we have to put up with the lunatic fringe if we are to have full diversity of thought, but we need to encourage everyone to feel free to “shop around” for different ideas, even if some of them are crazy. Our free society shouldn’t be cancelling people who have divergent opinions, because that sets us down the path to authoritarianism. As a recent add said in the New York Times, “Cancel culture cancels culture.” Unfortunately, authoritarianism is on the rise around the world, and the US almost suffered that fate on Jan. 6, 2021. Furthermore, with Xi aligning with Putin and giving him the green light on Ukraine–cynically on the eve of the Olympic Games (of peace)–the new world order will revolve around that axis for decades to come, if the world survives that long. Diversity and freedom may be the only antidote.

Matlab Program: Repmut.m

function repmut
% https://github.itap.purdue.edu/nolte/Matlab-Programs-for-Nonlinear-Dynamics

clear
format compact

N = 63;     
p = 0.5;

mutype = 1;     % 0 = Hamming   1 = rand
pay = 1;        % 0 = Hamming   1 = 1/sqrt(N) 
ep = 0.5;      % average mutation rate: 0.1 to 0.01 typical  (0.4835)

%%%%% Set original population
x0temp = rand(1,N);    % Initial population
sx = sum(x0temp);
y0 = x0temp/sx;
Pop0 = sum(y0);


%%%%% Set Adjacency

%node = makeglobal(N);
%node = makeER(N,0.25);       % 0.5     0.25 
%node = makeSF(N,6);       % 12         6
node = makeSW(N,7,0.125);   % 15,0.5    7,0.5
[Adj,degree,Lap] = adjacency(node);

%%%%%% Set Hamming distance
for yloop = 1:N
    for xloop = 1:N
        H(yloop,xloop) = hamming(yloop-1,xloop-1);
    end
end

%%%%%%% Set Mutation matrix
if mutype == 0
    Qtemp = 1./(1+H/ep);    %Mutation matrix on Hamming
    Qsum = sum(Qtemp,2);
    mnQsum = mean(Qsum);
    
    % Normalize mutation among species
    for yloop = 1:N
        for xloop = 1:N
            Q(yloop,xloop) = Qtemp(yloop,xloop)/Qsum(xloop);
        end
    end
    
elseif mutype == 1  
    S = stochasticmatrix(N);
    Stemp = S - diag(diag(S));
    Qtemp = ep*Stemp;
    sm = sum(Qtemp,2)';
    Q = Qtemp + diag(ones(1,N) - sm);
end

figure(1)
imagesc(Q)
title('Mutation Matrix')
colormap(jet)

%%%%%%% Set payoff matrix
if pay == 1
    payoff = zeros(N,N);
    for yloop = 1:N
        payoff(yloop,yloop) = 1;
        for xloop = yloop + 1:N
            payoff(yloop,xloop) = p;
            payoff(xloop,yloop) = p;
            payoff(1,N) = 1;    % Propaganda
            payoff(N,1) = 1;
        end
    end
elseif pay == 0
    payoff = zerodiag(exp(-1*H));
end

figure(2)
imagesc(payoff)
title('Payoff Matrix')
colormap(jet)

% Run time evolution
tspan = [0 4000];
[t,x] = ode45(@quasispec,tspan,y0);

Pop0
[sz,dum] = size(t);
Popend = sum(x(sz,:))

for loop = 1:N
    fit(loop) = sum(payoff(:,loop)'.*x(sz,:));
end

phistar = sum(fit.*x(sz,:))       % final average fitness

xend = x(sz,:)
sortxend = sort(xend,'descend');
coher = sum(sortxend(1:2))

figure(3)
clf
h = colormap(lines);
for loop = 1:N
    plot(t,x(:,loop),'Color',h(round(loop*64/N),:),'LineWidth',1.25)
    hold on
end
hold off

figure(4)
clf
for loop = 1:N
    semilogx(t,x(:,loop),'Color',h(round(loop*64/N),:),'LineWidth',1.25)
    hold on
end
hold off

figure(5)
clf
for loop = 1:N
    semilogy(t,x(:,loop),'Color',h(round(loop*64/N),:),'LineWidth',1.25)
    hold on
end
hold off

figure(6)
clf
for loop = 1:N
    loglog(t,x(:,loop),'Color',h(round(loop*64/N),:),'LineWidth',1.25)
    hold on
end
hold off

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    function yd = quasispec(~,y)
        
        for floop = 1:N
            f(floop) = sum(payoff(:,floop).*y);
        end
        
        Connect = Adj + eye(N);
        
        % Transition matrix
        for yyloop = 1:N
            for xxloop = 1:N
                W(yyloop,xxloop) = f(yyloop)*(Connect(yyloop,xxloop)*Q(yyloop,xxloop));
            end
        end
        
        phi = sum(f'.*y);   % Average fitness of population
        
        yd = W*y - phi*y;
        
    end     % end quasispec
end

Further Reading

M. A. Nowak, Evolutionary Dynamics: Exploring the Equations of Life. Cambridge, Mass.: Harvard University Press, 2006.

Life in a Solar System with a Super-sized Jupiter

There are many known super-Jupiters that orbit their stars—they are detected through a slight Doppler wobble they induce on their stars [1].  But what would become of a rocky planet also orbiting those stars as they feel the tug of both the star and the super planet?

This is not of immediate concern for us, because our solar system has had its current configuration of planets for over 4 billion years.  But there can be wandering interstellar planets or brown dwarfs that could visit our solar system, like Oumuamua did in 2017, but much bigger and able to scramble the planetary orbits. Such hypothesized astronomical objects have been given the name “Nemesis“, and it warrants thought on what living in an altered solar system might be like.

What would happen to Earth if Jupiter were 50 times bigger? Could we survive?

The Three-Body Problem

The Sun-Earth-Jupiter configuration is a three-body problem that has a long and interesting history, playing a key role in several aspects of modern dynamics [2].  There is no general analytical solution to the three-body problem.  To find the behavior of three mutually interacting bodies requires numerical solution.  However, there are subsets of the three-body problem that do yield to partial analytical approaches.  One of these is called the restricted three-body problem [3].  It consists of two massive bodies plus a third (nearly) massless body that all move in a plane.  This restricted problem was first tackled by Euler and later by Poincaré, who discovered the existence of chaos in its solutions.

The geometry of the restricted three-body problem is shown in Fig. 1. In this problem, take mass m1 = mS to be the Sun’s mass, m2 = mJ to be Jupiter’s mass, and the third (small) mass is the Earth. 

Fig. 1  The restricted 3-body problem in the plane.  The third mass is negligible relative to the first two masses that obey 2-body dynamics.

The equation of motion for the Earth is

where

and the parameter ξ characterizes the strength of the perturbation of the Earth’s orbit around the Sun.  The parameters for the Jupiter-Sun system are

with

for the 11.86 year journey of Jupiter around the Sun.  Eq. (1) is a four-dimensional non-autonomous flow

The solutions of an Earth orbit are shown in Fig.2.  The natural Earth-Sun-Jupiter system has a mass ratio mJ/mS = 0.001 for Jupiter relative to the Sun mass.  Even in this case, Jupiter causes perturbations of the Earth’s orbit by about one percent.  If the mass of Jupiter increases, the perturbations would grow larger until around ξ= 0.06 when the perturbations become severe and the orbit grows unstable.  The Earth gains energy from the momentum of the Sun-Jupiter system and can reach escape velocity.  The simulation for a mass ratio of 0.07 shows the Earth ejected from the Solar System.

Fig.2  Orbit of Earth as a function of the size of a Jupiter-like planet.  The natural system has a Jupiter-Earth mass ratio of 0.03.  As the size of Jupiter increases, the Earth orbit becomes unstable and can acquire escape velocity to escape from the Solar System. From body3.m. (Reprinted from Ref. [4])

The chances for ejection depends on initial conditions for these simulations, but generally the danger becomes severe when Jupiter is about 50 times larger than it currently is. Otherwise the Earth remains safe from ejection. However, if the Earth is to keep its climate intact, then Jupiter should not be any larger than about 5 times its current size. At the other extreme, for a planet 70 times larger than Jupiter, the Earth may not get ejected at once, but it can take a wild ride through the solar system. A simulation for a 70x Jupiter is shown in Fig. 3. In this case, the Earth is captured for a while as a “moon” of Jupiter in a very tight orbit around the super planet as it orbits the sun before it is set free again to orbit the sun in highly elliptical orbits. Because of the premise of the restricted three-body problem, the Earth has no effect on the orbit of Jupiter.

Fig. 3 Orbit of Earth for TJ = 11.86 years and ξ = 0.069. The radius of Jupiter is RJ = 5.2. Earth is “captured” for a while by Jupiter into a very tight orbit.

Resonance

If Nemesis were to swing by and scramble the solar system, then Jupiter might move closer to the Earth. More ominously, the period of Jupiter’s orbit could come into resonance with the Earth’s period. This occurs when the ratio of orbital periods is a ratio of small integers. Resonance can amplify small perturbations, so perhaps Jupiter would become a danger to Earth. However, the forces exerted by Jupiter on the Earth changes the Earth’s orbit and hence its period, preventing strict resonance to occur, and the Earth is not ejected from the solar system even for initial rational periods or larger planet mass. This is related to the famous KAM theory of resonances by Kolmogorov, Arnold and Moser that tends to protect the Earth from the chaos of the solar system. More often than not in these scenarios, the Earth is either captured by the super Jupiter, or it is thrown into a large orbit that is still bound to the sun. Some examples are given in the following figures.

Fig. 4 Orbit of Earth for an initial 8:1 resonance of TJ = 8 years and ξ = 0.073. The Radius of Jupiter is R = 4. Jupiter perturbs the Earth’s orbit so strongly that the 8:1 resonance is quickly removed.
Fig. 5 Earth orbit for TJ = 12 years and ξ = 0.071. The Earth is thrown into a nearly circular orbit beyond the orbit of Saturn.

Fig. 6 Earth Orbit for TJ = 4 years and ξ = 0.0615. Earth is thrown into an orbit of high ellipticity out to the orbit of Neptune.

Life on a planet in a solar system with two large bodies has been envisioned in dramatic detail in the science fiction novel “Three-Body Problem” by Liu Cixin about the Trisolarians of the closest known exoplanet to Earth–Proxima Centauri b.

Matlab Code: body3.m

function body3

clear

chsi0 = 1/1000;     % Earth-moon ratio = 1/317
wj0 = 2*pi/11.86;

wj = 2*pi/8;
chsi = 73*chsi0;    % (11.86,60) (11.86,67.5) (11.86,69) (11.86,70) (4,60) (4,61.5) (8,73) (12,71) 

rj = 5.203*(wj0/wj)^0.6666

rsun = chsi*rj/(1+chsi);
rjup = (1/chsi)*rj/(1+1/chsi);

r0 = 1-rsun;
y0 = [r0 0 0 2*pi/sqrt(r0)];

tspan = [0 300];
options = odeset('RelTol',1e-5,'AbsTol',1e-6);
[t,y] = ode45(@f5,tspan,y0,options);

figure(1)
plot(t,y(:,1),t,y(:,3))

figure(2)
plot(y(:,1),y(:,3),'k')
axis equal
axis([-6 6 -6 6])

RE = sqrt(y(:,1).^2 + y(:,3).^2);
stdRE = std(RE)

%print -dtiff -r800 threebody

    function yd = f5(t,y)
        
        xj = rjup*cos(wj*t);
        yj = rjup*sin(wj*t);
        xs = -rsun*cos(wj*t);
        ys = -rsun*sin(wj*t);
        rj32 = ((y(1) - xj).^2 + (y(3) - yj).^2).^1.5;
        r32 = ((y(1) - xs).^2 + (y(3) - ys).^2).^1.5;

        yp(1) = y(2);
        yp(2) = -4*pi^2*((y(1)-xs)/r32 + chsi*(y(1)-xj)/rj32);
        yp(3) = y(4);
        yp(4) = -4*pi^2*((y(3)-ys)/r32 + chsi*(y(3)-yj)/rj32);
 
        yd = [yp(1);yp(2);yp(3);yp(4)];

    end     % end f5

end



References:

[1] D. D. Nolte, “The Fall and Rise of the Doppler Effect,” Physics Today, vol. 73, no. 3, pp. 31-35, Mar (2020)

[2] J. Barrow-Green, Poincaré and the three body problem. London Mathematical Society, 1997.

[3] M. C. Gutzwiller, “Moon-Earth-Sun: The oldest three-body problem,” Reviews of Modern Physics, vol. 70, no. 2, pp. 589-639, Apr (1998)

[4] D. D. Nolte, Introduction to Modern Dynamics : Chaos, Networks, Space and Time, 1st ed. (Oxford University Press, 2015).

The Physics of Robinson Crusoe’s Economy

“What is a coconut worth to a cast-away on a deserted island?”

In the midst of the cast-away’s misfortune and hunger and exertion and food lies an answer that looks familiar to any physicist who speaks the words

“Assume a Lagrangian …”

It is the same process that determines how a bead slides along a bent wire in gravity or a skier navigates a ski hill.  The answer: find the balance of economic forces subject to constraints. 

Here is the history and the physics behind one of the simplest economic systems that can be conceived:  Robinson Crusoe spending his time collecting coconuts!

Robinson Crusoe in Economic History

Daniel Defoe published “The Life and Strange Surprizing Adventures of Robinson Crusoe” in 1719, about a man who is shipwrecked on a deserted island and survives there for 28 years before being rescued.  It was written in the first person, as if the author had actually lived through those experiences, and it was based on a real-life adventure story.  It is one of the first examples of realistic fiction, and it helped establish the genre of the English novel.

Several writers on economic theory made mention of Robinson Crusoe as an example of a labor economy, but it was in 1871 that Robinson Crusoe became an economic archetype.  In that year both William Stanley Jevons‘s The Theory of Political Economy and Carl Menger‘s Grundsätze der Volkswirtschaftslehre (Principles of Economics) used Robinson Crusoe to illustrate key principles of the budding marginalist revolution.

Marginalism in economic theory is the demarcation between classical economics and modern economics.  The key principle of marginalism is the principle of “diminishing returns” as the value of something gets less as an individual has more of it.  This principle makes functions convex, which helps to guarantee that there are equilibrium points in the economy.  Economic equilibrium is a key concept and goal because it provides stability to economic systems.

One-Product Is a Dull Diet

The Robinson Crusoe economy is  one of the simplest economic models that captures the trade-off between labor and production on one side, and leisure and consumption on the other.  The model has a single laborer for whom there are 24*7 =168 hours in the week.  Some of these hours must be spent finding food, let’s say coconuts, while the other hours are for leisure and rest.  The production of coconuts follows a production curve

that is a function of labor L.  There are diminishing returns in the finding of coconuts for a given labor, making the production curve of coconuts convex.  The amount of rest is

and there is a reciprocal production curve q(R) related to less coconuts produced for more time spent resting. In this model it is assumed that all coconuts that are produced are consumed.  This is known as market clearing when no surplus is built up. 

The production curve presents a continuous trade-off between consumption and leisure, but at first look there is no obvious way to decide how much to work and how much to rest.  A lazy person might be willing to go a little hungry if they can have more rest, while a busy person might want to use all waking hours to find coconuts.  The production curve represents something known as a Pareto frontier.  It is a continuous trade-off between two qualities.  Another example of a Pareto frontier is car engine efficiency versus cost.  Some consumers may care more about the up-front cost of the car than the cost of gas, while other consumers may value fuel efficiency and be willing to pay higher costs to get it. 

Continuous trade offs always present a bit of a problem for planning. It is often not clear what the best trade off should be. This problem is solved by introducing another concept into this little economy–the concept of “Utility”.

The utility function was introduced by the physicist Daniel Bernoulli, one of the many bountiful Bernoullis of Basel, in 1738. The utility function is a measure of how much benefit or utility a person or an enterprise gains by holding varying amounts of goods or labor. The essential problem in economic exchange is to maximize one’s utility function subject to whatever constraints are active. The utility function for Robinson Crusoe is

This function is obviously a maximum at maximum leisure (R = 1) and lots of coconuts (q = 1), but this is not allowed, because it lies off the production curve q(R). Therefore the question becomes: where on the production curve he can maximize the trade-off between coconuts and leisure?

Fig. 1 shows the dynamical space for Robinson Crusoe’s economy. The space is two dimensional with axes for coconuts q and rest R. Isoclines of the utility function are shown as contours known as “indifference” curves, because the utility is constant along these curves and hence Robinson Crusoe is indifferent to his position on it. The indifference curves are cut by the production curve q(R). The equilibrium problem is to maximize utility subject to the production curve.

Fig. 1 The production space of the Robinson Crusoe economy. The production curve q(R) cuts across the isoclines of the utility function U(q,R). The contours represent “indifference” curves because the utility is constant along a contour.

When looking at dynamics under constraints, Lagrange multipliers are the best tool. Furthermore, we can impart dynamics into the model with temporal adjustments in q and R that respond to economic forces.

The Lagrangian Economy

The approach to the Lagrangian economy is identical to the Lagrangian approach in classical physics. The equation of constraint is

All the dynamics take place on the production curve. The initial condition starts on the curve, and the state point moves along the curve until it reaches a maximum and settles into equilibrium. The dynamics is therefore one-dimensional, the link between q and R being the production curve.

The Lagrangian in this simple economy is given by the utility function augmented by the equation of constraint, such that

where λ is the Lagrangian undetermined multiplier. The Euler-Lagrange equations for dynamics are

where the term on the right-hand-side is a drag force with the relaxation rate γ.

The first term on the left is the momentum of the system. In economic dynamics, this is usually negligible, similar to dynamics in living systems at low Reynold’s number in which all objects are moving instantaneously at their terminal velocity in response to forces. The equations of motion are therefore

The Lagrange multiplier can be solved from the first equation as

and the last equation converts q-dot to R-dot to yield the single equation

which is a one-dimensional flow

where all q’s are expressed as R’s through the equation of constraint. The speed vanishes at the fixed point—the economic equilibrium—when

This is the point of Pareto efficient allocation. Any initial condition on the production curve will relax to this point with a rate given by γ. These trajectories are shown in Fig. 2. From the point of view of Robinson Crusoe, if he is working harder than he needs, then he will slack off. But if there aren’t enough coconuts to make him happy, he will work harder.

Fig. 2 Motion occurs on the one-dimensional manifold defined by the production curve such that the utility is maximized at a unique point called the Pareto Efficient Allocation.

The production curve is like a curved wire, the amount of production q is like the bead sliding on the wire. The utility function plays the role of a potential function, and the gradients of the utility function play the role of forces. Then this simple economic model is just like ordinary classical physics of point masses responding to forces constrained to lie on certain lines or surfaces. From this viewpoint, physics and economics are literally the same.

Worked Example

To make this problem specific, consider a utility function given by

that has a maximum in the upper right corner, and a production curve given by

that has diminishing returns. Then, the condition of equilibrium can be solved using

to yield

With the (fairly obvious) answer

For More Reading

[1] D. D. Nolte, Introduction to Modern Dynamics : Chaos, Networks, Space and Time, 2nd ed. Oxford : Oxford University Press (2019).

[2] Fritz Söllner; The Use (and Abuse) of Robinson Crusoe in Neoclassical Economics. History of Political Economy; 48 (1): 35–64. (2016)

Physics of the Flipping iPhone and the Fate of the Earth

Find an iPhone, then flip it face upwards (hopefully over a soft cushion or mattress).  What do you see?

An iPhone is a rectangular parallelepiped with three unequal dimensions and hence three unequal principle moments of inertia I1 < I2 < I3.  These axes are: vertical to the face, horizontal through the small dimension, and horizontal through the long dimension. So now spin the iPhone around its long axis, it keeps a nice and steady spin.  And then spin it around an axis point out of the face, again it’s a nice steady spin. But flip it face upwards, and it almost always does a half twist. Why?

The answer is variously known as the Tennis Racket Theorem or the Intermediate Axis Theorem or even the Dzhanibekov Effect. If you don’t have an iPhone or Samsung handy, then watch this NASA video of the effect.

Stability Analysis

The flipping iPhone is a rigid body experiencing force-free motion. The Euler equations are an easy way to approach the somewhat complicated physics. These equations are

They all equal zero because there is no torque. First let’s assume the object is rotating mainly around the x1 axis so that ω2 and ω3 are small (rotating mainly around ω1).  Then solving for the angular accelerations yields

This is a two-dimensional flow equation in the variables  ω2, ω3.  Hence we can apply classic stability analysis for rotation mainly about the x1 axis. The Jacobian matrix is

This matrix has a trace τ = 0 and a determinant Δ given by

Because of the ordering I1 < I2 < I3 we know that this is quantity is positive. 

Armed with the trace and the determinant of a two-dimensional flow, we simply need to look at the 2D “stability space” as shown in Fig. 1. The horizontal axis is the determinant of the Jacobian matrix evaluated at the fixed point of the motion, and the vertical axis is the trace. In the case of the flipping iPhone, the Jacobian matrix is independent of both ω2 and ω3 (if they are remain small), so it has a global stability. When the determinant is positive, the stability depends on the trace. If the trace is positive, all motions are unstable (deviations grow exponentially). If the trace is negative, all motions are stable. The sideways parabola in the figure is known as the discriminant. If solutions are within the discriminant, they are spirals. As the trace approaches the origin, the spirals get slower and slow, until they become simple harmonic motions when the trace goes to zero. This kind of marginal stability is also known as centers. Centers have a stead-state stability without dissipation.

Fig. 1 The stability space for two-dimensional dynamics. The vertical axis is the trace of the Jacobian matrix and the horizontal axis is the determinant. If the determinant is negative, all motions are unstable saddle points. Otherwise, stability depends on the sign of the trace, unless the trace is zero, for which case the motion has steady-state stability like celestial orbits or harmonic oscillators. (Reprinted from Ref. [1])

For the flipping iPhone (or tennis racket or book), the trace is zero and the determinant is positive for rotation mainly about the x1 axis, and the stability is therefore a “center”.  This is why the iPhone spins nicely about its axis with the smallest moment.

Let’s permute the indices to get the motion about the x3 axis with the largest moment. Then

The trace and determinant are

where the determinant is again positive and the stability is again a center.

But now let’s permute again so that the motion is mainly about the x2 axis with the intermediate moment.  In this case

And the trace and determinant are

The determinant is now negative, and from Fig. 1, this means that the stability is a saddle point. 

Saddle points in 2D have one stable manifold and one unstable manifold.  If the initial condition is just a little off the stability point, then the deviation will grow as the dynamical trajectory moves away from the equilibrium point along the unstable manifold.

The components of the angular frequencies of each of these cases is shown in Fig. 2 for rotation mainly around x1, then x2 and then x3. A small amount of rotation is given as an initial condition about the other two axes for each case. For these calculations, no approximations were made, using the full Euler equations, and the motion is fully three-dimensional.

Fig. 2 Angular frequency components for motion with initial conditions of spin mainly about, respecitvely, the x1, x2 and x3 axes. The x2 case shows strong nonlinearity and slow unstable dynamics that periodically reverse. (I1 = 0.3, I2 = 0.5, I3 = 0.7)

Fate of the Spinning Earth

When two of the axes have very similar moments of inertia, that is, when the object becomes more symmetric, then the unstable dynamics can get very slow. An example is shown in Fig. 3 for I2 just a bit smaller than I3. The high frequency spin remains the same for long times and then quickly reverses. During the time when the spin is nearly stable, the other angular frequencies are close to zero, and the object would have only a slight wobble to it. Yet, in time, the wobble goes from bad to worse, until the whole thing flips over. It’s inevitable for almost any real-world solid…like maybe the Earth.

Fig. 3 Angular frequencies for a slightly asymmetric rigid body. The spin remains the same for long times and then flips suddenly.

The Earth is an oblate spheroid, wider at the equator because of the centrifugal force of the rotation. If it were a perfect spheroid, then the two moments orthogonal to the spin axis would be identically equal. However, the Earth has landmasses, continents, that make the moments of inertia slightly unequal. This would have catastrophic consequences, because if the Earth were perfectly rigid, then every few million years it should flip over, scrambling the seasons!

But that doesn’t happen. The reason is that the Earth has a liquid mantel and outer core that very slowly dissipate any wobble. The Earth, and virtually every celestial object that has any type of internal friction, always spins about its axis with the highest moment of inertia, which also means the system relaxes to its lowest kinetic energy for conserved L through the simple equation

So we are safe!

Python Code

Here is a simple Python code to explore the intermediate axis theorem. Change the moments of inertia and change the initial conditions. Note that this program does not solve for the actual motions–the configuration-space trajectories. The solution of the Euler equations gives the time evolution of the three components of the angular velocity. Incremental rotations could be applied through rotation matrices operating on the configuration space to yield the configuration-space trajectory of the flipping iPhone (link to the technical details here).

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thurs Oct 7 19:38:57 2021

@author: David Nolte
Introduction to Modern Dynamics, 2nd edition (Oxford University Press, 2019)

FlipPhone Example
"""
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt

plt.close('all')
fig = plt.figure()
ax = fig.add_axes([0, 0, 1, 1], projection='3d')
ax.axis('on')

I1 = 0.45   # Moments of inertia can be changed here
I2 = 0.5
I3 = 0.55

def solve_lorenz(max_time=300.0):

# Flip Phone
    def flow_deriv(x_y_z, t0):

        x, y, z = x_y_z
        
        yp1 = ((I2-I3)/I1)*y*z;
        yp2 = ((I3-I1)/I2)*z*x;
        yp3 = ((I1-I2)/I3)*x*y;
        
        return [yp1, yp2, yp3]
    
    model_title = 'Flip Phone'

    t = np.linspace(0, max_time/4, int(250*max_time/4))

    # Solve for trajectories
    x0 = [[0.01,1,0.01]]   # Initial Conditions:  Change the major rotation axis here ....
    t = np.linspace(0, max_time, int(250*max_time))
    x_t = np.asarray([integrate.odeint(flow_deriv, x0i, t)
                      for x0i in x0])
     
    x, y, z = x_t[0,:,:].T
    lines = ax.plot(x, y, z, '-')
    plt.setp(lines, linewidth=0.5)

    ax.view_init(30, 30)
    plt.show()
    plt.title(model_title)
    plt.savefig('Flow3D')

    return t, x_t

ax.set_xlim((-1.1, 1.1))
ax.set_ylim((-1.1, 1.1))
ax.set_zlim((-1.1, 1.1))

t, x_t = solve_lorenz()

plt.figure(2)
lines = plt.plot(t,x_t[0,:,0],t,x_t[0,:,1],t,x_t[0,:,2])
plt.setp(lines, linewidth=1)


[1] D. D. Nolte, Introduction to Modern Dynamics, 2nd Edition (Oxford, 2019)

To see more on the Intermediate Axis Theorem, watch this amazing Youtube.

And here is another description of the Intermediate Axis Theorem.

Spontaneous Symmetry Breaking: A Mechanical Model

Symmetry is the canvas upon which the laws of physics are written. Symmetry defines the invariants of dynamical systems. But when symmetry breaks, the laws of physics break with it, sometimes in dramatic fashion. Take the Big Bang, for example, when a highly-symmetric form of the vacuum, known as the “false vacuum”, suddenly relaxed to a lower symmetry, creating an inflationary cascade of energy that burst forth as our Universe.

The early universe was extremely hot and energetic, so much so that all the forces of nature acted as one–described by a unified Lagrangian (as yet resisting discovery by theoretical physicists) of the highest symmetry. Yet as the universe expanded and cooled, the symmetry of the Lagrangian broke, and the unified forces split into two (gravity and electro-nuclear). As the universe cooled further, the Lagrangian (of the Standard Model) lost more symmetry as the electro-nuclear split into the strong nuclear force and the electro-weak force. Finally, at a tiny fraction of a second after the Big Bang, the universe cooled enough that the unified electro-week force broke into the electromagnetic force and the weak nuclear force. At each stage, spontaneous symmetry breaking occurred, and invariants of physics were broken, splitting into new behavior. In 2008, Yoichiro Nambu received the Nobel Prize in physics for his model of spontaneous symmetry breaking in subatomic physics.

Fig. 1 The spontanous symmetry breaking cascade after the Big Bang. From Ref.

Bifurcation Physics

Physics is filled with examples of spontaneous symmetry breaking. Crystallization and phase transitions are common examples. When the temperature is lowered on a fluid of molecules with high average local symmetry, the molecular interactions can suddenly impose lower-symmetry constraints on relative positions, and the liquid crystallizes into an ordered crystal. Even solid crystals can undergo a phase transition as one symmetry becomes energetically advantageous over another, and the crystal can change to a new symmetry.

In mechanics, any time a potential function evolves slowly with some parameter, it can start with one symmetry and evolve to another lower symmetry. The mechanical system governed by such a potential may undergo a discontinuous change in behavior.

In complex systems and chaos theory, sudden changes in behavior can be quite common as some parameter is changed continuously. These discontinuous changes in behavior, in response to a continuous change in a control parameter, is known as a bifurcation. There are many types of bifurcation, carrying descriptive names like the pitchfork bifurcation, period-doubling bifurcation, Hopf bifurcation, and fold bifurcation, among others. The pitchfork bifurcation is a typical example, shown in Fig. 2. As a parameter is changed continuously (horizontal axis), a stable fixed point suddenly becomes unstable and two new stable fixed points emerge at the same time. This type of bifurcation is called pitchfork because the diagram looks like a three-tined pitchfork. (This is technically called a supercritical pitchfork bifurcation. In a subcritical pitchfork bifurcation the solid and dashed lines are swapped.) This is exactly the bifurcation displayed by a simple mechanical model that illustrates spontaneous symmetry breaking.

Fig. 2 Bifurcation plot of a pitchfork bifurcation. As a parameter is changed smoothly and continuously (horizontal axis), a stable fixed point suddenly splits into three fixed points: one unstable and the other two stable.

Sliding Mass on a Rotating Hoop

One of the simplest mechanical models that displays spontaneous symmetry breaking and the pitchfork bifurcation is a bead sliding without friction on a circular hoop that is spinning on the vertical axis, as in Fig. 3. When it spins very slowly, this is just a simple pendulum with a stable equilibrium at the bottom, and it oscillates with a natural oscillation frequency ω0 = sqrt(g/b), where b is the radius of the hoop and g is the acceleration due to gravity. On the other hand, when it spins very fast, then the bead is flung to to one side or the other by centrifugal force. The bead then oscillates around one of the two new stable fixed points, but the fixed point at the bottom of the hoop is very unstable, because any deviation to one side or the other will cause the centrifugal force to kick in. (Note that in the body frame, centrifugal force is a non-inertial force that arises in the non-inertial coordinate frame. )

Fig. 3 A bead sliding without friction on a circular hoop rotating about a vertical axis. At high speed, the bead has a stable equilibrium to either side of the vertical.

The solution uses the Euler equations for the body frame along principal axes. In order to use the standard definitions of ω1, ω2, and ω3, the angle θ MUST be rotated around the x-axis.  This means the x-axis points out of the page in the diagram.  The y-axis is tilted up from horizontal by θ, and the z-axis is tilted from vertical by θ.  This establishes the body frame.

The components of the angular velocity are

And the moments of inertia are (assuming the bead is small)

There is only one Euler equation that is non-trivial. This is for the x-axis and the angle θ. The x-axis Euler equation is

and solving for the angular acceleration gives.

This is a harmonic oscillator with a “phase transition” that occurs as ω increases from zero.  At first the stable equilibrium is at the bottom.  But when ω passes a critical threshold, the equilibrium angle begins to increase to a finite angle set by the rotation speed.

This can only be real if  the magnitude of the argument is equal to or less than unity, which sets the critical threshold spin rate to make the system move to the new stable points to one side or the other for

which interestingly is the natural frequency of the non-rotating pendulum. Note that there are two equivalent angles (positive and negative), so this problem has a degeneracy. 

This is an example of a dynamical phase transition that leads to spontaneous symmetry breaking and a pitchfork bifurcation. By integrating the angular acceleration we can get the effective potential for the problem. One contribution to the potential is due to gravity. The other is centrifugal force. When combined and plotted in Fig. 4 for a family of values of the spin rate ω, a pitchfork emerges naturally by tracing the minima in the effective potential. The values of the new equilibrium angles are given in Fig. 2.

Fig. 4 Effective potential as a function of angle for a family of spin rates. At the transition spin rate, the effective potential is essentially flat with zero natural frequency. The pitchfork is the dashed green line.

Below the transition threshold for ω, the bottom of the hoop is the equilibrium position. To find the natural frequency of oscillation, expand the acceleration expression

For small oscillations the natural frequency is given by

As the effective potential gets flatter, the natural oscillation frequency decreases until it vanishes at the transition spin frequency. As the hoop spins even faster, the new equilibrium positions emerge. To find the natural frequency of the new equilibria, expand θ around the new equilibrium θ’ = θ – θ0

Which is a harmonic oscillator with oscillation angular frequency

Note that this is zero frequency at the transition threshold, then rises to match the spin rate of the hoop at high frequency. The natural oscillation frequency as a function of the spin looks like Fig. 5.

Fig. 5 Angular oscillation frequency for the bead. The bifurcation occurs at the critical spin rate ω = sqrt(g/b).

This mechanical analog is highly relevant for the spontaneous symmetry breaking that occurs in ferroelectric crystals when they go through a ferroelectric transition. At high temperature, these crystals have no internal polarization. But as the crystal cools towards the ferroelectric transition temperature, the optical-mode phonon modes “soften” as the phonon frequency decreases and vanishes at the transition temperature when the crystal spontaneously polarizes in one of several equivalent directions. The observation of mode softening in a polar crystal is one signature of an impending ferroelectric phase transition. Our mass on the hoop captures this qualitative physics nicely.

Golden Behavior

For fun, let’s find at what spin frequency the harmonic oscillation frequency at the dynamic equilibria equal the original natural frequency of the pendulum. Then

which is the golden ratio.  It’s spooky how often the golden ratio appears in random physics problems!