Heisenberg’s uncertainty principle is a law of physics – it cannot be violated under any circumstances, no matter how much we may want it to yield or how hard we try to bend it. Heisenberg, as he developed his ideas after his lone epiphany like a monk on the isolated island of Helgoland off the north coast of Germany in 1925, became a bit of a zealot, like a religious convert, convinced that all we can say about reality is a measurement outcome. In his view, there was no independent existence of an electron other than what emerged from a measuring apparatus. Reality, to Heisenberg, was just a list of numbers in a spread sheet—matrix elements. He took this line of reasoning so far that he stated without exception that there could be no such thing as a trajectory in a quantum system. When the great battle commenced between Heisenberg’s matrix mechanics against Schrödinger’s wave mechanics, Heisenberg was relentless, denying any reality to Schrödinger’s wavefunction other than as a calculation tool. He was so strident that even Bohr, who was on Heisenberg’s side in the argument, advised Heisenberg to relent . Eventually a compromise was struck, as Heisenberg’s uncertainty principle allowed Schrödinger’s wave functions to exist within limits—his uncertainty limits.
Disaster in the Poconos
Yet the idea of an actual trajectory of a quantum particle remained a type of heresy within the close quantum circles. Years later in 1948, when a young Richard Feynman took the stage at a conference in the Poconos, he almost sabotaged his career in front of Bohr and Dirac—two of the giants who had invented quantum mechanics—by having the audacity to talk about particle trajectories in spacetime diagrams.
Feynman was making his first presentation of a new approach to quantum mechanics that he had developed based on path integrals. The challenge was that his method relied on space-time graphs in which “unphysical” things were allowed to occur. In fact, unphysical things were required to occur, as part of the sum over many histories of his path integrals. For instance, a key element in the approach was allowing electrons to travel backwards in time as positrons, or a process in which the electron and positron annihilate into a single photon, and then the photon decays back into an electron-positron pair—a process that is not allowed by mass and energy conservation. But this is a possible history that must be added to Feynman’s sum.
It all looked like nonsense to the audience, and the talk quickly derailed. Dirac pestered him with questions that he tried to deflect, but Dirac persisted like a raven. A question was raised about the Pauli exclusion principle, about whether an orbital could have three electrons instead of the required two, and Feynman said that it could—all histories were possible and had to be summed over—an answer that dismayed the audience. Finally, as Feynman was drawing another of his space-time graphs showing electrons as lines, Bohr rose to his feet and asked derisively whether Feynman had forgotten Heisenberg’s uncertainty principle that made it impossible to even talk about an electron trajectory.
It was hopeless. The audience gave up and so did Feynman as the talk just fizzled out. It was a disaster. What had been meant to be Feynman’s crowning achievement and his entry to the highest levels of theoretical physics, had been a terrible embarrassment. He slunk home to Cornell where he sank into one of his depressions. At the close of the Pocono conference, Oppenheimer, the reigning king of physics, former head of the successful Manhattan Project and newly selected to head the prestigious Institute for Advanced Study at Princeton, had been thoroughly disappointed by Feynman.
But what Bohr and Dirac and Oppenheimer had failed to understand was that as long as the duration of unphysical processes was shorter than the energy differences involved, then it was literally obeying Heisenberg’s uncertainty principle. Furthermore, Feynman’s trajectories—what became his famous “Feynman Diagrams”—were meant to be merely cartoons—a shorthand way to keep track of lots of different contributions to a scattering process. The quantum processes certainly took place in space and time, conceptually like a trajectory, but only so far as time durations, and energy differences and locations and momentum changes were all within the bounds of the uncertainty principle. Feynman had invented a bold new tool for quantum field theory, able to supply deep results quickly. But no one at the Poconos could see it.
When Feynman had failed so miserably at the Pocono conference, he had taken the stage after Julian Schwinger, who had dazzled everyone with his perfectly scripted presentation of quantum field theory—the competing theory to Feynman’s. Schwinger emerged the clear winner of the contest. At that time, Roy Glauber (1925 – 2018) was a young physicist just taking his PhD from Schwinger at Harvard, and he later received a post-doc position at Princeton’s Institute for Advanced Study where he became part of a miniature revolution in quantum field theory that revolved around—not Schwinger’s difficult mathematics—but Feynman’s diagrammatic method. So Feynman won in the end. Glauber then went on to Caltech, where he filled in for Feynman’s lectures when Feynman was off in Brazil playing the bongos. Glauber eventually returned to Harvard where he was already thinking about the quantum aspects of photons in 1956 when news of the photon correlations in the Hanbury-Brown Twiss (HBT) experiment were published. Three years later, when the laser was invented, he began developing a theory of photon correlations in laser light that he suspected would be fundamentally different than in natural chaotic light.
Because of his background in quantum field theory, and especially quantum electrodynamics, it was fairly easy to couch the quantum optical properties of coherent light in terms of Dirac’s creation and annihilation operators of the electromagnetic field. Glauber developed a “coherent state” operator that was a minimum uncertainty state of the quantized electromagnetic field, related to the minimum-uncertainty wave functions derived initially by Schrödinger in the late 1920’s. The coherent state represents a laser operating well above the lasing threshold and behaved as “the most classical” wavepacket that can be constructed. Glauber was awarded the Nobel Prize in Physics in 2005 for his work on such “Glauber states” in quantum optics.
Glauber’s coherent states are built up from the natural modes of a harmonic oscillator. Therefore, it should come as no surprise that these coherent-state wavefunctions in a harmonic potential behave just like classical particles with well-defined trajectories. The quadratic potential matches the quadratic argument of the the Gaussian wavepacket, and the pulses propagate within the potential without broadening, as in Fig. 3, showing a snapshot of two wavepackets propagating in a two-dimensional harmonic potential. This is a somewhat radical situation, because most wavepackets in most potentials (or even in free space) broaden as they propagate. The quadratic potential is a special case that is generally not representative of how quantum systems behave.
To illustrate this special status for the quadratic potential, the wavepackets can be launched in a potential with a quartic perturbation. The quartic potential is anharmonic—the frequency of oscillation depends on the amplitude of oscillation unlike for the harmonic oscillator, where amplitude and frequency are independent. The quartic potential is integrable, like the harmonic oscillator, and there is no avenue for chaos in the classical analog. Nonetheless, wavepackets broaden as they propagate in the quartic potential, eventually spread out into a ring in the configuration space, as in Fig. 4.
A potential with integrability has as many conserved quantities to the motion as there are degrees of freedom. Because the quartic potential is integrable, the quantum wavefunction may spread, but it remains highly regular, as in the “ring” that eventually forms over time. However, integrable potentials are the exception rather than the rule. Most potentials lead to nonintegrable motion that opens the door to chaos.
A classic (and classical) potential that exhibits chaos in a two-dimensional configuration space is the famous Henon-Heiles potential. This has a four-dimensional phase space which admits classical chaos. The potential has a three-fold symmetry which is one reason it is non-integral, since a particle must “decide” which way to go when it approaches a saddle point. In the quantum regime, wavepackets face the same decision, leading to a breakup of the wavepacket on top of a general broadening. This allows the wavefunction eventually to distribute across the entire configuration space, as in Fig. 5.
Movies of quantum trajectories can be viewed at my Youtube Channel, Physics Unbound. The answer to the question “Is there a quantum trajectory?” can be seen visually as the movies run—they do exist in a very clear sense under special conditions, especially coherent states in a harmonic oscillator. And the concept of a quantum trajectory also carries over from a classical trajectory in cases when the classical motion is integrable, even in cases when the wavefunction spreads over time. However, for classical systems that display chaotic motion, wavefunctions that begin as coherent states break up into chaotic wavefunctions that fill the accessible configuration space for a given energy. The character of quantum evolution of coherent states—the most classical of quantum wavefunctions—in these cases reflects the underlying character of chaotic motion in the classical analogs. This process can be seen directly watching the movies as a wavepacket approaches a saddle point in the potential and is split. Successive splits of the multiple wavepackets as they interact with the saddle points is what eventually distributes the full wavefunction into its chaotic form.
Therefore, the idea of a “quantum trajectory”, so thoroughly dismissed by Heisenberg, remains a phenomenological guide that can help give insight into the behavior of quantum systems—both integrable and chaotic.
As a side note, the laws of quantum physics obey time-reversal symmetry just as the classical equations do. In the third movie of “A Quantum Ballet“, wavefunctions in a double-well potential are tracked in time as they start from coherent states that break up into chaotic wavefunctions. It is like watching entropy in action as an ordered state devolves into a disordered state. But at the half-way point of the movie, the imaginary part of the wavefunction has its sign flipped, and the dynamics continue. But now the wavefunctions move from disorder into an ordered state, seemingly going against the second law of thermodynamics. Flipping the sign of the imaginary part of the wavefunction at just one instant in time plays the role of a time-reversal operation, and there is no violation of the second law.
 See Chapter 8 , On the Quantum Footpath, in Galileo Unbound, D. D. Nolte (Oxford University Press, 2018)
Harmonic oscillators are one of the fundamental elements of physical theory. They arise so often in so many different contexts that they can be viewed as a central paradigm that spans all aspects of physics. Some famous physicists have been quoted to say that the entire universe is composed of simple harmonic oscillators (SHO).
Despite the physicist’s love affair with it, the SHO is pathological! First, it has infinite frequency degeneracy which makes it prone to the slightest perturbation that can tip it into chaos, in contrast to non-harmonic cyclic dynamics that actually protects us from the chaos of the cosmos (see my Blog on Chaos in the Solar System). Second, the SHO is nowhere to be found in the classical world. Linear oscillators are purely harmonic, with a frequency that is independent of amplitude—but no such thing exists! All oscillators must be limited, or they could take on infinite amplitude and infinite speed, which is nonsense. Even the simplest of simple harmonic oscillators would be limited by nothing other than the speed of light. Relativistic effects would modify the linearity, especially through time dilation effects, rendering the harmonic oscillator anharmonic.
Despite the physicist’s love affair with it, the SHO is pathological!
Therefore, for students of physics as well as practitioners, it is important to break the shackles imposed by the SHO and embrace the anharmonic harmonic oscillator as the foundation of physics. Here is a brief survey of several famous anharmonic oscillators in the history of physics, followed by the mathematical analysis of the relativistic anharmonic linear-spring oscillator.
Anharmonic oscillators have a long venerable history with many varieties. Many of these have become central models in systems as varied as neural networks, synchronization, grandfather clocks, mechanical vibrations, business cycles, ecosystem populations and more.
Already by the mid 1600’s Christiaan Huygens (1629 – 1695) knew that the pendulum becomes slower when it has larger amplitudes. The pendulum was one of the best candidates for constructing an accurate clock needed for astronomical observations and for the determination of longitude at sea. Galileo (1564 – 1642) had devised the plans for a rudimentary pendulum clock that his son attempted to construct, but the first practical pendulum clock was invented and patented by Huygens in 1657. However, Huygens’ modified verge escapement required his pendulum to swing with large amplitudes, which brought it into the regime of anharmonicity. The equations of the simple pendulum are truly simple, but the presence of the sinθ makes it the simplest anharmonic oscillator.
Therefore, Huygens searched for the mathematical form of a tautochrone curve for the pendulum (a curve that is traversed with equal times independently of amplitude) and in the process he invented the involutes and evolutes of a curve—precursors of the calculus. The answer to the tautochrone question is a cycloid (see my Blog on Huygen’s Tautochrone Curve).
Hermann von Helmholtz
Hermann von Helmholtz (1821 – 1894) was possibly the greatest German physicist of his generation—an Einstein before Einstein—although he began as a medical doctor. His study of muscle metabolism, drawing on the early thermodynamic work of Carnot, Clapeyron and Joule, led him to explore and to express the conservation of energy in its clearest form. Because he postulated that all forms of physical processes—electricity, magnetism, heat, light and mechanics—contributed to the interconversion of energy, he sought to explore them all, bringing his research into the mainstream of physics. His laboratory in Berlin became world famous, attracting to his laboratory the early American physicists Henry Rowland (founder and first president of the American Physical Society) and Albert Michelson (first American Nobel prize winner).
Even the simplest of simple harmonic oscillators would be limited by nothing other than the speed of light.
Helmholtz also pursued a deep interest in the physics of sensory perception such as sound. This research led to his invention of the Helmholtz oscillator which is a highly anharmonic relaxation oscillator in which a tuning fork was placed near an electromagnet that was powered by a mercury switch attached to the fork. As the tuning fork vibrated, the mercury came in and out of contact with it, turning on and off the magnet, which fed back on the tuning fork, and so on, enabling the device, once started, to continue oscillating without interruption. This device is called a tuning-fork resonator, and it became the first door-bell buzzers. (These are not to be confused with Helmholtz resonances that are formed when blowing across the open neck of a beer bottle.)
Baron John Strutt, the Lord Rayleigh (1842 – 1919) like Helmholtz also was a generalist and had a strong interest in the physics of sound. He was inspired by Helmholtz’ oscillator to consider general nonlinear anharmonic oscillators mathematically. He was led to consider the effects of anharmonic terms added to the harmonic oscillator equation. in a paper published in the Philosophical Magazine issue of 1883 with the title On Maintained Vibrations, he introduced an equation to describe the self-oscillation by adding an extra term to a simple harmonic oscillator. The extra term depended on the cube of the velocity, representing a balance between the gain of energy from a steady force and natural dissipation by friction. Rayleigh suggested that this equation applied to a wide range of self-oscillating systems, such as violin strings, clarinet reeds, finger glasses, flutes, organ pipes, among others (see my Blog on Rayleigh’s Harp.)
The first systematic study of quadratic and cubic deviations from the harmonic potential was performed by the German engineer George Duffing (1861 – 1944) under the conditions of a harmonic drive. The Duffing equation incorporates inertia, damping, the linear spring and nonlinear deviations.
Duffing confirmed his theoretical predictions with careful experiments and established the lowest-order corrections to ideal masses on springs. His work was rediscovered in the 1960’s after Lorenz helped launch numerical chaos studies. Duffing’s driven potential becomes especially interesting when α is negative and β is positive, creating a double-well potential. The driven double-well is a classic chaotic system (see my blog on Duffing’s Oscillator).
Balthasar van der Pol
Autonomous oscillators are one of the building blocks of complex systems, providing the fundamental elements for biological oscillators, neural networks, business cycles, population dynamics, viral epidemics, and even the rings of Saturn. The most famous autonomous oscillator (after the pendulum clock) is named for a Dutch physicist, Balthasar van der Pol (1889 – 1959), who discovered the laws that govern how electrons oscillate in vacuum tubes, but the dynamical system that he developed has expanded to become the new paradigm of cyclic dynamical systems to replace the SHO (see my Blog on GrandFather Clocks.)
Turning from this general survey, let’s find out what happens when special relativity is added to the simplest SHO .
Relativistic Linear-Spring Oscillator
The theory of the relativistic one-dimensional linear-spring oscillator starts from a relativistic Lagrangian of a free particle (with no potential) yielding the generalized relativistic momentum
The Lagrangian that accomplishes this is 
where the invariant 4-velocity is
When the particle is in a potential, the Lagrangian becomes
The action integral that is minimized is
and the Lagrangian for integration of the action integral over proper time is
The relativistic modification in the potential energy term of the Lagrangian is not in the spring constant, but rather is purely a time dilation effect. This is captured by the relativistic Lagrangian
where the dot is with respect to proper time τ. The classical potential energy term in the Lagrangian is multiplied by the relativistic factor γ, which is position dependent because of the non-constant speed of the oscillator mass. The Euler-Lagrange equations are
where the subscripts in the variables are a = 0, 1 for the time and space dimensions, respectively. The derivative of the time component of the 4-vector is
From the derivative of the Lagrangian with respect to speed, the following result is derived
where E is the constant total relativistic energy. Therefore,
which provides an expression for the derivative of the coordinate time with respect to the proper time where
The position-dependent γ(x) factor is then
The Euler-Lagrange equation with a = 1 is
providing the flow equations for the (an)harmonic oscillator with respect to proper time
This flow represents a harmonic oscillator modified by the γ(x) factor, due to time dilation, multiplying the spring force term. Therefore, at relativistic speeds, the oscillator is no longer harmonic even though the spring constant remains truly a constant. The term in parentheses effectively softens the spring for larger displacement, and hence the frequency of oscillation becomes smaller.
The state-space diagram of the anharmonic oscillator is shown in Fig. 3 with respect to proper time (the time read on a clock co-moving with the oscillator mass). At low energy, the oscillator is harmonic with a natural period of the SHO. As the maximum speed exceeds β = 0.8, the period becomes longer and the trajectory less sinusoidal. The position and speed for β = 0.9999 is shown in Fig. 4. The mass travels near the speed of light as it passes the origin, producing significant time dilation at that instant. The average time dilation through a single cycle is about a factor of three, despite the large instantaneous γ = 70 when the mass passes the origin.
 W. Moreau, R. Easther, and R. Neutze, “RELATIVISTIC (AN)HARMONIC OSCILLATOR,” American Journal of Physics, Article vol. 62, no. 6, pp. 531-535, Jun (1994)
“What is a coconut worth to a cast-away on a deserted island?”
In the midst of the cast-away’s misfortune and hunger and exertion and food lies an answer that looks familiar to any physicist who speaks the words
“Assume a Lagrangian …”
It is the same process that determines how a bead slides along a bent wire in gravity or a skier navigates a ski hill. The answer: find the balance of economic forces subject to constraints.
Here is the history and the physics behind one of the simplest economic systems that can be conceived: Robinson Crusoe spending his time collecting coconuts!
Robinson Crusoe in Economic History
Daniel Defoe published “The Life and Strange Surprizing Adventures of Robinson Crusoe” in 1719, about a man who is shipwrecked on a deserted island and survives there for 28 years before being rescued. It was written in the first person, as if the author had actually lived through those experiences, and it was based on a real-life adventure story. It is one of the first examples of realistic fiction, and it helped establish the genre of the English novel.
Marginalism in economic theory is the demarcation between classical economics and modern economics. The key principle of marginalism is the principle of “diminishing returns” as the value of something gets less as an individual has more of it. This principle makes functions convex, which helps to guarantee that there are equilibrium points in the economy. Economic equilibrium is a key concept and goal because it provides stability to economic systems.
One-Product Is a Dull Diet
The Robinson Crusoe economy is one of the simplest economic models that captures the trade-off between labor and production on one side, and leisure and consumption on the other. The model has a single laborer for whom there are 24*7 =168 hours in the week. Some of these hours must be spent finding food, let’s say coconuts, while the other hours are for leisure and rest. The production of coconuts follows a production curve
that is a function of labor L. There are diminishing returns in the finding of coconuts for a given labor, making the production curve of coconuts convex. The amount of rest is
and there is a reciprocal production curve q(R) related to less coconuts produced for more time spent resting. In this model it is assumed that all coconuts that are produced are consumed. This is known as market clearing when no surplus is built up.
The production curve presents a continuous trade-off between consumption and leisure, but at first look there is no obvious way to decide how much to work and how much to rest. A lazy person might be willing to go a little hungry if they can have more rest, while a busy person might want to use all waking hours to find coconuts. The production curve represents something known as a Pareto frontier. It is a continuous trade-off between two qualities. Another example of a Pareto frontier is car engine efficiency versus cost. Some consumers may care more about the up-front cost of the car than the cost of gas, while other consumers may value fuel efficiency and be willing to pay higher costs to get it.
Continuous trade offs always present a bit of a problem for planning. It is often not clear what the best trade off should be. This problem is solved by introducing another concept into this little economy–the concept of “Utility”.
The utility function was introduced by the physicist Daniel Bernoulli, one of the many bountiful Bernoullis of Basel, in 1738. The utility function is a measure of how much benefit or utility a person or an enterprise gains by holding varying amounts of goods or labor. The essential problem in economic exchange is to maximize one’s utility function subject to whatever constraints are active. The utility function for Robinson Crusoe is
This function is obviously a maximum at maximum leisure (R = 1) and lots of coconuts (q = 1), but this is not allowed, because it lies off the production curve q(R). Therefore the question becomes: where on the production curve he can maximize the trade-off between coconuts and leisure?
Fig. 1 shows the dynamical space for Robinson Crusoe’s economy. The space is two dimensional with axes for coconuts q and rest R. Isoclines of the utility function are shown as contours known as “indifference” curves, because the utility is constant along these curves and hence Robinson Crusoe is indifferent to his position on it. The indifference curves are cut by the production curve q(R). The equilibrium problem is to maximize utility subject to the production curve.
When looking at dynamics under constraints, Lagrange multipliers are the best tool. Furthermore, we can impart dynamics into the model with temporal adjustments in q and R that respond to economic forces.
The Lagrangian Economy
The approach to the Lagrangian economy is identical to the Lagrangian approach in classical physics. The equation of constraint is
All the dynamics take place on the production curve. The initial condition starts on the curve, and the state point moves along the curve until it reaches a maximum and settles into equilibrium. The dynamics is therefore one-dimensional, the link between q and R being the production curve.
The Lagrangian in this simple economy is given by the utility function augmented by the equation of constraint, such that
where the term on the right-hand-side is a drag force with the relaxation rate γ.
The first term on the left is the momentum of the system. In economic dynamics, this is usually negligible, similar to dynamics in living systems at low Reynold’s number in which all objects are moving instantaneously at their terminal velocity in response to forces. The equations of motion are therefore
The Lagrange multiplier can be solved from the first equation as
and the last equation converts q-dot to R-dot to yield the single equation
which is a one-dimensional flow
where all q’s are expressed as R’s through the equation of constraint. The speed vanishes at the fixed point—the economic equilibrium—when
This is the point of Pareto efficient allocation. Any initial condition on the production curve will relax to this point with a rate given by γ. These trajectories are shown in Fig. 2. From the point of view of Robinson Crusoe, if he is working harder than he needs, then he will slack off. But if there aren’t enough coconuts to make him happy, he will work harder.
The production curve is like a curved wire, the amount of production q is like the bead sliding on the wire. The utility function plays the role of a potential function, and the gradients of the utility function play the role of forces. Then this simple economic model is just like ordinary classical physics of point masses responding to forces constrained to lie on certain lines or surfaces. From this viewpoint, physics and economics are literally the same.
To make this problem specific, consider a utility function given by
that has a maximum in the upper right corner, and a production curve given by
that has diminishing returns. Then, the condition of equilibrium can be solved using
With the (fairly obvious) answer
For More Reading
 D. D. Nolte, Introduction to Modern Dynamics : Chaos, Networks, Space and Time, 2nd ed. Oxford : Oxford University Press (2019).
 Fritz Söllner; The Use (and Abuse) of Robinson Crusoe in Neoclassical Economics. History of Political Economy; 48 (1): 35–64. (2016)
“Society is founded on hero worship”, wrote Thomas Carlyle (1795 – 1881) in his 1840 lecture on “Hero as Divinity”—and the society of physicists is no different. Among physicists, the hero is the genius—the monomyth who journeys into the supernatural realm of high mathematics, engages in single combat against chaos and confusion, gains enlightenment in the mysteries of the universe, and returns home to share the new understanding. If the hero is endowed with unusual talent and achieves greatness, then mythologies are woven, creating shadows that can grow and eclipse the truth and the work of others, bestowing upon the hero recognitions that are not entirely deserved.
“Gentlemen! The views of space and time which I wish to lay before you … They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.”
Herman Minkowski (1908)
The greatest hero of physics of the twentieth century, without question, is Albert Einstein. He is the person most responsible for the development of “Modern Physics” that encompasses:
Relativity theory (both special and general),
Quantum theory (he invented the quantum in 1905—see my blog),
Astrophysics (his field equations of general relativity were solved by Schwarzschild in 1916 to predict event horizons of black holes, and he solved his own equations to predict gravitational waves that were discovered in 2015),
Cosmology (his cosmological constant is now recognized as the mysterious dark energy that was discovered in 2000), and
Solid state physics (his explanation of the specific heat of crystals inaugurated the field of quantum matter).
Einstein made so many seminal contributions to so many sub-fields of physics that it defies comprehension—hence he is mythologized as genius, able to see into the depths of reality with unique insight. He deserves his reputation as the greatest physicist of the twentieth century—he has my vote, and he was chosen by Time magazine in 2000 as the Man of the Century. But as his shadow has grown, it has eclipsed and even assimilated the work of others—work that he initially criticized and dismissed, yet later embraced so whole-heartedly that he is mistakenly given credit for its discovery.
For instance, when we think of Einstein, the first thing that pops into our minds is probably “spacetime”. He himself wrote several popular accounts of relativity that incorporated the view that spacetime is the natural geometry within which so many of the non-intuitive properties of relativity can be understood. When we think of time being mixed with space, making it seem that position coordinates and time coordinates share an equal place in the description of relativistic physics, it is common to attribute this understanding to Einstein. Yet Einstein initially resisted this viewpoint and even disparaged it when he first heard it!
Spacetime was the brain-child of Hermann Minkowski.
Minkowski in Königsberg
Hermann Minkowski was born in 1864 in Russia to German parents who moved to the city of Königsberg (King’s Mountain) in East Prussia when he was eight years old. He entered the university in Königsberg in 1880 when he was sixteen. Within a year, when he was only seventeen years old, and while he was still a student at the University, Minkowski responded to an announcement of the Mathematics Prize of the French Academy of Sciences in 1881. When he submitted is prize-winning memoire, he could have had no idea that it was starting him down a path that would lead him years later to revolutionary views.
The specific Prize challenge of 1881 was to find the number of representations of an integer as a sum of five squares of integers. For instance, every integer n > 33 can be expressed as the sum of five nonzero squares. As an example, 42 = 22 + 22 + 32 + 32 + 42, which is the only representation for that number. However, there are five representation for n = 53
The task of enumerating these representations draws from the theory of quadratic forms. A quadratic form is a function of products of numbers with integer coefficients, such as ax2 + bxy + cy2 and ax2 + by2 + cz2 + dxy + exz + fyz. In number theory, one seeks to find integer solutions for which the quadratic form equals an integer. For instance, the Pythagorean theorem x2 + y2 = n2 for integers is a quadratic form for which there are many integer solutions (x,y,n), known as Pythagorean triplets, such as
The topic of quadratic forms gained special significance after the work of Bernhard Riemann who established the properties of metric spaces based on the metric expression
for infinitesimal distance in a D-dimensional metric space. This is a generalization of Euclidean distance to more general non-Euclidean spaces that may have curvature. Minkowski would later use this expression to great advantage, developing a “Geometry of Numbers”  as he delved ever deeper into quadratic forms and their uses in number theory.
Minkowski in Göttingen
After graduating with a doctoral degree in 1885 from Königsberg, Minkowski did his habilitation at the university of Bonn and began teaching, moving back to Königsberg in 1892 and then to Zurich in 1894 (where one of his students was a somewhat lazy and unimpressive Albert Einstein). A few years later he was given an offer that he could not refuse.
At the turn of the 20th century, the place to be in mathematics was at the University of Göttingen. It had a long tradition of mathematical giants that included Carl Friedrich Gauss, Bernhard Riemann, Peter Dirichlet, and Felix Klein. Under the guidance of Felix Klein, Göttingen mathematics had undergone a renaissance. For instance, Klein had attracted Hilbert from the University of Königsberg in 1895. David Hilbert had known Minkowski when they were both students in Königsberg, and Hilbert extended an invitation to Minkowski to join him in Göttingen, which Minkowski accepted in 1902.
A few years after Minkowski arrived at Göttingen, the relativity revolution broke, and both Minkowski and Hilbert began working on mathematical aspects of the new physics. They organized a colloquium dedicated to relativity and related topics, and on Nov. 5, 1907 Minkowski gave his first tentative address on the geometry of relativity.
Because Minkowski’s specialty was quadratic forms, and given his understanding of Riemann’s work, he was perfectly situated to apply his theory of quadratic forms and invariants to the Lorentz transformations derived by Poincaré and Einstein. Although Poincaré had published a paper in 1906 that showed that the Lorentz transformation was a generalized rotation in four-dimensional space , Poincaré continued to discuss space and time as separate phenomena, as did Einstein. For them, simultaneity was no longer an invariant, but events in time were still events in time and not somehow mixed with space-like properties. Minkowski recognized that Poincaré had missed an opportunity to define a four-dimensional vector space filled by four-vectors that captured all possible events in a single coordinate description without the need to separate out time and space.
Minkowski’s first attempt, presented in his 1907 colloquium, at constructing velocity four-vectors was flawed because (like so many of my mechanics students when they first take a time derivative of the four-position) he had not yet understood the correct use of proper time. But the research program he outlined paved the way for the great work that was to follow.
On Feb. 21, 1908, only 3 months after his first halting steps, Minkowski delivered a thick manuscript to the printers for an article to appear in the Göttinger Nachrichten. The title “Die Grundgleichungen für die elektromagnetischen Vorgänge in bewegten Körpern” (The Basic Equations for Electromagnetic Processes of Moving Bodies) belies the impact and importance of this very dense article . In its 60 pages (with no figures), Minkowski presents the correct form for four-velocity by taking derivatives relative to proper time, and he formalizes his four-dimensional approach to relativity that became the standard afterwards. He introduces the terms spacelikevector, timelike vector, light cone and world line. He also presents the complete four-tensor form for the electromagnetic fields. The foundational work of Levi Cevita and Ricci-Curbastro on tensors was not yet well known, so Minkowski invents his own terminology of Traktor to describe it. Most importantly, he invents the terms spacetime (Raum-Zeit) and events (Erignisse) .
Minkowski’s four-dimensional formalism of relativistic electromagnetics was more than a mathematical trick—it uncovered the presence of a multitude of invariants that were obscured by the conventional mathematics of Einstein and Lorentz and Poincaré. In Minkowski’s approach, whenever a proper four-vector is contracted with itself (its inner product), an invariant emerges. Because there are many fundamental four-vectors, there are many invariants. These invariants provide the anchors from which to understand the complex relative properties amongst relatively moving frames.
Minkowski’s master work appeared in the Nachrichten on April 5, 1908. If he had thought that physicists would embrace his visionary perspective, he was about to be woefully disabused of that notion.
Despite his impressive ability to see into the foundational depths of the physical world, Einstein did not view mathematics as the root of reality. Mathematics for him was a tool to reduce physical intuition into quantitative form. In 1908 his fame was rising as the acknowledged leader in relativistic physics, and he was not impressed or pleased with the abstract mathematical form that Minkowski was trying to stuff the physics into. Einstein called it “superfluous erudition” , and complained “since the mathematics pounced on the relativity theory, I no longer understand it myself! ”
With his collaborator Jakob Laub (also a former student of Minkowski’s), Einstein objected to more than the hard-to-follow mathematics—they believed that Minkowski’s form of the pondermotive force was incorrect. They then proceeded to re-translate Minkowski’s elegant four-vector derivations back into ordinary vector analysis, publishing two papers in Annalen der Physik in the summer of 1908 that were politely critical of Minkowski’s approach [7-8]. Yet another of Minkowski’s students from Zurich, Gunnar Nordström, showed how to derive Minkowski’s field equations without any of the four-vector formalism.
One can only wonder why so many of his former students so easily dismissed Minkowski’s revolutionary work. Einstein had actually avoided Minkowski’s mathematics classes as a student at ETH , which may say something about Minkowski’s reputation among the students, although Einstein did appreciate the class on mechanics that he took from Minkowski. Nonetheless, Einstein missed the point! Rather than realizing the power and universality of the four-dimensional spacetime formulation, he dismissed it as obscure and irrelevant—perhaps prejudiced by his earlier dim view of his former teacher.
Raum und Zeit
It is clear that Minkowski was stung by the poor reception of his spacetime theory. It is also clear that he truly believed that he had uncovered an essential new approach to physical reality. While mathematicians were generally receptive of his work, he knew that if physicists were to adopt his new viewpoint, he needed to win them over with the elegant results.
In 1908, Minkowski presented a now-famous paper Raum und Zeit at the 80thAssembly of German Natural Scientists and Physicians (21 September 1908). In his opening address, he stated :
To illustrate his arguments Minkowski constructed the most recognizable visual icon of relativity theory—the space-time diagram in which the trajectories of particles appear as “world lines”, as in Fig. 1. On this diagram, one spatial dimension is plotted along the horizontal-axis, and the value ct (speed of light times time) is plotted along the vertical-axis. In these units, a photon travels along a line oriented at 45 degrees, and the world-line (the name Minkowski gave to trajectories) of all massive particles must have slopes steeper than this. For instance, a stationary particle, that appears to have no trajectory at all, executes a vertical trajectory on the space-time diagram as it travels forward through time. Within this new formulation by Minkowski, space and time were mixed together in a single manifold—spacetime—and were no longer separate entities.
In addition to the spacetime construct, Minkowski’s great discovery was the plethora of invariants that followed from his geometry. For instance, the spacetime hyperbola
is invariant to Lorentz transformation in coordinates. This is just a simple statement that a vector is an entity of reality that is independent of how it is described. The length of a vector in our normal three-space does not change if we flip the coordinates around or rotate them, and the same is true for four-vectors in Minkowski space subject to Lorentz transformations.
In relativity theory, this property of invariance becomes especially useful because part of the mental challenge of relativity is that everything looks different when viewed from different frames. How do you get a good grip on a phenomenon if it is always changing, always relative to one frame or another? The invariants become the anchors that we can hold on to as reference frames shift and morph about us.
As an example of a fundamental invariant, the mass of a particle in its rest frame becomes an invariant mass, always with the same value. In earlier relativity theory, even in Einstein’s papers, the mass of an object was a function of its speed. How is the mass of an electron a fundamental property of physics if it is a function of how fast it is traveling? The construction of invariant mass removes this problem, and the mass of the electron becomes an immutable property of physics, independent of the frame. Invariant mass is just one of many invariants that emerge from Minkowski’s space-time description. The study of relativity, where all things seem relative, became a study of invariants, where many things never change. In this sense, the theory of relativity is a misnomer. Ironically, relativity theory became the motivation of post-modern relativism that denies the existence of absolutes, even as relativity theory, as practiced by physicists, is all about absolutes.
Despite his audacious gambit to win over the physicists, Minkowski would not live to see the fruits of his effort. He died suddenly of a burst gall bladder on Jan. 12, 1909 at the age of 44.
Arnold Sommerfeld (who went on to play a central role in the development of quantum theory) took up Minkowski’s four vectors, and he systematized it in a way that was palatable to physicists. Then Max von Laue extended it while he was working with Sommerfeld in Munich, publishing the first physics textbook on relativity theory in 1911, establishing the space-time formalism for future generations of German physicists. Further support for Minkowski’s work came from his distinguished colleagues at Göttingen (Hilbert, Klein, Wiechert, Schwarzschild) as well as his former students (Born, Laue, Kaluza, Frank, Noether). With such champions, Minkowski’s work was immortalized in the methodology (and mythology) of physics, representing one of the crowning achievements of the Göttingen mathematical community.
Already in 1907 Einstein was beginning to grapple with the role of gravity in the context of relativity theory, and he knew that the special theory was just a beginning. Yet between 1908 and 1910 Einstein’s focus was on the quantum of light as he defended and extended his unique view of the photon and prepared for the first Solvay Congress of 1911. As he returned his attention to the problem of gravitation after 1910, he began to realize that Minkowski’s formalism provided a framework from which to understand the role of accelerating frames. In 1912 Einstein wrote to Sommerfeld to say 
I occupy myself now exclusively with the problem of gravitation . One thing is certain that I have never before had to toil anywhere near as much, and that I have been infused with great respect for mathematics, which I had up until now in my naivety looked upon as a pure luxury in its more subtle parts. Compared to this problem. the original theory of relativity is child’s play.
By the time Einstein had finished his general theory of relativity and gravitation in 1915, he fully acknowledge his indebtedness to Minkowski’s spacetime formalism without which his general theory may never have appeared.
The idea of parallel dimensions in physics has a long history dating back to Bernhard Riemann’s famous 1954 lecture on the foundations of geometry that he gave as a requirement to attain a teaching position at the University of Göttingen. Riemann laid out a program of study that included physics problems solved in multiple dimensions, but it was Rudolph Lipschitz twenty years later who first composed a rigorous view of physics as trajectories in many dimensions. Nonetheless, the three spatial dimensions we enjoy in our daily lives remained the only true physical space until Hermann Minkowski re-expressed Einstein’s theory of relativity in 4-dimensional space time. Even so, Minkowski’s time dimension was not on an equal footing with the three spatial dimensions—the four dimensions were entwined, but time had a different characteristic, what is known as pseudo-Riemannian metric. It is this pseudo-metric that allows space-time distances to be negative as easily as positive.
In 1919 Theodore Kaluza of the University of Königsberg in Prussia extended Einstein’s theory of gravitation to a fifth spatial dimension, and physics had its first true parallel dimension. It was more than just an exercise in mathematics—adding a fifth dimension to relativistic dynamics adds new degrees of freedom that allow the dynamical 5-dimensional theory to include more than merely relativistic massive particles and the electric field they generate. In addition to electro-magnetism, something akin to Einstein’s field equation of gravitation emerges. Here was a five-dimensional theory that seemed to unify E&M with gravity—a first unified theory of physics. Einstein, to whom Kaluza communicated his theory, was intrigued but hesitant to forward Kaluza’s paper for publication. It seemed too good to be true. But Einstein finally sent it to be published in the proceedings of the Prussian Academy of Sciences [Kaluza, 1921]. He later launched his own effort to explore such unified field theories more deeply.
Yet Kaluza’s theory was fully classical—if a fifth dimension can be called that—because it made no connection to the rapidly developing field of quantum mechanics. The person who took the step to make five-dimensional space-time into a quantum field theory was Oskar Klein.
Oskar Klein (1894 – 1977)
Oskar Klein was a Swedish physicist who was in the “second wave” of quantum physicists just a few years behind the titans Heisenberg and Schrödinger and Pauli. He began as a student in physical chemistry working in Stockholm under the famous Arrhenius. It was arranged for him to work in France and Germany in 1914, but he was caught in Paris at the onset of World War I. Returning to Sweden, he enlisted in military service from 1915 to 1916 and then joined Arrhenius’ group at the Nobel Institute where he met Hendrick Kramers—Bohr’s direct assistant at Copenhagen at that time. At Kramer’s invitation, Klein traveled to Copenhagen and worked for a year with Kramers and Bohr before returning to defend his doctoral thesis in 1921 in the field of physical chemistry. Klein’s work with Bohr had opened his eyes to the possibilities of quantum theory, and he shifted his research interest away from physical chemistry. Unfortunately, there were no positions at that time in such a new field, so Klein accepted a position as assistant professor at the University of Michigan in Ann Arbor where he stayed from 1923 to 1925.
The Fifth Dimension
In an odd twist of fate, this isolation of Klein from the mainstream quantum theory being pursued in Europe freed him of the bandwagon effect and allowed him to range freely on topics of his own devising and certainly in directions all his own. Unaware of Kaluza’s previous work, Klein expanded Minkowski’s space-time from four to five spatial dimensions, just as Kaluza had done, but now with a quantum interpretation. This was not just an incremental step but had far-ranging consequences in the history of physics.
Klein found a way to keep the fifth dimension Euclidean in its metric properties while rolling itself up compactly into a cylinder with the radius of the Planck length—something inconceivably small. This compact fifth dimension made the manifold into something akin to an infinitesimal string. He published a short note in Nature magazine in 1926 on the possibility of identifying the electric charge within the 5-dimensional theory [Klein, 2916a]. He then returned to Sweden to take up a position at the University of Lund. This odd string-like feature of 5-dimensional space-time was picked up by Einstein and others in their search for unified field theories of physics, but the topic soon drifted from the lime light where it lay dormant for nearly fifty years until the first forays were made into string theory. String theory resurrected the Kaluza-Klein theory which has bourgeoned into the vast topic of String Theory today, including Superstrings that occur in 10+1 dimensionsat the frontiers of physics.
Dirac Electrons without the Spin: Klein-Gordon Equation
Once back in Europe, Klein reengaged with the mainstream trends in the rapidly developing quantum theory and in 1926 developed a relativistic quantum theory of the electron [Klein, 1926b]. Around the same time Walter Gordon also proposed this equation, which is now called the “Klein-Gordon Equation”. The equation was a classic wave equation that was second order in both space and time. This was the most natural form for a wave equation for quantum particles and Schrödinger himself had started with this form. But Schrödinger had quickly realized that the second-order time term in the equation did not capture the correct structure of the hydrogen atom, which led him to express the time-dependent term in first order and non-relativistically—which is today’s “Schrödinger Equation”. The problem was in the spin of the electron. The electron is a spin-1/2 particle, a Fermion, which has special transformation properties. It was Dirac a few years later who discovered how to express the relativistic wave equation for the electron—not by promoting the time-dependent term to second order, but by demoting the space-dependent term to first order. The first-order expression for both the space and time derivatives goes hand in hand with the Pauli spin matrices for the electron, and the Dirac Equation is the appropriate relativistically-correct wave equation for the electron.
Klein’s relativistic quantum wave equation does turn out to be the relevant form for a spin-less particle like the pion, but the pion decays by the strong nuclear force and the Klein-Gordon equation is not a practical description. However, the Higgs boson also is a spin-zero particle, and the Klein-Gordon expression does have relevance for this fundamental exchange particle.
In those early days of the late 1920’s, the nature of the nucleus was still a mystery, especially the problem of nuclear radioactivity where a neutron could convert to a proton with the emission of an electron. Some suggested that the neutron was somehow a proton that had captured an electron in a potential barrier. Klein showed that this was impossible, that the electrons would be highly relativistic—something known as a Dirac electron—and they would tunnel with perfect probability through any potential barrier [Klein, 1929]. Therefore, Klein concluded, no nucleon or nucleus could bind an electron.
This phenomenon of unity transmission through a barrier became known as Klein tunneling. The relativistic electron transmits perfectly through an arbitrary potential barrier—independent of its width or height. This is unlike light that transmits through a dielectric slab in resonances that depend on the thickness of the slab—also known as a Fabry-Perot interferometer. The Dirac electron can have any energy, and the potential barrier can have any width, yet the electron will tunnel with 100% probability. How can this happen?
The answer has to do with the dispersion (velocity versus momentum) of the Dirac electron. As the momentum changes in a potential the speed of the Dirac electron stays constant. In the potential barrier, the moment flips sign, but the speed remains unchanged. This is equivalent to the effects of negative refractive index in optics. If a photon travels through a material with negative refractive index, its momentum is flipped, but its speed remains unchanged. From Fermat’s principle, it is speed which determines how a particle like a photon refracts, so if there is no speed change, then there is no reflection.
For the case of Dirac electrons in a potential with field F, speed v and transverse momentum py, the transmission coefficient is given by
If the transverse momentum is zero, then the transmission is perfect. A visual schematic of the role of dispersion and potentials for Dirac electrons undergoing Klein tunneling is shown in the next figure.
In this case, even if the transverse momentum is not strictly zero, there can still be perfect transmission. It is simply a matter of matching speeds.
Graphene became famous over the past decade because its electron dispersion relation is just like a relativistic Dirac electron with a Dirac point between conduction and valence bands. Evidence for Klein tunneling in graphene systems has been growing, but clean demonstrations have remained difficult to observe.
Now, published in the Dec. 2020 issue of Science magazine—almost a century after Klein first proposed it—an experimental group at the University of California at Berkeley reports a beautiful experimental demonstration of Klein tunneling—not from a nucleus, but in an acoustic honeycomb sounding board the size of a small table—making an experimental analogy between acoustics and Dirac electrons that bears out Klein’s theory.
In this special sounding board, it is not electrons but phonons—acoustic vibrations—that have a Dirac point. Furthermore, by changing the honeycomb pattern, the bands can be shifted, just like in a p-n-p junction, to produce a potential barrier. The Berkeley group, led by Xiang Zhang (now president of Hong Kong University), fabricated the sounding board that is about a half-meter in length, and demonstrated dramatic Klein tunneling.
It is amazing how long it can take between the time a theory is first proposed and the time a clean experimental demonstration is first performed. Nearly 90 years has elapsed since Klein first derived the phenomenon. Performing the experiment with actual relativistic electrons was prohibitive, but bringing the Dirac electron analog into the solid state has allowed the effect to be demonstrated easily.
 Kaluza, Theodor (1921). “Zum Unitätsproblem in der Physik”. Sitzungsber. Preuss. Akad. Wiss. Berlin. (Math. Phys.): 966–972
[1926a] Klein, O. (1926). “The Atomicity of Electricity as a Quantum Theory Law”. Nature118: 516-516.
[1926b] Klein, O. (1926). “Quantentheorie und fünfdimensionale Relativitätstheorie”. Zeitschrift für Physik. 37 (12): 895
 Klein, O. (1929). “Die Reflexion von Elektronen an einem Potentialsprung nach der relativistischen Dynamik von Dirac”. Zeitschrift für Physik. 53 (3–4): 157
It is second nature to think of integer dimensions: A line is one dimensional. A plane is two dimensional. A volume is three dimensional. A point has no dimensions.
It is harder to think in four dimensions and higher, but even here it is a simple extrapolation of lower dimensions. Consider the basis vectors spanning a three-dimensional space consisting of the triples of numbers
Then a four dimensional hyperspace is just created by adding a new “tuple” to the list
and so on to 5 and 6 dimensions and on. Child’s play!
But how do you think of fractional dimensions? What is a fractional dimension? For that matter, what is a dimension? Even the integer dimensions began to unravel when George Cantor showed in 1877 that the line and the plane, which clearly had different “dimensionalities”, both had the same cardinality and could be put into a one-to-one correspondence. From then onward the concept of dimension had to be rebuilt from the ground up, leading ultimately to fractals.
Here is a short history of fractal dimension, partially excerpted from my history of dynamics in Galileo Unbound (Oxford University Press, 2018) pg. 110 ff. This blog page presents the history through a set of publications that successively altered how mathematicians thought about curves in spaces, beginning with Karl Weierstrass in 1872.
Karl Weierstrass (1872)
Karl Weierstrass (1815 – 1897) was studying convergence properties of infinite power series in 1872 when he began with a problem that Bernhard Riemann had given to his students some years earlier. Riemann had asked whether the function
was continuous everywhere but not differentiable. This simple question about a simple series was surprisingly hard to answer (it was not solved until Hardy provided the proof in 1916 ). Therefore, Weierstrass conceived of a simpler infinite sum that was continuous everywhere and for which he could calculate left and right limits of derivatives at any point. This function is
where b is a large odd integer and a is positive and less than one. Weierstrass showed that the left and right derivatives failed to converge to the same value, no matter where he took his point. In short, he had discovered a function that was continuous everywhere, but had a derivative nowhere . This pathological function, called a “Monster” by Charles Hermite, is now called the Weierstrass function.
Beyond the strange properties that Weierstrass sought, the Weierstrass function would turn out to be a fractal curve (recognized much later by Besicovitch and Ursell in 1937 ) with a fractal (Hausdorff) dimension given by
although this was not proven until very recently . An example of the function is shown in Fig. 1 for a = 0.5 and b = 5. This specific curve has a fractal dimension D = 1.5693. Notably, this is a number that is greater than 1 dimension (the topological dimension of the curve) but smaller than 2 dimensions (the embedding dimension of the curve). The curve tends to fill more of the two dimensional plane than a straight line, so its intermediate fractal dimension has an intuitive feel about it. The more “monstrous” the curve looks, the closer its fractal dimension approaches 2.
Fig. 1 Weierstrass’ “Monster” (1872) with a = 0.5, b = 5. This continuous function is nowhere differentiable. It is a fractal with fractal dimension D = 2 + ln(0.5)/ln(5) = 1.5693.
Georg Cantor (1883)
Partially inspired by Weierstrass’ discovery, George Cantor (1845 – 1918) published an example of an unusual ternary set in 1883 in “Grundlagen einer allgemeinen Mannigfaltigkeitslehre” (“Foundations of a General Theory of Aggregates”) . The set generates a function (The Cantor Staircase) that has a derivative equal to zero almost everywhere, yet whose area integrates to unity. It is a striking example of a function that is not equal to the integral of its derivative! Cantor demonstrated that the size of his set is aleph0 , which is the cardinality of the real numbers. But whereas the real numbers are uniformly distributed, Cantor’s set is “clumped”. This clumpiness is an essential feature that distinguishes it from the one-dimensional number line, and it raised important questions about dimensionality. The fractal dimension of the ternary Cantor set is DH = ln(2)/ln(3) = 0.6309.
Fig. 2 The 1883 Cantor set (below) and the Cantor staircase (above, as the indefinite integral over the set).
Giuseppe Peano (1890)
In 1878, in a letter to his friend Richard Dedekind, Cantor showed that there was a one-to-one correspondence between the real numbers and the points in any n-dimensional space. He was so surprised by his own result that he wrote to Dedekind “I see it, but I don’t believe it.” The solid concepts of dimension and dimensionality were dissolving before his eyes. What does it mean to trace the path of a trajectory in an n-dimensional space, if all the points in n dimensions were just numbers on a line? What could such a trajectory look like? A graphic example of a plane-filling path was constructed in 1890 by Peano , who was a peripatetic mathematician with interests that wandered broadly across the landscape of the mathematical problems of his day—usually ahead of his time. Only two years after he had axiomatized linear vector spaces , Peano constructed a continuous curve that filled space.
The construction of Peano’s curve proceeds by taking a square and dividing it into 9 equal sub squares. Lines connect the centers of each of the sub squares. Then each sub square is divided again into 9 sub squares whose centers are all connected by lines. At this stage, the original pattern, repeated 9 times, is connected together by 8 links, forming a single curve. This process is repeated infinitely many times, resulting in a curve that passes through every point of the original plane square. In this way, a line is made to fill a plane. Where Cantor had proven abstractly that the cardinality of the real numbers was the same as the points in n-dimensional space, Peano created a specific example. This was followed quickly by another construction, invented by David Hilbert in 1891, that divided the square into four instead of nine, simplifying the construction, but also showing that such constructions were easily generated.
Fig. 3 Peano’s (1890) and Hilbert’s (1891) plane-filling curves. When the iterations are taken to infinity, the curves approach every point of two-dimensional space arbitrarily closely, giving them a dimension DH = DE = 2, although their topological dimensions are DT = 1.
Helge von Koch (1904)
The space-filling curves of Peano and Hilbert have the extreme property that a one-dimensional curve approaches every point in a two-dimensional space. This ability of a one-dimensional trajectory to fill space mirrored the ergodic hypothesis that Boltzmann relied upon as he developed statistical mechanics. These examples by Peano, Hilbert and Boltzmann inspired searches for continuous curves whose dimensionality similarly exceeded one dimension, yet without filling space. Weierstrass’ Monster was already one such curve, existing in some dimension greater than one but not filling the plane. The construction of the Monster required infinite series of harmonic functions, and the resulting curve was single valued on its domain of real numbers.
An alternative approach was proposed by Helge von Koch (1870—1924), a Swedish mathematician with an interest in number theory. He suggested in 1904 that a set of straight line segments could be joined together, and then shrunk by a scale factor to act as new segments of the original pattern . The construction of the Koch curve is shown in Fig. 4. When the process is taken to its limit, it produces a curve, differentiable nowhere, which snakes through two dimensions. When connected with other identical curves into a hexagon, the curve resembles a snowflake, and the construction is known as “Koch’s Snowflake”.
The Koch curve begins in generation 1 with N0 = 4 elements. These are shrunk by a factor of b = 1/3 to become the four elements of the next generation, and so on. The number of elements varies with the observation scale according to the equation
where D is called the fractal dimension. In the example of the Koch curve, the fractal dimension is
which is a number less than its embedding dimenion DE = 2. The fractal is embedded in 2D but has a fractional dimension that is greater than it topological dimension DT = 1.
Fig. 4 Generation of a Koch curve (1904). The fractal dimension is D = ln(4)/ln(3) = 1.26. At each stage, four elements are reduced in size by a factor of 3. The “length” of the curve approaches infinity as the features get smaller and smaller. But the scaling of the length with size is determined uniquely by the fractal dimension.
Waclaw Sierpinski (1915)
Waclaw Sierpinski (1882 – 1969) was a Polish mathematician studying at the Jagellonian University in Krakow for his doctorate when he came across a theorem that every point in the plane can be defined by a single coordinate. Intrigued by such an unintuitive result, he dived deep into Cantor’s set theory after he was appointed as a faculty member at the university in Lvov. He began to construct curves that had more specific properties than the Peano or Hilbert curves, such as a curve that passes through every interior point of a unit square but that encloses an area that is only equal to 5/12 = 0.4167. Sierpinski became interested in the topological properties of such sets.
Sierpinski considered how to define a curve that was embedded in DE = 2 but that was NOT constructed as a topological dimension DT = 1 curve as the curves of Peano, Hilbert, Koch (and even his own) had been. To demonstrate this point, he described a construction that began with a topological dimension DT = 2 object, a planar triangle, from which the open set of its central inverted triangle is removed, leaving its boundary points. The process is continued iteratively to all scales . The resulting point set is shown in Fig. 5 and is called the Sierpinski gasket. What is left after all the internal triangles are removed is a point set that can be made discontinuous by cutting it at a finite set of points. This is shown in Fig. 5 by the red circles. Each circle, no matter the size, cuts the set at three points, making the resulting set discontinuous. Ten years later, Karl Menger would show that this property of discontinuous cuts determined the topological dimension of the Sierpinski gasket to be DT = 1. The embedding dimension is of course DE = 2, and the fractal dimension of the Sierpinski gasket is
Fig. 5 The Sierpinski gasket. The central triangle is removed (leaving its boundary) at each scale. The pattern is self-similar with a fractal dimension DH = 1.5850. Unintuitively, it has a topological dimension DT = 1.
Felix Hausdorff (1918)
The work by Cantor, Peano, von Koch and Sierpinski had created a crisis in geometry as mathematicians struggled to rescue concepts of dimensionality. An important byproduct of that struggle was a much deeper understanding of concepts of space, especially in the hands of Felix Hausdorff.
Felix Hausdorff (1868 – 1942) was born in Breslau, Prussia, and educated in Leipzig. In his early years as a doctoral student, and as an assistant professor at Leipzig, he was a practicing mathematician by day and a philosopher and playwright by night, publishing under the pseudonym Paul Mongré. He was at the University of Bonn working on set theory when the Greek mathematician Constatin Carathéodory published a paper in 1914 that showed how to construct a p-dimensional set in a q-dimensional space . Haussdorff realized that he could apply similar ideas to the Cantor set. He showed that the outer measure of the Cantor set would go discontinuously from zero to infinity as the fractional dimension increased smoothly. The critical value where the measure changed its character became known as the Hausdorff dimension .
For the Cantor ternary set, the Hausdorff dimension is exactly DH = ln(2)/ln(3) = 0.6309. This value for the dimension is less than the embedding dimension DE = 1 of the support (the real numbers on the interval [0, 1]), but it is also greater than DT = 0 which would hold for a countable number of points on the interval. The work by Hausdorff became well known in the mathematics community who applied the idea to a broad range of point sets like Weierstrass’s monster and the Koch curve.
It is important to keep a perspective of what Hausdorff’s work meant during which period of time. For instance, although the curves of Weierstrass, von Koch and Sierpinski were understood to present a challenge to concepts of dimension, it was only after Haussdorff that mathematicians began to think in terms of fractional dimensions and to calculate the fractional dimensions of these earlier point sets. Despite the fact that Sierpinski created one of the most iconic fractals that we use as an example every day, he was unaware at the time that he was doing so. His interest was topological—creating a curve for which any cut at any point would create disconnected subsets starting with objects (triangles) with topological dimension DT = 2. In this way, talking about the early fractal objects tends to be anachronistic, using language to describe them that had not yet been invented at that time.
This perspective is also true for the ideas of topological dimension. For instance, even Sierpinski was not fully tuned into the problems of defining topological dimension. It turns out that what he created was a curve of topological dimension DT = 1, but that would only become clear later with the work of the Austrian mathematician Karl Menger.
Karl Menger (1926)
The day that Karl Menger (1902 – 1985) was born, his father, Carl Menger (1840 – 1941) lost his job. Carl Menger was one of the founders of the famous Viennese school that established the marginalist view of economics. However, Carl was not married to Karl’s mother, which was frowned upon by polite Vienna society, so he had to relinquish his professorship. Despite his father’s reduction in status, Karl received an excellent education at a Viennese gymnasium (high school). Among of his classmates were Wolfgang Pauli (Nobel Prize for Physics in 1945) and Richard Kuhn (Nobel Prize for Chemistry in 1938). When Karl began attending the University of Vienna he studied physics, but the mathematics professor Hans Hahn opened his eyes to the fascinating work on analysis that was transforming mathematics at that time, so Karl shifted his studies to mathematical analysis, specifically concerning conceptions of “curves”.
Menger made important contributions to the history of fractal dimension as well as the history of topological dimension. In his approach to defining the intrinsic topological dimension of a point set, he described the construction of a point set embedded in three dimensions that had zero volume, an infinite surface area, and a fractal dimension between 2 and 3. The object is shown in Fig. 6 and is called a Menger “sponge” . The Menger sponge is a fractal with a fractal dimension DH = ln(20)/ln(3) = 2.7268. The face of the sponge is also known as the Sierpinski carpt. The fractal dimension of the Sierpinski carpet is DH = ln(8)/ln(3) = 1.8928.
Fig. 6 Menger Sponge. Embedding dimension DE = 3. Fractal dimension DH = ln(20)/ln(3) = 2.7268. Topological dimension DT = 1: all one-dimensional metric spaces can be contained within the Menger sponge point set. Each face is a Sierpinski carpet with fractal dimension DH = ln(8)/ln(3) = 1.8928.
The striking feature of the Menger sponge is its topological dimension. Menger created a new definition of topological dimension that partially solved the crises created by Cantor when he showed that every point on the unit square can be defined by a single coordinate. This had put a one dimensional curve in one-to-one correspondence with a two-dimensional plane. Yet the topology of a 2-dimensional object is clearly different than the topology of a line. Menger found a simple definition that showed why 2D is different, topologically, than 3D, despite Cantor’s conundrum. The answer came from the idea of making cuts on a point set and seeing if the cut created disconnected subsets.
As a simple example, take a 1D line. The removal of a single point creates two disconnected sub-lines. The intersection of the cut with the line is 0-dimensional, and Menger showed that this defined the line as 1-dimensional. Similarly, a line cuts the unit square into to parts. The intersection of the cut with the plane is 1-dimensional, signifying that the dimension of the plane is 2-dimensional. In other words, a (n-1) dimensional intersection of the boundary of a small neighborhood with the point set indicates that the point set has a dimension of n. Generalizing this idea, looking at the Sierpinski gasket in Fig. 5, the boundary of a small circular region, if placed appropriately (as in the figure), intersects the Sierpinski gasket at three points of dimension zero. Hence, the topological dimension of the Sierpinski gasket is one-dimensional. Manger was likewise able to show that his sponge also had a topology that was one-dimensional, DT = 1, despite the embedding dimension of DE = 3. In fact, all 1-dimensional metric spaces can be fit inside a Menger Sponge.
Benoit Mandelbrot (1967)
Benoit Mandelbrot (1924 – 2010) was born in Warsaw and his family emigrated to Paris in 1935. He attended the Ecole Polytechnique where he studied under Gaston Julia (1893 – 1978) and Paul Levy (1886 – 1971). Both Julia and Levy made significant contributions to the field of self-similar point sets and made a lasting impression on Mandelbrot. He went to Cal Tech for a master’s degree in aeronautics and then a PhD in mathematical sciences from the University of Paris. In 1958 Mandelbrot joined the research staff of the IBM Thomas J. Watson Research Center in Yorktown Heights, New York where he worked for over 35 years on topics of information theory and economics, always with a view of properties of self-similar sets and time series.
In 1967 Mandelbrot published one of his early important papers on the self-similar properties of the coastline of Britain. He proposed that many natural features had statistical self similarity, which he applied to coastlines. He published the work as “How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension”  in Science magazine , where he showed that the length of the coastline diverged with a Hausdorff dimension equal to D = 1.25. Working at IBM, a world leader in computers, he had ready access to their power as well as their visualization capabilities. Therefore, he was one of the first to begin exploring the graphical character of self-similar maps and point sets.
During one of his sabbaticals at Harvard University he began exploring the properties of Julia sets (named after his former teacher at the Ecole Polytechnique). The Julia set is a self-similar point set that is easily visualized in the complex plane (two dimensions). As Mandelbrot studied the convergence of divergence of infinite series defined by the Julia mapping, he discovered an infinitely nested pattern that was both beautiful and complex. This has since become known as the Mandelbrot set.
Later, in 1975, Mandelbrot coined the term fractal to describe these self-similar point sets, and he began to realize that these types of sets were ubiquitous in nature, ranging from the structure of trees and drainage basins, to the patterns of clouds and mountain landscapes. He published his highly successful and influential book The Fractal Geometry of Nature in 1982, introducing fractals to the wider public and launching a generation of hobbyists interested in computer-generated fractals. The rise of fractal geometry coincided with the rise of chaos theory that was aided by the same computing power. For instance, important geometric structures of chaos theory, known as strange attractors, have fractal geometry.
Appendix: Box Counting
When confronted by a fractal of unknown structure, one of the simplest methods to find the fractal dimension is through box counting. This method is shown in Fig. 8. The fractal set is covered by a set of boxes of size b, and the number of boxes that contain at least one point of the fractal set are counted. As the boxes are reduced in size, the number of covering boxes increases as
To be numerically accurate, this method must be iterated over several orders of magnitude. The number of boxes covering a fractal has this characteristic power law dependence, as shown in Fig. 8, and the fractal dimension is obtained as the slope.
Fig. 8 Calculation of the fractal dimension using box counting. At each generation, the size of the grid is reduced by a factor of 3. The number of boxes that contain some part of the fractal curve increases as , where b is the scale
 Hardy, G. (1916). “Weierstrass’s non-differentiable function.” Transactions of the American Mathematical Society 17: 301-325.
 Weierstrass, K. (1872). “Uber continuirliche Functionen eines reellen Arguments, die fur keinen Werth des letzteren einen bestimmten Differentialquotienten besitzen.” Communication ri I’Academie Royale des Sciences II: 71-74.
 Besicovitch, A. S. and H. D. Ursell (1937). “Sets of fractional dimensions: On dimensional numbers of some continuous curves.” J. London Math. Soc. 1(1): 18-25.
 Shen, W. (2018). “Hausdorff dimension of the graphs of the classical Weierstrass functions.” Mathematische Zeitschrift. 289(1–2): 223–266.
 Cantor, G. (1883). Grundlagen einer allgemeinen Mannigfaltigkeitslehre. Leipzig, B. G. Teubner.
 Peano, G. (1890). “Sur une courbe qui remplit toute une aire plane.” Mathematische Annalen 36: 157-160.
 Peano, G. (1888). Calcolo geometrico secundo l’Ausdehnungslehre di H. Grassmann e precedutto dalle operazioni della logica deduttiva. Turin, Fratelli Bocca Editori.
 Von Koch, H. (1904). “Sur.une courbe continue sans tangente obtenue par une construction geometrique elementaire.” Arkiv for Mathematik, Astronomi och Fysich 1: 681-704.
 Sierpinski, W. (1915). “Sur une courbe dont tout point est un point de ramification.” Comptes Rendus Hebdomadaires des Seances de l’Academie des Sciences de Paris 160: 302-305.
 Carathéodory, C. (1914). “Über das lineare Mass von Punktmengen – eine Verallgemeinerung des Längenbegriffs.” Gött. Nachr. IV: 404–406.
 Hausdorff, F. (1919). “Dimension und ausseres Mass.” Mathematische Anna/en 79: 157-179.
 Menger, Karl (1926), “Allgemeine Räume und Cartesische Räume. I.”, Communications to the Amsterdam Academy of Sciences. English translation reprinted in Edgar, Gerald A., ed. (2004), Classics on fractals, Studies in Nonlinearity, Westview Press. Advanced Book Program, Boulder, CO
 B Mandelbrot, How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension. Science, 156 3775 (May 5, 1967): 636-638.
A chief principle of chaos theory states that even simple systems can display complex dynamics. All that is needed for chaos, roughly, is for a system to have at least three dynamical variables plus some nonlinearity.
A classic example of chaos is the driven damped pendulum. This is a mass at the end of a massless rod driven by a sinusoidal perturbation. The three variables are the angle, the angular velocity and the phase of the sinusoidal drive. The nonlinearity is provided by the cosine function in the potential energy which is anharmonic for large angles. However, the driven damped pendulum is not an autonomous system, because the drive is an external time-dependent function. To find an autonomous system—one that persists in complex motion without any external driving function—one needs only to add one more mass to a simple pendulum to create what is known as a compound pendulum, or a double pendulum.
Daniel Bernoulli and the Discovery of Normal Modes
After the invention of the calculus by Newton and Leibniz, the first wave of calculus practitioners (Leibniz, Jakob and Johann Bernoulli and von Tschirnhaus) focused on static problems, like the functional form of the catenary (the shape of a hanging chain), or on constrained problems, like the brachistochrone (the path of least time for a mass under gravity to move between two points) and the tautochrone (the path of equal time).
The next generation of calculus practitioners (Euler, Johann and Daniel Bernoulli, and D’Alembert) focused on finding the equations of motion of dynamical systems. One of the simplest of these, that yielded the earliest equations of motion as well as the first identification of coupled modes, was the double pendulum. The double pendulum, in its simplest form, is a mass on a rigid massless rod attached to another mass on a massless rod. For small-angle motion, this is a simple coupled oscillator.
Daniel Bernoulli, the son of Johann I Bernoulli, was the first to study the double pendulum, publishing a paper on the topic in 1733 in the proceedings of the Academy in St. Petersburg just as he returned from Russia to take up a post permanently in his home town of Basel, Switzerland. Because he was a physicist first and mathematician second, he performed experiments with masses on strings to attempt to understand the qualitative as well as quantitative behavior of the two-mass system. He discovered that for small motions there was a symmetric behavior that had a low frequency of oscillation and an antisymmetric motion that had a higher frequency of oscillation. Furthermore, he recognized that any general motion of the double pendulum was a combination of the fundamental symmetric and antisymmetric motions. This work by Daniel Bernoulli represents the discovery of normal modes of coupled oscillators. It is also the first statement of the combination of motions that he would use later (1753) to express for the first time the principle of superposition.
Superposition is one of the guiding principles of linear physical systems. It provides a means for the solution of differential equations. It explains the existence of eigenmodes and their eigenfrequencies. It is the basis of all interference phenomenon, whether classical like the Young’s double-slit experiment or quantum like Schrödinger’s cat. Today, superposition has taken center stage in quantum information sciences and helps define the spooky (and useful) properties of quantum entanglement. Therefore, normal modes, composition of motion, superposition of harmonics on a musical string—these all date back to Daniel Bernoulli in the twenty years between 1733 and 1753. (Daniel Bernoulli is also the originator of the Bernoulli principle that explains why birds and airplanes fly.)
Johann Bernoulli and the Equations of Motion
Daniel Bernoulli’s father was Johann I Bernoulli. Daniel had been tutored by Johann, along with his friend Leonhard Euler, when Daniel was young. But as Daniel matured as a mathematician, he and his father began to compete against each other in international mathematics competitions (which were very common in the early eighteenth century). When Daniel beat his father in a competition sponsored by the French Academy, Johann threw Daniel out of his house and their relationship remained strained for the remainder of their lives.
Johann had a history of taking ideas from Daniel and never citing the source. For instance, when Johann published his work on equations of motion for masses on strings in 1742, he built on the work of his son Daniel from 1733 but never once mentioned it. Daniel, of course, was not happy.
In a letter dated 20 October 1742 that Daniel wrote to Euler, he said, “The collected works of my father are being printed, and I have Just learned that he has inserted, without any mention of me, the dynamical problems I first discovered and solved (such as e. g. the descent of a sphere on a moving triangle; the linked pendulum, the center of spontaneous rotation, etc.).” And on 4 September 1743, when Daniel had finally seen his father’s works in print, he said, “The new mechanical problems are mostly mine, and my father saw my solutions before he solved the problems in his way …”. 
Daniel clearly has the priority for the discovery of the normal modes of the linked (i.e. double or compound) pendulum, but Johann often would “improve” on Daniel’s work despite giving no credit for the initial work. As a mathematician, Johann had a more rigorous approach and could delve a little deeper into the math. For this reason, it was Johann in 1742 who came closest to writing down differential equations of motion for multi-mass systems, but falling just short. It was D’Alembert only one year later who first wrote down the differential equations of motion for systems of masses and extended it to the loaded string for which he was the first to derive the wave equation. The D’Alembertian operator is today named after him.
Double Pendulum Dynamics
The general dynamics of the double pendulum are best obtained from Lagrange’s equations of motion. However, setting up the Lagrangian takes careful thought, because the kinetic energy of the second mass depends on its absolute speed which is dependent on the motion of the first mass from which it is suspended. The velocity of the second mass is obtained through vector addition of velocities.
The potential energy of the system is
so that the Lagrangian is
The partial derivatives are
and the time derivatives of the last two expressions are
Therefore, the equations of motion are
To get a sense of how this system behaves, we can make a small-angle approximation to linearize the equations to find the lowest-order normal modes. In the small-angle approximation, the equations of motion become
where the determinant is
This quartic equation is quadratic in w2 and the quadratic solution is
This solution is still a little opaque, so taking the special case: R = R1 = R2 and M = M1 = M2 it becomes
There are two normal modes. The low-frequency mode is symmetric as both masses swing (mostly) together, while the higher frequency mode is antisymmetric with the two masses oscillating against each other. These are the motions that Daniel Bernoulli discovered in 1733.
It is interesting to note that if the string were rigid, so that the two angles were the same, then the lowest frequency would be 3/5 which is within 2% of the above answer but is certainly not equal. This tells us that there is a slightly different angular deflection for the second mass relative to the first.
Chaos in the Double Pendulum
The full expression for the nonlinear coupled dynamics is expressed in terms of four variables (q1, q2, w1, w2). The dynamical equations are
These can be put into the normal form for a four-dimensional flow as
The numerical solution of these equations produce a complex interplay between the angle of the first mass and the angle of the second mass. Examples of trajectory projections in configuration space are shown in Fig. 3 for E = 1. The horizontal is the first angle, and the vertical is the angle of the second mass.
The dynamics in state space are four dimensional which are difficult to visualize directly. Using the technique of the Poincaré first-return map, the four-dimensional trajectories can be viewed as a two-dimensional plot where the trajectories pierce the Poincaré plane. Poincare sections are shown in Fig. 4.
Python Code: DoublePendulum.py
# -*- coding: utf-8 -*-
Created on Oct 16 06:03:32 2020
"Introduction to Modern Dynamics" 2nd Edition (Oxford, 2019)
import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt
E = 1. # Try 0.8 to 1.5
x, y, z, w = x_y_z_w
A = w**2*np.sin(y-x);
B = -2*np.sin(x);
C = z**2*np.sin(y-x)*np.cos(y-x);
D = np.sin(y)*np.cos(y-x);
EE = 2 - (np.cos(y-x))**2;
FF = w**2*np.sin(y-x)*np.cos(y-x);
G = -2*np.sin(x)*np.cos(y-x);
H = 2*z**2*np.sin(y-x);
I = 2*np.sin(y);
JJ = (np.cos(y-x))**2 - 2;
a = z
b = w
c = (A+B+C+D)/EE
d = (FF+G+H+I)/JJ
repnum = 75
for reploop in range(repnum):
px1 = 2*(np.random.random((1))-0.499)*np.sqrt(E);
py1 = -px1 + np.sqrt(2*E - px1**2);
xp1 = 0 # Try 0.1
yp1 = 0 # Try -0.2
x_y_z_w0 = [xp1, yp1, px1, py1]
tspan = np.linspace(1,1000,10000)
x_t = integrate.odeint(flow_deriv, x_y_z_w0, tspan)
siztmp = np.shape(x_t)
siz = siztmp
if reploop % 50 == 0:
lines = plt.plot(x_t[:,0],x_t[:,1])
y1 = np.mod(x_t[:,0]+np.pi,2*np.pi) - np.pi
y2 = np.mod(x_t[:,1]+np.pi,2*np.pi) - np.pi
y3 = np.mod(x_t[:,2]+np.pi,2*np.pi) - np.pi
y4 = np.mod(x_t[:,3]+np.pi,2*np.pi) - np.pi
py = np.zeros(shape=(10*repnum,))
yvar = np.zeros(shape=(10*repnum,))
cnt = -1
last = y1
for loop in range(2,siz):
if (last < 0)and(y1[loop] > 0):
cnt = cnt+1
del1 = -y1[loop-1]/(y1[loop] - y1[loop-1])
py[cnt] = y4[loop-1] + del1*(y4[loop]-y4[loop-1])
yvar[cnt] = y2[loop-1] + del1*(y2[loop]-y2[loop-1])
last = y1[loop]
last = y1[loop]
lines = plt.plot(yvar,py,'o',ms=1)
You can change the energy E on line 16 and also the initial conditions xp1 and yp1 on lines 48 and 49. The energy E is the initial kinetic energy imparted to the two masses. For a given initial condition, what happens to the periodic orbits as the energy E increases?
 Daniel Bernoulli, Theoremata de oscillationibus corporum filo flexili connexorum et catenae verticaliter suspensae,” Academiae Scientiarum Imperialis Petropolitanae, 6, 1732/1733
 Truesdell B. The rational mechanics of flexible or elastic bodies, 1638-1788. (Turici: O. Fussli, 1960). (This rare and artistically produced volume, that is almost impossible to find today in any library, is one of the greatest books written about the early history of dynamics.)
The task of figuring out who’s who in the Bernoulli family is a hard nut to crack. The Bernoulli name populates a dozen different theorems or physical principles in the history of science and mathematics, but each one was contributed by any of four or five different Bernoullis of different generations—brothers, uncles, nephews and cousins. What makes the task even more difficult is that any given Bernoulli might be called by several different aliases, while many of them shared the same name across generations. To make things worse, they often worked and published on each other’s problems.
To attribute a theorem to a Bernoulli is not too different from attributing something to the famous mathematical consortium called Nicholas Bourbaki. It’s more like a team rather than an individual. But in the case of Bourbaki, the goal was selfless anonymity, while in the case of the Bernoullis it was sometimes the opposite—bald-faced competition and one-up-manship coupled with jealousy and resentment. Fortunately, the competition tended to breed more output than less, and the world benefited from the family feud.
The Bernoulli Family Tree
The Bernoullis are intimately linked with the beautiful city of Basel, Switzerland, situated on the Rhine River where it leaves Switzerland and forms the border between France and Germany . The family moved there from the Netherlands in the 1600’s to escape the Spanish occupation.
The first Bernoulli born in Basel was Nikolaus Bernoulli (1623 – 1708), and he had four sons: Jakob I, Nikolaus, Johann I and Hieronymous I. The “I”s in this list refer to the fact, or the problem, that many of the immediate descendants took their father’s or uncle’s name. The long-lived family heritage in the roles of mathematician and scientist began with these four brothers. Jakob Bernoulli (1654 – 1705) was the eldest, followed by Nikolaus Bernoulli (1662 – 1717), Johann Bernoulli (1667 – 1748) and then Hieronymous (1669 – 1760). In this first generation of Bernoullis, the great mathematicians were Jakob and Johann. More mathematical equations today are named after Jakob, but Johann stands out because of the longevity of his contributions, the volume and impact of his correspondence, the fame of his students, and the number of offspring who also took up mathematics. Johann was also the worst when it came to jealousy and spitefulness—against his brother Jakob, whom he envied, and specifically against his son Daniel, whom he feared would eclipse him.
Jakob Bernoulli (aka James or Jacques or Jacob)
Jakob Bernoulli (1654 – 1705) was the eldest of the first generation of brothers and also the first to establish himself as a university professor. He held the chair of mathematics at the university in Basel. While his interests ranged broadly, he is known for his correspondences with Leibniz as he and his brother Johann were among the first mathematicians to apply Lebiniz’ calculus to solving specific problems. The Bernoulli differential equation is named after him. It was one of the first general differential equations to be solved after the invention of the calculus. The Bernoulli inequality is one of the earliest attempts to find the Taylor expansion of exponentiation, which is also related to Bernoulli numbers, Bernoulli polynomials and the Bernoulli triangle. A special type of curve that looks like an ellipse with a twist in the middle is the lemniscate of Bernoulli.
Jakob was 13 years older than his brother Johann Bernoulli (1667 – 1748), and Jakob tutored Johann in mathematics who showed great promise. Unfortunately, Johann had that awkward combination of high self esteem with low self confidence, and he increasingly sought to show that he was better than his older brother. As both brothers began corresponding with Leibniz on the new calculus, they also began to compete with one another. Driven by his insecurity, Johann also began to steal ideas from his older brother and claim them for himself.
A classic example of this is the famous brachistrochrone problem that was posed by Johann in the Acta Eruditorum in 1696. Johann at this time was a professor of mathematics at Gronigen in the Netherlands. He challenged the mathematical world to find the path of least time for a mass to travel under gravity between two points. He had already found one solution himself and thought that no-one else would succeed. Yet when he heard his brother Jakob was responding to the challenge, he spied out his result and then claimed it as his own. Within the year and a half there were 4 additional solutions—all correct—using different approaches. One of the most famous responses was by Newton (who as usual did not give up his method) but who is reported to have solved the problem in a day. Others who contributed solutions were Gottfried Leibniz, Ehrenfried Walther von Tschirnhaus, and Guillaume de l’Hôpital in addition to Jakob.
The participation of de l’Hôpital in the challenge was a particular thorn in Johann’s side, because de l’Hôpital had years earlier paid Johann to tutor him in Leibniz’ new calculus at a time when l’Hôpital knew nothing of the topic. What is today known as l’Hôpital’s theorem on ratios of limits in fact was taught to l’Hôpital by Johann. Johann never forgave l’Hôpital for publicizing the result—but l’Hôpital had the discipline to write a textbook while Johann did not. To be fair, l’Hôpital did give Johann credit in the opening of his book, but that was not enough for Johann who continued to carry his resentment.
When Jakob died of tuberculosis in 1705, Johann campaigned to replace him in his position as professor of mathematics and succeeded. In that chair, Johann had many famous students (Euler foremost among them, but also Maupertuis and Clairaut). Part of Johann’s enduring fame stems from his many associations and extensive correspondences with many of the top mathematicians of the day. For instance, he had a regular correspondence with the mathematician Varignon, and it was in one of these letters that Johann proposed the principle of virtual velocities which became a key axiom for Joseph Lagrange’s later epic work on the foundations of mechanics (see Chapter 4 in Galileo Unbound).
Johann remained in his chair of mathematics at Basel for almost 40 years. This longevity, and the fame of his name, guaranteed that he taught some of the most talented mathematicians of the age, including his most famous student Leonhard Euler, who is held by some as one of the four greatest mathematicians of all time (the others were Archimedes, Newton and Gauss) .
Nikolaus I Bernoulli
Nikolaus I Bernoulli (1687 – 1759, son of Nikolaus) was the cousin of Daniel and nephew to both Jacob and Johann. He was a well-known mathematician in his time (he briefly held Galileo’s chair in Padua), though few specific discoveries are attributed to him directly. He is perhaps most famous today for posing the “St. Petersburg Paradox” of economic game theory. Ironically, he posed this paradox while his cousin Nikolaus II Bernoulli (brother of Daniel Bernoulli) was actually in St. Petersburg with Daniel.
The St. Petersburg paradox is a simple game of chance played with a fair coin where a player must buy in at a certain price in order to place $2 in a pot that doubles each time the coin lands heads, and pays out the pot at the first tail. The average pay-out of this game has infinite expectation, so it seems that anyone should want to buy in at any cost. But most people would be unlikely to buy in even for a modest $25. Why? And is this perception correct? The answer was only partially provided by Nikolaus. The definitive answer was given by his cousin Daniel Bernoulli.
Daniel Bernoulli (1700 – 1782, son of Johann I) is my favorite Bernoulli. While most of the other Bernoullis were more mathematicians than scientists, Daniel Bernoulli was more physicist than mathematician. When we speak of “Bernoulli’s principle” today, the fundamental force that allows birds and airplanes to fly, we are referring to his work on hydrodynamics. He was one of the earliest originators of economic dynamics through his invention of the utility function and diminishing returns, and he was the first to clearly state the principle of superposition, which lies at the heart today of the physics of waves and quantum technology.
While in St. Petersburg, Daniel conceived of the solution to the St. Petersburg paradox (he is the one who actually named it). To explain why few people would pay high stakes to play the game, he devised a “utility function” that had “diminishing marginal utility” in which the willingness to play depended on ones wealth. Obviously a wealthy person would be willing to pay more than a poor person. Daniel stated
The determination of the value of an item must not be based on the price, but rather on the utility it yields…. There is no doubt that a gain of one thousand ducats is more significant to the pauper than to a rich man though both gain the same amount.
He created a log utility function that allowed one to calculate the highest stakes a person should be willing to take based on their wealth. Indeed, a millionaire may only wish to pay $20 per game to play, in part because the average payout over a few thousand games is only about $5 per game. It is only in the limit of an infinite number of games (and an infinite bank account by the casino) that the average payout diverges.
Johann II Bernoulli
Daniel’s brother Johann II (1710 – 1790) published in 1736 one of the most important texts on the theory of light during the time between Newton and Euler. Although the work looks woefully anachronistic today, it provided one of the first serious attempts at understanding the forces acting on light rays and describing them mathematically . Euler based his new theory of light, published in 1746, on much of the work laid down by Johann II. Euler came very close to proposing a wave-like theory of light, complete with a connection between frequency of wave pulses and colors, that would have preempted Thomas Young by more than 50 years. Euler, Daniel and Johann II as well as Nicholas II were all contemporaries as students of Johann I in Basel.
Over the years, there were many more Bernoullis who followed in the family tradition. Some of these include:
Johann II Bernoulli (1710–1790; also known as Jean), son of Johann, mathematician and physicist
Johann III Bernoulli (1744–1807; also known as Jean), son of Johann II, astronomer, geographer and mathematician
Jacob II Bernoulli (1759–1789; also known as Jacques), son of Johann II, physicist and mathematician
Johann Jakob Bernoulli (1831–1913), art historian and archaeologist; noted for his Römische Ikonographie (1882 onwards) on Roman Imperial portraits
Ludwig Bernoully (1873 – 1928), German architect in Frankfurt
Hans Bernoulli (1876–1959), architect and designer of the Bernoullihäuser in Zurich and Grenchen SO
Elisabeth Bernoulli (1873-1935), suffragette and campaigner against alcoholism.
Notable marriages to the Bernoulli family include the Curies (Pierre Curie was a direct descendant to Johann I) as well as the German author Hermann Hesse (married to a direct descendant of Johann I).
 Calinger, Ronald S.. Leonhard Euler : Mathematical Genius in the Enlightenment, Princeton University Press (2015).
 Euler L and Truesdell C. Leonhardi Euleri Opera Omnia. Series secunda: Opera mechanica et astronomica XI/2. The rational mechanics of flexible or elastic bodies 1638-1788. (Zürich: Orell Füssli, 1960).
 D Speiser, Daniel Bernoulli (1700-1782), Helvetica Physica Acta 55 (1982), 504-523.
 Leibniz GW. Briefwechsel zwischen Leibniz, Jacob Bernoulli, Johann Bernoulli und Nicolaus Bernoulli. (Hildesheim: Olms, 1971).
 Hakfoort C. Optics in the age of Euler : conceptions of the nature of light, 1700-1795. (Cambridge: Cambridge University Press, 1995).
Will the next extinction-scale asteroid strike the Earth in our lifetime?
This existential question—the question of our continued existence on this planet—is rhetorical, because there are far too many bodies in our solar system to accurately calculate all trajectories of all asteroids.
The solar system is what is known as an N-body problem. And even the N is not well determined. The asteroid belt alone has over a million extinction-sized asteroids, and there are tens of millions of smaller ones that could still do major damage to life on Earth if they hit. To have a hope of calculating even one asteroid trajectory do we ignore planetary masses that are too small? What is too small? What if we only consider the Sun, the Earth and Jupiter? This is what Euler did in 1760, and he still had to make more assumptions.
Stability of the Solar System
Once Newton published his Principia, there was a pressing need to calculate the orbit of the Moon (see my blog post on the three-body problem). This was important for navigation, because if the daily position of the moon could be known with sufficient accuracy, then ships would have a means to determine their longitude at sea. However, the Moon, Earth and Sun are already a three-body problem, which still ignores the effects of Mars and Jupiter on the Moon’s orbit, not to mention the problem that the Earth is not a perfect sphere. Therefore, to have any hope of success, toy systems that were stripped of all their obfuscating detail were needed.
Euler investigated simplified versions of the three-body problem around 1760, treating a body attracted to two fixed centers of gravity moving in the plane, and he solved it using elliptic integrals. When the two fixed centers are viewed in a coordinate frame that is rotating with the Sun-Earth system, it can come close to capturing many of the important details of the system. In 1762 Euler tried another approach, called the restricted three-body problem, where he considered a massless Moon attracted to a massive Earth orbiting a massive Sun, again all in the plane. Euler could not find general solutions to this problem, but he did stumble on an interesting special case when the three bodies remain collinear throughout their motions in a rotating reference frame.
It was not the danger of asteroids that was the main topic of interest in those days, but the question whether the Earth itself is in a stable orbit and is safe from being ejected from the Solar system. Despite steadily improving methods for calculating astronomical trajectories through the nineteenth century, this question of stability remained open.
Poincaré and the King Oscar Prize of 1889
Some years ago I wrote an article for Physics Today called “The Tangled Tale of Phase Space” that tracks the historical development of phase space. One of the chief players in that story was Henri Poincaré (1854 – 1912). Henri Poincare was the Einstein before Einstein. He was a minor celebrity and was considered to be the greatest genius of his era. The event in his early career that helped launch him to stardom was a mathematics prize announced in 1887 to honor the birthday of King Oscar II of Sweden. The challenge problem was as simple as it was profound: Prove rigorously whether the solar system is stable.
This was the old N-body problem that had so far resisted solution, but there was a sense at that time that recent mathematical advances might make the proof possible. There was even a rumor that Dirichlet had outlined such a proof, but no trace of the outline could be found in his papers after his death in 1859.
The prize competition was announced in Acta Mathematica, written by the Swedish mathematician Gösta Mittag-Leffler. It stated:
Given a system of arbitrarily many mass points that attract each according to Newton’s law, under the assumption that no two points ever collide, try to find a representation of the coordinates of each point as a series in a variable that is some known function of time and for all of whose values the series converges uniformly.
The timing of the prize was perfect for Poincaré who was in his early thirties and just beginning to make his mark on mathematics. He was working on the theory of dynamical systems and was developing a new viewpoint that went beyond integrating single trajectories by focusing more broadly on whole classes of solutions. The question of the stability of the solar system seemed like a good problem to use to sharpen his mathematical tools. The general problem was still too difficult, so he began with Euler’s restricted three-body problem. He made steady progress, and along the way he invented an array of new techniques for studying the general properties of dynamical systems. One of these was the Poincaré section. Another was his set of integral invariants, one of which is recognized as the conservation of volume in phase space, also known as Liouville’s theorem, although it was Ludwig Boltzmann who first derived this result (see my Physics Today article). Eventually, he believed he had proven that the restricted three-body problem was stable.
By the time Poincaré had finished is prize submission, he had invented a new field of mathematical analysis, and the judges of the prize submission recognized it. Poincaré was named the winner, and his submission was prepared for publication in the Acta. However, Mittag-Leffler was a little concerned by a technical objection that had been raised, so he forwarded the comment to Poincaré for him to look at. At first, Poincaré thought the objection could easily be overcome, but as he worked on it and delved deeper, he had a sudden attack of panic. Trajectories near a saddle point did not converge. His proof of stability was wrong!
He alerted Mittag-Leffler to stop the presses, but it was too late. The first printing had been completed and review copies had already been sent to the judges. Mittag-Leffler immediately wrote to them asking for their return while Poincaré worked nonstop to produce a corrected copy. When he had completed his reanalysis, he had discovered a divergent feature of the solution to the dynamical problem near saddle points that his recognized today as the discovery of chaos. Poincaré paid for the reprinting of his paper out of his own pocket and (almost) all of the original printing was destroyed. This embarrassing moment in the life of a great mathematician was virtually forgotten until it was brought to light by the historian Barrow-Green in 1994 .
Chaos in the Poincaré Return Map
Despite the fact that his conclusions on the stability of the 3-body problem flipped, Poincaré’s new tools for analyzing dynamical systems earned him the prize. He did not stop at his modified prize submission but continued working on systematizing his methods, publishing New Methods in Celestial Mechanics in several volumes through the 1890’s. It was here that he fully explored what happens when a trajectory approaches a saddle point of dynamical equilibrium.
To visualize a periodic trajectory, Poincaré invented a mathematical tool called a “first-return map”, also known as a Poincaré section. It was a way of taking a higher dimensional continuous trajectory and turning it into a simple iterated discrete map. Therefore, one did not need to solve continuous differential equations, it was enough to just iterate the map. In this way, complicated periodic, or nearly periodic, behavior could be explored numerically. However, even armed with this weapon, Poincaré found that iterated maps became unstable as a trajectory that originated from a saddle point approached another equivalent saddle point. Because the dynamics are periodic, the outgoing and incoming trajectories are opposite ends of the same trajectory, repeated with 2-pi periodicity. Therefore, the saddle point is also called a homoclinic point, meaning that trajectories in the discrete map intersect with themselves. (If two different trajectories in the map intersect, that is called a heteroclinic point.) When Poincaré calculated the iterations around the homoclinic point, he discovered a wild and complicated pattern in which a trajectory intersected itself many times. Poincaré wrote:
[I]f one seeks to visualize the pattern formed by these two curves and their infinite number of intersections … these intersections form a kind of lattice work, a weave, a chain-link network of infinitely fine mesh; each of the two curves can never cross itself, but it must fold back on itself in a very complicated way so as to recross all the chain-links an infinite number of times .… One will be struck by the complexity of this figure, which I am not even attempting to draw. Nothing can give us a better idea of the intricacy of the three-body problem, and of all the problems of dynamics in general…
This was the discovery of chaos! Today we call this “lattice work” the “homoclinic tangle”. He could not draw it with the tools of his day … but we can!
Chirikov’s Standard Map
The restricted 3-body problem is a bit more complicated than is needed to illustrate Poincaré’s homoclinic tangle. A much simpler model is a discrete map called Chirikov’s Map or the Standard Map. It describes the Poincaré section of a periodically kicked oscillator that rotates or oscillates in the angular direction with an angular momentm J. The map has the simple form
in which the angular momentum in updated first, and then the angle variable is updated with the new angular momentum. When plotted on the (θ,J) plane, the standard map produces a beautiful kaleidograph of intertwined trajectories piercing the Poincaré plane, as shown in the figure below. The small points or dots are successive intersections of the higher-dimensional trajectory intersecting a plane. It is possible to trace successive points by starting very close to a saddle point (on the left) and connecting successive iterates with lines. These lines merge into the black trace in the figure that emerges along the unstable manifold of the saddle point on the left and approaches the saddle point on the right generally along the stable manifold.
However, as the successive iterates approach the new saddle (which is really just the old saddle point because of periodicity) it crosses the stable manifold again and again, in ever wilder swings that diverge as it approaches the saddle point. This is just one trace. By calculating traces along all four stable and unstable manifolds and carrying them through to the saddle, a lattice work, or homoclinic tangle emerges.
Two of those traces originate from the stable manifolds, so to calculate their contributions to the homoclinic tangle, one must run these traces backwards in time using the inverse Chirikov map. This is
The four traces all intertwine at the saddle point in the figure below with a zoom in on the tangle in the next figure. This is the lattice work that Poincaré glimpsed in 1889 as he worked feverishly to correct the manuscript that won him the prize that established him as one of the preeminent mathematicians of Europe.
Python Code: StandmapHom.py
# -*- coding: utf-8 -*-
Created on Sun Aug 2 2020
"Introduction to Modern Dynamics" 2nd Edition (Oxford, 2019)
import numpy as np
from matplotlib import pyplot as plt
from numpy import linalg as LA
eps = 0.97
for eloop in range(0,100):
rlast = 2*np.pi*(0.5-np.random.random())
thlast = 4*np.pi*np.random.random()
rplot = np.zeros(shape=(200,))
thetaplot = np.zeros(shape=(200,))
for loop in range(0,200):
rnew = rlast + eps*np.sin(thlast)
thnew = np.mod(thlast+rnew,4*np.pi)
thetaplot[loop] = np.mod(thnew-np.pi,4*np.pi)
rtemp = np.mod(rnew + np.pi,2*np.pi)
rplot[loop] = rtemp - np.pi
rlast = rnew
thlast = thnew
K = eps
eps0 = 5e-7
J = [[1,1+K],[1,1]]
w, v = LA.eig(J)
My = w
Vu = v[:,0] # unstable manifold
Vs = v[:,1] # stable manifold
# Plot the unstable manifold
Hr = np.zeros(shape=(100,150))
Ht = np.zeros(shape=(100,150))
for eloop in range(0,100):
eps = eps0*eloop
roldu1 = eps*Vu
thetoldu1 = eps*Vu
Nloop = np.ceil(-6*np.log(eps0)/np.log(eloop+2))
flag = 1
cnt = 0
while flag==1 and cnt < Nloop:
ru1 = roldu1 + K*np.sin(thetoldu1)
thetau1 = thetoldu1 + ru1
roldu1 = ru1
thetoldu1 = thetau1
if thetau1 > 4*np.pi:
flag = 0
Hr[eloop,cnt] = roldu1
Ht[eloop,cnt] = thetoldu1 + 3*np.pi
cnt = cnt+1
x = Ht[0:99,12] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[0:99,12]
y2 = -y
x = Ht[5:39,15] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[5:39,15]
y2 = -y
x = Ht[12:69,16] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[12:69,16]
y2 = -y
x = Ht[15:89,17] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[15:89,17]
y2 = -y
x = Ht[30:99,18] - 2*np.pi
x2 = 6*np.pi - x
y = Hr[30:99,18]
y2 = -y
# Plot the stable manifold
del Hr, Ht
Hr = np.zeros(shape=(100,150))
Ht = np.zeros(shape=(100,150))
#eps0 = 0.03
for eloop in range(0,100):
eps = eps0*eloop
roldu1 = eps*Vs
thetoldu1 = eps*Vs
Nloop = np.ceil(-6*np.log(eps0)/np.log(eloop+2))
flag = 1
cnt = 0
while flag==1 and cnt < Nloop:
thetau1 = thetoldu1 - roldu1
ru1 = roldu1 - K*np.sin(thetau1)
roldu1 = ru1
thetoldu1 = thetau1
if thetau1 > 4*np.pi:
flag = 0
Hr[eloop,cnt] = roldu1
Ht[eloop,cnt] = thetoldu1
cnt = cnt+1
x = Ht[0:79,12] + np.pi
x2 = 6*np.pi - x
y = Hr[0:79,12]
y2 = -y
x = Ht[4:39,15] + np.pi
x2 = 6*np.pi - x
y = Hr[4:39,15]
y2 = -y
x = Ht[12:69,16] + np.pi
x2 = 6*np.pi - x
y = Hr[12:69,16]
y2 = -y
x = Ht[15:89,17] + np.pi
x2 = 6*np.pi - x
y = Hr[15:89,17]
y2 = -y
x = Ht[30:99,18] + np.pi
x2 = 6*np.pi - x
y = Hr[30:99,18]
y2 = -y
When Leibniz claimed in 1704, in a published article in Acta Eruditorum, to have invented the differential calculus in 1684 prior to anyone else, the British mathematicians rushed to Newton’s defense. They knew Newton had developed his fluxions as early as 1666 and certainly no later than 1676. Thus ensued one of the most bitter and partisan priority disputes in the history of math and science that pitted the continental Leibnizians against the insular Newtonians. Although a (partisan) committee of the Royal Society investigated the case and found in favor of Newton, the affair had the effect of insulating British mathematics from Continental mathematics, creating an intellectual desert as the forefront of mathematical analysis shifted to France. Only when George Green filled his empty hours with the latest advances in French analysis, as he tended his father’s grist mill, did British mathematics wake up. Green self-published his epic work in 1828 that introduced what is today called Green’s Theorem.
Yet the period from 1700 to 1828 was not a complete void for British mathematics. A few points of light shone out in the darkness, Thomas Simpson, Collin Maclaurin, Abraham de Moivre, and Brook Taylor (1685 – 1731) who came from an English family that had been elevated to minor nobility by an act of Cromwell during the English Civil War.
Growing up in Bifrons House
When Brook Taylor was ten years old, his father bought Bifrons House , one of the great English country houses, located in the county of Kent just a mile south of Canterbury. English country houses were major cultural centers and sources of employment for 300 years from the seventeenth century through the early 20th century. While usually being the country homes of nobility of all levels, from Barons to Dukes, sometimes they were owned by wealthy families or by representatives in Parliament, which was the case for the Taylors. Bifrons House had been built around 1610 in the Jacobean architectural style that was popular during the reign of James I. The house had a stately front façade, with cupola-topped square towers, gable ends to the roof, porches of a renaissance form, and extensive manicured gardens on the south side. Bifrons House remained the seat of the Taylor family until 1824 when they moved to a larger house nearby and let Bifrons first to a Marquess and then in 1828 to Lady Byron (ex-wife of Lord Byron) and her daughter Ada Lovelace (the mathematician famous for her contributions to early computer science). The Taylor’s sold the house in 1830 to the first Marquess Conyngham.
Taylor’s life growing up in the rarified environment of Bifrons House must have been like scenes out of the popular BBC TV drama Downton Abbey. The house had a large staff of servants and large grounds at the edge of a large park near the town of Patrixbourne. Life as the heir to the estate would have been filled with social events and fine arts that included music and painting. Taylor developed a life-long love of music during his childhood, later collaborating with Isaac Newton on a scientific investigation of music (it was never published). He was also an amateur artist, and one of the first books he published after being elected to the Royal Society was on the mathematics of linear perspective, which contained some of the early results of projective geometry.
There is a beautiful family portrait in the National Portrait Gallery in London painted by John Closterman around 1696. The portrait is of the children of John Taylor about a year after he purchased Bifrons House. The painting is notable because Brook, the heir to the family fortunes, is being crowned with a wreath by his two older sisters (who would not inherit). Brook was only about 11 years old at the time and was already famous within his family for his ability with music and numbers.
Taylor never had to go to school, being completely tutored at home until he entered St. John’s College, Cambridge, in 1701. He took mathematics classes from Machin and Keill and graduated in 1709. The allowance from his father was sufficient to allow him to lead the life of a gentleman scholar, and he was elected a member of the Royal Society in 1712 and elected secretary of the Society just two years later. During the following years he was active as a rising mathematician until 1721 when he married a woman of a good family but of no wealth. The support of a house like Bifrons always took money, and the new wife’s lack of it was enough for Taylor’s father to throw the new couple out. Unfortunately, his wife died in childbirth along with the child, so Taylor returned home in 1723. These family troubles ended his main years of productivity as a mathematician.
Methodus incrementorum directa et inversa
Under the eye of the Newtonian mathematician Keill at Cambridge, Taylor became a staunch supporter and user of Newton’s fluxions. Just after he was elected as a member of the Royal Society in 1712, he participated in an investigation of the priority for the invention of the calculus that pitted the British Newtonians against the Continental Leibnizians. The Royal Society found in favor of Newton (obviously) and raised the possibility that Leibniz learned of Newton’s ideas during a visit to England just a few years before Leibniz developed his own version of the differential calculus.
A re-evaluation of the priority dispute from today’s perspective attributes the calculus to both men. Newton clearly developed it first, but did not publish until much later. Leibniz published first and generated the excitement for the new method that dispersed its use widely. He also took an alternative route to the differential calculus that is demonstrably different than Newton’s. Did Leibniz benefit from possibly knowing Newton’s results (but not his methods)? Probably. But that is how science is supposed to work … building on the results of others while bringing new perspectives. Leibniz’ methods and his notations were superior to Newton’s, and the calculus we use today is closer to Leibniz’ version than to Newton’s.
Once Taylor was introduced to Newton’s fluxions, he latched on and helped push its development. The same year (1715) that he published a book on linear perspective for art, he also published a ground-breaking book on the use of the calculus to solve practical problems. This book, Methodus incrementorum directa et inversa, introduced several new ideas, including finite difference methods (which are used routinely today in numerical simulations of differential equations). It also considered possible solutions to the equation for a vibrating string for the first time.
The vibrating string is one of the simplest problem in “continuum mechanics”, but it posed a severe challenge to Newtonian physics of point particles. It was only much later that D’Alembert used Newton’s first law of action-reaction to eliminate internal forces to derive D’Alembert’s principle on the net force on an extended body. Yet Taylor used finite differences to treat the line mass of the string in a way that yielded a possible solution of a sine function. Taylor was the first to propose that a sine function was the form of the string displacement during vibration. This idea would be taken up later by D’Alembert (who first derived the wave equation), and by Euler (who vehemently disagreed with D’Alembert’s solutions) and Daniel Bernoulli (who was the first to suggest that it is not just a single sine function, but a sum of sine functions, that described the string’s motion — the principle of superposition).
Of course, the most influential idea in Taylor’s 1715 book was his general use of an infinite series to describe a curve.
Infinite series became a major new tool in the toolbox of analysis with the publication of John Wallis‘ Arithmetica Infinitorum published in 1656. Shortly afterwards many series were published such as Nikolaus Mercator‘s series (1668)
And of course Isaac Newton’s generalized binomial theorem that he worked out famously during the plague years of 1665-1666
But these consisted mainly of special cases that had been worked out one by one. What was missing was a general method that could yield a series expression for any curve.
Taylor used concepts of finite differences as well as infinitesimals to derive his formula for expanding a function as a power series around any point. His derivation in Methodus incrementorum directa et inversa is not easily recognized today. Using difference tables, and ideas from Newton’s fluxions that viewed functions as curves traced out as a function of time, he arrived at the somewhat cryptic expression
where the “dots” are time derivatives, x stands for the ordinate (the function), v is a finite difference, and z is the abcissa moving with constant speed. If the abcissa moves with unit speed, then this becomes Taylor’s Series (in modern notation)
The term “Taylor’s series” was probably first used by L’Huillier in 1786, although Condorcet attributed the equation to both Taylor and d’Alembert in 1784. It was Lagrange in 1797 who immortalized Taylor by claiming that Taylor’s theorem was the foundation of analysis.
Expand sin(x) around x = π
This is related to the expansion around x = 0 (also known as a Maclaurin series)
To get an feel for how to apply Taylor’s theorem to a function like arctan, begin with
and take the derivative of both sides
Rewrite this as
and substitute the expression for y
and integrate term by term to arrive at
This is James Gregory’s famous series. Although the math here is modern and only takes a few lines, it parallel’s Gregory’s approach. But Gregory had to invent aspects of calculus as he went along — his derivation covering many dense pages. In the priority dispute between Leibniz and Newton, Gregory is usually overlooked as an independent inventor of many aspects of the calculus. This is partly because Gregory acknowledged that Newton had invented it first, and he delayed publishing to give Newton priority.
Two-Dimensional Taylor’s Series
The ideas behind the Taylor’s series generalizes to any number of dimensions. For a scalar function of two variables it takes the form (out to second order)
where J is the Jacobian matrix (vector) and H is the Hessian matrix defined for the scalar function as
As a concrete example, consider the two-dimensional Gaussian function
The Jacobean and Hessian matrices are
which are the first- and second-order coefficients of the Taylor series.
 “A History of Bifrons House”, B. M. Thomas, Kent Archeological Society (2017)