A Short History of Neural Networks

When it comes to questions about the human condition, the question of intelligence is at the top. What is the origin of our intelligence? How intelligent are we? And how intelligent can we make other things…things like artificial neural networks?

This is a short history of the science and technology of neural networks, not just artificial neural networks but also the natural, organic type, because theories of natural intelligence are at the core of theories of artificial intelligence. Without understanding our own intelligence, we probably have no hope of creating the artificial type.

Ramon y Cajal (1888): Visualizing Neurons

The story begins with Santiago Ramon y Cajal (1853 – 1934) who received the Nobel Prize in physiology in 1906 for his work illuminating natural neural networks. He built on work by Camillo Golgi, using a stain to give intracellular components contrast [1], and then went further to developed his own silver emulsions like those of early photography (which was one of his hobbies). Cajal was the first to show that neurons were individual constituents of neural matter and that their contacts were sequential: axons of sets of neurons contacted the dendrites of other sets of neurons, never axon-to-axon or dendrite-to-dendrite, to create a complex communication network. This became known as the neuron doctrine, and it is a central idea of neuroscience today.

Fig. 1 One of Cajal’s published plates demonstrating neural synapses. From Link.

McCulloch and Pitts (1943): Mathematical Models

In 1941, Warren S. McCulloch (1898–1969) arrived at the Department of Psychiatry at the University of Illinois at Chicago where he met with the mathematical biology group at the University of Chicago led by Nicolas Rashevsky (1899–1972), widely acknowledged as the father of mathematical biophysics in the United States.

An itinerant member of Rashevsky’s group at the time was a brilliant, young and unusual mathematician, Walter Pitts (1923– 1969). He was not enrolled as a student at Chicago, but had simply “showed up” one day as a teenager at Rashevsky’s office door.  Rashevsky was so impressed by Pitts that he invited him to attend the group meetings, and Pitts became interested in the application of mathematical logic to biological information systems.

When McCulloch met Pitts, he realized that Pitts had the mathematical background that complemented his own views of brain activity as computational processes. Pitts was homeless at the time, so McCulloch invited him to live with his family, giving the two men ample time to work together on their mutual obsession to provide a logical basis for brain activity in the way that Turing had provided it for computation.

McColloch and Pitts simplified the operation of individual neurons to their most fundamental character, envisioning a neural computing unit with multiple inputs (received from upstream neurons) and a single on-off output (sent to downstream neurons) with the additional possibility of feedback loops as downstream neurons fed back onto upstream neurons. They also discretized the dynamics in time, using discrete logic and time-difference equations, succeeding in devising a logical structure with rules and equations for the general operation of nets of neurons.  They published their results a 1943 in the paper titled “A logical calculus of the ideas immanent in nervous activity,” [2] introducing computational language and logic to neuroscience.  Their simplified neural unit became the basis for discrete logic, picked up a few years later by von Neumann as an elemental example of a logic gate upon which von Neumann began constructing the theory and design of the modern electronic computer.

Fig. 2 The only figure in McCulloch and Pitt’s “Logical Calculus”.

Donald Hebb (1949): Hebbian Learning

The basic model for learning and adjustment of synaptic weights among neurons was put forward in 1949 by the physiological psychologist Donald Hebb (1904-1985) of McGill University in Canada in a book titled The Organization of Behavior [3].

In Hebbian learning, an initially untrained network consists of many neurons with many synapses having random synaptic weights. During learning, a synapse between two neurons is strengthened when both the pre-synaptic and post-synaptic neurons are firing simultaneously. In this model, it is essential that each neuron makes many synaptic contacts with other neurons because it requires many input neurons acting in concert to trigger the output neuron. In this way, synapses are strengthened when there is collective action among the neurons. The synaptic strengths are therefore altered through a form of self-organization. A collective response of the network strengthens all those synapses that are responsible for the response, while the other synapses that do not contribute, weaken. Despite the simplicity of this model, it has been surprisingly robust, standing up as a general principle for the training of artificial neural networks.

Fig. 3. A Figure from Hebb’s textbook on psychology (1958). From Link.

Hodgkin and Huxley (1952): Neuron Transporter Models

Alan Hodgkin (1914 – 1998) and Andrew Huxley (1917 – 2012) were English biophysicists who received the 1963 Nobel Prize in physiology for their work on the physics behind neural activation.  They constructed a differential equation for the spiking action potential for which their biggest conceptual challenge was the presence of time delays in the voltage signals that were not explained by linear models of the neural conductance. As they began exploring nonlinear models, using their experiments to guide the choice of parameters, they settled on a dynamical model in a four-dimensional phase space. One dimension was voltage, while another was inhibitory current. The two remaining dimensions were sodium and potassium conductances, which they had determined were the major ions participating in the generation and propagation of the action potential. The nonlinear conductances of their model described the observed time delays and captured the essential neural behavior of the fast spike followed by a slow recovery. Huxley solved the equations on a hand-cranked calculator, taking over three months of tedious cranking to plot the numerical results.

Fig. 4 The Hodgkin-Huxley model of the neuron, including capacitance C, voltage V and bias current I along with the conductances of potassium (K), sodium (Na) and Lithium (L) channels.

Hodgkin and Huxley published [4] their measurements and their model (known as the Hodgkin-Huxley model) in a series of six papers in 1952 that led to an explosion of research in electrophysiology, for which Hodgkin and Huxley won the 1963 Nobel Prize in physiology or medicine. The four-dimensional Hodgkin–Huxley model stands as a classic example of the power of phenomenological modeling when combined with accurate experimental observation. Hodgkin and Huxley were able to ascertain not only the existence of ion channels in the cell membrane, but also their relative numbers, long before these molecular channels were ever directly observed using electron microscopes. The Hodgkin–Huxley model lent itself to simplifications that could capture the essential behavior of neurons while stripping off the details.

Frank Rosenblatt (1958): The Perceptron

Frank Rosenblatt (1928–1971) had a PhD in psychology from Cornell University and was in charge of the cognitive systems section of the Cornell Aeronautical Laboratory (CAL) located in Buffalo, New York.  He was tasked with fulfilling a contract from the Navy to develop an analog image processor. Drawing from the work of McCulloch and Pitts, his team constructed a software system and then constructed a hardware model that adaptively updated the strength of the inputs, that they called neural weights, as it was trained on test images. The machine was dubbed the Mark I Perceptron, and its announcement in 1958 created a small media frenzy [5]. A New York Times article reported the perceptron was “the embryo of an electronic computer that [the navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.”

The perceptron had a simple architecture, with two layers of neurons consisting of an input layer and a processing layer, and it was programmed by adjusting the synaptic weights to the inputs. This computing machine was the first to adaptively learn its functions, as opposed to following predetermined algorithms like digital computers. It seemed like a breakthrough in cognitive science and computing, as trumpeted by the New York Times.  But within a decade, the development had stalled because the architecture was too restrictive.

Fig. 5 Frank Rosenblatt with his Perceptron. From Link.

Richard Fitzhugh and Jin-Ichi Nagumo (1961): Neural van der Pol Oscillators

In 1961 Richard FitzHugh (1922–2007), a neurophysiology researcher at the National Institute of Neurological Disease and Blindness (NINDB) of the National Institutes of Health (NIH), created a surprisingly simple model of the neuron that retained only a third order nonlinearity, just like the third-order nonlinearity that Rayleigh had proposed and solved in 1883, and that van der Pol extended in 1926. Around the same time that FitzHugh proposed his mathematical model [6], the electronics engineer Jin-Ichi Nagumo (1926-1999) in Japan created an electronic diode circuit with an equivalent circuit model that mimicked neural oscillations [7]. Together, this work by FitzHugh and Nagumo led to the so-called FitzHugh–Nagumo model. The conceptual importance of this model is that it demonstrated that the neuron was a self-oscillator, just like a violin string or wheel shimmy or the pacemaker cells of the heart. Once again, self-oscillators showed themselves to be common elements of a complex world—and especially of life.

Fig. 6 The FitzHugh-Nagumo model of the neuron simplifies the Hodgkin-Huxley model from four dimensions down to two dimensions of voltage V and channel activation n.

John Hopfield (1982): Spin Glasses and Recurrent Networks

John Hopfield (1933–) received his PhD from Cornell University in 1958, advised by Al Overhauser in solid state theory, and he continued to work on a broad range of topics in solid state physics as he wandered from appointment to appointment at Bell Labs, Berkeley, Princeton, and Cal Tech. In the 1970s Hopfield’s interests broadened into the field of biophysics, where he used his expertise in quantum tunneling to study quantum effects in biomolecules, and expanded further to include information transfer processes in DNA and RNA. In the early 1980s, he became aware of aspects of neural network research and was struck by the similarities between McColloch and Pitts’ idealized neuronal units and the physics of magnetism. For instance, there is a type of disordered magnetic material called a spin glass in which a large number of local regions of magnetism are randomly oriented. In the language of solid-state physics, one says that the potential energy function of a spin glass has a large number of local minima into which various magnetic configurations can be trapped. In the language of dynamics, one says that the dynamical system has a large number of basins of attraction [8].

The Parallel Distributed Processing Group (1986): Backpropagation

David Rumelhart, a mathematical psychologist at UC San Diego, was joined by James McClelland in 1974 and then by Geoffrey Hinton in 1978 to become what they called the Parallel Distributed Processing (PDP) group. The central tenets of the PDP framework they developed were: 1) processing is distributed across many semi-autonomous neural units, that 2) learn by adjusting the weights of their interconnections based on the strengths of their signals (i.e., Hebbian learning), whose memories and behaviors are 3) an emergent property of the distributed learned weights.

PDP was an exciting framework for artificial intelligence, and it captured the general behavior of natural neural networks, but it had a serious problem: How could all of the neural weights be trained?

In 1986, Rumelhart and Hinton with the mathematician Ronald Williams developed a mathematical procedure for training neural weights called error backpropagation [9]. The idea is actually very simple: create a mean squared error of the response of a neural network compared to an ideal response, then tweak one of the neural weights and see if the error increases or decreases. If the error decreases, keep the tweak for that weight and move to the next, working iteratively, tweak by tweak, to minimize the mean squared error. In this way, large numbers of neural weights can be adjusted as the network is trained to perform a specified task.

Error backpropagation has come a long way from that early 1986 paper, and it now lies at the core of the AI revolution we are experiencing today as tens of millions of neural weights are trained on massive datasets.

Yann LeCun (1989): Convolutional Neural Networks

In 1988, I was a new post-doc at AT&T Bell Labs at Holmdel, New Jersey fresh out of my PhD in physics from Berkeley. Bell Labs liked to give its incoming employees inspirational talks and tours of their facilities, and one of the tours I took was of the neural network lab run by Lawrence Jackel that was working on computer recognition of zip-code digits. The team’s new post-doc, arriving at Bell Labs the same time as me, was Yann LeCun. It is very possible that the demo our little group watched was run by him, or at least he was there, but at the time he was a nobody, so even if I had heard his name, it wouldn’t have meant anything to me.

Fast forward to today, and Yann LeCun’s name is almost synonomous with AI. He is the Chief AI Scientist at Facebook and his google scholar page reports that he gets 50,000 citations per year.

LeCun is famous for developing the convolutional neural network (CNN) in work that he published from Bell Labs in 1989 [10]. It is a biomimetic neural network that takes its inspiration from the receptive fields of the neural networks in the retina. What you think you see, when you look at something, is actually reconstructed by your brain. Your retina is a neural processor with receptive fields that are a far cry from one-to-one. Most prominent in the retina are center-surround fields, or kernels, that respond to the derivatives of the focused image instead of the image itself. It’s the derivatives that are sent up your optic neuron to your brain which then reconstructs the image. It works as a form of image compression so that broad uniform areas in an image are reduced to its edges.

The convolutional neural network works in the same way, it’s just engineered specifically to produce compressed and multiscale codes that capture broad areas as well as the fine details of an image. By constructing many different “kernel” operators at many different scales, it creates a set of features that capture the nuances of the image in quantitative form that is then processed by training neural weights in downstream neural networks.

Fig. 7 Example of a receptive field of a CNN. The filter is the kernel (in this case a discrete 3×3 Laplace operator) that is stepped sequentially across the image field to produce the Laplacian feature map of the original image. One feature map for every different kernel becomes the input for the next level of kernels in a hierarchical scaling structure.

Geoff Hinton (2006): Deep Belief

It seems like Geoff Hinton has had his finger in almost every pie when it comes to how we do AI today. Backpropagation? Geoff Hinton. Rectified Linear Units? Geoff Hinton. Boltzmann Machines? Geoff Hinton. t-SNE? Geoff Hinton. Dropout regularization? Geoff Hinton. AlexNet? Geoff Hinton. The 2024 Nobel Prize in Physics? Geoff Hinton! He may not have invented all of these, but he was in the midst of it all.

Hinton received his PhD in Artificial Intelligence (ar rare field at the time) from the University of Edinburgh in 1978 after which he joined the PDP group at UCSD (see above) as a post-doc. After a time at Carnegie-Mellon, he joined the University of Toronto, Canada, in 1987 where he established one of the leading groups in the world on neural network research. It was from here that he launched so many of the ideas and techniques that have become the core of deep learning.

A central idea of deep learning came from Hinton’s work on Boltzmann Machines that learn statistical distributions of complex data. This type of neural network is known as an energy-based model, similar to a Hopfield network, and it has strong ties to the statistical mechanics of spin-glass systems. Unfortunately, it is a bitch to train! So Hinton simplified it into a Restricted Boltzmann Machine (RBM) that was much more tractable and layers of RBMs could be stacked into “Deep Belief Networks” [11] that had a hierarchical structure that allowed the neural nets to learn layers of abstractions. These were among the first deep networks that were able to do complex tasks at the level of human capabilities (and sometimes beyond).

The breakthrough that propelled Geoff Hinton to world-wide acclaim was the success of AlexNet, a neural network constructed by his graduate student Alex Krizhevsky at Toronto in 2012 consisting of 650,000 neurons with 60 million parameters that were trained using two early Nvidia GPUs. It won the ImageNet challenge that year, enabled by its deep architecture and representing a marked advancement that has been proceeding unabated today.

Deep learning is now the rule in AI, supported by the Attention mechanism and Transformers that underpin the large language models, like ChatGPT and others, that are poised to disrupt all the legacy business models based on the previous silicon revolution of 50 years ago.

Further Reading

(Sections of this article have been excerpted from Chapter 11 of Galileo Unbound, (Oxford University Press)

References

[1] Ramón y Cajal S. (1888). Estructura de los centros nerviosos de las aves. Rev. Trim. Histol. Norm. Pat. 1, 1–10.

[2] McCulloch, W.S. and W. Pitts, A Logical Calculus of the Ideas Immanent in Nervous Activity. Bull. Math. Biophys., 1943. 5: p. 115.

[3] Hebb, D. O. (1949). The Organization of Behavior: A Neuropsychological Theory. New York: Wiley and Sons. ISBN 978-0-471-36727-7 – via Internet Archive.

[4] Hodgkin AL, Huxley AF (August 1952). “A quantitative description of membrane current and its application to conduction and excitation in nerve”. The Journal of Physiology. 117 (4): 500–44.

[5] Rosenblatt, Frank (1957). “The Perceptron—a perceiving and recognizing automaton”. Report 85-460-1. Cornell Aeronautical Laboratory.

[6] FitzHugh, Richard (July 1961). “Impulses and Physiological States in Theoretical Models of Nerve Membrane”. Biophysical Journal. 1 (6): 445–466.

[7] Nagumo, J.; Arimoto, S.; Yoshizawa, S. (October 1962). “An Active Pulse Transmission Line Simulating Nerve Axon”. Proceedings of the IRE. 50 (10): 2061–2070.

[8] Hopfield, J. J. (1982). “Neural networks and physical systems with emergent collective computational abilities”. Proceedings of the National Academy of Sciences. 79 (8): 2554–2558.

[9] Rumelhart, D.E. et al. Nature 323, 533-536 (1986).

[10] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard and L. D. Jackel: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541–551, Winter 1989.

[11] G. E. Hinton, S. Osindero, and Y. W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation 18, 1527-1554 (2006).

Books by David D. Nolte at Oxford University Press
Read more in Books by David D. Nolte at Oxford University Press

The Anharmonic Harmonic Oscillator

Harmonic oscillators are one of the fundamental elements of physical theory.  They arise so often in so many different contexts that they can be viewed as a central paradigm that spans all aspects of physics.  Some famous physicists have been quoted to say that the entire universe is composed of simple harmonic oscillators (SHO).

Despite the physicist’s love affair with it, the SHO is pathological! First, it has infinite frequency degeneracy which makes it prone to the slightest perturbation that can tip it into chaos, in contrast to non-harmonic cyclic dynamics that actually protects us from the chaos of the cosmos (see my Blog on Chaos in the Solar System). Second, the SHO is nowhere to be found in the classical world.  Linear oscillators are purely harmonic, with a frequency that is independent of amplitude—but no such thing exists!  All oscillators must be limited, or they could take on infinite amplitude and infinite speed, which is nonsense.  Even the simplest of simple harmonic oscillators would be limited by nothing other than the speed of light.  Relativistic effects would modify the linearity, especially through time dilation effects, rendering the harmonic oscillator anharmonic.

Despite the physicist’s love affair with it, the SHO is pathological!

Therefore, for students of physics as well as practitioners, it is important to break the shackles imposed by the SHO and embrace the anharmonic harmonic oscillator as the foundation of physics. Here is a brief survey of several famous anharmonic oscillators in the history of physics, followed by the mathematical analysis of the relativistic anharmonic linear-spring oscillator.

Anharmonic Oscillators

Anharmonic oscillators have a long venerable history with many varieties.  Many of these have become central models in systems as varied as neural networks, synchronization, grandfather clocks, mechanical vibrations, business cycles, ecosystem populations and more.

Christiaan Huygens

Already by the mid 1600’s Christiaan Huygens (1629 – 1695) knew that the pendulum becomes slower when it has larger amplitudes.  The pendulum was one of the best candidates for constructing an accurate clock needed for astronomical observations and for the determination of longitude at sea.  Galileo (1564 – 1642) had devised the plans for a rudimentary pendulum clock that his son attempted to construct, but the first practical pendulum clock was invented and patented by Huygens in 1657.  However, Huygens’ modified verge escapement required his pendulum to swing with large amplitudes, which brought it into the regime of anharmonicity. The equations of the simple pendulum are truly simple, but the presence of the sinθ makes it the simplest anharmonic oscillator.

Therefore, Huygens searched for the mathematical form of a tautochrone curve for the pendulum (a curve that is traversed with equal times independently of amplitude) and in the process he invented the involutes and evolutes of a curve—precursors of the calculus.  The answer to the tautochrone question is a cycloid (see my Blog on Huygen’s Tautochrone Curve).

Hermann von Helmholtz

Hermann von Helmholtz (1821 – 1894) was possibly the greatest German physicist of his generation—an Einstein before Einstein—although he began as a medical doctor.  His study of muscle metabolism, drawing on the early thermodynamic work of Carnot, Clapeyron and Joule, led him to explore and to express the conservation of energy in its clearest form.  Because he postulated that all forms of physical processes—electricity, magnetism, heat, light and mechanics—contributed to the interconversion of energy, he sought to explore them all, bringing his research into the mainstream of physics.  His laboratory in Berlin became world famous, attracting to his laboratory the early American physicists Henry Rowland (founder and first president of the American Physical Society) and Albert Michelson (first American Nobel prize winner).

Even the simplest of simple harmonic oscillators would be limited by nothing other than the speed of light.  

Helmholtz also pursued a deep interest in the physics of sensory perception such as sound.  This research led to his invention of the Helmholtz oscillator which is a highly anharmonic relaxation oscillator in which a tuning fork was placed near an electromagnet that was powered by a mercury switch attached to the fork. As the tuning fork vibrated, the mercury came in and out of contact with it, turning on and off the magnet, which fed back on the tuning fork, and so on, enabling the device, once started, to continue oscillating without interruption. This device is called a tuning-fork resonator, and it became the first door-bell buzzers.  (These are not to be confused with Helmholtz resonances that are formed when blowing across the open neck of a beer bottle.)

Lord Rayleigh

Baron John Strutt, the Lord Rayleigh (1842 – 1919) like Helmholtz also was a generalist and had a strong interest in the physics of sound.  He was inspired by Helmholtz’ oscillator to consider general nonlinear anharmonic oscillators mathematically.  He was led to consider the effects of anharmonic terms added to the harmonic oscillator equation.  in a paper published in the Philosophical Magazine issue of 1883 with the title On Maintained Vibrations, he introduced an equation to describe the self-oscillation by adding an extra term to a simple harmonic oscillator. The extra term depended on the cube of the velocity, representing a balance between the gain of energy from a steady force and natural dissipation by friction.  Rayleigh suggested that this equation applied to a wide range of self-oscillating systems, such as violin strings, clarinet reeds, finger glasses, flutes, organ pipes, among others (see my Blog on Rayleigh’s Harp.)

Georg Duffing

The first systematic study of quadratic and cubic deviations from the harmonic potential was performed by the German engineer George Duffing (1861 – 1944) under the conditions of a harmonic drive. The Duffing equation incorporates inertia, damping, the linear spring and nonlinear deviations.

Fig. 1 The Duffing equation adds a nonlinear term to the spring force when alpha is positive, stiffening or weakening it for larger excursions when beta is positive or negative, respectively. And by making alpha negative and beta positive, it describes a damped driven double-well potential.

Duffing confirmed his theoretical predictions with careful experiments and established the lowest-order corrections to ideal masses on springs. His work was rediscovered in the 1960’s after Lorenz helped launch numerical chaos studies. Duffing’s driven potential becomes especially interesting when α is negative and β is positive, creating a double-well potential. The driven double-well is a classic chaotic system (see my blog on Duffing’s Oscillator).

Balthasar van der Pol

Autonomous oscillators are one of the building blocks of complex systems, providing the fundamental elements for biological oscillatorsneural networksbusiness cyclespopulation dynamics, viral epidemics, and even the rings of Saturn.  The most famous autonomous oscillator (after the pendulum clock) is named for a Dutch physicist, Balthasar van der Pol (1889 – 1959), who discovered the laws that govern how electrons oscillate in vacuum tubes, but the dynamical system that he developed has expanded to become the new paradigm of cyclic dynamical systems to replace the SHO (see my Blog on GrandFather Clocks.)

Fig. 2 The van der Pol equation is the standard simple harmonic oscillator with a gain term that saturates for large excursions leading to a limit cycle oscillator.

Turning from this general survey, let’s find out what happens when special relativity is added to the simplest SHO [1].

Relativistic Linear-Spring Oscillator

The theory of the relativistic one-dimensional linear-spring oscillator starts from a relativistic Lagrangian of a free particle (with no potential) yielding the generalized relativistic momentum

The Lagrangian that accomplishes this is [2]

where the invariant 4-velocity is

When the particle is in a potential, the Lagrangian becomes

The action integral that is minimized is

and the Lagrangian for integration of the action integral over proper time is

The relativistic modification in the potential energy term of the Lagrangian is not in the spring constant, but rather is purely a time dilation effect.  This is captured by the relativistic Lagrangian

where the dot is with respect to proper time τ. The classical potential energy term in the Lagrangian is multiplied by the relativistic factor γ, which is position dependent because of the non-constant speed of the oscillator mass.  The Euler-Lagrange equations are

where the subscripts in the variables are a = 0, 1 for the time and space dimensions, respectively.  The derivative of the time component of the 4-vector is

From the derivative of the Lagrangian with respect to speed, the following result is derived

where E is the constant total relativistic energy.  Therefore,

which provides an expression for the derivative of the coordinate time with respect to the proper time where

The position-dependent γ(x) factor is then

The Euler-Lagrange equation with a = 1 is

which gives

providing the flow equations for the (an)harmonic oscillator with respect to proper time

This flow represents a harmonic oscillator modified by the γ(x) factor, due to time dilation, multiplying the spring force term.  Therefore, at relativistic speeds, the oscillator is no longer harmonic even though the spring constant remains truly a constant.  The term in parentheses effectively softens the spring for larger displacement, and hence the frequency of oscillation becomes smaller. 

The state-space diagram of the anharmonic oscillator is shown in Fig. 3 with respect to proper time (the time read on a clock co-moving with the oscillator mass).  At low energy, the oscillator is harmonic with a natural period of the SHO.  As the maximum speed exceeds β = 0.8, the period becomes longer and the trajectory less sinusoidal.  The position and speed for β = 0.9999 is shown in Fig. 4.  The mass travels near the speed of light as it passes the origin, producing significant time dilation at that instant.  The average time dilation through a single cycle is about a factor of three, despite the large instantaneous γ = 70 when the mass passes the origin.

Fig. 3 State-space diagram in relativistic units relative to proper time of a relativistic (an)harmonic oscillator with a constant spring constant for several relative speeds β. The anharmonicity becomes pronounced above β = 0.8.
Fig. 4 Position and speed in relativistic units relative to proper time of a relativistic (an)harmonic oscillator with a constant spring constant for β = 0.9999.  The period of oscillation in this simulation is nearly three times longer than the natural frequency at small amplitudes.

By David D. Nolte, May 29, 2022


[1] W. Moreau, R. Easther, and R. Neutze, “RELATIVISTIC (AN)HARMONIC OSCILLATOR,” American Journal of Physics, Article vol. 62, no. 6, pp. 531-535, Jun (1994)

[2] D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd. ed. (Oxford University Press, 2019)


This Blog Post is a Companion to the undergraduate physics textbook Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford, 2019) introducing Lagrangians and Hamiltonians, chaos theory, complex systems, synchronization, neural networks, econophysics and Special and General Relativity.

Limit-Cycle Oscillators: The Fast and the Slow of Grandfather Clocks

Imagine in your mind the stately grandfather clock.  The long slow pendulum swinging back and forth so purposefully with such majesty.  It harks back to slower simpler times—seemingly Victorian in character, although their origins go back to Christiaan Huygens in 1656.  In introductory physics classes the dynamics of the pendulum is taught as one of the simplest simple harmonic oscillators, only a bit more complicated than a mass on a spring.

But don’t be fooled!  This simplicity is an allusion, for the pendulum clock lies at the heart of modern dynamics.  It is a nonlinear autonomous oscillator with system gain that balances dissipation to maintain a dynamic equilibrium that ticks on resolutely as long as some energy source can continue to supply it (like the heavy clock weights).    

This analysis has converted the two-dimensional dynamics of the autonomous oscillator to a simple one-dimensional dynamics with a stable fixed point.

The dynamic equilibrium of the grandfather clock is known as a limit cycle, and they are the central feature of autonomous oscillators.  Autonomous oscillators are one of the building blocks of complex systems, providing the fundamental elements for biological oscillators, neural networks, business cycles, population dynamics, viral epidemics, and even the rings of Saturn.  The most famous autonomous oscillator (after the pendulum clock) is named for a Dutch physicist, Balthasar van der Pol (1889 – 1959), who discovered the laws that govern how electrons oscillate in vacuum tubes.  But this highly specialized physics problem has expanded to become the new guiding paradigm for the fundamental oscillating element of modern dynamics—the van der Pol oscillator.

The van der Pol Oscillator

The van der Pol (vdP) oscillator begins as a simple harmonic oscillator (SHO) in which the dissipation (loss of energy) is flipped to become gain of energy.  This is as simple as flipping the sign of the damping term in the SHO

where β is positive.  This 2nd-order ODE is re-written into a dynamical flow as

where γ = β/m is the system gain.  Clearly, the dynamics of this SHO with gain would lead to run-away as the oscillator grows without bound.             

But no real-world system can grow indefinitely.  It has to eventually be limited by things such as inelasticity.  One of the simplest ways to include such a limiting process in the mathematical model is to make the gain get smaller at larger amplitudes.  This can be accomplished by making the gain a function of the amplitude x as

When the amplitude x gets large, the gain decreases, becoming zero and changing sign when x = 1.  Putting this amplitude-dependent gain into the SHO equation yields

This is the van der Pol equation.  It is the quintessential example of a nonlinear autonomous oscillator.            

When the parameter ε is large, the vdP oscillator has can behave in strongly nonlinear ways, with strongly nonlinear and nonharmonic oscillations.  An example is shown in Fig. 2 for a = 5 and b = 2.5.  The oscillation is clearly non-harmonic.

Fig. 1 Time trace of the position and velocity of the vdP oscillator with w0 = 5 and ε = 2.5.
Fig. 2 State-space portrait of the vdP flow lines for w0 = 5 and ε = 2.5.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Mon Apr 16 07:38:57 2018
@author: David Nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)
"""
import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt

plt.close('all')

def solve_flow(param,lim = [-3,3,-3,3],max_time=10.0):
# van der pol 2D flow 
    def flow_deriv(x_y, t0, alpha,beta):
        x, y = x_y
        return [y,-alpha*x+beta*(1-x**2)*y]
    
    plt.figure()
    xmin = lim[0]
    xmax = lim[1]
    ymin = lim[2]
    ymax = lim[3]
    plt.axis([xmin, xmax, ymin, ymax])

    N=144
    colors = plt.cm.prism(np.linspace(0, 1, N))
    
    x0 = np.zeros(shape=(N,2))
    ind = -1
    for i in range(0,12):
        for j in range(0,12):
            ind = ind + 1;
            x0[ind,0] = ymin-1 + (ymax-ymin+2)*i/11
            x0[ind,1] = xmin-1 + (xmax-xmin+2)*j/11
             
    # Solve for the trajectories
    t = np.linspace(0, max_time, int(250*max_time))
    x_t = np.asarray([integrate.odeint(flow_deriv, x0i, t, param)
                      for x0i in x0])

    for i in range(N):
        x, y = x_t[i,:,:].T
        lines = plt.plot(x, y, '-', c=colors[i])
        plt.setp(lines, linewidth=1)

    plt.show()
    plt.title(model_title)
    plt.savefig('Flow2D')
    
    return t, x_t

def solve_flow2(param,max_time=20.0):
# van der pol 2D flow 
    def flow_deriv(x_y, t0, alpha,beta):
        #"""Compute the time-derivative of a Medio system."""
        x, y = x_y
        return [y,-alpha*x+beta*(1-x**2)*y]
    model_title = 'van der Pol Oscillator'
    x0 = np.zeros(shape=(2,))
    x0[0] = 0
    x0[1] = 4.5
    
    # Solve for the trajectories
    t = np.linspace(0, max_time, int(250*max_time))
    x_t = integrate.odeint(flow_deriv, x0, t, param)
 
    return t, x_t

param = (5, 2.5)             # van der Pol
lim = (-7,7,-10,10)

t, x_t = solve_flow(param,lim)

t, x_t = solve_flow2(param)
plt.figure(2)
lines = plt.plot(t,x_t[:,0],t,x_t[:,1],'-')

Separation of Time Scales

Nonlinear systems can have very complicated behavior that may be difficult to address analytically.  This is why the numerical ODE solver is a central tool of modern dynamics.  But there is a very neat analytical trick that can be applied to tame the nonlinearities (if they are not too large) and simplify the autonomous oscillator.  This trick is called separation of time scales (also known as secular perturbation theory)—it looks for simultaneous fast and slow behavior within the dynamics.  An example of fast and slow time scales in a well-known dynamical system is found in the simple spinning top in which nutation (fast oscillations) are superposed on precession (slow oscillations).             

For the autonomous van der Pol oscillator the fast time scale is the natural oscillation frequency, while the slow time scale is the approach to the limit cycle.  Let’s assign t0 = t and t1 = εt, where ε is a small parameter.  t0 is the slow period (approach to the limit cycle) and t1 is the fast period (natural oscillation frequency).  The solution in terms of these time scales is

where x0 is a slow response and acts as an envelope function for x1 that is the fast response. The total differential is

Similarly, to obtain a second derivative

Therefore, the vdP equation in terms of x0 and x1 is

to lowest order. Now separate the orders to zeroth and first orders in ε, respectively,

Solve the first equation (a simple harmonic oscillator)

and plug the solution it into the right-hand side of the second equation to give

The key to secular perturbation theory is to confine dynamics to their own time scales.  In other words, the slow dynamics provide the envelope that modulates the fast carrier frequency.  The envelope dynamics are contained in the time dependence of the coefficients A and B.  Furthermore, the dynamics of x1 should be a homogeneous function of time, which requires each term in the last equation to be zero.  Therefore, the dynamical equations for the envelope functions are

These can be transformed into polar coordinates. Because the envelope functions do not depend on the slow time scale, the time derivatives are

With these expressions, the slow dynamics become

where the angular velocity in the fast variable is equal to zero, leaving only the angular velocity of the unperturbed oscillator. (This is analogous to the rotating wave approximation (RWA) in optics, and also equivalent to studying the dynamics in the rotating frame of the unperturbed oscillator.)

Making a final substitution ρ = R/2 gives a very simple set of dynamical equations

These final equations capture the essential properties of the relaxation of the dynamics to the limit cycle. To lowest order (when the gain is weak) the angular frequency is unaffected, and the system oscillates at the natural frequency. The amplitude of the limit cycle equals 1. A deviation in the amplitude from 1 decays slowly back to the limit cycle making it a stable fixed point in the radial dynamics. This analysis has converted the two-dimensional dynamics of the autonomous oscillator to a simple one-dimensional dynamics with a stable fixed point on the radius variable. The phase-space portrait of this simplified autonomous oscillator is shown in Fig. 3. What could be simpler? This simplified autonomous oscillator can be found as a fundamental element of many complex systems.

Fig. 3 The state-space diagram of the simplified autonomous oscillator. Initial conditions relax onto the limit cycle. (Reprinted from Introduction to Modern Dynamics (Oxford, 2019) on pg. 8)

Further Reading

D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd edition (Oxford University Press, 2019)

Pikovsky, A. S., M. G. Rosenblum and J. Kurths (2003). Synchronization: A Universal concept in nonlinear science. Cambridge, Cambridge University Press.


This Blog Post is a Companion to the undergraduate physics textbook Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford, 2019) introducing Lagrangians and Hamiltonians, chaos theory, complex systems, synchronization, neural networks, econophysics and Special and General Relativity.