A Short History of Chaos Theory

Chaos seems to rule our world.  Weather events, natural disasters, economic volatility, empire building—all these contribute to the complexities that buffet our lives.  It is no wonder that ancient man attributed the chaos to the gods or to the fates, infinitely far from anything we can comprehend as cause and effect.  Yet there is a balm to soothe our wounds from the slings of life—Chaos Theory—if not to solve our problems, then at least to understand them.

(Sections of this Blog have been excerpted

from the book Galileo Unbound, published by Oxford University Press)

Chaos Theory is the theory of complex systems governed by multiple factors that produce complicated outputs.  The power of the theory is its ability recognize when the complicated outputs are not “random”, no matter how complicated they are, but are in fact determined by the inputs.  Furthermore, chaos theory finds structures and patterns within the output—like the fractal structures known as “strange attractors”.  These patterns not only are not random, but they tell us about the internal mechanics of the system, and they tell us where to look “on average” for the system behavior. 

In other words, chaos theory tames the chaos, and we no longer need to blame gods or the fates.

Henri Poincare (1889)

The first glimpse of the inner workings of chaos was made by accident when Henri Poincaré responded to a mathematics competition held in honor of the King of Sweden.  The challenge was to prove whether the solar system was absolutely stable, or whether there was a danger that one day the Earth would be flung from its orbit.  Poincaré had already been thinking about the stability of dynamical systems so he wrote up his solution to the challenge and sent it in, believing that he had indeed proven that the solar system was stable.

His entry to the competition was the most convincing, so he was awarded the prize and instructed to submit the manuscript for publication.  The paper was already at the printers and coming off the presses when Poincaré was asked by the competition organizer to check one last part of the proof which one of the reviewer’s had questioned relating to homoclinic orbits.

Fig. 1 A homoclinic orbit is an orbit in phase space that intersects itself.

To Poincaré’s horror, as he checked his results against the reviewer’s comments, he found that he had made a fundamental error, and in fact the solar system would never be stable.  The problem that he had overlooked had to do with the way that orbits can cross above or below each other on successive passes, leading to a tangle of orbital trajectories that crisscrossed each other in a fine mesh.  This is known as the “homoclinic tangle”: it was the first glimpse that deterministic systems could lead to unpredictable results. Most importantly, he had developed the first mathematical tools that would be needed to analyze chaotic systems—such as the Poincaré section—but nearly half a century would pass before these tools would be picked up again. 

Poincaré paid out of his own pocket for the first printing to be destroyed and for the corrected version of his manuscript to be printed in its place [1]. No-one but the competition organizers and reviewers ever saw his first version.  Yet it was when he was correcting his mistake that he stumbled on chaos for the first time, which is what posterity remembers him for. This little episode in the history of physics went undiscovered for a century before being brought to light by Barrow-Green in her 1997 book Poincaré and the Three Body Problem [2].

Fig. 2 Henri Poincaré’s homoclinic tangle from the Standard Map. (The picture on the right is the Poincaré crater on the moon). For more details, see my blog on Poincaré and his Homoclinic Tangle.

Cartwight and Littlewood (1945)

During World War II, self-oscillations and nonlinear dynamics became strategic topics for the war effort in England. High-power magnetrons were driving long-range radar, keeping Britain alert to Luftwaffe bombing raids, and the tricky dynamics of these oscillators could be represented as a driven van der Pol oscillator. These oscillators had been studied in the 1920’s by the Dutch physicist Balthasar van der Pol (1889–1959) when he was completing his PhD thesis at the University of Utrecht on the topic of radio transmission through ionized gases. van der Pol had built a short-wave triode oscillator to perform experiments on radio diffraction to compare with his theoretical calculations of radio transmission. Van der Pol’s triode oscillator was an engineering feat that produced the shortest wavelengths of the day, making van der Pol intimately familiar with the operation of the oscillator, and he proposed a general form of differential equation for the triode oscillator.

Fig. 3 Driven van der Pol oscillator equation.

Research on the radar magnetron led to theoretical work on driven nonlinear oscillators, including the discovery that a driven van der Pol oscillator could break up into wild and intermittent patterns. This “bad” behavior of the oscillator circuit (bad for radar applications) was the first discovery of chaotic behavior in man-made circuits.

These irregular properties of the driven van der Pol equation were studied by Mary- Lucy Cartwright (1990–1998) (the first woman to be elected a fellow of the Royal Society) and John Littlewood (1885–1977) at Cambridge who showed that the coexistence of two periodic solutions implied that discontinuously recurrent motion—in today’s parlance, chaos— could result, which was clearly undesirable for radar applications. The work of Cartwright and Littlewood [3] later inspired the work by Levinson and Smale as they introduced the field of nonlinear dynamics.

Fig. 4 Mary Cartwright

Andrey Kolmogorov (1954)

The passing of the Russian dictator Joseph Stalin provided a long-needed opening for Soviet scientists to travel again to international conferences where they could meet with their western colleagues to exchange ideas.  Four Russian mathematicians were allowed to attend the 1954 International Congress of Mathematics (ICM) held in Amsterdam, the Netherlands.  One of those was Andrey Nikolaevich Kolmogorov (1903 – 1987) who was asked to give the closing plenary speech.  Despite the isolation of Russia during the Soviet years before World War II and later during the Cold War, Kolmogorov was internationally renowned as one of the greatest mathematicians of his day.

By 1954, Kolmogorov’s interests had spread into topics in topology, turbulence and logic, but no one was prepared for the topic of his plenary lecture at the ICM in Amsterdam.  Kolmogorov spoke on the dusty old topic of Hamiltonian mechanics.  He even apologized at the start for speaking on such an old topic when everyone had expected him to speak on probability theory.  Yet, in the length of only half an hour he laid out a bold and brilliant outline to a proof that the three-body problem had an infinity of stable orbits.  Furthermore, these stable orbits provided impenetrable barriers to the diffusion of chaotic motion across the full phase space of the mechanical system. The crucial consequences of this short talk were lost on almost everyone who attended as they walked away after the lecture, but Kolmogorov had discovered a deep lattice structure that constrained the chaotic dynamics of the solar system.

Kolmogorov’s approach used a result from number theory that provides a measure of how close an irrational number is to a rational one.  This is an important question for orbital dynamics, because whenever the ratio of two orbital periods is a ratio of integers, especially when the integers are small, then the two bodies will be in a state of resonance, which was the fundamental source of chaos in Poincaré’s stability analysis of the three-body problem.    After Komogorov had boldly presented his results at the ICM of 1954 [4], what remained was the necessary mathematical proof of Kolmogorov’s daring conjecture.  This would be provided by one of his students, V. I. Arnold, a decade later.  But before the mathematicians could settle the issue, an atmospheric scientist, using one of the first electronic computers, rediscovered Poincaré’s tangle, this time in a simplified model of the atmosphere.

Edward Lorenz (1963)

In 1960, with the help of a friend at MIT, the atmospheric scientist Edward Lorenz purchased a Royal McBee LGP-30 tabletop computer to make calculation of a simplified model he had derived for the weather.  The McBee used 113 of the latest miniature vacuum tubes and also had 1450 of the new solid-state diodes made of semiconductors rather than tubes, which helped reduce the size further, as well as reducing heat generation.  The McBee had a clock rate of 120 kHz and operated on 31-bit numbers with a 15 kB memory.  Under full load it used 1500 Watts of power to run.  But even with a computer in hand, the atmospheric equations needed to be simplified to make the calculations tractable.  Lorenz simplified the number of atmospheric equations down to twelve, and he began programming his Royal McBee. 

Progress was good, and by 1961, he had completed a large initial numerical study.  One day, as he was testing his results, he decided to save time by starting the computations midway by using mid-point results from a previous run as initial conditions.  He typed in the three-digit numbers from a paper printout and went down the hall for a cup of coffee.  When he returned, he looked at the printout of the twelve variables and was disappointed to find that they were not related to the previous full-time run.  He immediately suspected a faulty vacuum tube, as often happened.  But as he looked closer at the numbers, he realized that, at first, they tracked very well with the original run, but then began to diverge more and more rapidly until they lost all connection with the first-run numbers.  The internal numbers of the McBee had a precision of 6 decimal points, but the printer only printed three to save time and paper.  His initial conditions were correct to a part in a thousand, but this small error was magnified exponentially as the solution progressed.  When he printed out the full six digits (the resolution limit for the machine), and used these as initial conditions, the original trajectory returned.  There was no mistake.  The McBee was working perfectly.

At this point, Lorenz recalled that he “became rather excited”.  He was looking at a complete breakdown of predictability in atmospheric science.  If radically different behavior arose from the smallest errors, then no measurements would ever be accurate enough to be useful for long-range forecasting.  At a more fundamental level, this was a break with a long-standing tradition in science and engineering that clung to the belief that small differences produced small effects.  What Lorenz had discovered, instead, was that the deterministic solution to his 12 equations was exponentially sensitive to initial conditions (known today as SIC). 

The more Lorenz became familiar with the behavior of his equations, the more he felt that the 12-dimensional trajectories had a repeatable shape.  He tried to visualize this shape, to get a sense of its character, but it is difficult to visualize things in twelve dimensions, and progress was slow, so he simplified his equations even further to three variables that could be represented in a three-dimensional graph [5]. 

Fig. 5 Two-dimensional projection of the three-dimensional Lorenz Butterfly.

V. I. Arnold (1964)

Meanwhile, back in Moscow, an energetic and creative young mathematics student knocked on Kolmogorov’s door looking for an advisor for his undergraduate thesis.  The youth was Vladimir Igorevich Arnold (1937 – 2010), who showed promise, so Kolmogorov took him on as his advisee.  They worked on the surprisingly complex properties of the mapping of a circle onto itself, which Arnold filed as his dissertation in 1959.  The circle map holds close similarities with the periodic orbits of the planets, and this problem led Arnold down a path that drew tantalizingly close to Kolmogorov’s conjecture on Hamiltonian stability.  Arnold continued in his PhD with Kolmogorov, solving Hilbert’s 13th problem by showing that every function of n variables can be represented by continuous functions of a single variable.  Arnold was appointed as an assistant in the Faculty of Mechanics and Mathematics at Moscow State University.

Arnold’s habilitation topic was Kolmogorov’s conjecture, and his approach used the same circle map that had played an important role in solving Hilbert’s 13th problem.  Kolmogorov neither encouraged nor discouraged Arnold to tackle his conjecture.  Arnold was led to it independently by the similarity of the stability problem with the problem of continuous functions.  In reference to his shift to this new topic for his habilitation, Arnold stated “The mysterious interrelations between different branches of mathematics with seemingly no connections are still an enigma for me.”  [6] 

Arnold began with the problem of attracting and repelling fixed points in the circle map and made a fundamental connection to the theory of invariant properties of action-angle variables .  These provided a key element in the proof of Kolmogorov’s conjecture.  In late 1961, Arnold submitted his results to the leading Soviet physics journal—which promptly rejected it because he used forbidden terms for the journal, such as “theorem” and “proof”, and he had used obscure terminology that would confuse their usual physicist readership, terminology such as “Lesbesgue measure”, “invariant tori” and “Diophantine conditions”.  Arnold withdrew the paper.

Arnold later incorporated an approach pioneered by Jurgen Moser [7] and published a definitive article on the problem of small divisors in 1963 [8].  The combined work of Kolmogorov, Arnold and Moser had finally established the stability of irrational orbits in the three-body problem, the most irrational and hence most stable orbit having the frequency of the golden mean.  The term “KAM theory”, using the first initials of the three theorists, was coined in 1968 by B. V. Chirikov, who also introduced in 1969 what has become known as the Chirikov map (also known as the Standard map ) that reduced the abstract circle maps of Arnold and Moser to simple iterated functions that any student can program easily on a computer to explore KAM invariant tori and the onset of Hamiltonian chaos, as in Fig. 1 [9]. 

Fig. 6 The Chirikov Standard Map when the last stable orbits are about to dissolve for ε = 0.97.

Sephen Smale (1967)

Stephen Smale was at the end of a post-graduate fellowship from the National Science Foundation when he went to Rio to work with Mauricio Peixoto.  Smale and Peixoto had met in Princeton in 1960 where Peixoto was working with Solomon Lefschetz  (1884 – 1972) who had an interest in oscillators that sustained their oscillations in the absence of a periodic force.  For instance, a pendulum clock driven by the steady force of a hanging weight is a self-sustained oscillator.  Lefschetz was building on work by the Russian Aleksandr A. Andronov (1901 – 1952) who worked in the secret science city of Gorky in the 1930’s on nonlinear self-oscillations using Poincaré’s first return map.  The map converted the continuous trajectories of dynamical systems into discrete numbers, simplifying problems of feedback and control. 

The central question of mechanical control systems, even self-oscillating systems, was how to attain stability.  By combining approaches of Poincaré and Lyapunov, as well as developing their own techniques, the Gorky school became world leaders in the theory and applications of nonlinear oscillations.  Andronov published a seminal textbook in 1937 The Theory of Oscillations with his colleagues Vitt and Khaykin, and Lefschetz had obtained and translated the book into English in 1947, introducing it to the West.  When Peixoto returned to Rio, his interest in nonlinear oscillations captured the imagination of Smale even though his main mathematical focus was on problems of topology.  On the beach in Rio, Smale had an idea that topology could help prove whether systems had a finite number of periodic points.  Peixoto had already proven this for two dimensions, but Smale wanted to find a more general proof for any number of dimensions.

Norman Levinson (1912 – 1975) at MIT became aware of Smale’s interests and sent off a letter to Rio in which he suggested that Smale should look at Levinson’s work on the triode self-oscillator (a van der Pol oscillator), as well as the work of Cartwright and Littlewood who had discovered quasi-periodic behavior hidden within the equations.  Smale was puzzled but intrigued by Levinson’s paper that had no drawings or visualization aids, so he started scribbling curves on paper that bent back upon themselves in ways suggested by the van der Pol dynamics.  During a visit to Berkeley later that year, he presented his preliminary work, and a colleague suggested that the curves looked like strips that were being stretched and bent into a horseshoe. 

Smale latched onto this idea, realizing that the strips were being successively stretched and folded under the repeated transformation of the dynamical equations.  Furthermore, because dynamics can move forward in time as well as backwards, there was a sister set of horseshoes that were crossing the original set at right angles.  As the dynamics proceeded, these two sets of horseshoes were repeatedly stretched and folded across each other, creating an infinite latticework of intersections that had the properties of the Cantor set.  Here was solid proof that Smale’s original conjecture was wrong—the dynamics had an infinite number of periodicities, and they were nested in self-similar patterns in a latticework of points that map out a Cantor-like set of points.  In the two-dimensional case, shown in the figure, the fractal dimension of this lattice is D = ln4/ln3 = 1.26, somewhere in dimensionality between a line and a plane.  Smale’s infinitely nested set of periodic points was the same tangle of points that Poincaré had noticed while he was correcting his King Otto Prize manuscript.  Smale, using modern principles of topology, was finally able to put rigorous mathematical structure to Poincaré’s homoclinic tangle. Coincidentally, Poincaré had launched the modern field of topology, so in a sense he sowed the seeds to the solution to his own problem.

Fig. 7 The horseshoe takes regions of phase space and stretches and folds them over and over to create a lattice of overlapping trajectories.

Ruelle and Takens (1971)

The onset of turbulence was an iconic problem in nonlinear physics with a long history and a long list of famous researchers studying it.  As far back as the Renaissance, Leonardo da Vinci had made detailed studies of water cascades, sketching whorls upon whorls in charcoal in his famous notebooks.  Heisenberg, oddly, wrote his PhD dissertation on the topic of turbulence even while he was inventing quantum mechanics on the side.  Kolmogorov in the 1940’s applied his probabilistic theories to turbulence, and this statistical approach dominated most studies up to the time when David Ruelle and Floris Takens published a paper in 1971 that took a nonlinear dynamics approach to the problem rather than statistical, identifying strange attractors in the nonlinear dynamical Navier-Stokes equations [10].  This paper coined the phrase “strange attractor”.  One of the distinct characteristics of their approach was the identification of a bifurcation cascade.  A single bifurcation means a sudden splitting of an orbit when a parameter is changed slightly.  In contrast, a bifurcation cascade was not just a single Hopf bifurcation, as seen in earlier nonlinear models, but was a succession of Hopf bifurcations that doubled the period each time, so that period-two attractors became period-four attractors, then period-eight and so on, coming fast and faster, until full chaos emerged.  A few years later Gollub and Swinney experimentally verified the cascade route to turbulence , publishing their results in 1975 [11]. 

Fig. 8 Bifurcation cascade of the logistic map.

Feigenbaum (1978)

In 1976, computers were not common research tools, although hand-held calculators now were.  One of the most famous of this era was the Hewlett-Packard HP-65, and Feigenbaum pushed it to its limits.  He was particularly interested in the bifurcation cascade of the logistic map [12]—the way that bifurcations piled on top of bifurcations in a forking structure that showed increasing detail at increasingly fine scales.  Feigenbaum was, after all, a high-energy theorist and had overlapped at Cornell with Kenneth Wilson when he was completing his seminal work on the renormalization group approach to scaling phenomena.  Feigenbaum recognized a strong similarity between the bifurcation cascade and the ideas of real-space renormalization where smaller and smaller boxes were used to divide up space. 

One of the key steps in the renormalization procedure was the need to identify a ratio of the sizes of smaller structures to larger structures.  Feigenbaum began by studying how the bifurcations depended on the increasing growth rate.  He calculated the threshold values rm for each of the bifurcations, and then took the ratios of the intervals, comparing the previous interval (rm-1 – rm-2) to the next interval (rm – rm-1).  This procedure is like the well-known method to calculate the golden ratio = 1.61803 from the Fibonacci series, and Feigenbaum might have expected the golden ratio to emerge from his analysis of the logistic map.  After all, the golden ratio has a scary habit of showing up in physics, just like in the KAM theory.  However, as the bifurcation index m increased in Feigenbaum’s study, this ratio settled down to a limiting value of 4.66920.  Then he did what anyone would do with an unfamiliar number that emerges from a physical calculation—he tried to see if it was a combination of other fundamental numbers, like pi and Euler’s constant e, and even the golden ratio.  But none of these worked.  He had found a new number that had universal application to chaos theory [13]. 

Fig. 9 The ratio of the limits of successive cascades leads to a new universal number (the Feigenbaum number).

Gleick (1987)

By the mid-1980’s, chaos theory was seeping in to a broadening range of research topics that seemed to span the full breadth of science, from biology to astrophysics, from mechanics to chemistry. A particularly active group of chaos practitioners were J. Doyn Farmer, James Crutchfield, Norman Packard and Robert Shaw who founded the Dynamical Systems Collective at the University of California, Santa Cruz. One of the important outcomes of their work was a method to reconstruct the state space of a complex system using only its representative time series [14]. Their work helped proliferate the techniques of chaos theory into the mainstream. Many who started using these techniques were only vaguely aware of its long history until the science writer James Gleick wrote a best-selling history of the subject that brought chaos theory to the forefront of popular science [15]. And the rest, as they say, is history.

References

[1] Poincaré, H. and D. L. Goroff (1993). New methods of celestial mechanics. Edited and introduced by Daniel L. Goroff. New York, American Institute of Physics.

[2] J. Barrow-Green, Poincaré and the three body problem (London Mathematical Society, 1997).

[3] Cartwright,M.L.andJ.E.Littlewood(1945).“Onthenon-lineardifferential equation of the second order. I. The equation y′′ − k(1 – yˆ2)y′ + y = bλk cos(λt + a), k large.” Journal of the London Mathematical Society 20: 180–9. Discussed in Aubin, D. and A. D. Dalmedico (2002). “Writing the History of Dynamical Systems and Chaos: Longue DurÈe and Revolution, Disciplines and Cultures.” Historia Mathematica, 29: 273.

[4] Kolmogorov, A. N., (1954). “On conservation of conditionally periodic motions for a small change in Hamilton’s function.,” Dokl. Akad. Nauk SSSR (N.S.), 98: 527–30.

[5] Lorenz, E. N. (1963). “Deterministic Nonperiodic Flow.” Journal of the Atmo- spheric Sciences 20(2): 130–41.

[6] Arnold,V.I.(1997).“From superpositions to KAM theory,”VladimirIgorevich Arnold. Selected, 60: 727–40.

[7] Moser, J. (1962). “On Invariant Curves of Area-Preserving Mappings of an Annulus.,” Nachr. Akad. Wiss. Göttingen Math.-Phys, Kl. II, 1–20.

[8] Arnold, V. I. (1963). “Small denominators and problems of the stability of motion in classical and celestial mechanics (in Russian),” Usp. Mat. Nauk., 18: 91–192,; Arnold, V. I. (1964). “Instability of Dynamical Systems with Many Degrees of Freedom.” Doklady Akademii Nauk Sssr 156(1): 9.

[9] Chirikov, B. V. (1969). Research concerning the theory of nonlinear resonance and stochasticity. Institute of Nuclear Physics, Novosibirsk. 4. Note: The Standard Map Jn+1 =Jn sinθn θn+1 =θn +Jn+1
is plotted in Fig. 3.31 in Nolte, Introduction to Modern Dynamics (2015) on p. 139. For small perturbation ε, two fixed points appear along the line J = 0 corresponding to p/q = 1: one is an elliptical point (with surrounding small orbits) and the other is a hyperbolic point where chaotic behavior is first observed. With increasing perturbation, q elliptical points and q hyperbolic points emerge for orbits with winding numbers p/q with small denominators (1/2, 1/3, 2/3 etc.). Other orbits with larger q are warped by the increasing perturbation but are not chaotic. These orbits reside on invariant tori, known as the KAM tori, that do not disintegrate into chaos at small perturbation. The set of KAM tori is a Cantor-like set with non- zero measure, ensuring that stable behavior can survive in the presence of perturbations, such as perturbation of the Earth’s orbit around the Sun by Jupiter. However, with increasing perturbation, orbits with successively larger values of q disintegrate into chaos. The last orbits to survive in the Standard Map are the golden mean orbits with p/q = φ–1 and p/q = 2–φ. The critical value of the perturbation required for the golden mean orbits to disintegrate into chaos is surprisingly large at εc = 0.97.

[10] Ruelle,D. and F.Takens (1971).“OntheNatureofTurbulence.”Communications in Mathematical Physics 20(3): 167–92.

[11] Gollub, J. P. and H. L. Swinney (1975). “Onset of Turbulence in a Rotating Fluid.” Physical Review Letters, 35(14): 927–30.

[12] May, R. M. (1976). “Simple Mathematical-Models with very complicated Dynamics.” Nature, 261(5560): 459–67.

[13] M. J. Feigenbaum, “Quantitative Universality for a Class of Nnon-linear Transformations,” Journal of Statistical Physics 19, 25-52 (1978).

[14] Packard, N.; Crutchfield, J. P.; Farmer, J. Doyne; Shaw, R. S. (1980). “Geometry from a Time Series”. Physical Review Letters. 45 (9): 712–716.

[15] Gleick,J.(1987).Chaos:MakingaNewScience,NewYork:Viking.p.180.

A Short History of Multiple Dimensions

Hyperspace by any other name would sound as sweet, conjuring to the mind’s eye images of hypercubes and tesseracts, manifolds and wormholes, Klein bottles and Calabi Yau quintics.  Forget the dimension of time—that may be the most mysterious of all—but consider the extra spatial dimensions that challenge the mind and open the door to dreams of going beyond the bounds of today’s physics.

The geometry of n dimensions studies reality; no one doubts that. Bodies in hyperspace are subject to precise definition, just like bodies in ordinary space; and while we cannot draw pictures of them, we can imagine and study them.

(Poincare 1895)

Here is a short history of hyperspace.  It begins with advances by Möbius and Liouville and Jacobi who never truly realized what they had invented, until Cayley and Grassmann and Riemann made it explicit.  They opened Pandora’s box, and multiple dimensions burst upon the world never to be put back again, giving us today the manifolds of string theory and infinite-dimensional Hilbert spaces.

August Möbius (1827)

Although he is most famous for the single-surface strip that bears his name, one of the early contributions of August Möbius was the idea of barycentric coordinates [1] , for instance using three coordinates to express the locations of points in a two-dimensional simplex—the triangle. Barycentric coordinates are used routinely today in metallurgy to describe the alloy composition in ternary alloys.

August Möbius (1790 – 1868). Image.

Möbius’ work was one of the first to hint that tuples of numbers could stand in for higher dimensional space, and they were an early example of homogeneous coordinates that could be used for higher-dimensional representations. However, he was too early to use any language of multidimensional geometry.

Carl Jacobi (1834)

Carl Jacobi was a master at manipulating multiple variables, leading to his development of the theory of matrices. In this context, he came to study (n-1)-fold integrals over multiple continuous-valued variables. From our modern viewpoint, he was evaluating surface integrals of hyperspheres.

Carl Gustav Jacob Jacobi (1804 – 1851)

In 1834, Jacobi found explicit solutions to these integrals and published them in a paper with the imposing title “De binis quibuslibet functionibus homogeneis secundi ordinis per substitutiones lineares in alias binas transformandis, quae solis quadratis variabilium constant; una cum variis theorematis de transformatione et determinatione integralium multiplicium” [2]. The resulting (n-1)-fold integrals are

when the space dimension is even or odd, respectively. These are the surface areas of the manifolds called (n-1)-spheres in n-dimensional space. For instance, the 2-sphere is the ordinary surface 4πr2 of a sphere on our 3D space.

Despite the fact that we recognize these as surface areas of hyperspheres, Jacobi used no geometric language in his paper. He was still too early, and mathematicians had not yet woken up to the analogy of extending spatial dimensions beyond 3D.

Joseph Liouville (1838)

Joseph Liouville’s name is attached to a theorem that lies at the core of mechanical systems—Liouville’s Theorem that proves that volumes in high-dimensional phase space are incompressible. Surprisingly, Liouville had no conception of high dimensional space, to say nothing of abstract phase space. The story of the convoluted path that led Liouville’s name to be attached to his theorem is told in Chapter 6, “The Tangled Tale of Phase Space”, in Galileo Unbound (Oxford University Press, 2018).

Joseph Liouville (1809 – 1882)

Nonetheless, Liouville did publish a pure-mathematics paper in 1838 in Crelle’s Journal [3] that identified an invariant quantity that stayed constant during the differential change of multiple variables when certain criteria were satisfied. It was only later that Jacobi, as he was developing a new mechanical theory based on William R. Hamilton’s work, realized that the criteria needed for Liouville’s invariant quantity to hold were satisfied by conservative mechanical systems. Even then, neither Liouville nor Jacobi used the language of multidimensional geometry, but that was about to change in a quick succession of papers and books by three mathematicians who, unknown to each other, were all thinking along the same lines.

Facsimile of Liouville’s 1838 paper on invariants

Arthur Cayley (1843)

Arthur Cayley was the first to take the bold step to call the emerging geometry of multiple variables to be actual space. His seminal paper “Chapters in the Analytic Theory of n-Dimensions” was published in 1843 in the Philosophical Magazine [4]. Here, for the first time, Cayley recognized that the domain of multiple variables behaved identically to multidimensional space. He used little of the language of geometry in the paper, which was mostly analysis rather than geometry, but his bold declaration for spaces of n-dimensions opened the door to a changing mindset that would soon sweep through geometric reasoning.

Arthur Cayley (1821 – 1895). Image

Hermann Grassmann (1844)

Grassmann’s life story, although not overly tragic, was beset by lifelong setbacks and frustrations. He was a mathematician literally 30 years ahead of his time, but because he was merely a high-school teacher, no-one took his ideas seriously.

Somehow, in nearly a complete vacuum, disconnected from the professional mathematicians of his day, he devised an entirely new type of algebra that allowed geometric objects to have orientation. These could be combined in numerous different ways obeying numerous different laws. The simplest elements were just numbers, but these could be extended to arbitrary complexity with arbitrary number of elements. He called his theory a theory of “Extension”, and he self-published a thick and difficult tome that contained all of his ideas [5]. He tried to enlist Möbius to help disseminate his ideas, but even Möbius could not recognize what Grassmann had achieved.

In fact, what Grassmann did achieve was vector algebra of arbitrarily high dimension. Perhaps more impressive for the time is that he actually recognized what he was dealing with. He did not know of Cayley’s work, but independently of Cayley he used geometric language for the first time describing geometric objects in high dimensional spaces. He said, “since this method of formation is theoretically applicable without restriction, I can define systems of arbitrarily high level by this method… geometry goes no further, but abstract science knows no limits.” [6]

Grassman was convinced that he had discovered something astonishing and new, which he had, but no one understood him. After years trying to get mathematicians to listen, he finally gave up, left mathematics behind, and actually achieved some fame within his lifetime in the field of linguistics. There is even a law of diachronic linguistics named after him. For the story of Grassmann’s struggles, see the blog on Grassmann and his Wedge Product .

Hermann Grassmann (1809 – 1877).

Julius Plücker (1846)

Projective geometry sounds like it ought to be a simple topic, like the projective property of perspective art as parallel lines draw together and touch at the vanishing point on the horizon of a painting. But it is far more complex than that, and it provided a separate gateway into the geometry of high dimensions.

A hint of its power comes from homogeneous coordinates of the plane. These are used to find where a point in three dimensions intersects a plane (like the plane of an artist’s canvas). Although the point on the plane is in two dimensions, it take three homogeneous coordinates to locate it. By extension, if a point is located in three dimensions, then it has four homogeneous coordinates, as if the three dimensional point were a projection onto 3D from a 4D space.

These ideas were pursued by Julius Plücker as he extended projective geometry from the work of earlier mathematicians such as Desargues and Möbius. For instance, the barycentric coordinates of Möbius are a form of homogeneous coordinates. What Plücker discovered is that space does not need to be defined by a dense set of points, but a dense set of lines can be used just as well. The set of lines is represented as a four-dimensional manifold. Plücker reported his findings in a book in 1846 [7] and expanded on the concepts of multidimensional spaces published in 1868 [8].

Julius Plücker (1801 – 1868).

Ludwig Schläfli (1851)

After Plücker, ideas of multidimensional analysis became more common, and Ludwig Schläfli (1814 – 1895), a professor at the University of Berne in Switzerland, was one of the first to fully explore analytic geometry in higher dimensions. He described multidimsnional points that were located on hyperplanes, and he calculated the angles between intersecting hyperplanes [9]. He also investigated high-dimensional polytopes, from which are derived our modern “Schläfli notation“. However, Schläffli used his own terminology for these objects, emphasizing analytic properties without using the ordinary language of high-dimensional geometry.

Some of the polytopes studied by Schläfli.

Bernhard Riemann (1854)

The person most responsible for the shift in the mindset that finally accepted the geometry of high-dimensional spaces was Bernhard Riemann. In 1854 at the university in Göttingen he presented his habilitation talk “Über die Hypothesen, welche der Geometrie zu Grunde liegen” (Over the hypotheses on which geometry is founded). A habilitation in Germany was an examination that qualified an academic to be able to advise their own students (somewhat like attaining tenure in US universities).

The habilitation candidate would suggest three topics, and it was usual for the first or second to be picked. Riemann’s three topics were: trigonometric properties of functions (he was the first to rigorously prove the convergence properties of Fourier series), aspects of electromagnetic theory, and a throw-away topic that he added at the last minute on the foundations of geometry (on which he had not actually done any serious work). Gauss was his faculty advisor and picked the third topic. Riemann had to develop the topic in a very short time period, starting from scratch. The effort exhausted him mentally and emotionally, and he had to withdraw temporarily from the university to regain his strength. After returning around Easter, he worked furiously for seven weeks to develop a first draft and then asked Gauss to set the examination date. Gauss initially thought to postpone to the Fall semester, but then at the last minute scheduled the talk for the next day. (For the story of Riemann and Gauss, see Chapter 4 “Geometry on my Mind” in the book Galileo Unbound (Oxford, 2018)).

Riemann gave his lecture on 10 June 1854, and it was a masterpiece. He stripped away all the old notions of space and dimensions and imbued geometry with a metric structure that was fundamentally attached to coordinate transformations. He also showed how any set of coordinates could describe space of any dimension, and he generalized ideas of space to include virtually any ordered set of measurables, whether it was of temperature or color or sound or anything else. Most importantly, his new system made explicit what those before him had alluded to: Jacobi, Grassmann, Plücker and Schläfli. Ideas of Riemannian geometry began to percolate through the mathematics world, expanding into common use after Richard Dedekind edited and published Riemann’s habilitation lecture in 1868 [10].

Bernhard Riemann (1826 – 1866). Image.

George Cantor and Dimension Theory (1878)

In discussions of multidimensional spaces, it is important to step back and ask what is dimension? This question is not as easy to answer as it may seem. In fact, in 1878, George Cantor proved that there is a one-to-one mapping of the plane to the line, making it seem that lines and planes are somehow the same. He was so astonished at his own results that he wrote in a letter to his friend Richard Dedekind “I see it, but I don’t believe it!”. A few decades later, Peano and Hilbert showed how to create area-filling curves so that a single continuous curve can approach any point in the plane arbitrarily closely, again casting shadows of doubt on the robustness of dimension. These questions of dimensionality would not be put to rest until the work by Karl Menger around 1926 when he provided a rigorous definition of topological dimension (see the Blog on the History of Fractals).

Area-filling curves by Peano and Hilbert.

Hermann Minkowski and Spacetime (1908)

Most of the earlier work on multidimensional spaces were mathematical and geometric rather than physical. One of the first examples of physical hyperspace is the spacetime of Hermann Minkowski. Although Einstein and Poincaré had noted how space and time were coupled by the Lorentz equations, they did not take the bold step of recognizing space and time as parts of a single manifold. This step was taken in 1908 [11] by Hermann Minkowski who claimed

“Gentlemen! The views of space and time which I wish to lay before you … They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.”Herman Minkowski (1908)

For the story of Einstein and Minkowski, see the Blog on Minkowski’s Spacetime: The Theory that Einstein Overlooked.

Facsimile of Minkowski’s 1908 publication on spacetime.

Felix Hausdorff and Fractals (1918)

No story of multiple “integer” dimensions can be complete without mentioning the existence of “fractional” dimensions, also known as fractals. The individual who is most responsible for the concepts and mathematics of fractional dimensions was Felix Hausdorff. Before being compelled to commit suicide by being jewish in Nazi Germany, he was a leading light in the intellectual life of Leipzig, Germany. By day he was a brilliant mathematician, by night he was the author Paul Mongré writing poetry and plays.

In 1918, as the war was ending, he wrote a small book “Dimension and Outer Measure” that established ways to construct sets whose measured dimensions were fractions rather than integers [12]. Benoit Mandelbrot would later popularize these sets as “fractals” in the 1980’s. For the background on a history of fractals, see the Blog A Short History of Fractals.

Felix Hausdorff (1868 – 1942)
Example of a fractal set with embedding dimension DE = 2, topological dimension DT = 1, and fractal dimension DH = 1.585.


The Fifth Dimension of Theodore Kaluza (1921) and Oskar Klein (1926)

The first theoretical steps to develop a theory of a physical hyperspace (in contrast to merely a geometric hyperspace) were taken by Theodore Kaluza at the University of Königsberg in Prussia. He added an additional spatial dimension to Minkowski spacetime as an attempt to unify the forces of gravity with the forces of electromagnetism. Kaluza’s paper was communicated to the journal of the Prussian Academy of Science in 1921 through Einstein who saw the unification principles as a parallel of some of his own attempts [13]. However, Kaluza’s theory was fully classical and did not include the new quantum theory that was developing at that time in the hands of Heisenberg, Bohr and Born.

Oskar Klein was a Swedish physicist who was in the “second wave” of quantum physicists having studied under Bohr. Unaware of Kaluza’s work, Klein developed a quantum theory of a five-dimensional spacetime [14]. For the theory to be self-consistent, it was necessary to roll up the extra dimension into a tight cylinder. This is like a strand a spaghetti—looking at it from far away it looks like a one-dimensional string, but an ant crawling on the spaghetti can move in two dimensions—along the long direction, or looping around it in the short direction called a compact dimension. Klein’s theory was an early attempt at what would later be called string theory. For the historical background on Kaluza and Klein, see the Blog on Oskar Klein.

The wave equations of Klein-Gordon, Schrödinger and Dirac.

John Campbell (1931): Hyperspace in Science Fiction

Art has a long history of shadowing the sciences, and the math and science of hyperspace was no exception. One of the first mentions of hyperspace in science fiction was in the story “Islands in Space’, by John Campbell [15], published in the Amazing Stories quarterly in 1931, where it was used as an extraordinary means of space travel.

In 1951, Isaac Asimov made travel through hyperspace the transportation network that connected the galaxy in his Foundation Trilogy [16].

Testez-vous : Isaac Asimov avait-il (entièrement) raison ? - Sciences et  Avenir
Isaac Asimov (1920 – 1992)

John von Neumann and Hilbert Space (1932)

Quantum mechanics had developed rapidly through the 1920’s, but by the early 1930’s it was in need of an overhaul, having outstripped rigorous mathematical underpinnings. These underpinnings were provided by John von Neumann in his 1932 book on quantum theory [17]. This is the book that cemented the Copenhagen interpretation of quantum mechanics, with projection measurements and wave function collapse, while also establishing the formalism of Hilbert space.

Hilbert space is an infinite dimensional vector space of orthogonal eigenfunctions into which any quantum wave function can be decomposed. The physicists of today work and sleep in Hilbert space as their natural environment, often losing sight of its infinite dimensions that don’t seem to bother anyone. Hilbert space is more than a mere geometrical space, but less than a full physical space (like five-dimensional spacetime). Few realize that what is so often ascribed to Hilbert was actually formalized by von Neumann, among his many other accomplishments like stored-program computers and game theory.

John von Neumann (1903 – 1957). Image Credits.

Einstein-Rosen Bridge (1935)

One of the strangest entities inhabiting the theory of spacetime is the Einstein-Rosen Bridge. It is space folded back on itself in a way that punches a short-cut through spacetime. Einstein, working with his collaborator Nathan Rosen at Princeton’s Institute for Advanced Study, published a paper in 1935 that attempted to solve two problems [18]. The first problem was the Schwarzschild singularity at a radius r = 2M/c2 known as the Schwarzschild radius or the Event Horizon. Einstein had a distaste for such singularities in physical theory and viewed them as a problem. The second problem was how to apply the theory of general relativity (GR) to point masses like an electron. Again, the GR solution to an electron blows up at the location of the particle at r = 0.

Einstein-Rosen Bridge. Image.

To eliminate both problems, Einstein and Rosen (ER) began with the Schwarzschild metric in its usual form

where it is easy to see that it “blows up” when r = 2M/c2 as well as at r = 0. ER realized that they could write a new form that bypasses the singularities using the simple coordinate substitution

to yield the “wormhole” metric

It is easy to see that as the new variable u goes from -inf to +inf that this expression never blows up. The reason is simple—it removes the 1/r singularity by replacing it with 1/(r + ε). Such tricks are used routinely today in computational physics to keep computer calculations from getting too large—avoiding the divide-by-zero problem. It is also known as a form of regularization in machine learning applications. But in the hands of Einstein, this simple “bypass” is not just math, it can provide a physical solution.

It is hard to imagine that an article published in the Physical Review, especially one written about a simple variable substitution, would appear on the front page of the New York Times, even appearing “above the fold”, but such was Einstein’s fame this is exactly the response when he and Rosen published their paper. The reason for the interest was because of the interpretation of the new equation—when visualized geometrically, it was like a funnel between two separated Minkowski spaces—in other words, what was named a “wormhole” by John Wheeler in 1957. Even back in 1935, there was some sense that this new property of space might allow untold possibilities, perhaps even a form of travel through such a short cut.

As it turns out, the ER wormhole is not stable—it collapses on itself in an incredibly short time so that not even photons can get through it in time. More recent work on wormholes have shown that it can be stabilized by negative energy density, but ordinary matter cannot have negative energy density. On the other hand, the Casimir effect might have a type of negative energy density, which raises some interesting questions about quantum mechanics and the ER bridge.

Edward Witten’s 10+1 Dimensions (1995)

A history of hyperspace would not be complete without a mention of string theory and Edward Witten’s unification of the variously different 10-dimensional string theories into 10- or 11-dimensional M-theory. At a string theory conference at USC in 1995 he pointed out that the 5 different string theories of the day were all related through dualities. This observation launched the second superstring revolution that continues today. In this theory, 6 extra spatial dimensions are wrapped up into complex manifolds such as the Calabi-Yau manifold.

Two-dimensional slice of a six-dimensional Calabi-Yau quintic manifold.

Prospects

There is definitely something wrong with our three-plus-one dimensions of spacetime. We claim that we have achieved the pinnacle of fundamental physics with what is called the Standard Model and the Higgs boson, but dark energy and dark matter loom as giant white elephants in the room. They are giant, gaping, embarrassing and currently unsolved. By some estimates, the fraction of the energy density of the universe comprised of ordinary matter is only 5%. The other 95% is in some form unknown to physics. How can physicists claim to know anything if 95% of everything is in some unknown form?

The answer, perhaps to be uncovered sometime in this century, may be the role of extra dimensions in physical phenomena—probably not in every-day phenomena, and maybe not even in high-energy particles—but in the grand expanse of the cosmos.

By David D. Nolte, Feb. 8, 2023


Bibliography:

M. Kaku, R. O’Keefe, Hyperspace: A scientific odyssey through parallel universes, time warps, and the tenth dimension.  (Oxford University Press, New York, 1994).

A. N. Kolmogorov, A. P. Yushkevich, Mathematics of the 19th century: Geometry, analytic function theory.  (Birkhäuser Verlag, Basel ; 1996).


References:

[1] F. Möbius, in Möbius, F. Gesammelte Werke,, D. M. Saendig, Ed. (oHG, Wiesbaden, Germany, 1967), vol. 1, pp. 36-49.

[2] Carl Jacobi, “De binis quibuslibet functionibus homogeneis secundi ordinis per substitutiones lineares in alias binas transformandis, quae solis quadratis variabilium constant; una cum variis theorematis de transformatione et determinatione integralium multiplicium” (1834)

[3] J. Liouville, Note sur la théorie de la variation des constantes arbitraires. Liouville Journal 3, 342-349 (1838).

[4] A. Cayley, Chapters in the analytical geometry of n dimensions. Collected Mathematical Papers 1, 317-326, 119-127 (1843).

[5] H. Grassmann, Die lineale Ausdehnungslehre.  (Wiegand, Leipzig, 1844).

[6] H. Grassmann quoted in D. D. Nolte, Galileo Unbound (Oxford University Press, 2018) pg. 105

[7] J. Plücker, System der Geometrie des Raumes in Neuer Analytischer Behandlungsweise, Insbesondere de Flächen Sweiter Ordnung und Klasse Enthaltend.  (Düsseldorf, 1846).

[8] J. Plücker, On a New Geometry of Space (1868).

[9] L. Schläfli, J. H. Graf, Theorie der vielfachen Kontinuität. Neue Denkschriften der Allgemeinen Schweizerischen Gesellschaft für die Gesammten Naturwissenschaften 38. ([s.n.], Zürich, 1901).

[10] B. Riemann, Über die Hypothesen, welche der Geometrie zu Grunde liegen, Habilitationsvortrag. Göttinger Abhandlung 13,  (1854).

[11] Minkowski, H. (1909). “Raum und Zeit.” Jahresbericht der Deutschen Mathematikier-Vereinigung: 75-88.

[12] Hausdorff, F.(1919).“Dimension und ausseres Mass,”Mathematische Annalen, 79: 157–79.

[13] Kaluza, Theodor (1921). “Zum Unitätsproblem in der Physik”. Sitzungsber. Preuss. Akad. Wiss. Berlin. (Math. Phys.): 966–972

[14] Klein, O. (1926). “Quantentheorie und fünfdimensionale Relativitätstheorie“. Zeitschrift für Physik. 37 (12): 895

[15] John W. Campbell, Jr. “Islands of Space“, Amazing Stories Quarterly (1931)

[16] Isaac Asimov, Foundation (Gnome Press, 1951)

[17] J. von Neumann, Mathematical Foundations of Quantum Mechanics.  (Princeton University Press, ed. 1996, 1932).

[18] A. Einstein and N. Rosen, “The Particle Problem in the General Theory of Relativity,” Phys. Rev. 48(73) (1935).


Interference (New from Oxford University Press: 2023)

Read about the history of light and interference.

Available at Amazon.

Available at Oxford U Press

Avaliable at Barnes & Nobles

A Short History of Quantum Entanglement

Despite the many apparent paradoxes posed in physics—the twin and ladder paradoxes of relativity theory, Olber’s paradox of the bright night sky, Loschmitt’s paradox of irreversible statistical fluctuations—these are resolved by a deeper look at the underlying assumptions—the twin paradox is resolved by considering shifts in reference frames, the ladder paradox is resolved by the loss of simultaneity, Olber’s paradox is resolved by a finite age to the universe, and Loschmitt’s paradox is resolved by fluctuation theorems.  In each case, no physical principle is violated, and each paradox is fully explained.

However, there is at least one “true” paradox in physics that defies consistent explanation—quantum entanglement.  Quantum entanglement was first described by Einstein with colleagues Podolsky and Rosen in the famous EPR paper of 1935 as an argument against the completeness of quantum mechanics, and it was given its name by Schrödinger the same year in the paper where he introduced his “cat” as a burlesque consequence of entanglement. 

Here is a short history of quantum entanglement [1], from its beginnings in 1935 to the recent 2022 Nobel prize in Physics awarded to John Clauser, Alain Aspect and Anton Zeilinger.

The EPR Papers of 1935

Einstein can be considered as the father of quantum mechanics, even over Planck, because of his 1905 derivation of the existence of the photon as a discrete carrier of a quantum of energy (see Einstein versus Planck).  Even so, as Heisenberg and Bohr advanced quantum mechanics in the mid 1920’s, emphasizing the underlying non-deterministic outcomes of measurements, and in particular the notion of instantaneous wavefunction collapse, they pushed the theory in directions that Einstein found increasingly disturbing and unacceptable. 

This feature is an excerpt from an upcoming book, Interference: The History of Optical Interferometry and the Scientists Who Tamed Light (Oxford University Press, July 2023), by David D. Nolte.

At the invitation-only Solvay Congresses of 1927 and 1930, where all the top physicists met to debate the latest advances, Einstein and Bohr began a running debate that was epic in the history of physics as the two top minds went head-to-head as the onlookers looked on in awe.  Ultimately, Einstein was on the losing end.  Although he was convinced that something was missing in quantum theory, he could not counter all of Bohr’s rejoinders, even as Einstein’s assaults became ever more sophisticated, and he left the field of battle beaten but not convinced.  Several years later he launched his last and ultimate salvo.

Fig. 1 Niels Bohr and Albert Einstein

At the Institute for Advanced Study in Princeton, New Jersey, in the 1930’s Einstein was working with Nathan Rosen and Boris Podolsky when he envisioned a fundamental paradox in quantum theory that occurred when two widely-separated quantum particles were required to share specific physical properties because of simple conservation theorems like energy and momentum.  Even Bohr and Heisenberg could not deny the principle of conservation of energy and momentum, and Einstein devised a two-particle system for which these conservation principles led to an apparent violation of Heisenberg’s own uncertainty principle.  He left the details to his colleagues, with Podolsky writing up the main arguments.  They published the paper in the Physical Review in March of 1935 with the title “Can Quantum-Mechanical Description of Physical Reality be Considered Complete” [2].  Because of the three names on the paper (Einstein, Podolsky, Rosen), it became known as the EPR paper, and the paradox they presented became known as the EPR paradox.

When Bohr read the paper, he was initially stumped and aghast.  He felt that EPR had shaken the very foundations of the quantum theory that he and his institute had fought so hard to establish.  He also suspected that EPR had made a mistake in their arguments, and he halted all work at his institute in Copenhagen until they could construct a definitive answer.  A few months later, Bohr published a paper in the Physical Review in July of 1935, using the identical title that EPR had used, in which he refuted the EPR paradox [3].  There is not a single equation or figure in the paper, but he used his “awful incantation terminology” to maximum effect, showing that one of the EPR assumptions on the assessment of uncertainties to position and momentum was in error, and he was right.

Einstein was disgusted.  He had hoped that this ultimate argument against the completeness of quantum mechanics would stand the test of time, but Bohr had shot it down within mere months.  Einstein was particularly disappointed with Podolsky, because Podolsky had tried too hard to make the argument specific to position and momentum, leaving a loophole for Bohr to wiggle through, where Einstein had wanted the argument to rest on deeper and more general principles. 

Despite Bohr’s victory, Einstein had been correct in his initial formulation of the EPR paradox that showed quantum mechanics did not jibe with common notions of reality.  He and Schrödinger exchanged letters commiserating with each other and encouraging each other in their counter beliefs against Bohr and Heisenberg.  In November of 1935, Schrödinger published a broad, mostly philosophical, paper in Naturwissenschaften [4] in which he amplified the EPR paradox with the use of an absurd—what he called burlesque—consequence of wavefunction collapse that became known as Schrödinger’s Cat.  He also gave the central property of the EPR paradox its name: entanglement.

Ironically, both Einstein’s entanglement paradox and Schrödinger’s Cat, which were formulated originally to be arguments against the validity of quantum theory, have become established quantum tools.  Today, entangled particles are the core workhorses of quantum information systems, and physicists are building larger and larger versions of Schrödinger’s Cat that may eventually merge with the physics of the macroscopic world.

Bohm and Ahronov Tackle EPR

The physicist David Bohm was a rare political exile from the United States.  He was born in the heart of Pennsylvania in the town of Wilkes-Barre, attended Penn State and then the University of California at Berkeley, where he joined Robert Oppenheimer’s research group.  While there, he became deeply involved in the fight for unions and socialism, activities for which he was called before McCarthy’s Committee on Un-American Activities.  He invoked his right to the fifth amendment for which he was arrested.  Although he was later acquitted, Princeton University fired him from his faculty position, and fearing another arrest, he fled to Brazil where his US passport was confiscated by American authorities.  He had become a physicist without a country. 

Fig. 2 David Bohm

Despite his personal trials, Bohm remained scientifically productive.  He published his influential textbook on quantum mechanics in the midst of his Senate hearings, and after a particularly stimulating discussion with Einstein shortly before he fled the US, he developed and published an alternative version of quantum theory in 1952 that was fully deterministic—removing Einstein’s “God playing dice”—by creating a hidden-variable theory [5].

Hidden-variable theories of quantum mechanics seek to remove the randomness of quantum measurement by assuming that some deeper element of quantum phenomena—a hidden variable—explains each outcome.  But it is also assumed that these hidden variables are not directly accessible to experiment.  In this sense, the quantum theory of Bohr and Heisenberg was “correct” but not “complete”, because there were things that the theory could not predict or explain.

Bohm’s hidden variable theory, based on a quantum potential, was able to reproduce all the known results of standard quantum theory without invoking the random experimental outcomes that Einstein abhorred.  However, it still contained one crucial element that could not sweep away the EPR paradox—it was nonlocal.

Nonlocality lies at the heart of quantum theory.  In its simplest form, the nonlocal nature of quantum phenomenon says that quantum states span spacetime with space-like separations, meaning that parts of the wavefunction are non-causally connected to other parts of the wavefunction.  Because Einstein was fundamentally committed to causality, the nonlocality of quantum theory was what he found most objectionable, and Bohm’s elegant hidden-variable theory, that removed Einstein’s dreaded randomness, could not remove that last objection of non-causality.

After working in Brazil for several years, Bohm moved to the Technion University in Israel where he began a fruitful collaboration with Yakir Ahronov.  In addition to proposing the Ahronov-Bohm effect, in 1957 they reformulated Podolsky’s version of the EPR paradox that relied on continuous values of position and momentum and replaced it with a much simpler model based on the Stern-Gerlach effect on spins and further to the case of positronium decay into two photons with correlated polarizations.  Bohm and Ahronov reassessed experimental results of positronium decay that had been made by Madame Wu in 1950 at Columbia University and found it in full agreement with standard quantum theory.

John Bell’s Inequalities

John Stuart Bell had an unusual start for a physicist.  His family was too poor to give him an education appropriate to his skills, so he enrolled in vocational school where he took practical classes that included brick laying.  Working later as a technician in a university lab, he caught the attention of his professors who sponsored him to attend the university.  With a degree in physics, he began working at CERN as an accelerator designer when he again caught the attention of his supervisors who sponsored him to attend graduate school.  He graduated with a PhD and returned to CERN as a card-carrying physicist with all the rights and privileges that entailed.

Fig. 3 John Bell

During his university days, he had been fascinated by the EPR paradox, and he continued thinking about the fundamentals of quantum theory.  On a sabbatical to the Stanford accelerator in 1960 he began putting mathematics to the EPR paradox to see whether any local hidden variable theory could be compatible with quantum mechanics.  His analysis was fully general, so that it could rule out as-yet-unthought-of hidden-variable theories.  The result of this work was a set of inequalities that must be obeyed by any local hidden-variable theory.  Then he made a simple check using the known results of quantum measurement and showed that his inequalities are violated by quantum systems.  This ruled out the possibility of any local hidden variable theory (but not Bohm’s nonlocal hidden-variable theory).  Bell published his analysis in 1964 [6] in an obscure journal that almost no one read…except for a curious graduate student at Columbia University who began digging into the fundamental underpinnings of quantum theory against his supervisor’s advice.

Fig. 4 Polarization measurements on entangled photons violate Bell’s inequality.

John Clauser’s Tenacious Pursuit

As a graduate student in astrophysics at Columbia University, John Clauser was supposed to be doing astrophysics.  Instead, he spent his time musing over the fundamentals of quantum theory.  In 1967 Clauser stumbled across Bell’s paper while he was in the library.  The paper caught his imagination, but he also recognized that the inequalities were not experimentally testable, because they required measurements that depended directly on hidden variables, which are not accessible.  He began thinking of ways to construct similar inequalities that could be put to an experimental test, and he wrote about his ideas to Bell, who responded with encouragement.  Clauser wrote up his ideas in an abstract for an upcoming meeting of the American Physical Society, where one of the abstract reviewers was Abner Shimony of Boston University.  Clauser was surprised weeks later when he received a telephone call from Shimony.  Shimony and his graduate student Micheal Horne had been thinking along similar lines, and Shimony proposed to Clauser that they join forces.  They met in Boston where they were met Richard Holt, a graudate student at Harvard who was working on experimental tests of quantum mechanics.  Collectively, they devised a new type of Bell inequality that could be put to experimental test [7].  The result has become known as the CHSH Bell inequality (after Clauser, Horne, Shimony and Holt).

Fig. 5 John Clauser

When Clauser took a post-doc position in Berkeley, he began searching for a way to do the experiments to test the CHSH inequality, even though Holt had a head start at Harvard.  Clauser enlisted the help of Charles Townes, who convinced one of the Berkeley faculty to loan Clauser his graduate student, Stuart Freedman, to help.  Clauser and Freedman performed the experiments, using a two-photon optical decay of calcium ions and found a violation of the CHSH inequality by 5 standard deviations, publishing their result in 1972 [8]. 

Fig. 6 CHSH inequality violated by entangled photons.

Alain Aspect’s Non-locality

Just as Clauser’s life was changed when he stumbled on Bell’s obscure paper in 1967, the paper had the same effect on the life of French physicist Alain Aspect who stumbled on it in 1975.  Like Clauser, he also sought out Bell for his opinion, meeting with him in Geneva, and Aspect similarly received Bell’s encouragement, this time with the hope to build upon Clauser’s work. 

Fig. 7 Alain Aspect

In some respects, the conceptual breakthrough achieved by Clauser had been the CHSH inequality that could be tested experimentally.  The subsequent Clauser Freedman experiments were not a conclusion, but were just the beginning, opening the door to deeper tests.  For instance, in the Clauser-Freedman experiments, the polarizers were static, and the detectors were not widely separated, which allowed the measurements to be time-like separated in spacetime.  Therefore, the fundamental non-local nature of quantum physics had not been tested.

Aspect began a thorough and systematic program, that would take him nearly a decade to complete, to test the CHSH inequality under conditions of non-locality.  He began with a much brighter source of photons produced using laser excitation of the calcium ions.  This allowed him to perform the experiment in 100’s of seconds instead of the hundreds of hours by Clauser.  With such a high data rate, Aspect was able to verify violation of the Bell inequality to 10 standard deviations, published in 1981 [9].

However, the real goal was to change the orientations of the polarizers while the photons were in flight to widely separated detectors [10].  This experiment would allow the detection to be space-like separated in spacetime.  The experiments were performed using fast-switching acoustic-optic modulators, and the Bell inequality was violated to 5 standard deviations [11].  This was the most stringent test yet performed and the first to fully demonstrate the non-local nature of quantum physics.

Anton Zeilinger: Master of Entanglement

If there is one physicist today whose work encompasses the broadest range of entangled phenomena, it would be the Austrian physicist, Anton Zeilinger.  He began his career in neutron interferometery, but when he was bitten by the entanglement bug in 1976, he switched to quantum photonics because of the superior control that can be exercised using optics over sources and receivers and all the optical manipulations in between.

Fig. 8 Anton Zeilinger

Working with Daniel Greenberger and Micheal Horne, they took the essential next step past the Bohm two-particle entanglement to consider a 3-particle entangled state that had surprising properties.  While the violation of locality by the two-particle entanglement was observed through the statistical properties of many measurements, the new 3-particle entanglement could show violations on single measurements, further strengthening the arguments for quantum non-locality.  This new state is called the GHZ state (after Greenberger, Horne and Zeilinger) [12].

As the Zeilinger group in Vienna was working towards experimental demonstrations of the GHZ state, Charles Bennett of IBM proposed the possibility for quantum teleportation, using entanglement as a core quantum information resource [13].   Zeilinger realized that his experimental set-up could perform an experimental demonstration of the effect, and in a rapid re-tooling of the experimental apparatus [14], the Zeilinger group was the first to demonstrate quantum teleportation that satisfied the conditions of the Bennett teleportation proposal [15].  An Italian-UK collaboration also made an early demonstration of a related form of teleportation in a paper that was submitted first, but published after Zeilinger’s, due to delays in review [16].  But teleportation was just one of a widening array of quantum applications for entanglement that was pursued by the Zeilinger group over the succeeding 30 years [17], including entanglement swapping, quantum repeaters, and entanglement-based quantum cryptography. Perhaps most striking, he has worked on projects at astronomical observatories that entangle photons coming from cosmic sources.

By David D. Nolte Nov. 26, 2022


Read more about the history of quantum entanglement in Interference (New From Oxford University Press, 2023)

A popular account of the trials and toils of the scientists and engineers who tamed light and used it to probe the universe.


Video Lectures

Physics Colloquium on the Backstory of the 2023 Nobel Prize in Physics


Timeline

1935 – Einstein EPR

1935 – Bohr EPR

1935 – Schrödinger: Entanglement and Cat

1950 – Madam Wu positron decay

1952 – David Bohm and Non-local hidden variables

1957 – Bohm and Ahronov version of EPR

1963 – Bell’s inequalities

1967 – Clauser reads Bell’s paper

1967 – Commins experiment with Calcium

1969 – CHSH inequality: measurable with detection inefficiencies

1972 – Clauser and Freedman experiment

1975 – Aspect reads Bell’s paper

1976 – Zeilinger reads Bell’s paper

1981 – Aspect two-photon generation source

1982 – Aspect time variable analyzers

1988 – Parametric down-conversion of EPR pairs (Shih and Alley, Ou and Mandel)

1989 – GHZ state proposed

1993 – Bennett quantum teleportation proposal

1995 – High-intensity down-conversion source of EPR pairs (Kwiat and Zeilinger)

1997 – Zeilinger quantum teleportation experiment

1999 – Observation of the GHZ state


Bibliography

[1] See the full details in: David D. Nolte, Interference: A History of Interferometry and the Scientists Who Tamed Light (Oxford University Press, July 2023)

[2] A. Einstein, B. Podolsky, N. Rosen, Can quantum-mechanical description of physical reality be considered complete? Physical Review 47, 0777-0780 (1935).

[3] N. Bohr, Can quantum-mechanical description of physical reality be considered complete? Physical Review 48, 696-702 (1935).

[4] E. Schrödinger, Die gegenwärtige Situation in der Quantenmechanik. Die Naturwissenschaften 23, 807-12; 823-28; 844-49 (1935).

[5] D. Bohm, A suggested interpretation of the quantum theory in terms of hidden variables .1. Physical Review 85, 166-179 (1952); D. Bohm, A suggested interpretation of the quantum theory in terms of hidden variables .2. Physical Review 85, 180-193 (1952).

[6] J. Bell, On the Einstein-Podolsky-Rosen paradox. Physics 1, 195 (1964).

[7] 1. J. F. Clauser, M. A. Horne, A. Shimony, R. A. Holt, Proposed experiment to test local hidden-variable theories. Physical Review Letters 23, 880-& (1969).

[8] S. J. Freedman, J. F. Clauser, Experimental test of local hidden-variable theories. Physical Review Letters 28, 938-& (1972).

[9] A. Aspect, P. Grangier, G. Roger, EXPERIMENTAL TESTS OF REALISTIC LOCAL THEORIES VIA BELLS THEOREM. Physical Review Letters 47, 460-463 (1981).

[10]  Alain Aspect, Bell’s Theorem: The Naïve Veiw of an Experimentalit. (2004), hal- 00001079

[11] A. Aspect, J. Dalibard, G. Roger, EXPERIMENTAL TEST OF BELL INEQUALITIES USING TIME-VARYING ANALYZERS. Physical Review Letters 49, 1804-1807 (1982).

[12] D. M. Greenberger, M. A. Horne, A. Zeilinger, in 1988 Fall Workshop on Bells Theorem, Quantum Theory and Conceptions of the Universe. (George Mason Univ, Fairfax, Va, 1988), vol. 37, pp. 69-72.

[13] C. H. Bennett, G. Brassard, C. Crepeau, R. Jozsa, A. Peres, W. K. Wootters, Teleporting an unknown quantum state via dual classical and einstein-podolsky-rosen channels. Physical Review Letters 70, 1895-1899 (1993).

[14]  J. Gea-Banacloche, Optical realizations of quantum teleportation, in Progress in Optics, Vol 46, E. Wolf, Ed. (2004), vol. 46, pp. 311-353.

[15] D. Bouwmeester, J.-W. Pan, K. Mattle, M. Eibl, H. Weinfurter, A. Zeilinger, Experimental quantum teleportation. Nature 390, 575-579 (1997).

[16] D. Boschi, S. Branca, F. De Martini, L. Hardy, S. Popescu, Experimental realization of teleporting an unknown pure quantum state via dual classical and Einstein-podolsky-Rosen Channels. Phys. Rev. Lett. 80, 1121-1125 (1998).

[17]  A. Zeilinger, Light for the quantum. Entangled photons and their applications: a very personal perspective. Physica Scripta 92, 1-33 (2017).

A Short History of Quantum Tunneling

Quantum physics is often called “weird” because it does things that are not allowed in classical physics and hence is viewed as non-intuitive or strange.  Perhaps the two “weirdest” aspects of quantum physics are quantum entanglement and quantum tunneling.  Entanglement allows a particle state to extend across wide expanses of space, while tunneling allows a particle to have negative kinetic energy.  Neither of these effects has a classical analog.

Quantum entanglement arose out of the Bohr-Einstein debates at the Solvay Conferences in the 1920’s and 30’s, and it was the subject of a recent Nobel Prize in Physics (2022).  The quantum tunneling story is just as old, but it was recognized much earlier by the Nobel Prize in 1972 when it was awarded to Brian Josephson, Ivar Giaever and Leo Esaki—each of whom was a graduate student when they discovered their respective effects and two of whom got their big idea while attending a lecture class. 

Always go to class, you never know what you might miss, and the payoff is sometimes BIG

Ivar Giaever

Of the two effects, tunneling is the more common and the more useful in modern electronic devices (although entanglement is coming up fast with the advent of quantum information science). Here is a short history of quantum tunneling, told through a series of publications that advanced theory and experiments.

Double-Well Potential: Friedrich Hund (1927)

The first analysis of quantum tunneling was performed by Friedrich Hund (1896 – 1997), a German physicist who studied early in his career with Born in Göttingen and Bohr in Copenhagen.  He published a series of papers in 1927 in Zeitschrift für Physik [1] that solved the newly-proposed Schrödinger equation for the case of the double well potential.  He was particularly interested in the formation of symmetric and anti-symmetric states of the double well that contributed to the binding energy of atoms in molecules.  He derived the first tunneling-frequency expression for a quantum superposition of the symmetric and anti-symmetric states

where f is the coherent oscillation frequency, V is the height of the potential and hν is the quantum energy of the isolated states when the atoms are far apart.  The exponential dependence on the potential height V made the tunnel effect extremely sensitive to the details of the tunnel barrier.

Fig. 1 Friedrich Hund

Electron Emission: Lothar Nordheim and Ralph Fowler (1927 – 1928)

The first to consider quantum tunneling from a bound state to a continuum state was Lothar Nordheim (1899 – 1985), a German physicist who studied under David Hilbert and Max Born at Göttingen and worked with John von Neumann and Eugene Wigner and later with Hans Bethe. In 1927 he solved the problem of a particle in a well that is separated from continuum states by a thin finite barrier [2]. Using the new Schrödinger theory, he found transmission coefficients that were finite valued, caused by quantum tunneling of the particle through the barrier. Nordheim’s use of square potential wells and barriers are now, literally, textbook examples that every student of quantum mechanics solves. (For a quantum simulation of wavefunction tunneling through a square barrier see the companion Quantum Tunneling YouTube video.) Nordheim later escaped the growing nationalism and anti-semitism in Germany in the mid 1930’s to become a visiting professor of physics at Purdue University in the United States, moving to a permanent position at Duke University.

Fig. 2 Nordheim square tunnel barrier and Fowler-Nordheim triangular tunnel barrier for electron tunneling from bound states into the continuum.

One of the giants of mathematical physics in the UK from the 1920s through the 1930’s was Ralph Fowler (1889 – 1944). Three of his doctoral students went on to win Nobel Prizes (Chandrasekhar, Dirac and Mott) and others came close (Bhabha, Hartree, Lennard-Jones). In 1928 Fowler worked with Nordheim on a more realistic version of Nordheim’s surface electron tunneling that could explain thermionic emission of electrons from metals under strong electric fields. The electric field modified Nordheim’s square potential barrier into a triangular barrier (which they treated using WKB theory) to obtain the tunneling rate [3]. This type of tunnel effect is now known as Fowler-Nordheim tunneling.

Nuclear Alpha Decay: George Gamow (1928)

George Gamov (1904 – 1968) is one of the icons of mid-twentieth-century physics. He was a substantial physicist who also had a solid sense of humor that allowed him to achieve a level of cultural popularity shared by a few of the larger-than-life physicists of his time, like Richard Feynman and Stephen Hawking. His popular books included One Two Three … Infinity as well as a favorite series of books under the rubric of Mr. Tompkins (Mr. Tompkins in Wonderland and Mr. Tompkins Explores the Atom, among others). He also wrote a history of the early years of quantum theory (Thirty Years that Shook Physics).

In 1928 Gamow was in Göttingen (the Mecca of early quantum theory) with Max Born when he realized that the radioactive decay of Uranium by alpha decay might be explained by quantum tunneling. It was known that nucleons were bound together by some unknown force in what would be an effective binding potential, but that charged alpha particles would also feel a strong electrostatic repulsive potential from a nucleus. Gamow combined these two potentials to create a potential landscape that was qualitatively similar to Nordheim’s original system of 1927, but with a potential barrier that was neither square nor triangular (like the Fowler-Nordheim situation).

Fig. 3 George Gamow

Gamow was able to make an accurate approximation that allowed him to express the decay rate in terms of an exponential term

where Zα is the atomic charge of the alpha particle, Z is the nuclear charge of the Uranium decay product and v is the speed of the alpha particle detected in external measurements [4].

The very next day after Gamow submitted his paper, Ronald Gurney and Edward Condon of Princeton University submitted a paper [5] that solved the same problem using virtually the same approach … except missing Gamow’s surprisingly concise analytic expression for the decay rate.

Molecular Tunneling: George Uhlenbeck (1932)

Because tunneling rates depend inversely on the mass of the particle tunneling through the barrier, electrons are more likely to tunnel through potential barriers than atoms. However, hydrogen is a particularly small atom and is therefore the most amenable to experiencing tunneling.

The first example of atom tunneling is associated with hydrogen in the ammonia molecule NH3. The molecule has a pyramidal structure with the Nitrogen hovering above the plane defined by the three hydrogens. However, an equivalent configuration has the Nitrogen hanging below the hydrogen plane. The energies of these two configurations are the same, but the Nitrogen must tunnel from one side of the hydrogen plane to the other through a barrier. The presence of light-weight hydrogen that can “move out of the way” for the nitrogen makes this barrier very small (infrared energies). When the ammonia is excited into its first vibrational excited state, the molecular wavefunction tunnels through the barrier, splitting the excited level by an energy associated with a wavelength of 1.2 cm which is in the microwave. This tunnel splitting was the first microwave transition observed in spectroscopy and is used in ammonia masers.

Fig. 4 Nitrogen inversion in the ammonia molecule is achieved by excitation to a vibrational excited state followed by tunneling through the barrier, proposed by George Uhlenbeck in 1932.

One of the earliest papers [6] written on the tunneling of nitrogen in ammonia was published by George Uhlenbeck in 1932. George Uhlenbeck (1900 – 1988) was a Dutch-American theoretical physicist. He played a critical role, with Samuel Goudsmit, in establishing the spin of the electron in 1925. Both Uhlenbeck and Goudsmit were close associates of Paul Ehrenfest at Leiden in the Netherlands. Uhlenbeck is also famous for the Ornstein-Uhlenbeck process which is a generalization of Einstein’s theory of Brownian motion that can treat active transport such as intracellular transport in living cells.

Solid-State Electron Tunneling: Leo Esaki (1957)

Although the tunneling of electrons in molecular bonds and in the field emission from metals had been established early in the century, direct use of electron tunneling in solid state devices had remained elusive until Leo Esaki (1925 – ) observed electron tunneling in heavily doped Germanium and Silicon semiconductors. Esaki joined an early precursor of Sony electronics in 1956 and was supported to obtain a PhD from the University of Tokyo. In 1957 he was working with heavily-doped p-n junction diodes and discovered a phenomenon known as negative differential resistance where the current through an electronic device actually decreases as the voltage increases.

Because the junction thickness was only about 100 atoms, or about 10 nanometers, he suspected and then proved that the electronic current was tunneling quantum mechanically through the junction. The negative differential resistance was caused by a decrease in available states to the tunneling current as the voltage increased.

Fig. 5 Esaki tunnel diode with heavily doped p- and n-type semiconductors. At small voltages, electrons and holes tunnel through the semiconductor bandgap across a junction that is only about 10 nm wide. Ht higher voltage, the electrons and hole have no accessible states to tunnel into, producing negative differential resistance where the current decreases with increasing voltage.

Esaki tunnel diodes were the fastest semiconductor devices of the time, and the negative differential resistance of the diode in an external circuit produced high-frequency oscillations. They were used in high-frequency communication systems. They were also radiation hard and hence ideal for the early communications satellites. Esaki was awarded the 1973 Nobel Prize in Physics jointly with Ivar Giaever and Brian Josephson.

Superconducting Tunneling: Ivar Giaever (1960)

Ivar Giaever (1929 – ) is a Norwegian-American physicist who had just joined the GE research lab in Schenectady New York in 1958 when he read about Esaki’s tunneling experiments. He was enrolled at that time as a graduate student in physics at Rensselaer Polytechnic Institute (RPI) where he was taking a course in solid state physics and learning about superconductivity. Superconductivity is carried by pairs of electrons known as Cooper pairs that spontaneously bind together with a binding energy that produced an “energy gap” in the electron energies of the metal, but no one had ever found a way to directly measure it. The Esaki experiment made him immediately think of the equivalent experiment in which Cooper pairs might tunnel between two superconductors (through a thin oxide layer) and yield a measurement of the energy gap. The idea actually came to him during the class lecture.

The experiments used a junction between aluminum and lead (Al—Al2O3—Pb). At first, the temperature of the system was adjusted so that Al remained a normal metal and Pb was superconducting, and Giaever observed a tunnel current with a threshold related to the gap in Pb. Then the temperature was lowered so that both Al and Pb were superconducting, and a peak in the tunnel current appeared at the voltage associated with the difference in the energy gaps (predicted by Harrison and Bardeen).

Fig. 6 Diagram from Giaever “The Discovery of Superconducting Tunneling” at https://conferences.illinois.edu/bcs50/pdf/giaever.pdf

The Josephson Effect: Brian Josephson (1962)

In Giaever’s experiments, the external circuits had been designed to pick up “ordinary” tunnel currents in which individual electrons tunneled through the oxide rather than the Cooper pairs themselves. However, in 1962, Brian Josephson (1940 – ), a physics graduate student at Cambridge, was sitting in a lecture (just like Giaever) on solid state physics given by Phil Anderson (who was on sabbatical there from Bell Labs). During lecture he had the idea to calculate whether it was possible for the Cooper pairs themselves to tunnel through the oxide barrier. Building on theoretical work by Leo Falicov who was at the University of Chicago and later at Berkeley (years later I was lucky to have Leo as my PhD thesis advisor at Berkeley), Josephson found a surprising result that even when the voltage was zero, there would be a supercurrent that tunneled through the junction (now known as the DC Josephson Effect). Furthermore, once a voltage was applied, the supercurrent would oscillate (now known as the AC Josephson Effect). These were strange and non-intuitive results, so he showed Anderson his calculations to see what he thought. By this time Anderson had already been extremely impressed by Josephson (who would often come to the board after one of Anderson’s lectures to show where he had made a mistake). Anderson checked over the theory and agreed with Josephson’s conclusions. Bolstered by this reception, Josephson submitted the theoretical prediction for publication [9].

As soon as Anderson returned to Bell Labs after his sabbatical, he connected with John Rowell who was making tunnel junction experiments, and they revised the external circuit configuration to be most sensitive to the tunneling supercurrent, which they observed in short time and submitted a paper for publication. Since then, the Josephson Effect has become a standard element of ultra-sensitive magnetometers, measurement standards for charge and voltage, far-infrared detectors, and have been used to construct rudimentary qubits and quantum computers.

By David D. Nolte: Nov. 6, 2022


YouTube Video

YouTube Video of Quantum Tunneling Systems


References:

[1] F. Hund, Z. Phys. 40, 742 (1927). F. Hund, Z. Phys. 43, 805 (1927).

[2] L. Nordheim, Z. Phys. 46, 833 (1928).

[3] R. H. Fowler, L. Nordheim, Proc. R. Soc. London, Ser. A 119, 173 (1928).

[4] G. Gamow, Z. Phys. 51, 204 (1928).

[5] R. W. Gurney, E. U. Condon, Nature 122, 439 (1928). R. W. Gurney, E. U. Condon, Phys. Rev. 33, 127 (1929).

[6] Dennison, D. M. and G. E. Uhlenbeck. “The two-minima problem and the ammonia molecule.” Physical Review 41(3): 313-321. (1932)

[7] L. Esaki, New Phenomenon in Narrow Germanium Para-Normal-Junctions, Phys. Rev., 109, 603-604 (1958); L. Esaki, (1974). Long journey into tunneling, disintegration, Proc. of the Nature 123, IEEE, 62, 825.

[8] I. Giaever, Energy Gap in Superconductors Measured by Electron Tunneling, Phys. Rev. Letters, 5, 147-148 (1960); I. Giaever, Electron tunneling and superconductivity, Science, 183, 1253 (1974)

[9] B. D. Josephson, Phys. Lett. 1, 251 (1962); B.D. Josephson, The discovery of tunneling supercurrent, Science, 184, 527 (1974).

[10] P. W. Anderson, J. M. Rowell, Phys. Rev. Lett. 10, 230 (1963); Philip W. Anderson, How Josephson discovered his effect, Physics Today 23, 11, 23 (1970)

[11] Eugen Merzbacher, The Early History of Quantum Tunneling, Physics Today 55, 8, 44 (2002)

[12] Razavy, Mohsen. Quantum Theory Of Tunneling, World Scientific Publishing Company, 2003.



Interference (New from Oxford University Press, 2023)

Read the stories of the scientists and engineers who tamed light and used it to probe the universe.

Available from Amazon.

Available from Oxford U Press

Available from Barnes & Nobles

A Short History of the Photon

The quantum of light—the photon—is a little over 100 years old.  It was born in 1905 when Einstein merged Planck’s blackbody quantum hypothesis with statistical mechanics and concluded that light itself must be quantized.  No one believed him!  Fast forward to today, and the photon is a modern workhorse of modern quantum technology.  Quantum encryption and communication are performed almost exclusively with photons, and many prototype quantum computers are optics based.  Quantum optics also underpins atomic and molecular optics (AMO), which is one of the hottest and most rapidly advancing  frontiers of physics today.

Only after the availability of “quantum” light sources … could photon numbers be manipulated at will, launching the modern era of quantum optics.

This blog tells the story of the early days of the photon and of quantum optics.  It begins with Einstein in 1905 and ends with the demonstration of photon anti-bunching that was the first fundamentally quantum optical phenomenon observed seventy years later in 1977.  Across that stretch of time, the photon went from a nascent idea in Einstein’s fertile brain to the most thoroughly investigated quantum particle in the realm of physics.

The Photon: Albert Einstein (1905)

When Planck presented his quantum hypothesis in 1900 to the German Physical Society [1], his model of black body radiation retained all its classical properties but one—the quantized interaction of light with matter.  He did not think yet in terms of quanta, only in terms of steps in a continuous interaction.

The quantum break came from Einstein when he published his 1905 paper proposing the existence of the photon—an actual quantum of light that carried with it energy and momentum [2].  His reasoning was simple and iron-clad, resting on Planck’s own blackbody relation that Einstein combined with simple reasoning from statistical mechanics.  He was led inexorably to the existence of the photon.  Unfortunately, almost no one believed him (see my blog on Einstein and Planck). 

This was before wave-particle duality in quantum thinking, so the notion that light—so clearly a wave phenomenon—could be a particle was unthinkable.  It had taken half of the 19th century to rid physics of Newton’s corpuscules and emmisionist theories of light, so to bring it back at the beginning of the 20th century seemed like a great blunder.  However, Einstein persisted.

In 1909 he published a paper on the fluctuation properties of light [3] in which he proposed that the fluctuations observed in light intensity had two contributions: one from the discreteness of the photons (what we call “shot noise” today) and one from the fluctuations in the wave properties.  Einstein was proposing that both particle and wave properties contributed to intensity fluctuations, exhibiting simultaneous particle-like and wave-like properties.  This was one of the first expressions of wave-particle duality in modern physics.

In 1916 and 1917 Einstein took another bold step and proposed the existence of stimulated emission [4].  Once again, his arguments were based on simple physics—this time the principle of detailed balance—and he was led to the audacious conclusion that one photon can stimulated the emission of another.  This would become the basis of the laser forty-five years later.

While Einstein was confident in the reality of the photon, others sincerely doubted its existence.  Robert Milliken (1868 – 1953) decided to put Einstein’s theory of photoelectron emission to the most stringent test ever performed.  In 1915 he painstakingly acquired the definitive dataset with the goal to refute Einstein’s hypothesis, only to confirm it in spectacular fashion [5].  Partly based on Milliken’s confirmation of Einstein’s theory of the photon, Einstein was awarded the Nobel Prize in Physics in 1921.

Einstein at a blackboard.

From that point onward, the physical existence of the photon was accepted and was incorporated routinely into other physical theories.  Compton used the energy and the momentum of the photon in 1922 to predict and measure Compton scattering of x-rays off of electrons [6].  The photon was given its modern name by Gilbert Lewis in 1926 [7].

Single-Photon Interference: Geoffry Taylor (1909)

If a light beam is made up of a group of individual light quanta, then in the limit of very dim light, there should just be one photon passing through an optical system at a time.  Therefore, to do optical experiments on single photons, one just needs to reach the ultimate dim limit.  As simple and clear as this argument sounds, it has problems that only were sorted out after the Hanbury Brown and Twiss experiments in the 1950’s and the controversy they launched (see below).  However, in 1909, this thinking seemed like a clear approach for looking for deviations in optical processes in the single-photon limit.

In 1909, Geoffry Ingram Taylor (1886 – 1975) was an undergraduate student at Cambridge University and performed a low-intensity Young’s double-slit experiment (encouraged by J. J. Thomson).  At that time the idea of Einstein’s photon was only 4 years old, and Bohr’s theory of the hydrogen atom was still a year away.  But Thomson believed that if photons were real, then their existence could possibly show up as deviations in experiments involving single photons.  Young’s double-slit experiment is the classic demonstration of the classical wave nature of light, so performing it under conditions when (on average) only a single photon was in transit between a light source and a photographic plate seemed like the best place to look.

G. I. Taylor

The experiment was performed by finding an optimum exposure of photographic plates in a double slit experiment, then reducing the flux while increasing the exposure time, until the single-photon limit was achieved while retaining the same net exposure of the photographic plate.  Under the lowest intensity, when only a single photon was in transit at a time (on average), Taylor performed the exposure for three months.  To his disappointment, when he developed the film, there was no significant difference between high intensity and low intensity interference fringes [8].  If photons existed, then their quantized nature was not showing up in the low-intensity interference experiment.

The reason that there is no single-photon-limit deviation in the behavior of the Young double-slit experiment is because Young’s experiment only measures first-order coherence properties.  The average over many single-photon detection events is described equally well either by classical waves or by quantum mechanics.  Quantized effects in the Young experiment could only appear in fluctuations in the arrivals of photons, but in Taylor’s day there was no way to detect the arrival of single photons. 

Quantum Theory of Radiation : Paul Dirac (1927)

After Paul Dirac (1902 – 1984) was awarded his doctorate from Cambridge in 1926, he received a stipend that sent him to work with Niels Bohr (1885 – 1962) in Copenhagen. His attention focused on the electromagnetic field and how it interacted with the quantized states of atoms.  Although the electromagnetic field was the classical field of light, it was also the quantum field of Einstein’s photon, and he wondered how the quantized harmonic oscillators of the electromagnetic field could be generated by quantum wavefunctions acting as operators.  He decided that, to generate a photon, the wavefunction must operate on a state that had no photons—the ground state of the electromagnetic field known as the vacuum state.

Dirac put these thoughts into their appropriate mathematical form and began work on two manuscripts.  The first manuscript contained the theoretical details of the non-commuting electromagnetic field operators.  He called the process of generating photons out of the vacuum “second quantization”.  In second quantization, the classical field of electromagnetism is converted to an operator that generates quanta of the associated quantum field out of the vacuum (and also annihilates photons back into the vacuum).  The creation operators can be applied again and again to build up an N-photon state containing N photons that obey Bose-Einstein statistics, as they must, as required by their integer spin, and agreeing with Planck’s blackbody radiation. 

Dirac then showed how an interaction of the quantized electromagnetic field with quantized energy levels involved the annihilation and creation of photons as they promoted electrons to higher atomic energy levels, or demoted them through stimulated emission.  Very significantly, Dirac’s new theory explained the spontaneous emission of light from an excited electron level as a direct physical process that creates a photon carrying away the energy as the electron falls to a lower energy level.  Spontaneous emission had been explained first by Einstein more than ten years earlier when he derived the famous A and B coefficients [4], but the physical mechanism for these processes was inferred rather than derived. Dirac, in late 1926, had produced the first direct theory of photon exchange with matter [9]

Paul Dirac in his early days.

Einstein-Podolsky-Rosen (EPR) and Bohr (1935)

The famous dialog between Einstein and Bohr at the Solvay Conferences culminated in the now famous “EPR” paradox of 1935 when Einstein published (together with B. Podolsky and N. Rosen) a paper that contained a particularly simple and cunning thought experiment. In this paper, not only was quantum mechanics under attack, but so was the concept of reality itself, as reflected in the paper’s title “Can Quantum Mechanical Description of Physical Reality Be Considered Complete?” [10].

Bohr and Einstein at Paul Ehrenfest’s house in 1925.

Einstein considered an experiment on two quantum particles that had become “entangled” (meaning they interacted) at some time in the past, and then had flown off in opposite directions. By the time their properties are measured, the two particles are widely separated. Two observers each make measurements of certain properties of the particles. For instance, the first observer could choose to measure either the position or the momentum of one particle. The other observer likewise can choose to make either measurement on the second particle. Each measurement is made with perfect accuracy. The two observers then travel back to meet and compare their measurements.   When the two experimentalists compare their data, they find perfect agreement in their values every time that they had chosen (unbeknownst to each other) to make the same measurement. This agreement occurred either when they both chose to measure position or both chose to measure momentum.

It would seem that the state of the particle prior to the second measurement was completely defined by the results of the first measurement. In other words, the state of the second particle is set into a definite state (using quantum-mechanical jargon, the state is said to “collapse”) the instant that the first measurement is made. This implies that there is instantaneous action at a distance −− violating everything that Einstein believed about reality (and violating the law that nothing can travel faster than the speed of light). He therefore had no choice but to consider this conclusion of instantaneous action to be false.  Therefore quantum mechanics could not be a complete theory of physical reality −− some deeper theory, yet undiscovered, was needed to resolve the paradox.

Bohr, on the other hand, did not hold “reality” so sacred. In his rebuttal to the EPR paper, which he published six months later under the identical title [11], he rejected Einstein’s criterion for reality. He had no problem with the two observers making the same measurements and finding identical answers. Although one measurement may affect the conditions of the second despite their great distance, no information could be transmitted by this dual measurement process, and hence there was no violation of causality. Bohr’s mind-boggling viewpoint was that reality was nonlocal, meaning that in the quantum world the measurement at one location does influence what is measured somewhere else, even at great distance. Einstein, on the other hand, could not accept a nonlocal reality.

Entangled versus separable states. When the states are separable, no measurement on photon A has any relation to measurements on photon B. However, in the entangled case, all measurements on A are related to measurements on B (and vice versa) regardless of what decision is made to make what measurement on either photon, or whether the photons are separated by great distance. The entangled wave-function is “nonlocal” in the sense that it encompasses both particles at the same time, no matter how far apart they are.

The Intensity Interferometer:  Hanbury Brown and Twiss (1956)

Optical physics was surprisingly dormant from the 1930’s through the 1940’s. Most of the research during this time was either on physical optics, like lenses and imaging systems, or on spectroscopy, which was more interested in the physical properties of the materials than in light itself. This hiatus from the photon was about to change dramatically, not driven by physicists, but driven by astronomers.

The development of radar technology during World War II enabled the new field of radio astronomy both with high-tech receivers and with a large cohort of scientists and engineers trained in radio technology. In the late 1940’s and early 1950’s radio astronomy was starting to work with long baselines to better resolve radio sources in the sky using interferometery. The first attempts used coherent references between two separated receivers to provide a common mixing signal to perform field-based detection. However, the stability of the reference was limiting, especially for longer baselines.

In 1950, a doctoral student in the radio astronomy department of the University of Manchester, R. Hanbury Brown, was given the task to design baselines that could work at longer distances to resolve smaller radio sources. After struggling with the technical difficulties of providing a coherent “local” oscillator for distant receivers, Hanbury Brown had a sudden epiphany one evening. Instead of trying to reference the field of one receiver to the field of another, what if, instead, one were to reference the intensity of one receiver to the intensity of the other, specifically correlating the noise on the intensity? To measure intensity requires no local oscillator or reference field. The size of an astronomical source would then show up in how well the intensity fluctuations correlated with each other as the distance between the receivers was changed. He did a back of the envelope calculation that gave him hope that his idea might work, but he needed more rigorous proof if he was to ask for money to try out his idea. He tracked down Richard Twiss at a defense research lab and the two working out the theory of intensity correlations for long-baseline radio interferometry. Using facilities at the famous Jodrell Bank Radio Observatory at Manchester, they demonstrated the principle of their intensity interferometer and measured the angular size of Cygnus A and Cassiopeia A, two of the strongest radio sources in the Northern sky.

R. Hanbury Brown

One of the surprising side benefits of the intensity interferometer over field-based interferometry was insensitivity to environmental phase fluctuations. For radio astronomy the biggest source of phase fluctuations was the ionosphere, and the new intensity interferometer was immune to its fluctuations. Phase fluctuations had also been the limiting factor for the Michelson stellar interferometer which had limited its use to only about half a dozen stars, so Hanbury Brown and Twiss decided to revisit visible stellar interferometry using their new concept of intensity interferometry.

To illustrate the principle for visible wavelengths, Hanbury Brown and Twiss performed a laboratory experiment to correlate intensity fluctuations in two receivers illuminated by a common source through a beam splitter. The intensity correlations were detected and measured as a function of path length change, illustrating an excess correlation in noise for short path lengths that decayed as the path length increased. They published their results in Nature magazine in 1956 that immediately ignited a firestorm of protest from physicists [12].

In the 1950’s, many physicists had embraced the discrete properties of the photon and had developed a misleading mental picture of photons as individual and indivisible particles that could only go one way or another from a beam splitter, but not both. Therefore, the argument went, if the photon in an attenuated beam was detected in one detector at the output of a beam splitter, then it cannot be detected at the other. This would produce an anticorrelation in coincidence counts at the two detectors. However, the Hanbury Brown Twiss (HBT) data showed a correlation from the two detectors. This launched an intense controversy in which some of those who accepted the results called for a radical new theory of the photon, while most others dismissed the HBT results as due to systematics in the light source. The heart of this controversy was quickly understood by the Nobel laureate E. M Purcell. He correctly pointed out that photons are bosons and are indistinguishable discrete particles and hence are likely to “bunch” together, according to quantum statistics, even under low light conditions [13]. Therefore, attenuated “chaotic” light would indeed show photodetector correlations, even if the average photon number was less than a single photon at a time, the photons would still bunch.

The bunching of photons in light is a second order effect that moves beyond the first-order interference effects of Young’s double slit, but even here the quantum nature of light is not required. A semiclassical theory of light emission from a spectral line with a natural bandwidth also predicts intensity correlations, and the correlations are precisely what would be observed for photon bunching. Therefore, even the second-order HBT results, when performed with natural light sources, do not distinguish between classical and quantum effects in the experimental results. But this reliance on natural light sources was about to change fundmaentally with the invention of the laser.

Invention of the Laser : Ted Maiman (1959)

One of the great scientific breakthroughs of the 20th century was the nearly simultaneous yet independent realization by several researchers around 1951 (by Charles H. Townes of Columbia University, by Joseph Weber of the University of Maryland, and by Alexander M. Prokhorov and Nikolai G. Basov at the Lebedev Institute in Moscow) that clever techniques and novel apparati could be used to produce collections of atoms that had more electrons in excited states than in ground states. Such a situation is called a population inversion. If this situation could be attained, then according to Einstein’s 1917 theory of photon emission, a single photon would stimulate a second photon, which in turn would stimulate two additional electrons to emit two identical photons to give a total of four photons −− and so on. Clearly this process turns a single photon into a host of photons, all with identical energy and phase.

Theodore Maiman

Charles Townes and his research group were the first to succeed in 1953 in producing a device based on ammonia molecules that could work as an intense source of coherent photons. The initial device did not amplify visible light, but amplified microwave photons that had wavelengths of about 3 centimeters. They called the process microwave amplification by stimulated emission of radiation, hence the acronym “MASER”. Despite the significant breakthrough that this invention represented, the devices were very expensive and difficult to operate. The maser did not revolutionize technology, and some even quipped that the acronym stood for “Means of Acquiring Support for Expensive Research”. The maser did, however, launch a new field of study, called quantum electronics, that was the direct descendant of Einstein’s 1917 paper. Most importantly, the existence and development of the maser became the starting point for a device that could do the same thing for light.

The race to develop an optical maser (later to be called laser, for light amplification by stimulated emission of radiation) was intense. Many groups actively pursued this holy grail of quantum electronics. Most believed that it was possible, which made its invention merely a matter of time and effort. This race was won by Theodore H. Maiman at Hughes Research Laboratory in Malibu California in 1960 [14]. He used a ruby crystal that was excited into a population inversion by an intense flash tube (like a flash bulb) that had originally been invented for flash photography. His approach was amazingly simple −− blast the ruby with a high-intensity pulse of light and see what comes out −− which explains why he was the first. Most other groups had been pursuing much more difficult routes because they believed that laser action would be difficult to achieve.

Perhaps the most important aspect of Maiman’s discovery was that it demonstrated that laser action was actually much simpler than people anticipated, and that laser action is a fairly common phenomenon. His discovery was quickly repeated by other groups, and then additional laser media were discovered such as helium-neon gas mixtures, argon gas, carbon dioxide gas, garnet lasers and others. Within several years, over a dozen different material and gas systems were made to lase, opening up wide new areas of research and development that continues unabated to this day. It also called for new theories of optical coherence to explain how coherent laser light interacted with matter.

Coherent States : Glauber (1963)

The HBT experiment had been performed with attenuated chaotic light that had residual coherence caused by the finite linewidth of the filtered light source. The theory of intensity correlations for this type of light was developed in the 1950’s by Emil Wolf and Leonard Mandel using a semiclassical theory in which the statistical properties of the light was based on electromagnetics without a direct need for quantized photons. The HBT results were fully consistent with this semiclassical theory. However, after the invention of the laser, new “coherent” light sources became available that required a fundamentally quantum depiction.

Roy Glauber was a theoretical physicist who received his PhD working with Julian Schwinger at Harvard. He spent several years as a post-doc at Princeton’s Institute for Advanced Study starting in 1949 at the time when quantum field theory was being developed by Schwinger, Feynman and Dyson. While Feynman was off in Brazil for a year learning to play the bongo drums, Glauber filled in for his lectures at Cal Tech. He returned to Harvard in 1952 in the position of an assistant professor. He was already thinking about the quantum aspects of photons in 1956 when news of the photon correlations in the HBT experiment were published, and when the laser was invented three years later, he began developing a theory of photon correlations in laser light that he suspected would be fundamentally different than in natural chaotic light.

Roy Glauber

Because of his background in quantum field theory, and especially quantum electrodynamics, it was a fairly easy task to couch the quantum optical properties of coherent light in terms of Dirac’s creation and annihilation operators of the electromagnetic field. Related to the minimum-uncertainty wave functions derived initially by Schrödinger in the late 1920’s, Glauber developed a “coherent state” operator that was a minimum uncertainty state of the quantized electromagnetic field [15]. This coherent state represents a laser operating well above the lasing threshold and predicted that the HBT correlations would vanish. Glauber was awarded the Nobel Prize in Physics in 2005 for his work on such “Glauber” states in quantum optics.

Single-Photon Optics: Kimble and Mandel (1977)

Beyond introducing coherent states, Glauber’s new theoretical approach, and parallel work by George Sudarshan around the same time [16], provided a new formalism for exploring quantum optical properties in which fundamentally quantum processes could be explored that could not be predicted using only semiclassical theory. For instance, one could envision producing photon states in which the photon arrivals at a detector could display the kind of anti-bunching that had originally been assumed (in error) by the critics of the HBT experiment. A truly one-photon state, also known as a Fock state or a number state, would be the extreme limit in which the quantum field possessed a single quantum that could be directed at a beam splitter and would emerge either from one side or the other with complete anti-correlation. However, generating such a state in the laboratory remained a challenge.

In 1975 by Carmichel and Walls predicted that resonance fluorescence could produce quantized fields that had lower correlations than coherent states [17]. In 1977 H. J. Kimble, M. Dagenais and L. Mandel demonstrated, for the first time, photon antibunching between two photodetectors at the two ports of a beam splitter [18]. They used a beam of sodium atoms pumped by a dye laser.

This first demonstration of photon antibunching represents a major milestone in the history of quantum optics. Taylor’s first-order experiments in 1909 showed no difference between classical electromagnetic waves and a flux of photons. Similarly the second-order HBT experiment of 1956 using chaotic light could be explained equally well using classical or quantum approaches to explain the observed photon correlations. Even laser light (when the laser is operated far above threshold) produced classic “classical” wave effects with only the shot noise demonstrating the discreteness of photon arrivals. Only after the availability of “quantum” light sources, beginning with the work of Kimble and Mandel, could photon numbers be manipulated at will, launching the modern era of quantum optics. Later experiments by them and others have continually improved the control of photon states.

By David D. Nolte, Jan. 18, 2021

TimeLine:

  • 1900 – Planck (1901). “Law of energy distribution in normal spectra.” Annalen Der Physik 4(3): 553-563.
  • 1905 – A. Einstein (1905). “Generation and conversion of light with regard to a heuristic point of view.” Annalen Der Physik 17(6): 132-148.
  • 1909 – A. Einstein (1909). “On the current state of radiation problems.” Physikalische Zeitschrift 10: 185-193.
  • 1909 – G.I. Taylor: Proc. Cam. Phil. Soc. Math. Phys. Sci. 15 , 114 (1909) Single photon double-slit experiment
  • 1915 – Millikan, R. A. (1916). “A direct photoelectric determination of planck’s “h.”.” Physical Review 7(3): 0355-0388. Photoelectric effect.
  • 1916 – Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318.. Einstein predicts stimulated emission
  • 1923 –Compton, Arthur H. (May 1923). “A Quantum Theory of the Scattering of X-Rays by Light Elements”. Physical Review. 21 (5): 483–502.
  • 1926 – Lewis, G. N. (1926). “The conservation of photons.” Nature 118: 874-875.. Gilbert Lewis named “photon”
  • 1927 – D. Dirac, P. A. M. (1927). “The quantum theory of the emission and absorption of radiation.” Proceedings of the Royal Society of London Series a-Containing Papers of a Mathematical and Physical Character 114(767): 243-265.
  • 1932 – E. P. Wigner: Phys. Rev. 40, 749 (1932)
  • 1935 – A. Einstein, B. Podolsky, N. Rosen: Phys. Rev. 47 , 777 (1935). EPR paradox.
  • 1935 – N. Bohr: Phys. Rev. 48 , 696 (1935). Bohr’s response to the EPR paradox.
  • 1956 – R. Hanbury-Brown, R.W. Twiss: Nature 177 , 27 (1956) Photon bunching
  • 1963 – R. J. Glauber: Phys. Rev. 130 , 2529 (1963) Coherent states
  • 1963 – E. C. G. Sudarshan: Phys. Rev. Lett. 10, 277 (1963) Coherent states
  • 1964 – P. L. Kelley, W.H. Kleiner: Phys. Rev. 136 , 316 (1964)
  • 1966 – F. T. Arecchi, E. Gatti, A. Sona: Phys. Rev. Lett. 20 , 27 (1966); F.T. Arecchi, Phys. Lett. 16 , 32 (1966)
  • 1966 – J. S. Bell: Physics 1 , 105 (1964); Rev. Mod. Phys. 38 , 447 (1966) Bell inequalities
  • 1967 – R. F. Pfleegor, L. Mandel: Phys. Rev. 159 , 1084 (1967) Interference at single photon level
  • 1967 – M. O. Scully, W.E. Lamb: Phys. Rev. 159 , 208 (1967).  Quantum theory of laser
  • 1967 – B. R. Mollow, R. J. Glauber: Phys. Rev. 160, 1097 (1967); 162, 1256 (1967) Parametric converter
  • 1969 – M. O. Scully, W.E. Lamb: Phys. Rev. 179 , 368 (1969).  Quantum theory of laser
  • 1969 – M. Lax, W.H. Louisell: Phys. Rev. 185 , 568 (1969).  Quantum theory of laser
  • 1975 – Carmichael, H. J. and D. F. Walls (1975). Journal of Physics B-Atomic Molecular and Optical Physics 8(6): L77-L81. Photon anti-bunching predicted in resonance fluorescence
  • 1977 – H. J. Kimble, M. Dagenais and L. Mandel (1977) Photon antibunching in resonance fluorescence. Phys. Rev. Lett. 39, 691-5:  Kimble, Dagenais and Mandel demonstrate the effect of antibunching

References

• Parts of this blog are excerpted from Mind at Light Speed, D. Nolte (Free Press, 2001) that tells the story of light’s central role in telecommunications and in the future of optical and quantum computers. Further information can be found in Interference: The History of Optical Interferometry and the Scientists who Tamed Light (Oxford, 2023).

[1] Planck (1901). “Law of energy distribution in normal spectra.” Annalen Der Physik 4(3): 553-563.

[2] A. Einstein (1905). “Generation and conversion of light with regard to a heuristic point of view.” Annalen Der Physik 17(6): 132-148

[3] A. Einstein (1909). “On the current state of radiation problems.” Physikalische Zeitschrift 10: 185-193.

[4] Einstein, A. (1916). “Strahlungs-Emission un -Absorption nach der Quantentheorie.” Verh. Deutsch. Phys. Ges. 18: 318; Einstein, A. (1917). “Quantum theory of radiation.” Physikalische Zeitschrift 18: 121-128.

[5] Millikan, R. A. (1916). “A direct photoelectric determination of planck‘s “h.”.” Physical Review 7(3): 0355-0388.

[6] Compton, A. H. (1923). “A quantum theory of the scattering of x-rays by light elements.” Physical Review 21(5): 0483-0502.

[7] Lewis, G. N. (1926). “The conservation of photons.” Nature 118: 874-875.

[8] Taylor, G. I. (1910). “Interference fringes with feeble light.” Proceedings of the Cambridge Philosophical Society 15: 114-115.

[9] Dirac, P. A. M. (1927). “The quantum theory of the emission and absorption of radiation.” Proceedings of the Royal Society of London Series a-Containing Papers of a Mathematical and Physical Character 114(767): 243-265.

[10] Einstein, A., B. Podolsky and N. Rosen (1935). “Can quantum-mechanical description of physical reality be considered complete?” Physical Review 47(10): 0777-0780.

[11] Bohr, N. (1935). “Can quantum-mechanical description of physical reality be considered complete?” Physical Review 48(8): 696-702.

[12] Brown, R. H. and R. Q. Twiss (1956). “Correlation Between Photons in 2 Coherent Beams of Light.” Nature 177(4497): 27-29; [1] R. H. Brown and R. Q. Twiss, “Test of a new type of stellar interferometer on Sirius,” Nature, vol. 178, no. 4541, pp. 1046-1048, (1956).

[13] Purcell, E. M. (1956). “Question of Correlation Between Photons in Coherent Light Rays.” Nature 178(4548): 1448-1450.

[14] Maimen, T. H. (1960). “Stimulated optical radiation in ruby.” Nature 187: 493.

[15] Glauber, R. J. (1963). “Photon Correlations.” Physical Review Letters 10(3): 84.

[16] Sudarshan, E. C. G. (1963). “Equivalence of semiclassical and quantum mechanical descriptions of statistical light beams.” Physical Review Letters 10(7): 277-&.; Mehta, C. L. and E. C. Sudarshan (1965). “Relation between quantum and semiclassical description of optical coherence.” Physical Review 138(1B): B274.

[17] Carmichael, H. J. and D. F. Walls (1975). “Quantum treatment of spontaneous emission from a strongly driven 2-level atom.” Journal of Physics B-Atomic Molecular and Optical Physics 8(6): L77-L81.

[18] Kimble, H. J., M. Dagenais and L. Mandel (1977). “Photon anti bunching in resonance fluorescence.” Physical Review Letters 39(11): 691-695.



Interference (New from Oxford University Press, 2023)

A popular account of the trials and toils of the scientists and engineers who tamed light and used it to probe the universe.

A Short History of Fractal Dimension

It is second nature to think of integer dimensions:  A line is one dimensional.  A plane is two dimensional. A volume is three dimensional.  A point has no dimensions.

It is harder to think in four dimensions and higher, but even here it is a simple extrapolation of lower dimensions.  Consider the basis vectors spanning a three-dimensional space consisting of the triples of numbers

Then a four dimensional hyperspace is just created by adding a new “tuple” to the list

and so on to 5 and 6 dimensions and on.  Child’s play!

But how do you think of fractional dimensions?  What is a fractional dimension?  For that matter, what is a dimension?  Even the integer dimensions began to unravel when George Cantor showed in 1877 that the line and the plane, which clearly had different “dimensionalities”, both had the same cardinality and could be put into a one-to-one correspondence.  From then onward the concept of dimension had to be rebuilt from the ground up, leading ultimately to fractals.

Here is a short history of fractal dimension, partially excerpted from my history of dynamics in Galileo Unbound (Oxford University Press, 2018) pg. 110 ff.  This blog page presents the history through a set of publications that successively altered how mathematicians thought about curves in spaces, beginning with Karl Weierstrass in 1872.

Karl Weierstrass (1872)

Karl Weierstrass (1815 – 1897) was studying convergence properties of infinite power series in 1872 when he began with a problem that Bernhard Riemann had given to his students some years earlier.  Riemann had asked whether the function

was continuous everywhere but not differentiable.  This simple question about a simple series was surprisingly hard to answer (it was not solved until Hardy provided the proof in 1916 [1]).  Therefore, Weierstrass conceived of a simpler infinite sum that was continuous everywhere and for which he could calculate left and right limits of derivatives at any point.  This function is

where b is a large odd integer and a is positive and less than one.  Weierstrass showed that the left and right derivatives failed to converge to the same value, no matter where he took his point.  In short, he had discovered a function that was continuous everywhere, but had a derivative nowhere [2].  This pathological function, called a “Monster” by Charles Hermite, is now called the Weierstrass function.

Beyond the strange properties that Weierstrass sought, the Weierstrass function would turn out to be a fractal curve (recognized much later by Besicovitch and Ursell in 1937 [3]) with a fractal (Hausdorff) dimension given by

although this was not proven until very recently [4].  An example of the function is shown in Fig. 1 for a = 0.5 and b = 5.  This specific curve has a fractal dimension D = 1.5693.  Notably, this is a number that is greater than 1 dimension (the topological dimension of the curve) but smaller than 2 dimensions (the embedding dimension of the curve).  The curve tends to fill more of the two dimensional plane than a straight line, so its intermediate fractal dimension has an intuitive feel about it.  The more “monstrous” the curve looks, the closer its fractal dimension approaches 2.

Fig. 1  Weierstrass’ “Monster” (1872) with a = 0.5, b = 5.  This continuous function is nowhere differentiable.  It is a fractal with fractal dimension D = 2 + ln(0.5)/ln(5) = 1.5693.

Georg Cantor (1883)

Partially inspired by Weierstrass’ discovery, George Cantor (1845 – 1918) published an example of an unusual ternary set in 1883 in “Grundlagen einer allgemeinen Mannigfaltigkeitslehre” (“Foundations of a General Theory of Aggregates”) [5].  The set generates a function (The Cantor Staircase) that has a derivative equal to zero almost everywhere, yet whose area integrates to unity.  It is a striking example of a function that is not equal to the integral of its derivative!  Cantor demonstrated that the size of his set is aleph0 , which is the cardinality of the real numbers.  But whereas the real numbers are uniformly distributed, Cantor’s set is “clumped”.  This clumpiness is an essential feature that distinguishes it from the one-dimensional number line, and it raised important questions about dimensionality. The fractal dimension of the ternary Cantor set is DH = ln(2)/ln(3) = 0.6309.

Fig. 2  The 1883 Cantor set (below) and the Cantor staircase (above, as the indefinite integral over the set).

Giuseppe Peano (1890)

In 1878, in a letter to his friend Richard Dedekind, Cantor showed that there was a one-to-one correspondence between the real numbers and the points in any n-dimensional space.  He was so surprised by his own result that he wrote to Dedekind “I see it, but I don’t believe it.”  The solid concepts of dimension and dimensionality were dissolving before his eyes.  What does it mean to trace the path of a trajectory in an n-dimensional space, if all the points in n dimensions were just numbers on a line?  What could such a trajectory look like?  A graphic example of a plane-filling path was constructed in 1890 by Peano [6], who was a peripatetic mathematician with interests that wandered broadly across the landscape of the mathematical problems of his day—usually ahead of his time.  Only two years after he had axiomatized linear vector spaces [7], Peano constructed a continuous curve that filled space. 

The construction of Peano’s curve proceeds by taking a square and dividing it into 9 equal sub squares.  Lines connect the centers of each of the sub squares.  Then each sub square is divided again into 9 sub squares whose centers are all connected by lines.  At this stage, the original pattern, repeated 9 times, is connected together by 8 links, forming a single curve.  This process is repeated infinitely many times, resulting in a curve that passes through every point of the original plane square.  In this way, a line is made to fill a plane.  Where Cantor had proven abstractly that the cardinality of the real numbers was the same as the points in n-dimensional space, Peano created a specific example.  This was followed quickly by another construction, invented by David Hilbert in 1891, that divided the square into four instead of nine, simplifying the construction, but also showing that such constructions were easily generated.

Fig. 3 Peano’s (1890) and Hilbert’s (1891) plane-filling curves.  When the iterations are taken to infinity, the curves approach every point of two-dimensional space arbitrarily closely, giving them a dimension DH = DE = 2, although their topological dimensions are DT = 1.

Helge von Koch (1904)

The space-filling curves of Peano and Hilbert have the extreme property that a one-dimensional curve approaches every point in a two-dimensional space.  This ability of a one-dimensional trajectory to fill space mirrored the ergodic hypothesis that Boltzmann relied upon as he developed statistical mechanics.  These examples by Peano, Hilbert and Boltzmann inspired searches for continuous curves whose dimensionality similarly exceeded one dimension, yet without filling space.  Weierstrass’ Monster was already one such curve, existing in some dimension greater than one but not filling the plane.  The construction of the Monster required infinite series of harmonic functions, and the resulting curve was single valued on its domain of real numbers. 

An alternative approach was proposed by Helge von Koch (1870—1924), a Swedish mathematician with an interest in number theory.  He suggested in 1904 that a set of straight line segments could be joined together, and then shrunk by a scale factor to act as new segments of the original pattern [8].  The construction of the Koch curve is shown in Fig. 4.  When the process is taken to its limit, it produces a curve, differentiable nowhere, which snakes through two dimensions.  When connected with other identical curves into a hexagon, the curve resembles a snowflake, and the construction is known as “Koch’s Snowflake”. 

The Koch curve begins in generation 1 with N0 = 4 elements.  These are shrunk by a factor of b = 1/3 to become the four elements of the next generation, and so on.  The number of elements varies with the observation scale according to the equation

where D is called the fractal dimension.  In the example of the Koch curve, the fractal dimension is

which is a number less than its embedding dimenion DE = 2.  The fractal is embedded in 2D but has a fractional dimension that is greater than it topological dimension DT = 1.

Fig. 4  Generation of a Koch curve (1904).  The fractal dimension is D = ln(4)/ln(3) = 1.26.  At each stage, four elements are reduced in size by a factor of 3.  The “length” of the curve approaches infinity as the features get smaller and smaller.  But the scaling of the length with size is determined uniquely by the fractal dimension.

Waclaw Sierpinski (1915)

Waclaw Sierpinski (1882 – 1969) was a Polish mathematician studying at the Jagellonian University in Krakow for his doctorate when he came across a theorem that every point in the plane can be defined by a single coordinate.  Intrigued by such an unintuitive result, he dived deep into Cantor’s set theory after he was appointed as a faculty member at the university in Lvov.  He began to construct curves that had more specific properties than the Peano or Hilbert curves, such as a curve that passes through every interior point of a unit square but that encloses an area that is only equal to 5/12 = 0.4167.  Sierpinski became interested in the topological properties of such sets.

Sierpinski considered how to define a curve that was embedded in DE = 2 but that was NOT constructed as a topological dimension DT = 1 curve as the curves of Peano, Hilbert, Koch (and even his own) had been.  To demonstrate this point, he described a construction that began with a topological dimension DT = 2 object, a planar triangle, from which the open set of its central inverted triangle is removed, leaving its boundary points.  The process is continued iteratively to all scales [9].  The resulting point set is shown in Fig. 5 and is called the Sierpinski gasket.  What is left after all the internal triangles are removed is a point set that can be made discontinuous by cutting it at a finite set of points.  This is shown in Fig. 5 by the red circles.  Each circle, no matter the size, cuts the set at three points, making the resulting set discontinuous.  Ten years later, Karl Menger would show that this property of discontinuous cuts determined the topological dimension of the Sierpinski gasket to be DT = 1.  The embedding dimension is of course DE = 2, and the fractal dimension of the Sierpinski gasket is

Fig. 5 The Sierpinski gasket.  The central triangle is removed (leaving its boundary) at each scale.  The pattern is self-similar with a fractal dimension DH = 1.5850.  Unintuitively, it has a topological dimension DT = 1.

Felix Hausdorff (1918)

The work by Cantor, Peano, von Koch and Sierpinski had created a crisis in geometry as mathematicians struggled to rescue concepts of dimensionality.  An important byproduct of that struggle was a much deeper understanding of concepts of space, especially in the hands of Felix Hausdorff. 

Felix Hausdorff (1868 – 1942) was born in Breslau, Prussia, and educated in Leipzig.  In his early years as a doctoral student, and as an assistant professor at Leipzig, he was a practicing mathematician by day and a philosopher and playwright by night, publishing under the pseudonym Paul Mongré.  He was at the University of Bonn working on set theory when the Greek mathematician Constatin Carathéodory published a paper in 1914 that showed how to construct a p-dimensional set in a q-dimensional space [9].  Haussdorff realized that he could apply similar ideas to the Cantor set.  He showed that the outer measure of the Cantor set would go discontinuously from zero to infinity as the fractional dimension increased smoothly.  The critical value where the measure changed its character became known as the Hausdorff dimension [11]. 

For the Cantor ternary set, the Hausdorff dimension is exactly DH = ln(2)/ln(3) = 0.6309.  This value for the dimension is less than the embedding dimension DE = 1 of the support (the real numbers on the interval [0, 1]), but it is also greater than DT = 0 which would hold for a countable number of points on the interval.  The work by Hausdorff became well known in the mathematics community who applied the idea to a broad range of point sets like Weierstrass’s monster and the Koch curve.

It is important to keep a perspective of what Hausdorff’s work meant during which period of time.  For instance, although the curves of Weierstrass, von Koch and Sierpinski were understood to present a challenge to concepts of dimension, it was only after Haussdorff that mathematicians began to think in terms of fractional dimensions and to calculate the fractional dimensions of these earlier point sets.  Despite the fact that Sierpinski created one of the most iconic fractals that we use as an example every day, he was unaware at the time that he was doing so.  His interest was topological—creating a curve for which any cut at any point would create disconnected subsets starting with objects (triangles) with topological dimension DT = 2.  In this way, talking about the early fractal objects tends to be anachronistic, using language to describe them that had not yet been invented at that time.

This perspective is also true for the ideas of topological dimension.  For instance, even Sierpinski was not fully tuned into the problems of defining topological dimension.  It turns out that what he created was a curve of topological dimension DT = 1, but that would only become clear later with the work of the Austrian mathematician Karl Menger.

Karl Menger (1926)

The day that Karl Menger (1902 – 1985) was born, his father, Carl Menger (1840 – 1941) lost his job.  Carl Menger was one of the founders of the famous Viennese school that established the marginalist view of economics.  However, Carl was not married to Karl’s mother, which was frowned upon by polite Vienna society, so he had to relinquish his professorship.  Despite his father’s reduction in status, Karl received an excellent education at a Viennese gymnasium (high school).  Among of his classmates were Wolfgang Pauli (Nobel Prize for Physics in 1945)  and Richard Kuhn (Nobel Prize for Chemistry in 1938).  When Karl began attending the University of Vienna he studied physics, but the mathematics professor Hans Hahn opened his eyes to the fascinating work on analysis that was transforming mathematics at that time, so Karl shifted his studies to mathematical analysis, specifically concerning conceptions of “curves”. 

Menger made important contributions to the history of fractal dimension as well as the history of topological dimension.  In his approach to defining the intrinsic topological dimension of a point set, he described the construction of a point set embedded in three dimensions that had zero volume, an infinite surface area, and a fractal dimension between 2 and 3.  The object is shown in Fig. 6 and is called a Menger “sponge” [12].  The Menger sponge is a fractal with a fractal dimension DH = ln(20)/ln(3) = 2.7268.  The face of the sponge is also known as the Sierpinski carpt.  The fractal dimension of the Sierpinski carpet is DH = ln(8)/ln(3) = 1.8928.

Fig. 6 Menger Sponge. Embedding dimension DE = 3. Fractal dimension DH = ln(20)/ln(3) = 2.7268. Topological dimension DT = 1: all one-dimensional metric spaces can be contained within the Menger sponge point set. Each face is a Sierpinski carpet with fractal dimension DH = ln(8)/ln(3) = 1.8928.

The striking feature of the Menger sponge is its topological dimension.  Menger created a new definition of topological dimension that partially solved the crises created by Cantor when he showed that every point on the unit square can be defined by a single coordinate.  This had put a one dimensional curve in one-to-one correspondence with a two-dimensional plane.  Yet the topology of a 2-dimensional object is clearly different than the topology of a line.  Menger found a simple definition that showed why 2D is different, topologically, than 3D, despite Cantor’s conundrum.  The answer came from the idea of making cuts on a point set and seeing if the cut created disconnected subsets. 

As a simple example, take a 1D line.  The removal of a single point creates two disconnected sub-lines.  The intersection of the cut with the line is 0-dimensional, and Menger showed that this defined the line as 1-dimensional.  Similarly, a line cuts the unit square into to parts.  The intersection of the cut with the plane is 1-dimensional, signifying that the dimension of the plane is 2-dimensional.  In other words, a (n-1) dimensional intersection of the boundary of a small neighborhood with the point set indicates that the point set has a dimension of n.  Generalizing this idea, looking at the Sierpinski gasket in Fig. 5, the boundary of a small circular region, if placed appropriately (as in the figure), intersects the Sierpinski gasket at three points of dimension zero.  Hence, the topological dimension of the Sierpinski gasket is one-dimensional.  Manger was likewise able to show that his sponge also had a topology that was one-dimensional, DT = 1, despite the embedding dimension of DE = 3.  In fact, all 1-dimensional metric spaces can be fit inside a Menger Sponge.

Benoit Mandelbrot (1967)

Benoit Mandelbrot (1924 – 2010) was born in Warsaw and his family emigrated to Paris in 1935.  He attended the Ecole Polytechnique where he studied under Gaston Julia (1893 – 1978) and Paul Levy (1886 – 1971).  Both Julia and Levy made significant contributions to the field of self-similar point sets and made a lasting impression on Mandelbrot.  He went to Cal Tech for a master’s degree in aeronautics and then a PhD in mathematical sciences from the University of Paris.  In 1958 Mandelbrot joined the research staff of the IBM Thomas J. Watson Research Center in Yorktown Heights, New York where he worked for over 35 years on topics of information theory and economics, always with a view of properties of self-similar sets and time series.

In 1967 Mandelbrot published one of his early important papers on the self-similar properties of the coastline of Britain.  He proposed that many natural features had statistical self similarity, which he applied to coastlines.  He published the work as “How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension” [13] in Science magazine , where he showed that the length of the coastline diverged with a Hausdorff dimension equal to D = 1.25.  Working at IBM, a world leader in computers, he had ready access to their power as well as their visualization capabilities.  Therefore, he was one of the first to begin exploring the graphical character of self-similar maps and point sets.

During one of his sabbaticals at Harvard University he began exploring the properties of Julia sets (named after his former teacher at the Ecole Polytechnique).  The Julia set is a self-similar point set that is easily visualized in the complex plane (two dimensions).  As Mandelbrot studied the convergence of divergence of infinite series defined by the Julia mapping, he discovered an infinitely nested pattern that was both beautiful and complex.  This has since become known as the Mandelbrot set.

Fig. 7 Mandelbrot set.

Later, in 1975, Mandelbrot coined the term fractal to describe these self-similar point sets, and he began to realize that these types of sets were ubiquitous in nature, ranging from the structure of trees and drainage basins, to the patterns of clouds and mountain landscapes.  He published his highly successful and influential book The Fractal Geometry of Nature in 1982, introducing fractals to the wider public and launching a generation of hobbyists interested in computer-generated fractals.  The rise of fractal geometry coincided with the rise of chaos theory that was aided by the same computing power.  For instance, important geometric structures of chaos theory, known as strange attractors, have fractal geometry. 

By David D. Nolte, Dec. 26, 2020

Appendix:  Box Counting

When confronted by a fractal of unknown structure, one of the simplest methods to find the fractal dimension is through box counting.  This method is shown in Fig. 8.  The fractal set is covered by a set of boxes of size b, and the number of boxes that contain at least one point of the fractal set are counted.  As the boxes are reduced in size, the number of covering boxes increases as 

To be numerically accurate, this method must be iterated over several orders of magnitude.  The number of boxes covering a fractal has this characteristic power law dependence, as shown in Fig. 8, and the fractal dimension is obtained as the slope.

Fig. 8  Calculation of the fractal dimension using box counting.  At each generation, the size of the grid is reduced by a factor of 3.  The number of boxes that contain some part of the fractal curve increases as  , where b is the scale


The Physics of Life, the Universe and Everything:

Read more about the history of modern dynamics and fractals, in Galileo Unbound from Oxford University Press


References

[1] Hardy, G. (1916). “Weierstrass’s non-differentiable function.” Transactions of the American Mathematical Society 17: 301-325.

[2] Weierstrass, K. (1872). “Uber continuirliche Functionen eines reellen Arguments, die fur keinen Werth des letzteren einen bestimmten Differentialquotienten besitzen.” Communication ri I’Academie Royale des Sciences II: 71-74.

[3] Besicovitch, A. S. and H. D. Ursell (1937). “Sets of fractional dimensions: On dimensional numbers of some continuous curves.” J. London Math. Soc. 1(1): 18-25.

[4] Shen, W. (2018). “Hausdorff dimension of the graphs of the classical Weierstrass functions.” Mathematische Zeitschrift. 289(1–2): 223–266.

[5] Cantor, G. (1883). Grundlagen einer allgemeinen Mannigfaltigkeitslehre. Leipzig, B. G. Teubner.

[6] Peano, G. (1890). “Sur une courbe qui remplit toute une aire plane.” Mathematische Annalen 36: 157-160.

[7] Peano, G. (1888). Calcolo geometrico secundo l’Ausdehnungslehre di H. Grassmann e precedutto dalle operazioni della logica deduttiva. Turin, Fratelli Bocca Editori.

[8] Von Koch, H. (1904). “Sur.une courbe continue sans tangente obtenue par une construction geometrique elementaire.” Arkiv for Mathematik, Astronomi och Fysich 1: 681-704.

[9] Sierpinski, W. (1915). “Sur une courbe dont tout point est un point de ramification.” Comptes Rendus Hebdomadaires des Seances de l’Academie des Sciences de Paris 160: 302-305.

[10] Carathéodory, C. (1914). “Über das lineare Mass von Punktmengen – eine Verallgemeinerung des Längenbegriffs.” Gött. Nachr. IV: 404–406.

[11] Hausdorff, F. (1919). “Dimension und ausseres Mass.” Mathematische Anna/en 79: 157-179.

[12] Menger, Karl (1926), “Allgemeine Räume und Cartesische Räume. I.”, Communications to the Amsterdam Academy of Sciences. English translation reprinted in Edgar, Gerald A., ed. (2004), Classics on fractals, Studies in Nonlinearity, Westview Press. Advanced Book Program, Boulder, CO

[13] B Mandelbrot, How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension. Science, 156 3775 (May 5, 1967): 636-638.