A Commotion in the Stars: The Legacy of Christian Doppler

Christian Andreas Doppler (1803 – 1853) was born in Salzburg, Austria, to a longstanding family of stonemasons.  As a second son, he was expected to help his older brother run the business, so his Father had him tested in his 18th year for his suitability for a career in business.  The examiner Simon Stampfer (1790 – 1864), an Austrian mathematician and inventor teaching at the Lyceum in Salzburg, discovered that Doppler had a gift for mathematics and was better suited for a scientific career.  Stampfer’s enthusiasm convinced Doppler’s father to enroll him in the Polytechnik Institute in Vienna (founded only a few years earlier in 1815) where he took classes in mathematics, mechanics and physics [1] from 1822 to 1825.  Doppler excelled in his courses, but was dissatisfied with the narrowness of the education, yearning for more breadth and depth in his studies and for more significance in his positions, feelings he would struggle with for his entire short life.  He left Vienna, returning to the Lyceum in Salzburg to round out his education with philosophy, languages and poetry.  Unfortunately, this four-year detour away from technical studies impeded his ability to gain a permanent technical position, so he began a temporary assistantship with a mathematics professor at Vienna.  As he approached his 30th birthday this term expired without prospects.  He was about to emigrate to America when he finally received an offer to teach at a secondary school in Prague.

To read about the attack by Joseph Petzval on Doppler’s effect and the effect it had on Doppler, see my feature article “The Fall and Rise of the Doppler Effect in Physics Today, 73(3) 30, March (2020).

Salzburg Austria

Doppler in Prague

Prague gave Doppler new life.  He was a professor with a position that allowed him to marry the daughter of a sliver and goldsmith from Salzburg.  He began to publish scholarly papers, and in 1837 was appointed supplementary professor of Higher Mathematics and Geometry at the Prague Technical Institute, promoted to full professor in 1841.  It was here that he met the unusual genius Bernard Bolzano (1781 – 1848), recently returned from political exile in the countryside.  Bolzano was a philosopher and mathematician who developed rigorous concepts of mathematical limits and is famous today for his part in the Bolzano-Weierstrass theorem in functional analysis, but he had been too liberal and too outspoken for the conservative Austrian regime and had been dismissed from the University in Prague in 1819.  He was forbidden to publish his work in Austrian journals, which is one reason why much of Bolzano’s groundbreaking work in functional analysis remained unknown during his lifetime.  However, he participated in the Bohemian Society for Science from a distance, recognizing the inventive tendencies in the newcomer Doppler and supporting him for membership in the Bohemian Society.  When Bolzano was allowed to return in 1842 to the Polytechnic Institute in Prague, he and Doppler became close friends as kindred spirits. 

Prague, Czech Republic

On May 25, 1842, Bolzano presided as chairman over a meeting of the Bohemian Society for Science on the day that Doppler read a landmark paper on the color of stars to a meagre assembly of only five regular members of the Society [2].  The turn-out was so small that the meeting may have been held in the robing room of the Society rather than in the meeting hall itself.  Leading up to this famous moment, Doppler’s interests were peripatetic, ranging widely over mathematical and physical topics, but he had lately become fascinated by astronomy and by the phenomenon of stellar aberration.  Stellar aberration was discovered by James Bradley in 1729 and explained as the result of the Earth’s yearly motion around the Sun, causing the apparent location of a distant star to change slightly depending on the direction of the Earth’s motion.  Bradley explained this in terms of the finite speed of light and was able to estimate it to within several percent [3].  As Doppler studied Bradley aberration, he wondered how the relative motion of the Earth would affect the color of the star.  By making a simple analogy of a ship traveling with, or against, a series of ocean waves, he concluded that the frequency of impact of the peaks and troughs of waves on the ship was no different than the arrival of peaks and troughs of the light waves impinging on the eye.  Because perceived color was related to the frequency of excitation in the eye, he concluded that the color of light would be slightly shifted to the blue if approaching, and to the red if receding from, the light source. 

Doppler wave fronts from a source emitting spherical waves moving with speeds β relative to the speed of the wave in the medium.

Doppler calculated the magnitude of the effect by taking a simple ratio of the speed of the observer relative to the speed of light.  What he found was that the speed of the Earth, though sufficient to cause the detectable aberration in the position of stars, was insufficient to produce a noticeable change in color.  However, his interest in astronomy had made him familiar with binary stars where the relative motion of the light source might be high enough to cause color shifts.  In fact, in the star catalogs there were examples of binary stars that had complementary red and blue colors.  Therefore, the title of his paper, published in the Proceedings of the Royal Bohemian Society of Sciences a few months after he read it to the society, was “On the Coloured Light of the Double Stars and Certain Other Stars of the Heavens: Attempt at a General Theory which Incorporates Bradley’s Theorem of Aberration as an Integral Part” [4]

Title page of Doppler’s 1842 paper introducing the Doppler Effect.

Doppler’s analogy was correct, but like all analogies not founded on physical law, it differed in detail from the true nature of the phenomenon.  By 1842 the transverse character of light waves had been thoroughly proven through the work of Fresnel and Arago several decades earlier, yet Doppler held onto the old-fashioned notion that light was composed of longitudinal waves.  Bolzano, fully versed in the transverse nature of light, kindly published a commentary shortly afterwards [5] showing how the transverse effect for light, and a longitudinal effect for sound, were both supported by Doppler’s idea.  Yet Doppler also did not know that speeds in visual binaries were too small to produce noticeable color effects to the unaided eye.  Finally, (and perhaps the greatest flaw in his argument on the color of stars) a continuous spectrum that extends from the visible into the infrared and ultraviolet would not change color because all the frequencies would shift together preserving the flat (white) spectrum.

The simple algebraic derivation of the Doppler Effect in the 1842 publication..

Doppler’s twelve years in Prague were intense.  He was consumed by his Society responsibilities and by an extremely heavy teaching load that included personal exams of hundreds of students.  The only time he could be creative was during the night while his wife and children slept.  Overworked and running on too little rest, his health already frail with the onset of tuberculosis, Doppler collapsed, and he was unable to continue at the Polytechnic.  In 1847 he transferred to the School of Mines and Forrestry in Schemnitz (modern Banská Štiavnica in Slovakia) with more pay and less work.  Yet the revolutions of 1848 swept across Europe, with student uprisings, barricades in the streets, and Hungarian liberation armies occupying the cities and universities, giving him no peace.  Providentially, his former mentor Stampfer retired from the Polytechnic in Vienna, and Doppler was called to fill the vacancy.

Although Doppler was named the Director of Austria’s first Institute of Physics and was elected to the National Academy, he ran afoul of one of the other Academy members, Joseph Petzval (1807 – 1891), who persecuted Doppler and his effect.  To read a detailed description of the attack by Petzval on Doppler’s effect and the effect it had on Doppler, see my feature article “The Fall and Rise of the Doppler Effect” in Physics Today, March issue (2020).

Christian Doppler

Voigt’s Transformation

It is difficult today to appreciate just how deeply engrained the reality of the luminiferous ether was in the psyche of the 19th century physicist.  The last of the classical physicists were reluctant even to adopt Maxwell’s electromagnetic theory for the explanation of optical phenomena, and as physicists inevitably were compelled to do so, some of their colleagues looked on with dismay and disappointment.  This was the situation for Woldemar Voigt (1850 – 1919) at the University of Göttingen, who was appointed as one of the first professors of physics there in 1883, to be succeeded in later years by Peter Debye and Max Born.  Voigt received his doctorate at the University of Königsberg under Franz Neumann, exploring the elastic properties of rock salt, and at Göttingen he spent a quarter century pursuing experimental and theoretical research into crystalline properties.  Voigt’s research, with students like Paul Drude, laid the foundation for the modern field of solid state physics.  His textbook Lehrbuch der Kristallphysik published in 1910 remained influential well into the 20th century because it adopted mathematical symmetry as a guiding principle of physics.  It was in the context of his studies of crystal elasticity that he introduced the word “tensor” into the language of physics.

At the January 1887 meeting of the Royal Society of Science at Göttingen, three months before Michelson and Morely began their reality-altering experiments at the Case Western Reserve University in Cleveland Ohio, Voit submitted a paper deriving the longitudinal optical Doppler effect in an incompressible medium.  He was responding to results published in 1886 by Michelson and Morely on their measurements of the Fresnel drag coefficient, which was the precursor to their later results on the absolute motion of the Earth through the ether. 

Fresnel drag is the effect of light propagating through a medium that is in motion.  The French physicist Francois Arago (1786 – 1853) in 1810 had attempted to observe the effects of corpuscles of light emitted from stars propagating with different speeds through the ether as the Earth spun on its axis and traveled around the sun.  He succeeded only in observing ordinary stellar aberration.  The absence of the effects of motion through the ether motivated Augustin-Jean Fresnel (1788 – 1827) to apply his newly-developed wave theory of light to explain the null results.  In 1818 Fresnel derived an expression for the dragging of light by a moving medium that explained the absence of effects in Arago’s observations.  For light propagating through a medium of refractive index n that is moving at a speed v, the resultant velocity of light is

where the last term in parenthesis is the Fresnel drag coefficient.  The Fresnel drag effect supported the idea of the ether by explaining why its effects could not be observed—a kind of Catch-22—but it also applied to light moving through a moving dielectric medium.  In 1851, Fizeau used an interferometer to measure the Fresnel drag coefficient for light moving through moving water, arriving at conclusions that directly confirmed the Fresnel drag effect.  The positive experiments of Fizeau, as well as the phenomenon of stellar aberration, would be extremely influential on the thoughts of Einstein as he developed his approach to special relativity in 1905.  They were also extremely influential to Michelson, Morley and Voigt.

 In his paper on the absence of the Fresnel drag effect in the first Michelson-Morley experiment, Voigt pointed out that an equation of the form

is invariant under the transformation

From our modern vantage point, we immediately recognize (to within a scale factor) the Lorentz transformation of relativity theory.  The first equation is common Galilean relativity, but the last equation was something new, introducing a position-dependent time as an observer moved with speed  relative to the speed of light [6].  Using these equations, Voigt was the first to derive the longitudinal (conventional) Doppler effect from relativistic effects.

Voigt’s derivation of the longitudinal Doppler effect used a classical approach that is still used today in Modern Physics textbooks to derive the Doppler effect.  The argument proceeds by considering a moving source that emits a continuous wave in the direction of motion.  Because the wave propagates at a finite speed, the moving source chases the leading edge of the wave front, catching up by a small amount by the time a single cycle of the wave has been emitted.  The resulting compressed oscillation represents a blue shift of the emitted light.  By using his transformations, Voigt arrived at the first relativistic expression for the shift in light frequency.  At low speeds, Voigt’s derivation reverted to Doppler’s original expression.

A few months after Voigt delivered his paper, Michelson and Morley announced the results of their interferometric measurements of the motion of the Earth through the ether—with their null results.  In retrospect, the Michelson-Morley experiment is viewed as one of the monumental assaults on the old classical physics, helping to launch the relativity revolution.  However, in its own day, it was little more than just another null result on the ether.  It did incite Fitzgerald and Lorentz to suggest that length of the arms of the interferometer contracted in the direction of motion, with the eventual emergence of the full Lorentz transformations by 1904—seventeen years after the Michelson results.

            In 1904 Einstein, working in relative isolation at the Swiss patent office, was surprisingly unaware of the latest advances in the physics of the ether.  He did not know about Voigt’s derivation of the relativistic Doppler effect  (1887) as he had not heard of Lorentz’s final version of relativistic coordinate transformations (1904).  His thinking about relativistic effects focused much farther into the past, to Bradley’s stellar aberration (1725) and Fizeau’s experiment of light propagating through moving water (1851).  Einstein proceeded on simple principles, unencumbered by the mental baggage of the day, and delivered his beautifully minimalist theory of special relativity in his famous paper of 1905 “On the Electrodynamics of Moving Bodies”, independently deriving the Lorentz coordinate transformations [7]

One of Einstein’s talents in theoretical physics was to predict new phenomena as a way to provide direct confirmation of a new theory.  This was how he later famously predicted the deflection of light by the Sun and the gravitational frequency shift of light.  In 1905 he used his new theory of special relativity to predict observable consequences that included a general treatment of the relativistic Doppler effect.  This included the effects of time dilation in addition to the longitudinal effect of the source chasing the wave.  Time dilation produced a correction to Doppler’s original expression for the longitudinal effect that became significant at speeds approaching the speed of light.  More significantly, it predicted a transverse Doppler effect for a source moving along a line perpendicular to the line of sight to an observer.  This effect had not been predicted either by Doppler or by Voigt.  The equation for the general Doppler effect for any observation angle is

Just as Doppler had been motivated by Bradley’s aberration of starlight when he conceived of his original principle for the longitudinal Doppler effect, Einstein combined the general Doppler effect with his results for the relativistic addition of velocities (also in his 1905 Annalen paper) as the conclusive treatment of stellar aberration nearly 200 years after Bradley first observed the effect.

Despite the generally positive reception of Einstein’s theory of special relativity, some of its consequences were anathema to many physicists at the time.  A key stumbling block was the question whether relativistic effects, like moving clocks running slowly, were only apparent, or were actually real, and Einstein had to fight to convince others of its reality.  When Johannes Stark (1874 – 1957) observed Doppler line shifts in ion beams called “canal rays” in 1906 (Stark received the 1919 Nobel prize in part for this discovery) [8], Einstein promptly published a paper suggesting how the canal rays could be used in a transverse geometry to directly detect time dilation through the transverse Doppler effect [9].  Thirty years passed before the experiment was performed with sufficient accuracy by Herbert Ives and G. R. Stilwell in 1938 to measure the transverse Doppler effect [10].  Ironically, even at this late date, Ives and Stilwell were convinced that their experiment had disproved Einstein’s time dilation by supporting Lorentz’ contraction theory of the electron.  The Ives-Stilwell experiment was the first direct test of time dilation, followed in 1940 by muon lifetime measurements [11].

Further Reading

D. D. Nolte, “The Fall and Rise of the Doppler Effect“, Phys. Today 73(3) 30, March 2020.


[1] pg. 15, Eden, A. (1992). The search for Christian Doppler. Wien, Springer-Verlag.

[2] pg. 30, Eden

[3] Bradley, J (1729). “Account of a new discoved Motion of the Fix’d Stars”. Phil Trans. 35: 637–660.

[4] C. A. DOPPLER, “Über das farbige Licht der Doppelsterne und einiger anderer Gestirne des Himmels (About the coloured light of the binary stars and some other stars of the heavens),” Proceedings of the Royal Bohemian Society of Sciences, vol. V, no. 2, pp. 465–482, (Reissued 1903) (1842).

[5] B. Bolzano, “Ein Paar Bemerkunen über die Neu Theorie in Herrn Professor Ch. Doppler’s Schrift “Über das farbige Licht der Doppersterne und eineger anderer Gestirnedes Himmels”,” Pogg. Anal. der Physik und Chemie, vol. 60, p. 83, 1843; B. Bolzano, “Christian Doppler’s neuste Leistunen af dem Gebiet der physikalischen Apparatenlehre, Akoustik, Optik and optische Astronomie,” Pogg. Anal. der Physik und Chemie, vol. 72, pp. 530-555, 1847.

[6] W. Voigt, “Uber das Doppler’sche Princip,” Göttinger Nachrichten, vol. 7, pp. 41–51, (1887). The common use of c to express the speed of light came later from Voigt’s student Paul Drude.

[7] A. Einstein, “On the electrodynamics of moving bodies,” Annalen Der Physik, vol. 17, pp. 891-921, 1905.

[8] J. Stark, W. Hermann, and S. Kinoshita, “The Doppler effect in the spectrum of mercury,” Annalen Der Physik, vol. 21, pp. 462-469, Nov 1906.

[9] A. Einstein, “”Über die Möglichkeit einer neuen Prüfung des Relativitätsprinzips”,” vol. 328, pp. 197–198, 1907.

[10] H. E. Ives and G. R. Stilwell, “An experimental study of the rate of a moving atomic clock,” Journal of the Optical Society of America, vol. 28, p. 215, 1938.

[11] B. Rossi and D. B. Hall, “Variation of the Rate of Decay of Mesotrons with Momentum,” Physical Review, vol. 59, pp. 223–228, 1941.

Snell’s Law: The Five-Fold Way

The bending of light rays as they enter a transparent medium—what today is called Snell’s Law—has had a long history of independent discoveries and radically different approaches.  The general problem of refraction was known to the Greeks in the first century AD, and it was later discussed by the Arabic scholar Alhazan.  Ibn Sahl in Bagdad in 984 AD was the first to put an accurate equation to the phenomenon.  Thomas Harriott in England discussed the problem with Johannes Kepler in 1602, unaware of the work by Ibn Sahl.  Willebrord Snellius (1580–1626) in the Netherlands derived the equation for refraction in 1621, but did not publish it, though it was known to Christian Huygens (1629 – 1695).  René Descartes (1596 – 1650), unaware of Snellius’ work, derived the law in his Dioptrics, using his newly-invented coordinate geometry.  Christiaan Huygens, in his Traité de la Lumière in 1678, derived the law yet again, this time using his principle of secondary waves, though he acknowledged the prior work of Snellius, permanently cementing the shortened name “Snell” to the law of refraction.

Through this history and beyond, there have been many approaches to deriving Snell’s Law.  Some used ideas of momentum, while others used principles of waves.  Today, there are roughly five different ways to derive Snell’s law.  These are:

            1) Huygens’ Principle,

            2) Fermat’s Principle,

            3) Wavefront Continuity

            4) Plane-wave Boundary Conditions, and

            5) Photon Momentum Conservation.

The approaches differ in detail, but they fall into two rough categories:  the first two fall under minimization or extremum principles, and the last three fall under continuity or conservation principles.

Snell’s Law: Huygens’ Principle

Huygens’ principle, published in 1687, states that every point on a wavefront serves as the source of a spherical secondary wave.  This was one of the first wave principles ever proposed for light (Robert Hooke had suggested that light had wavelike character based on his observations of colors in thin films) yet remains amazingly powerful even today.  It can be used not only to derive Snell’s law but also properties of light scattering and diffraction.  Huygens’ principle is a form of minimization principle:  it finds the direction of propagation (for a spherically expanding wavefront from a point where a ray strikes a surface) that yields a minimum angle (tangent to the surface) relative to a second source.  Finding the tangent to the spherical surface is a minimization problem and yields Snell’s Law.

Fig. 1 Huygens’ principle.

            The use of Huygen’s principle for the derivation of Snell’s Law is shown in Fig. 1.  Two parallel incoming rays strike a surface a distance d apart.  The first point emits a secondary spherical wave into the second medium.  The wavefront propagates at a speed of v2 relative to the speed in the first medium of v1.  In the diagram, the propagation distance over the distance d is equal to the sine of the angle

Solving for d and equating the two equations gives

The speed depends on the refractive index as

which leads to Snell’s Law:

Snell’s Law: Fermat’s Principle

Fermat’s principle of least time is a direct minimization problem that finds the least time it takes light to propagate from one point to another.  One of the central questions about Fermat’s principle is: why does it work?  Why is the path of least time the path light needs to take?  I’ll answer that question after we do the derivation.  The configuration of the problem is shown in Fig. 2.

Fig. 2 Fermat’s principle of least time and Feynman’s principle of stationary action leading to maximum constructive interference.

Consider a source point A and a destination point B.  Light travels in a straite line in each medium, deflecting at the point x on the figure.  The speed in medium 1 is c/n1, and the speed in medium 2 is c/n2.  What position x provides the minimum time?

The distances from A to x, and from x to B are, respectively:

The total time is

Minimize this expression by taking the derivative of the time relative to the position x and setting the result to zero

Converting the cosines to sines yields Snell’s Law

Fermat’s principle of least time can be explained in terms of wave interference.  If we think of all paths being taken by propagating waves, then those waves that take paths that differ only a little from the optimum path still interfere constructively.  This is the principle of stationarity.  The time minimizes a quadratic expression that deviates from the minimum only in second order (shown in the right part of Fig. 2).  Therefore, all “nearby” paths interfere constructively, while paths that are farther away begin to interfere destructively.  Therefore, the path of least time is also the path of stationary time and hence stationary optical path length and hence the path of maximum constructive interference.  This is the actual path taken by the wave—and the light.

Snell’s Law: Wavefront Continuity

When a wave passes across an interface between two transparent media the phase of the wave remains continuous.  This continuity of phase provides a way to derive Snell’s Law.  Consider Fig. 3.  A plane wave with wavelength l1 is incident from medium 1 on an interface with medium 2 in which the wavelength is l2.  The wavefronts remain continuous, but they are “kinked” at the interface. 

Fig. 3 Wavefront continuity.

The waves in medium 1 and medium 2 share the part of the interface between wavefronts.  This distance is

The wavelengths in the two media are related to the refractive index through

where l0 is the free-space wavelength.  Plugging these into the first expression yields

which relates the denominators through Snell’s Law

Snell’s Law: Plane-Wave Boundary Condition

Maxwell’s four equations in integral form can each be applied to the planar interface between two refractive media.

Fig. 4 Electromagnetic boundary conditions leading to phase-matching at the planar interface.

All four boundary conditions can be written as

The only way this condition can be true for all possible values of the fields is if the phases of the wave terms are all the same (phase-matching), namely

which in turn guarantees that the transverse projection of the k-vector is continuous across the interface

and the transverse components (projections) are

where the last line states both Snell’s law of refraction and the law of reflection. Therefore, the general wave boundary condition leads immediately to Snell’s Law.

Snell’s Law: Momentum Conservation

Going from Maxwell’s equations for classical fields to photons keeps the same mathematical form for the transverse components for the k-vectors, but now interprets them in a different manner.  Where before there was a requirement for phase-matching the classical waves at the interface, in the photon picture the transverse k-vector becomes the transverse momentum through de Broglie’s equation

Therefore, continuity of the transverse k-vector is interpreted as conservation of transverse momentum of the photon across the interface.  In the figure the second medium is denser with a larger refractive index n2 > n1.  Hence, the momentum of the photon in the second medium is larger while keeping the transverse momentum projection the same.  This simple interpretation gives the same mathematical form as the previous derivation using classical boundary conditions, namely

which is again Snell’s law and the law of relection.

Fig. 5 Conservation of transverse photon momentum.


Snell’s Law has an eerie habit of springing from almost any statement that can be made about a dielectric interface. It yields the path of least time, tracks the path of maximum constructive interference, produces wavefronts that are extremally tangent to wavefronts, connects continuous wavefronts across the interface, conserves transverse momentum, and guarantees phase matching. These all sound very different, yet all lead to the same simple law of Snellius and Ibn Sahl.

This is deep physics!

Orbiting Photons around a Black Hole

The physics of a path of light passing a gravitating body is one of the hardest concepts to understand in General Relativity, but it is also one of the easiest.  It is hard because there can be no force of gravity on light even though the path of a photon bends as it passes a gravitating body.  It is easy, because the photon is following the simplest possible path—a geodesic equation for force-free motion.

         This blog picks up where my last blog left off, having there defined the geodesic equation and presenting the Schwarzschild metric.  With those two equations in hand, we could simply solve for the null geodesics (a null geodesic is the path of a light beam through a manifold).  But there turns out to be a simpler approach that Einstein came up with himself (he never did like doing things the hard way).  He just had to sacrifice the fundamental postulate that he used to explain everything about Special Relativity.

Throwing Special Relativity Under the Bus

The fundamental postulate of Special Relativity states that the speed of light is the same for all observers.  Einstein posed this postulate, then used it to derive some of the most astonishing consequences of Special Relativity—like E = mc2.  This postulate is at the rock core of his theory of relativity and can be viewed as one of the simplest “truths” of our reality—or at least of our spacetime. 

            Yet as soon as Einstein began thinking how to extend SR to a more general situation, he realized almost immediately that he would have to throw this postulate out.   While the speed of light measured locally is always equal to c, the apparent speed of light observed by a distant observer (far from the gravitating body) is modified by gravitational time dilation and length contraction.  This means that the apparent speed of light, as observed at a distance, varies as a function of position.  From this simple conclusion Einstein derived a first estimate of the deflection of light by the Sun, though he initially was off by a factor of 2.  (The full story of Einstein’s derivation of the deflection of light by the Sun and the confirmation by Eddington is in Chapter 7 of Galileo Unbound (Oxford University Press, 2018).)

The “Optics” of Gravity

The invariant element for a light path moving radially in the Schwarzschild geometry is

The apparent speed of light is then

where c(r) is  always less than c, when observing it from flat space.  The “refractive index” of space is defined, as for any optical material, as the ratio of the constant speed divided by the observed speed

Because the Schwarzschild metric has the property

the effective refractive index of warped space-time is

with a divergence at the Schwarzschild radius.

            The refractive index of warped space-time in the limit of weak gravity can be used in the ray equation (also known as the Eikonal equation described in an earlier blog)

where the gradient of the refractive index of space is

The ray equation is then a four-variable flow

These equations represent a 4-dimensional flow for a light ray confined to a plane.  The trajectory of any light path is found by using an ODE solver subject to the initial conditions for the direction of the light ray.  This is simple for us to do today with Python or Matlab, but it was also that could be done long before the advent of computers by early theorists of relativity like Max von Laue  (1879 – 1960).

The Relativity of Max von Laue

In the Fall of 1905 in Berlin, a young German physicist by the name of Max Laue was sitting in the physics colloquium at the University listening to another Max, his doctoral supervisor Max Planck, deliver a seminar on Einstein’s new theory of relativity.  Laue was struck by the simplicity of the theory, in this sense “simplistic” and hence hard to believe, but the beauty of the theory stuck with him, and he began to think through the consequences for experiments like the Fizeau experiment on partial ether drag.

         Armand Hippolyte Louis Fizeau (1819 – 1896) in 1851 built one of the world’s first optical interferometers and used it to measure the speed of light inside moving fluids.  At that time the speed of light was believed to be a property of the luminiferous ether, and there were several opposing theories on how light would travel inside moving matter.  One theory would have the ether fully stationary, unaffected by moving matter, and hence the speed of light would be unaffected by motion.  An opposite theory would have the ether fully entrained by matter and hence the speed of light in moving matter would be a simple sum of speeds.  A middle theory considered that only part of the ether was dragged along with the moving matter.  This was Fresnel’s partial ether drag hypothesis that he had arrived at to explain why his friend Francois Arago had not observed any contribution to stellar aberration from the motion of the Earth through the ether.  When Fizeau performed his experiment, the results agreed closely with Fresnel’s drag coefficient, which seemed to settle the matter.  Yet when Michelson and Morley performed their experiments of 1887, there was no evidence for partial drag.

         Even after the exposition by Einstein on relativity in 1905, the disagreement of the Michelson-Morley results with Fizeau’s results was not fully reconciled until Laue showed in 1907 that the velocity addition theorem of relativity gave complete agreement with the Fizeau experiment.  The velocity observed in the lab frame is found using the velocity addition theorem of special relativity. For the Fizeau experiment, water with a refractive index of n is moving with a speed v and hence the speed in the lab frame is

The difference in the speed of light between the stationary and the moving water is the difference

where the last term is precisely the Fresnel drag coefficient.  This was one of the first definitive “proofs” of the validity of Einstein’s theory of relativity, and it made Laue one of relativity’s staunchest proponents.  Spurred on by his success with the Fresnel drag coefficient explanation, Laue wrote the first monograph on relativity theory, publishing it in 1910. 

Fig. 1 Front page of von Laue’s textbook, first published in 1910, on Special Relativity (this is a 4-th edition published in 1921).

A Nobel Prize for Crystal X-ray Diffraction

In 1909 Laue became a Privatdozent under Arnold Sommerfeld (1868 – 1951) at the university in Munich.  In the Spring of 1912 he was walking in the Englischer Garten on the northern edge of the city talking with Paul Ewald (1888 – 1985) who was finishing his doctorate with Sommerfed studying the structure of crystals.  Ewald was considering the interaction of optical wavelength with the periodic lattice when it struck Laue that x-rays would have the kind of short wavelengths that would allow the crystal to act as a diffraction grating to produce multiple diffraction orders.  Within a few weeks of that discussion, two of Sommerfeld’s students (Friedrich and Knipping) used an x-ray source and photographic film to look for the predicted diffraction spots from a copper sulfate crystal.  When the film was developed, it showed a constellation of dark spots for each of the diffraction orders of the x-rays scattered from the multiple periodicities of the crystal lattice.  Two years later, in 1914, Laue was awarded the Nobel prize in physics for the discovery.  That same year his father was elevated to the hereditary nobility in the Prussian empire and Max Laue became Max von Laue.

            Von Laue was not one to take risks, and he remained conservative in many of his interests.  He was immensely respected and played important roles in the administration of German science, but his scientific contributions after receiving the Nobel Prize were only modest.  Yet as the Nazis came to power in the early 1930’s, he was one of the few physicists to stand up and resist the Nazi take-over of German physics.  He was especially disturbed by the plight of the Jewish physicists.  In 1933 he was invited to give the keynote address at the conference of the German Physical Society in Wurzburg where he spoke out against the Nazi rejection of relativity as they branded it “Jewish science”.  In his speech he likened Einstein, the target of much of the propaganda, to Galileo.  He said, “No matter how great the repression, the representative of science can stand erect in the triumphant certainty that is expressed in the simple phrase: And yet it moves.”  Von Laue believed that truth would hold out in the face of the proscription against relativity theory by the Nazi regime.  The quote “And yet it moves” is supposed to have been muttered by Galileo just after his abjuration before the Inquisition, referring to the Earth moving around the Sun.  Although the quote is famous, it is believed to be a myth.

            In an odd side-note of history, von Laue sent his gold Nobel prize medal to Denmark for its safe keeping with Niels Bohr so that it would not be paraded about by the Nazi regime.  Yet when the Nazis invaded Denmark, to avoid having the medals fall into the hands of the Nazis, the medal was dissolved in aqua regia by a member of Bohr’s team, George de Hevesy.  The gold completely dissolved into an orange liquid that was stored in a beaker high on a shelf through the war.  When Denmark was finally freed, the dissolved gold was precipitated out and a new medal was struck by the Nobel committee and re-presented to von Laue in a ceremony in 1951. 

The Orbits of Light Rays

Von Laue’s interests always stayed close to the properties of light and electromagnetic radiation ever since he was introduced to the field when he studied with Woldemor Voigt at Göttingen in 1899.  This interest included the theory of relativity, and only a few years after Einstein published his theory of General Relativity and Gravitation, von Laue added to his earlier textbook on relativity by writing a second volume on the general theory.  The new volume was published in 1920 and included the theory of the deflection of light by gravity. 

         One of the very few illustrations in his second volume is of light coming into interaction with a super massive gravitational field characterized by a Schwarzschild radius.  (No one at the time called it a “black hole”, nor even mentioned Schwarzschild.  That terminology came much later.)  He shows in the drawing, how light, if incident at just the right impact parameter, would actually loop around the object.  This is the first time such a diagram appeared in print, showing the trajectory of light so strongly affected by gravity.

Fig. 2 A page from von Laue’s second volume on relativity (first published in 1920) showing the orbit of a photon around a compact mass with “gravitational cutoff” (later known as a “black hole:”). The figure is drawn semi-quantitatively, but the phenomenon was clearly understood by von Laue.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Created on Tue May 28 11:50:24 2019
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os


def create_circle():
	circle = plt.Circle((0,0), radius= 10, color = 'black')
	return circle

def show_shape(patch):
def refindex(x,y):
    A = 10
    eps = 1e-6
    rp0 = np.sqrt(x**2 + y**2);
    n = 1/(1 - A/(rp0+eps))
    fac = np.abs((1-9*(A/rp0)**2/8))   # approx correction to Eikonal
    nx = -fac*n**2*A*x/(rp0+eps)**3
    ny = -fac*n**2*A*y/(rp0+eps)**3
    return [n,nx,ny]

def flow_deriv(x_y_z,tspan):
    x, y, z, w = x_y_z
    [n,nx,ny] = refindex(x,y)
    yp = np.zeros(shape=(4,))
    yp[0] = z/n
    yp[1] = w/n
    yp[2] = nx
    yp[3] = ny
    return yp
for loop in range(-5,30):
    xstart = -100
    ystart = -2.245 + 4*loop
    [n,nx,ny] = refindex(xstart,ystart)

    y0 = [xstart, ystart, n, 0]

    tspan = np.linspace(1,400,2000)

    y = integrate.odeint(flow_deriv, y0, tspan)

    xx = y[1:2000,0]
    yy = y[1:2000,1]

    lines = plt.plot(xx,yy)
    plt.setp(lines, linewidth=1)
    plt.title('Photon Orbits')
c = create_circle()
axes = plt.gca()

# Now set up a circular photon orbit
xstart = 0
ystart = 15

[n,nx,ny] = refindex(xstart,ystart)

y0 = [xstart, ystart, n, 0]

tspan = np.linspace(1,94,1000)

y = integrate.odeint(flow_deriv, y0, tspan)

xx = y[1:1000,0]
yy = y[1:1000,1]

lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=2, color = 'black')

One of the most striking effects of gravity on photon trajectories is the possibility for a photon to orbit a black hole in a circular orbit. This is shown in Fig. 3 as the black circular ring for a photon at a radius equal to 1.5 times the Schwarzschild radius. This radius defines what is known as the photon sphere. However, the orbit is not stable. Slight deviations will send the photon spiraling outward or inward.

The Eikonal approximation does not strictly hold under strong gravity, but the Eikonal equations with the effective refractive index of space still yield semi-quantitative behavior. In the Python code, a correction factor is used to match the theory to the circular photon orbits, while still agreeing with trajectories far from the black hole. The results of the calculation are shown in Fig. 3. For large impact parameters, the rays are deflected through a finite angle. At a critical impact parameter, near 3 times the Schwarzschild radius, the ray loops around the black hole. For smaller impact parameters, the rays are captured by the black hole.

Fig. 3 Photon orbits near a black hole calculated using the Eikonal equation and the effective refractive index of warped space. One ray, near the critical impact parameter, loops around the black hole as predicted by von Laue. The central black circle is the black hole with a Schwarzschild radius of 10 units. The black ring is the circular photon orbit at a radius 1.5 times the Schwarzschild radius.

Photons pile up around the black hole at the photon sphere. The first image ever of the photon sphere of a black hole was made earlier this year (announced April 10, 2019). The image shows the shadow of the supermassive black hole in the center of Messier 87 (M87), an elliptical galaxy 55 million light-years from Earth. This black hole is 6.5 billion times the mass of the Sun. Imaging the photosphere required eight ground-based radio telescopes placed around the globe, operating together to form a single telescope with an optical aperture the size of our planet.  The resolution of such a large telescope would allow one to image a half-dollar coin on the surface of the Moon, although this telescope operates in the radio frequency range rather than the optical.

Fig. 4 Scientists have obtained the first image of a black hole, using Event Horizon Telescope observations of the center of the galaxy M87. The image shows a bright ring formed as light bends in the intense gravity around a black hole that is 6.5 billion times more massive than the Sun.

Further Reading

Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019)

B. Lavenda, The Optical Properties of Gravity, J. Mod. Phys, 8 8-3-838 (2017)

The Iconic Eikonal and the Optical Path

Nature loves the path of steepest descent.  Place a ball on a smooth curved surface and release it, and it will instantansouly accelerate in the direction of steepest descent.  Shoot a laser beam from an oblique angle onto a piece of glass to hit a target inside, and the path taken by the beam is the only path that decreases the distance to the target in the shortest time.  Diffract a stream of electrons from the surface of a crystal, and quantum detection events are greatest at the positions where the troughs and peaks of the deBroglie waves converge the most.  The first example is Newton’s second law.  The second example is Fermat’s principle.  The third example is Feynman’s path-integral formulation of quantum mechanics.  They all share in common a minimization principle—the principle of least action—that the path of a dynamical system is the one that minimizes a property known as “action”.

The Eikonal Equation is the “F = ma” of ray optics.  It’s solutions describe the paths of light rays through complicated media.

         The principle of least action, first proposed by the French physicist Maupertuis through mechanical analogy, became a principle of Lagrangian mechanics in the hands of Lagrange, but was still restricted to mechanical systems of particles.  The principle was generalized forty years later by Hamilton, who began by considering the propagation of light waves, and ended by transforming mechanics into a study of pure geometry divorced from forces and inertia.  Optics played a key role in the development of mechanics, and mechanics returned the favor by giving optics the Eikonal Equation.  The Eikonal Equation is the “F = ma” of ray optics.  It’s solutions describe the paths of light rays through complicated media.

Malus’ Theorem

Anyone who has taken a course in optics knows that Étienne-Louis Malus (1775-1812) discovered the polarization of light, but little else is taught about this French mathematician who was one of the savants Napoleon had taken along with himself when he invaded Egypt in 1798.  After experiencing numerous horrors of war and plague, Malus returned to France damaged but wiser.  He discovered the polarization of light in the Fall of 1808 as he was playing with crystals of icelandic spar at sunset and happened to view last rays of the sun reflected from the windows of the Luxumbourg palace.  Icelandic spar produces double images in natural light because it is birefringent.  Malus discovered that he could extinguish one of the double images of the Luxumbourg windows by rotating the crystal a certain way, demonstrating that light is polarized by reflection.  The degree to which light is extinguished as a function of the angle of the polarizing crystal is known as Malus’ Law

Fronts-piece to the Description de l’Égypte , the first volume published by Joseph Fourier in 1808 based on the report of the savants of L’Institute de l’Égypte that included Monge, Fourier and Malus, among many other French scientists and engineers.

         Malus had picked up an interest in the general properties of light and imaging during lulls in his ordeal in Egypt.  He was an emissionist following his compatriot Laplace, rather than an undulationist following Thomas Young.  It is ironic that the French scientists were staunchly supporting Newton on the nature of light, while the British scientist Thomas Young was trying to upend Netwonian optics.  Almost all physicists at that time were emissionists, only a few years after Young’s double-slit experiment of 1804, and few serious scientists accepted Young’s theory of the wave nature of light until Fresnel and Arago supplied the rigorous theory and experimental proofs much later in 1819. 

Malus’ Theorem states that rays perpendicular to an initial surface are perpendicular to a later surface after reflection in an optical system. This theorem is the starting point for the Eikonal ray equation, as well as for modern applications in adaptive optics. This figure shows a propagating aberrated wavefront that is “compensated” by a deformable mirror to produce a tight focus.

         As a prelude to his later discovery of polarization, Malus had earlier proven a theorem about trajectories that particles of light take through an optical system.  One of the key questions about the particles of light in an optical system was how they formed images.  The physics of light particles moving through lenses was too complex to treat at that time, but reflection was relatively easy based on the simple reflection law.  Malus proved a theorem mathematically that after a reflection from a curved mirror, a set of rays perpendicular to an initial nonplanar surface would remain perpendicular at a later surface after reflection (this property is closely related to the conservation of optical etendue).  This is known as Malus’ Theorem, and he thought it only held true after a single reflection, but later mathematicians proved that it remains true even after an arbitrary number of reflections, even in cases when the rays intersect to form an optical effect known as a caustic.  The mathematics of caustics would catch the interest of an Irish mathematician and physicist who helped launch a new field of mathematical physics.

Etienne-Louis Malus

Hamilton’s Characteristic Function

William Rowan Hamilton (1805 – 1865) was a child prodigy who taught himself thirteen languages by the time he was thirteen years old (with the help of his linguist uncle), but mathematics became his primary focus at Trinity College at the University in Dublin.  His mathematical prowess was so great that he was made the Astronomer Royal of Ireland while still an undergraduate student.  He also became fascinated in the theory of envelopes of curves and in particular to the mathematics of caustic curves in optics. 

         In 1823 at the age of 18, he wrote a paper titled Caustics that was read to the Royal Irish Academy.  In this paper, Hamilton gave an exceedingly simple proof of Malus’ Law, but that was perhaps the simplest part of the paper.  Other aspects were mathematically obscure and reviewers requested further additions and refinements before publication.  Over the next four years, as Hamilton expanded this work on optics, he developed a new theory of optics, the first part of which was published as Theory of Systems of Rays in 1827 with two following supplements completed by 1833 but never published.

         Hamilton’s most important contribution to optical theory (and eventually to mechanics) he called his characteristic function.  By applying the principle of Fermat’s least time, which he called his principle of stationary action, he sought to find a single unique function that characterized every path through an optical system.  By first proving Malus’ Theorem and then applying the theorem to any system of rays using the principle of stationary action, he was able to construct two partial differential equations whose solution, if it could be found, defined every ray through the optical system.  This result was completely general and could be extended to include curved rays passing through inhomogeneous media.  Because it mapped input rays to output rays, it was the most general characterization of any defined optical system.  The characteristic function defined surfaces of constant action whose normal vectors were the rays of the optical system.  Today these surfaces of constant action are called the Eikonal function (but how it got its name is the next chapter of this story).  Using his characteristic function, Hamilton predicted a phenomenon known as conical refraction in 1832, which was subsequently observed, launching him to a level of fame unusual for an academic.

         Once Hamilton had established his principle of stationary action of curved light rays, it was an easy step to extend it to apply to mechanical systems of particles with curved trajectories.  This step produced his most famous work On a General Method in Dynamics published in two parts in 1834 and 1835 [1] in which he developed what became known as Hamiltonian dynamics.  As his mechanical work was extended by others including Jacobi, Darboux and Poincaré, Hamilton’s work on optics was overshadowed, overlooked and eventually lost.  It was rediscovered when Schrödinger, in his famous paper of 1926, invoked Hamilton’s optical work as a direct example of the wave-particle duality of quantum mechanics [2]. Yet in the interim, a German mathematician tackled the same optical problems that Hamilton had seventy years earlier, and gave the Eikonal Equation its name.

Bruns’ Eikonal

The German mathematician Heinrich Bruns (1848-1919) was engaged chiefly with the measurement of the Earth, or geodesy.  He was a professor of mathematics in Berlin and later Leipzig.  One claim fame was that one of his graduate students was Felix Hausdorff [3] who would go on to much greater fame in the field of set theory and measure theory (the Hausdorff dimension was a precursor to the fractal dimension).  Possibly motivated by his studies done with Hausdorff on refraction of light by the atmosphere, Bruns became interested in Malus’ Theorem for the same reasons and with the same goals as Hamilton, yet was unaware of Hamilton’s work in optics. 

         The mathematical process of creating “images”, in the sense of a mathematical mapping, made Bruns think of the Greek word  eikwn which literally means “icon” or “image”, and he published a small book in 1895 with the title Das Eikonal in which he derived a general equation for the path of rays through an optical system.  His approach was heavily geometrical and is not easily recognized as an equation arising from variational principals.  It rediscovered most of the results of Hamilton’s paper on the Theory of Systems of Rays and was thus not groundbreaking in the sense of new discovery.  But it did reintroduce the world to the problem of systems of rays, and his name of Eikonal for the equations of the ray paths stuck, and was used with increasing frequency in subsequent years.  Arnold Sommerfeld (1868 – 1951) was one of the early proponents of the Eikonal equation and recognized its connection with action principles in mechanics. He discussed the Eikonal equation in a 1911 optics paper with Runge [4] and in 1916 used action principles to extend Bohr’s model of the hydrogen atom [5]. While the Eikonal approach was not used often, it became popular in the 1960’s when computational optics made numerical solutions possible.

Lagrangian Dynamics of Light Rays

In physical optics, one of the most important properties of a ray passing through an optical system is known as the optical path length (OPL).  The OPL is the central quantity that is used in problems of interferometry, and it is the central property that appears in Fermat’s principle that leads to Snell’s Law.  The OPL played an important role in the history of the calculus when Johann Bernoulli in 1697 used it to derive the path taken by a light ray as an analogy of a brachistochrone curve – the curve of least time taken by a particle between two points.

            The OPL between two points in a refractive medium is the sum of the piecewise product of the refractive index n with infinitesimal elements of the path length ds.  In integral form, this is expressed as

where the “dot” is a derivative with respedt to s.  The optical Lagrangian is recognized as

The Lagrangian is inserted into the Euler equations to yield (after some algebra, see Introduction to Modern Dynamics pg. 336)

This is a second-order ordinary differential equation in the variables xa that define the ray path through the system.  It is literally a “trajectory” of the ray, and the Eikonal equation becomes the F = ma of ray optics.

Hamiltonian Optics

In a paraxial system (in which the rays never make large angles relative to the optic axis) it is common to select the position z as a single parameter to define the curve of the ray path so that the trajectory is parameterized as

where the derivatives are with respect to z, and the effective Lagrangian is recognized as

The Hamiltonian formulation is derived from the Lagrangian by defining an optical Hamiltonian as the Legendre transform of the Lagrangian.  To start, the Lagrangian is expressed in terms of the generalized coordinates and momenta.  The generalized optical momenta are defined as

This relationship leads to an alternative expression for the Eikonal equation (also known as the scalar Eikonal equation) expressed as

where S(x,y,z) = const. is the eikonal function.  The  momentum vectors are perpendicular to the surfaces of constant S, which are recognized as the wavefronts of a propagating wave.

            The Lagrangian can be restated as a function of the generalized momenta as

and the Legendre transform that takes the Lagrangian into the Hamiltonian is

The trajectory of the rays is the solution to Hamilton’s equations of motion applied to this Hamiltonian

Light Orbits

If the optical rays are restricted to the x-y plane, then Hamilton’s equations of motion can be expressed relative to the path length ds, and the momenta are pa = ndxa/ds.  The ray equations are (simply expressing the 2 second-order Eikonal equation as 4 first-order equations)

where the dot is a derivative with respect to the element ds.

As an example, consider a radial refractive index profile in the x-y plane

where r is the radius on the x-y plane. Putting this refractive index profile into the Eikonal equations creates a two-dimensional orbit in the x-y plane. The following Python code solves for individual trajectories.

Python Code: raysimple.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Created on Tue May 28 11:50:24 2019
@author: nolte
D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd ed. (Oxford,2019)

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os


# selection 1 = Gaussian
# selection 2 = Donut
selection = 1

print(' ')

def refindex(x,y):
    if selection == 1:
        sig = 10
        n = 1 + np.exp(-(x**2 + y**2)/2/sig**2)
        nx = (-2*x/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
        ny = (-2*y/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
    elif selection == 2:
        sig = 10;
        r2 = (x**2 + y**2)
        r1 = np.sqrt(r2)
        np.expon = np.exp(-r2/2/sig**2)
        n = 1+0.3*r1*np.expon;
        nx = 0.3*r1*(-2*x/2/sig**2)*np.expon + 0.3*np.expon*2*x/r1
        ny = 0.3*r1*(-2*y/2/sig**2)*np.expon + 0.3*np.expon*2*y/r1
    return [n,nx,ny]

def flow_deriv(x_y_z,tspan):
    x, y, z, w = x_y_z
    n, nx, ny = refindex(x,y)
    yp = np.zeros(shape=(4,))
    yp[0] = z/n
    yp[1] = w/n
    yp[2] = nx
    yp[3] = ny
    return yp
V = np.zeros(shape=(100,100))
for xloop in range(100):
    xx = -20 + 40*xloop/100
    for yloop in range(100):
        yy = -20 + 40*yloop/100
        n,nx,ny = refindex(xx,yy) 
        V[yloop,xloop] = n

fig = plt.figure(1)
contr = plt.contourf(V,100, cmap=cm.coolwarm, vmin = 1, vmax = 3)
fig.colorbar(contr, shrink=0.5, aspect=5)    
fig = plt.show()

v1 = 0.707      # Change this initial condition
v2 = np.sqrt(1-v1**2)
y0 = [12, v1, 0, v2]     # Change these initial conditions

tspan = np.linspace(1,1700,1700)

y = integrate.odeint(flow_deriv, y0, tspan)

lines = plt.plot(y[1:1550,0],y[1:1550,1])
plt.setp(lines, linewidth=0.5)

Gaussian refractive index profile in the x-y plane. From raysimple.py.
Ray orbits around the center of the Gaussian refractive index profile. From raysimple.py


An excellent textbook on geometric optics from Hamilton’s point of view is K. B. Wolf, Geometric Optics in Phase Space (Springer, 2004). Another is H. A. Buchdahl, An Introduction to Hamiltonian Optics (Dover, 1992).

A rather older textbook on geometrical optics is by J. L. Synge, Geometrical Optics: An Introduction to Hamilton’s Method (Cambridge University Press, 1962) showing the derivation of the ray equations in the final chapter using variational methods. Synge takes a dim view of Bruns’ term “Eikonal” since Hamilton got there first and Bruns was unaware of it.

A book that makes an especially strong case for the Optical-Mechanical analogy of Fermat’s principle, connecting the trajectories of mechanics to the paths of optical rays is Daryl Holm, Geometric Mechanics: Part I Dynamics and Symmetry (Imperial College Press 2008).

The Eikonal ray equation is derived from the geodesic equation (or rather as a geodesic equation) in D. D. Nolte, Introduction to Modern Dynamics, 2nd-edition (Oxford, 2019).


[1] Hamilton, W. R. “On a general method in dynamics I.” Mathematical Papers, I ,103-161: 247-308. (1834); Hamilton, W. R. “On a general method in dynamics II.” Mathematical Papers, I ,103-161: 95-144. (1835)

[2] Schrodinger, E. “Quantification of the eigen-value problem.” Annalen Der Physik 79(6): 489-527. (1926)

[3] For the fateful story of Felix Hausdorff (aka Paul Mongré) see Chapter 9 of Galileo Unbound (Oxford, 2018).

[4] Sommerfeld, A. and J. Runge. “The application of vector calculations on the basis of geometric optics.” Annalen Der Physik 35(7): 277-298. (1911)

[5] Sommerfeld, A. “The quantum theory of spectral lines.” Annalen Der Physik 51(17): 1-94. (1916)

Is the Future of Quantum Computing Bright?

There is a very real possibility that quantum computing is, and always will be, a technology of the future.  Yet if it is ever to be the technology of the now, then it needs two things: practical high-performance implementation and a killer app.  Both of these will require technological breakthroughs.  Whether this will be enough to make quantum computing real (commercializable) was the topic of a special symposium at the Conference on Lasers and ElectroOptics (CLEO) held in San Jose the week of May 6, 2019. 

Quantum computing is stuck in a sort of limbo between hype and hope, pitched with incredible (unbelievable?) claims, yet supported by tantalizing laboratory demonstrations. 

            The symposium had panelists from many top groups working in quantum information science, including Jerry Chow (IBM), Mikhail Lukin (Harvard), Jelena Vuckovic (Stanford), Birgitta Whaley (Berkeley) and Jungsang Kim (IonQ).  The moderator Ben Eggleton (U Sydney) posed the question to the panel: “Will Quantum Computing Actually Work?”.  My Blog for this week is a report, in part, of what they said, and also what was happening in the hallways and the scientific sessions at CLEO.  My personal view after listening and watching this past week is that the future of quantum computers is optics.

Einstein’s Photons

 It is either ironic or obvious that the central figure behind quantum computing is Albert Einstein.  It is obvious because Einstein provided the fundamental tools of quantum computing by creating both quanta and entanglement (the two key elements to any quantum computer).  It is ironic, because Einstein turned his back on quantum mechanics, and he “invented” entanglement to actually argue that it was an “incomplete science”. 

            The actual quantum revolution did not begin with Max Planck in 1900, as so many Modern Physics textbooks attest, but with Einstein in 1905.  This was his “miracle year” when he published 5 seminal papers, each of which solved one of the greatest outstanding problems in the physics of the time.  In one of those papers he used simple arguments based on statistics, combined with the properties of light emission, to propose — actually to prove — that light is composed of quanta of energy (later to be named “photons” by Gilbert Lewis in 1924).  Although Planck’s theory of blackbody radiation contained quanta implicitly through the discrete actions of his oscillators in the walls of the cavity, Planck vigorously rejected the idea that light itself came in quanta.  He even apologized for Einstein, as he was proposing Einstein for membership the Berlin Academy, saying that he should be admitted despite his grave error of believing in light quanta.  When Millikan set out in 1914 to prove experimentally that Einstein was wrong about photons by performing exquisite experiments on the photoelectric effect, he actually ended up proving that Einstein was right after all, which brought Einstein the Nobel Prize in 1921.

            In the early 1930’s after a series of intense and public debates with Bohr over the meaning of quantum mechanics, Einstein had had enough of the “Copenhagen Interpretation” of quantum mechanics.  In league with Schrödinger, who deeply disliked Heisenberg’s version of quantum mechanics, the two proposed two of the most iconic problems of quantum mechanics.  Schrödinger launched, as a laughable parody, his eponymously-named “Schrödinger’s Cat”, and Einstein launched what has become known as the “Entanglement”.  Each was intended to show the absurdity of quantum mechanics and drive a nail into its coffin, but each has been embraced so thoroughly by physicists that Schrödinger and Einstein are given the praise and glory for inventing these touchstones of quantum science. Schrödinger’s cat and entanglement both lie at the heart of the problems and the promise of quantum computers.

Between Hype and Hope

Quantum computing is stuck in a sort of limbo between hype and hope, pitched with incredible (unbelievable?) claims, yet supported by tantalizing laboratory demonstrations.  In the midst of the current revival in quantum computing interest (the first wave of interest in quantum computing was in the 1990’s, see “Mind at Light Speed“), the US Congress has passed a house resolution to fund quantum computing efforts in the United States with a commitment $1B.  This comes on the heels of commercial efforts in quantum computing by big players like IBM, Microsoft and Google, and also is partially in response to China’s substantial financial commitment to quantum information science.  These acts, and the infusion of cash, will supercharge efforts on quantum computing.  But this comes with real danger of creating a bubble.  If there is too much hype, and if the expensive efforts under-deliver, then the bubble will burst, putting quantum computing back by decades.  This has happened before, as in the telecom and fiber optics bubble of Y2K that burst in 2001.  The optics industry is still recovering from that crash nearly 20 years later.  The quantum computing community will need to be very careful in managing expectations, while also making real strides on some very difficult and long-range problems.

            This was part of what the discussion at the CLEO symposium centered around.  Despite the charge by Eggleton to “be real” and avoid the hype, there was plenty of hype going around on the panel and plenty of optimism, tempered by caution.  I admit that there is reason for cautious optimism.  Jerry Chow showed IBM’s very real quantum computer (with a very small number of qubits) that can be accessed through the cloud by anyone.  They even built a user interface to allow users to code their own quantum codes.  Jungsang Kim of IonQ was equally optimistic, showing off their trapped-atom quantum computer with dozens of trapped ions acting as individual qubits.  Admittedly Chow and Kim have vested interests in their own products, but the technology is certainly impressive.  One of the sharpest critics, Mikhail Lukin of Harvard, was surprisingly also one of the most optimistic. He made clear that scalable quantum computers in the near future is nonsense.  Yet he is part of a Harvard-MIT collaboration that has constructed a 51-qubit array of trapped atoms that sets a world record.  Although it cannot be used for quantum computing, it was used to simulate a complex many-body physics problem, and it found an answer that could not be calculated or predicted using conventional computers.

            The panel did come to a general consensus about quantum computing that highlights the specific challenges that the field will face as it is called upon to deliver on its hyperbole.  They each echoed an idea known as the “supremacy plot” which is a two-axis graph of number of qubits and number of operations (also called circuit depth).  The graph has one region that is not interesting, one region that is downright laughable (at the moment), and one final area of great hope.  The region of no interest lies in the range of large numbers of qubits but low numbers of operations, or large numbers of operations on a small number of qubits.  Each of these extremes can easily be calculated on conventional computers and hence is of no practical interest.  The region that is laughable is the the area of large numbers of qubits and large numbers of operations.  No one suggested that this area can be accessed in even the next 10 years.  The region that everyone is eager to reach is the region of “quantum supremacy”.  This consists of quantum computers that have enough qubits and enough operations that they cannot be simulated by classical computers.  When asked where this region is, the panel consensus was that it would require more than 50 qubits and more than hundreds or thousands of operations.  What makes this so exciting is that there are real technologies that are now approaching this region–and they are based on light.

The Quantum Supremacy Chart: Plot of the number of Qbits and the circuit depth (number of operations or gates) in a quantum computer. The red region (“Zzzzzzz”) is where classical computers can do as well. The purple region (“Ha Ha Ha”) is a dream. The middle region (“Wow”) is the region of hope, which may soon be reached by trapped atoms and optics.

Chris Monroe’s Perfect Qubits

The second plenary session at CLEO featured the recent Nobel prize winners Art Ashkin, Donna Strickland and Gerard Mourou who won the 2018 Nobel prize in physics for laser applications.  (Donna Strickland is only the third woman to win the Nobel prize in physics.)  The warm-up band for these headliners was Chris Monroe, founder of the start-up company IonQ out of the University of Maryland.  Monroe outlined the general layout of their quantum computer which is based on trapped atoms which he called “perfect qubits”.  Each trapped atom is literally an atomic clock with the kind of exact precision that atomic clocks come with.  The quantum properties of these atoms are as perfect as is needed for any quantum computation, and the limits on the performance of the current IonQ system is entirely caused by the classical controls that trap and manipulate the atoms.  This is where the efforts of their rapidly growing R&D team are focused.

            If trapped atoms are the perfect qubit, then the perfect quantum communication channel is the photon.  The photon in vacuum is the quintessential messenger, propagating forever and interacting with nothing.  This is why experimental cosmologists can see the photons originating from the Big Bang 13 billion years ago (actually from about a hundred thousand years after the Big Bang when the Universe became transparent).  In a quantum computer based on trapped atoms as the gates, photons become the perfect wires.

            On the quantum supremacy chart, Monroe plotted the two main quantum computing technologies: solid state (based mainly on superconductors but also some semiconductor technology) and trapped atoms.  The challenges to solid state quantum computers comes with the scale-up to the range of 50 qubits or more that will be needed to cross the frontier into quantum supremacy.  The inhomogeneous nature of solid state fabrication, as perfected as it is for the transistor, is a central problem for a solid state solution to quantum computing.  Furthermore, by scaling up the number of solid state qubits, it is extremely difficult to simultaneously increase the circuit depth.  In fact, circuit depth is likely to decrease (initially) as the number of qubits rises because of the two-dimensional interconnect problem that is well known to circuit designers.  Trapped atoms, on the other hand, have the advantages of the perfection of atomic clocks that can be globally interconnected through perfect photon channels, and scaling up the number of qubits can go together with increased circuit depth–at least in the view of Monroe, who admittedly has a vested interest.  But he was speaking before an audience of several thousand highly-trained and highly-critical optics specialists, and no scientist in front of such an audience will make a claim that cannot be supported (although the reality is always in the caveats).

The Future of Quantum Computing is Optics

The state of the art of the photonic control of light equals the levels of sophistication of electronic control of the electron in circuits.  Each is driven by big-world applications: electronics by the consumer electronics and computer market, and photonics by the telecom industry.  Having a technology attached to a major world-wide market is a guarantee that progress is made relatively quickly with the advantages of economy of scale.  The commercial driver is profits, and the driver for funding agencies (who support quantum computing) is their mandate to foster competitive national economies that create jobs and improve standards of living.

            The yearly CLEO conference is one of the top conferences in laser science in the world, drawing in thousands of laser scientists who are working on photonic control.  Integrated optics is one of the current hot topics.  It brings many of the resources of the electronics industry to bear on photonics.  Solid state optics is mostly concerned with quantum properties of matter and its interaction with photons, and this year’s CLEO conference hosted many focused sessions on quantum sensors, quantum control, quantum information and quantum communication.  The level of external control of quantum systems is increasing at a spectacular rate.  Sitting in the audience at CLEO you get the sense that you are looking at the embryonic stages of vast new technologies that will be enlisted in the near future for quantum computing.  The challenge is, there are so many variants that it is hard to know which of these naissent technologies will win and change the world.  But the key to technological progress is diversity (as it is for society), because it is the interplay and cross-fertilization among the diverse technologies that drives each forward, and even technologies that recede away still contribute to the advances of the winning technology. 

            The expert panel at CLEO on the future of quantum computing punctuated their moments of hype with moments of realism as they called for new technologies to solve some of the current barriers to quantum computers.  Walking out of the panel discussion that night, and walking into one of the CLEO technical sessions the next day, you could almost connect the dots.  The enabling technologies being requested by the panel are literally being built by the audience.

            In the end, the panel had a surprisingly prosaic argument in favor of the current push to build a working quantum computer.  It is an echo of the movie Field of Dreams, with the famous quote “If you build it they will come”.  That was the plea made by Lukin, who argued that by putting quantum computers into the hands of users, then the killer app that will drive the future economics of quantum computers likely will emerge.  You don’t really know what to do with a quantum computer until you have one.

            Given the “perfect qubits” of trapped atoms, and the “perfect photons” of the communication channels, combined with the dizzying assortment of quantum control technologies being invented and highlighted at CLEO, it is easy to believe that the first large-scale quantum computers will be based on light.