The bending of light rays as they enter a transparent medium—what today is called Snell’s Law—has had a long history of independent discoveries and radically different approaches. The general problem of refraction was known to the Greeks in the first century AD, and it was later discussed by the Arabic scholar Alhazan. Ibn Sahl in Bagdad in 984 AD was the first to put an accurate equation to the phenomenon. Thomas Harriott in England discussed the problem with Johannes Kepler in 1602, unaware of the work by Ibn Sahl. Willebrord Snellius (1580–1626) in the Netherlands derived the equation for refraction in 1621, but did not publish it, though it was known to Christian Huygens (1629 – 1695). René Descartes (1596 – 1650), unaware of Snellius’ work, derived the law in his Dioptrics, using his newly-invented coordinate geometry. Christiaan Huygens, in his Traité de la Lumière in 1678, derived the law yet again, this time using his principle of secondary waves, though he acknowledged the prior work of Snellius, permanently cementing the shortened name “Snell” to the law of refraction.
Through this history and beyond, there have been many approaches to deriving Snell’s Law. Some used ideas of momentum, while others used principles of waves. Today, there are roughly five different ways to derive Snell’s law. These are:
1) Huygens’ Principle,
2) Fermat’s Principle,
3) Wavefront Continuity
4) Plane-wave Boundary Conditions, and
5) Photon Momentum Conservation.
The approaches differ in detail, but they fall into two rough categories: the first two fall under minimization or extremum principles, and the last three fall under continuity or conservation principles.
Snell’s Law: Huygens’ Principle
Huygens’ principle, published in 1687, states that every point on a wavefront serves as the source of a spherical secondary wave. This was one of the first wave principles ever proposed for light (Robert Hooke had suggested that light had wavelike character based on his observations of colors in thin films) yet remains amazingly powerful even today. It can be used not only to derive Snell’s law but also properties of light scattering and diffraction. Huygens’ principle is a form of minimization principle: it finds the direction of propagation (for a spherically expanding wavefront from a point where a ray strikes a surface) that yields a minimum angle (tangent to the surface) relative to a second source. Finding the tangent to the spherical surface is a minimization problem and yields Snell’s Law.
The use of Huygen’s principle for the derivation of Snell’s Law is shown in Fig. 1. Two parallel incoming rays strike a surface a distance d apart. The first point emits a secondary spherical wave into the second medium. The wavefront propagates at a speed of v2 relative to the speed in the first medium of v1. In the diagram, the propagation distance over the distance d is equal to the sine of the angle
Solving for d and equating the two equations gives
The speed depends on the refractive index as
which leads to Snell’s Law:
Snell’s Law: Fermat’s Principle
Fermat’s principle of least time is a direct minimization problem that finds the least time it takes light to propagate from one point to another. One of the central questions about Fermat’s principle is: why does it work? Why is the path of least time the path light needs to take? I’ll answer that question after we do the derivation. The configuration of the problem is shown in Fig. 2.
Consider a source point A and a destination point B. Light travels in a straite line in each medium, deflecting at the point x on the figure. The speed in medium 1 is c/n1, and the speed in medium 2 is c/n2. What position x provides the minimum time?
The distances from A to x, and from x to B are, respectively:
The total time is
Minimize this expression by taking the derivative of the time relative to the position x and setting the result to zero
Converting the cosines to sines yields Snell’s Law
Fermat’s principle of least time can be explained in terms of wave interference. If we think of all paths being taken by propagating waves, then those waves that take paths that differ only a little from the optimum path still interfere constructively. This is the principle of stationarity. The time minimizes a quadratic expression that deviates from the minimum only in second order (shown in the right part of Fig. 2). Therefore, all “nearby” paths interfere constructively, while paths that are farther away begin to interfere destructively. Therefore, the path of least time is also the path of stationary time and hence stationary optical path length and hence the path of maximum constructive interference. This is the actual path taken by the wave—and the light.
Snell’s Law: Wavefront Continuity
When a wave passes across an interface between two transparent media the phase of the wave remains continuous. This continuity of phase provides a way to derive Snell’s Law. Consider Fig. 3. A plane wave with wavelength l1 is incident from medium 1 on an interface with medium 2 in which the wavelength is l2. The wavefronts remain continuous, but they are “kinked” at the interface.
The waves in medium 1 and medium 2 share the part of the interface between wavefronts. This distance is
The wavelengths in the two media are related to the refractive index through
where l0 is the free-space wavelength. Plugging these into the first expression yields
which relates the denominators through Snell’s Law
Snell’s Law: Plane-Wave Boundary Condition
Maxwell’s four equations in integral form can each be applied to the planar interface between two refractive media.
All four boundary conditions can be written as
The only way this condition can be true for all possible values of the fields is if the phases of the wave terms are all the same (phase-matching), namely
which in turn guarantees that the transverse projection of the k-vector is continuous across the interface
and the transverse components (projections) are
where the last line states both Snell’s law of refraction and the law of reflection. Therefore, the general wave boundary condition leads immediately to Snell’s Law.
Snell’s Law: Momentum Conservation
Going from Maxwell’s equations for classical fields to photons keeps the same mathematical form for the transverse components for the k-vectors, but now interprets them in a different manner. Where before there was a requirement for phase-matching the classical waves at the interface, in the photon picture the transverse k-vector becomes the transverse momentum through de Broglie’s equation
Therefore, continuity of the transverse k-vector is interpreted as conservation of transverse momentum of the photon across the interface. In the figure the second medium is denser with a larger refractive index n2 > n1. Hence, the momentum of the photon in the second medium is larger while keeping the transverse momentum projection the same. This simple interpretation gives the same mathematical form as the previous derivation using classical boundary conditions, namely
which is again Snell’s law and the law of relection.
Snell’s Law has an eerie habit of springing from almost any statement that can be made about a dielectric interface. It yields the path of least time, tracks the path of maximum constructive interference, produces wavefronts that are extremally tangent to wavefronts, connects continuous wavefronts across the interface, conserves transverse momentum, and guarantees phase matching. These all sound very different, yet all lead to the same simple law of Snellius and Ibn Sahl.
physics of a path of light passing a gravitating body is one of the hardest
concepts to understand in General Relativity, but it is also one of the
easiest. It is hard because there can be
no force of gravity on light even though the path of a photon bends as it
passes a gravitating body. It is easy,
because the photon is following the simplest possible path—a geodesic equation
for force-free motion.
This blog picks up where my last blog left off, having there defined the geodesic equation and presenting the Schwarzschild metric. With those two equations in hand, we could simply solve for the null geodesics (a null geodesic is the path of a light beam through a manifold). But there turns out to be a simpler approach that Einstein came up with himself (he never did like doing things the hard way). He just had to sacrifice the fundamental postulate that he used to explain everything about Special Relativity.
Throwing Special Relativity Under the Bus
The fundamental postulate of Special Relativity states that the speed of light is the same for all observers. Einstein posed this postulate, then used it to derive some of the most astonishing consequences of Special Relativity—like E = mc2. This postulate is at the rock core of his theory of relativity and can be viewed as one of the simplest “truths” of our reality—or at least of our spacetime.
Yet as soon as Einstein began thinking how to extend SR to a more general situation, he realized almost immediately that he would have to throw this postulate out. While the speed of light measured locally is always equal to c, the apparent speed of light observed by a distant observer (far from the gravitating body) is modified by gravitational time dilation and length contraction. This means that the apparent speed of light, as observed at a distance, varies as a function of position. From this simple conclusion Einstein derived a first estimate of the deflection of light by the Sun, though he initially was off by a factor of 2. (The full story of Einstein’s derivation of the deflection of light by the Sun and the confirmation by Eddington is in Chapter 7 of Galileo Unbound (Oxford University Press, 2018).)
The “Optics” of Gravity
The invariant element for a light path moving radially in the Schwarzschild geometry is
The apparent speed of light is
where c(r) is always less than c, when observing it from
flat space. The “refractive index” of
space is defined, as for any optical material, as the ratio of the constant speed
divided by the observed speed
Because the Schwarzschild metric has the property
the effective refractive index of warped space-time is
with a divergence at the Schwarzschild
The refractive index of warped space-time in the limit of weak gravity can be used in the ray equation (also known as the Eikonal equation described in an earlier blog)
where the gradient of the refractive index of space is
The ray equation is then a four-variable flow
These equations represent a 4-dimensional flow for a light ray confined to a plane. The trajectory of any light path is found by using an ODE solver subject to the initial conditions for the direction of the light ray. This is simple for us to do today with Python or Matlab, but it was also that could be done long before the advent of computers by early theorists of relativity like Max von Laue (1879 – 1960).
The Relativity of Max von Laue
In the Fall of 1905 in Berlin, a young German physicist by the name of Max Laue was sitting in the physics colloquium at the University listening to another Max, his doctoral supervisor Max Planck, deliver a seminar on Einstein’s new theory of relativity. Laue was struck by the simplicity of the theory, in this sense “simplistic” and hence hard to believe, but the beauty of the theory stuck with him, and he began to think through the consequences for experiments like the Fizeau experiment on partial ether drag.
Armand Hippolyte Louis Fizeau (1819 – 1896) in 1851 built one of the world’s first optical interferometers and used it to measure the speed of light inside moving fluids. At that time the speed of light was believed to be a property of the luminiferous ether, and there were several opposing theories on how light would travel inside moving matter. One theory would have the ether fully stationary, unaffected by moving matter, and hence the speed of light would be unaffected by motion. An opposite theory would have the ether fully entrained by matter and hence the speed of light in moving matter would be a simple sum of speeds. A middle theory considered that only part of the ether was dragged along with the moving matter. This was Fresnel’s partial ether drag hypothesis that he had arrived at to explain why his friend Francois Arago had not observed any contribution to stellar aberration from the motion of the Earth through the ether. When Fizeau performed his experiment, the results agreed closely with Fresnel’s drag coefficient, which seemed to settle the matter. Yet when Michelson and Morley performed their experiments of 1887, there was no evidence for partial drag.
Even after the exposition by Einstein on relativity in 1905, the disagreement of the Michelson-Morley results with Fizeau’s results was not fully reconciled until Laue showed in 1907 that the velocity addition theorem of relativity gave complete agreement with the Fizeau experiment. The velocity observed in the lab frame is found using the velocity addition theorem of special relativity. For the Fizeau experiment, water with a refractive index of n is moving with a speed v and hence the speed in the lab frame is
The difference in the speed of light between the stationary and the moving water is the difference
where the last term is precisely the Fresnel drag coefficient. This was one of the first definitive “proofs” of the validity of Einstein’s theory of relativity, and it made Laue one of relativity’s staunchest proponents. Spurred on by his success with the Fresnel drag coefficient explanation, Laue wrote the first monograph on relativity theory, publishing it in 1910.
A Nobel Prize for Crystal X-ray Diffraction
In 1909 Laue became a Privatdozent under Arnold Sommerfeld (1868 – 1951) at the university in Munich. In the Spring of 1912 he was walking in the Englischer Garten on the northern edge of the city talking with Paul Ewald (1888 – 1985) who was finishing his doctorate with Sommerfed studying the structure of crystals. Ewald was considering the interaction of optical wavelength with the periodic lattice when it struck Laue that x-rays would have the kind of short wavelengths that would allow the crystal to act as a diffraction grating to produce multiple diffraction orders. Within a few weeks of that discussion, two of Sommerfeld’s students (Friedrich and Knipping) used an x-ray source and photographic film to look for the predicted diffraction spots from a copper sulfate crystal. When the film was developed, it showed a constellation of dark spots for each of the diffraction orders of the x-rays scattered from the multiple periodicities of the crystal lattice. Two years later, in 1914, Laue was awarded the Nobel prize in physics for the discovery. That same year his father was elevated to the hereditary nobility in the Prussian empire and Max Laue became Max von Laue.
Von Laue was not one to take risks, and he remained conservative in many of his interests. He was immensely respected and played important roles in the administration of German science, but his scientific contributions after receiving the Nobel Prize were only modest. Yet as the Nazis came to power in the early 1930’s, he was one of the few physicists to stand up and resist the Nazi take-over of German physics. He was especially disturbed by the plight of the Jewish physicists. In 1933 he was invited to give the keynote address at the conference of the German Physical Society in Wurzburg where he spoke out against the Nazi rejection of relativity as they branded it “Jewish science”. In his speech he likened Einstein, the target of much of the propaganda, to Galileo. He said, “No matter how great the repression, the representative of science can stand erect in the triumphant certainty that is expressed in the simple phrase: And yet it moves.” Von Laue believed that truth would hold out in the face of the proscription against relativity theory by the Nazi regime. The quote “And yet it moves” is supposed to have been muttered by Galileo just after his abjuration before the Inquisition, referring to the Earth moving around the Sun. Although the quote is famous, it is believed to be a myth.
In an odd side-note of history, von Laue sent his gold Nobel prize medal to Denmark for its safe keeping with Niels Bohr so that it would not be paraded about by the Nazi regime. Yet when the Nazis invaded Denmark, to avoid having the medals fall into the hands of the Nazis, the medal was dissolved in aqua regia by a member of Bohr’s team, George de Hevesy. The gold completely dissolved into an orange liquid that was stored in a beaker high on a shelf through the war. When Denmark was finally freed, the dissolved gold was precipitated out and a new medal was struck by the Nobel committee and re-presented to von Laue in a ceremony in 1951.
The Orbits of Light Rays
Von Laue’s interests always stayed close to the properties of light and electromagnetic radiation ever since he was introduced to the field when he studied with Woldemor Voigt at Göttingen in 1899. This interest included the theory of relativity, and only a few years after Einstein published his theory of General Relativity and Gravitation, von Laue added to his earlier textbook on relativity by writing a second volume on the general theory. The new volume was published in 1920 and included the theory of the deflection of light by gravity.
One of the very few illustrations in his second volume is of light coming into interaction with a super massive gravitational field characterized by a Schwarzschild radius. (No one at the time called it a “black hole”, nor even mentioned Schwarzschild. That terminology came much later.) He shows in the drawing, how light, if incident at just the right impact parameter, would actually loop around the object. This is the first time such a diagram appeared in print, showing the trajectory of light so strongly affected by gravity.
# -*- coding: utf-8 -*-
Created on Tue May 28 11:50:24 2019
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
circle = plt.Circle((0,0), radius= 10, color = 'black')
A = 10
eps = 1e-6
rp0 = np.sqrt(x**2 + y**2);
n = 1/(1 - A/(rp0+eps))
fac = np.abs((1-9*(A/rp0)**2/8)) # approx correction to Eikonal
nx = -fac*n**2*A*x/(rp0+eps)**3
ny = -fac*n**2*A*y/(rp0+eps)**3
x, y, z, w = x_y_z
[n,nx,ny] = refindex(x,y)
yp = np.zeros(shape=(4,))
yp = z/n
yp = w/n
yp = nx
yp = ny
for loop in range(-5,30):
xstart = -100
ystart = -2.245 + 4*loop
[n,nx,ny] = refindex(xstart,ystart)
y0 = [xstart, ystart, n, 0]
tspan = np.linspace(1,400,2000)
y = integrate.odeint(flow_deriv, y0, tspan)
xx = y[1:2000,0]
yy = y[1:2000,1]
lines = plt.plot(xx,yy)
c = create_circle()
axes = plt.gca()
# Now set up a circular photon orbit
xstart = 0
ystart = 15
[n,nx,ny] = refindex(xstart,ystart)
y0 = [xstart, ystart, n, 0]
tspan = np.linspace(1,94,1000)
y = integrate.odeint(flow_deriv, y0, tspan)
xx = y[1:1000,0]
yy = y[1:1000,1]
lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=2, color = 'black')
One of the most striking effects of gravity on photon trajectories is the possibility for a photon to orbit a black hole in a circular orbit. This is shown in Fig. 3 as the black circular ring for a photon at a radius equal to 1.5 times the Schwarzschild radius. This radius defines what is known as the photon sphere. However, the orbit is not stable. Slight deviations will send the photon spiraling outward or inward.
The Eikonal approximation does not strictly hold under strong gravity, but the Eikonal equations with the effective refractive index of space still yield semi-quantitative behavior. In the Python code, a correction factor is used to match the theory to the circular photon orbits, while still agreeing with trajectories far from the black hole. The results of the calculation are shown in Fig. 3. For large impact parameters, the rays are deflected through a finite angle. At a critical impact parameter, near 3 times the Schwarzschild radius, the ray loops around the black hole. For smaller impact parameters, the rays are captured by the black hole.
Photons pile up around the black hole at the photon sphere. The first image ever of the photon sphere of a black hole was made earlier this year (announced April 10, 2019). The image shows the shadow of the supermassive black hole in the center of Messier 87 (M87), an elliptical galaxy 55 million light-years from Earth. This black hole is 6.5 billion times the mass of the Sun. Imaging the photosphere required eight ground-based radio telescopes placed around the globe, operating together to form a single telescope with an optical aperture the size of our planet. The resolution of such a large telescope would allow one to image a half-dollar coin on the surface of the Moon, although this telescope operates in the radio frequency range rather than the optical.
Nature loves the path of steepest descent. Place a ball on a smooth curved surface and release it, and it will instantansouly accelerate in the direction of steepest descent. Shoot a laser beam from an oblique angle onto a piece of glass to hit a target inside, and the path taken by the beam is the only path that decreases the distance to the target in the shortest time. Diffract a stream of electrons from the surface of a crystal, and quantum detection events are greatest at the positions where the troughs and peaks of the deBroglie waves converge the most. The first example is Newton’s second law. The second example is Fermat’s principle. The third example is Feynman’s path-integral formulation of quantum mechanics. They all share in common a minimization principle—the principle of least action—that the path of a dynamical system is the one that minimizes a property known as “action”.
The Eikonal Equation is the “F = ma” of ray optics. It’s solutions describe the paths of light rays through complicated media.
The principle of least action, first proposed by the French physicist Maupertuis through mechanical analogy, became a principle of Lagrangian mechanics in the hands of Lagrange, but was still restricted to mechanical systems of particles. The principle was generalized forty years later by Hamilton, who began by considering the propagation of light waves, and ended by transforming mechanics into a study of pure geometry divorced from forces and inertia. Optics played a key role in the development of mechanics, and mechanics returned the favor by giving optics the Eikonal Equation. The Eikonal Equation is the “F = ma” of ray optics. It’s solutions describe the paths of light rays through complicated media.
Anyone who has taken a course in optics knows that Étienne-Louis Malus (1775-1812) discovered the polarization of light, but little else is taught about this French mathematician who was one of the savants Napoleon had taken along with himself when he invaded Egypt in 1798. After experiencing numerous horrors of war and plague, Malus returned to France damaged but wiser. He discovered the polarization of light in the Fall of 1808 as he was playing with crystals of icelandic spar at sunset and happened to view last rays of the sun reflected from the windows of the Luxumbourg palace. Icelandic spar produces double images in natural light because it is birefringent. Malus discovered that he could extinguish one of the double images of the Luxumbourg windows by rotating the crystal a certain way, demonstrating that light is polarized by reflection. The degree to which light is extinguished as a function of the angle of the polarizing crystal is known as Malus’ Law.
Malus had picked up an interest in the general properties of light and imaging during lulls in his ordeal in Egypt. He was an emissionist following his compatriot Laplace, rather than an undulationist following Thomas Young. It is ironic that the French scientists were staunchly supporting Newton on the nature of light, while the British scientist Thomas Young was trying to upend Netwonian optics. Almost all physicists at that time were emissionists, only a few years after Young’s double-slit experiment of 1804, and few serious scientists accepted Young’s theory of the wave nature of light until Fresnel and Arago supplied the rigorous theory and experimental proofs much later in 1819.
As a prelude to his later discovery of polarization, Malus had earlier proven a theorem about trajectories that particles of light take through an optical system. One of the key questions about the particles of light in an optical system was how they formed images. The physics of light particles moving through lenses was too complex to treat at that time, but reflection was relatively easy based on the simple reflection law. Malus proved a theorem mathematically that after a reflection from a curved mirror, a set of rays perpendicular to an initial nonplanar surface would remain perpendicular at a later surface after reflection (this property is closely related to the conservation of optical etendue). This is known as Malus’ Theorem, and he thought it only held true after a single reflection, but later mathematicians proved that it remains true even after an arbitrary number of reflections, even in cases when the rays intersect to form an optical effect known as a caustic. The mathematics of caustics would catch the interest of an Irish mathematician and physicist who helped launch a new field of mathematical physics.
Hamilton’s Characteristic Function
William Rowan Hamilton (1805 – 1865) was a child prodigy who taught himself thirteen languages by the time he was thirteen years old (with the help of his linguist uncle), but mathematics became his primary focus at Trinity College at the University in Dublin. His mathematical prowess was so great that he was made the Astronomer Royal of Ireland while still an undergraduate student. He also became fascinated in the theory of envelopes of curves and in particular to the mathematics of caustic curves in optics.
In 1823 at the age of 18, he wrote a paper titled Caustics that was read to the Royal Irish Academy. In this paper, Hamilton gave an exceedingly simple proof of Malus’ Law, but that was perhaps the simplest part of the paper. Other aspects were mathematically obscure and reviewers requested further additions and refinements before publication. Over the next four years, as Hamilton expanded this work on optics, he developed a new theory of optics, the first part of which was published as Theory of Systems of Rays in 1827 with two following supplements completed by 1833 but never published.
Hamilton’s most important contribution
to optical theory (and eventually to mechanics) he called his characteristic
function. By applying the principle of
Fermat’s least time, which he called his principle of stationary action, he
sought to find a single unique function that characterized every path through
an optical system. By first proving
Malus’ Theorem and then applying the theorem to any system of rays using the
principle of stationary action, he was able to construct two partial
differential equations whose solution, if it could be found, defined every ray
through the optical system. This result
was completely general and could be extended to include curved rays passing
through inhomogeneous media. Because it
mapped input rays to output rays, it was the most general characterization of
any defined optical system. The
characteristic function defined surfaces of constant action whose normal
vectors were the rays of the optical system.
Today these surfaces of constant action are called the Eikonal function
(but how it got its name is the next chapter of this story). Using his characteristic function, Hamilton
predicted a phenomenon known as conical refraction in 1832, which was
subsequently observed, launching him to a level of fame unusual for an
Once Hamilton had established his principle of stationary action of curved light rays, it was an easy step to extend it to apply to mechanical systems of particles with curved trajectories. This step produced his most famous work On a General Method in Dynamics published in two parts in 1834 and 1835  in which he developed what became known as Hamiltonian dynamics. As his mechanical work was extended by others including Jacobi, Darboux and Poincaré, Hamilton’s work on optics was overshadowed, overlooked and eventually lost. It was rediscovered when Schrödinger, in his famous paper of 1926, invoked Hamilton’s optical work as a direct example of the wave-particle duality of quantum mechanics . Yet in the interim, a German mathematician tackled the same optical problems that Hamilton had seventy years earlier, and gave the Eikonal Equation its name.
The German mathematician Heinrich Bruns (1848-1919) was engaged chiefly with the measurement of the Earth, or geodesy. He was a professor of mathematics in Berlin and later Leipzig. One claim fame was that one of his graduate students was Felix Hausdorff  who would go on to much greater fame in the field of set theory and measure theory (the Hausdorff dimension was a precursor to the fractal dimension). Possibly motivated by his studies done with Hausdorff on refraction of light by the atmosphere, Bruns became interested in Malus’ Theorem for the same reasons and with the same goals as Hamilton, yet was unaware of Hamilton’s work in optics.
The mathematical process of creating “images”, in the sense of a mathematical mapping, made Bruns think of the Greek word eikwn which literally means “icon” or “image”, and he published a small book in 1895 with the title Das Eikonal in which he derived a general equation for the path of rays through an optical system. His approach was heavily geometrical and is not easily recognized as an equation arising from variational principals. It rediscovered most of the results of Hamilton’s paper on the Theory of Systems of Rays and was thus not groundbreaking in the sense of new discovery. But it did reintroduce the world to the problem of systems of rays, and his name of Eikonal for the equations of the ray paths stuck, and was used with increasing frequency in subsequent years. Arnold Sommerfeld (1868 – 1951) was one of the early proponents of the Eikonal equation and recognized its connection with action principles in mechanics. He discussed the Eikonal equation in a 1911 optics paper with Runge  and in 1916 used action principles to extend Bohr’s model of the hydrogen atom . While the Eikonal approach was not used often, it became popular in the 1960’s when computational optics made numerical solutions possible.
Lagrangian Dynamics of Light Rays
In physical optics, one of the most important properties of a ray passing through an optical system is known as the optical path length (OPL). The OPL is the central quantity that is used in problems of interferometry, and it is the central property that appears in Fermat’s principle that leads to Snell’s Law. The OPL played an important role in the history of the calculus when Johann Bernoulli in 1697 used it to derive the path taken by a light ray as an analogy of a brachistochrone curve – the curve of least time taken by a particle between two points.
The OPL between two points in a refractive medium is the sum of the piecewise product of the refractive index n with infinitesimal elements of the path length ds. In integral form, this is expressed as
where the “dot” is a derivative
with respedt to s. The optical
Lagrangian is recognized as
The Lagrangian is inserted into the Euler equations to yield (after some algebra, see Introduction to Modern Dynamics pg. 336)
This is a second-order
ordinary differential equation in the variables xa that define the
ray path through the system. It is
literally a “trajectory” of the ray, and the Eikonal equation becomes the F =
ma of ray optics.
In a paraxial system (in which
the rays never make large angles relative to the optic axis) it is common to
select the position z as a single parameter to define the curve of the ray path
so that the trajectory is parameterized as
where the derivatives
are with respect to z, and the effective Lagrangian is recognized as
formulation is derived from the Lagrangian by defining an optical Hamiltonian
as the Legendre transform of the Lagrangian.
To start, the Lagrangian is expressed in terms of the generalized
coordinates and momenta. The generalized
optical momenta are defined as
This relationship leads
to an alternative expression for the Eikonal equation (also known as the scalar
Eikonal equation) expressed as
where S(x,y,z) = const. is the eikonal function. The
momentum vectors are perpendicular to the surfaces of constant S, which
are recognized as the wavefronts of a propagating wave.
Lagrangian can be restated as a function of the generalized momenta as
and the Legendre
transform that takes the Lagrangian into the Hamiltonian is
The trajectory of the
rays is the solution to Hamilton’s equations of motion applied to this
If the optical rays are
restricted to the x-y plane, then Hamilton’s equations of motion can be
expressed relative to the path length ds, and the momenta are pa =
ndxa/ds. The ray equations are
(simply expressing the 2 second-order Eikonal equation as 4 first-order
where the dot is a derivative
with respect to the element ds.
As an example, consider a radial refractive index profile in the x-y plane
where r is the radius on the x-y plane. Putting this refractive index profile into the Eikonal equations creates a two-dimensional orbit in the x-y plane. The following Python code solves for individual trajectories.
Python Code: raysimple.py
# -*- coding: utf-8 -*-
Created on Tue May 28 11:50:24 2019
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
# selection 1 = Gaussian
# selection 2 = Donut
selection = 1
if selection == 1:
sig = 10
n = 1 + np.exp(-(x**2 + y**2)/2/sig**2)
nx = (-2*x/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
ny = (-2*y/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
elif selection == 2:
sig = 10;
r2 = (x**2 + y**2)
r1 = np.sqrt(r2)
np.expon = np.exp(-r2/2/sig**2)
n = 1+0.3*r1*np.expon;
nx = 0.3*r1*(-2*x/2/sig**2)*np.expon + 0.3*np.expon*2*x/r1
ny = 0.3*r1*(-2*y/2/sig**2)*np.expon + 0.3*np.expon*2*y/r1
x, y, z, w = x_y_z
n, nx, ny = refindex(x,y)
yp = np.zeros(shape=(4,))
yp = z/n
yp = w/n
yp = nx
yp = ny
V = np.zeros(shape=(100,100))
for xloop in range(100):
xx = -20 + 40*xloop/100
for yloop in range(100):
yy = -20 + 40*yloop/100
n,nx,ny = refindex(xx,yy)
V[yloop,xloop] = n
fig = plt.figure(1)
contr = plt.contourf(V,100, cmap=cm.coolwarm, vmin = 1, vmax = 3)
fig.colorbar(contr, shrink=0.5, aspect=5)
fig = plt.show()
v1 = 0.707 # Change this initial condition
v2 = np.sqrt(1-v1**2)
y0 = [12, v1, 0, v2] # Change these initial conditions
tspan = np.linspace(1,1700,1700)
y = integrate.odeint(flow_deriv, y0, tspan)
lines = plt.plot(y[1:1550,0],y[1:1550,1])
An excellent textbook on geometric optics from Hamilton’s point of view is K. B. Wolf, Geometric Optics in Phase Space (Springer, 2004). Another is H. A. Buchdahl, An Introduction to Hamiltonian Optics (Dover, 1992).
A rather older textbook on geometrical optics is by J. L. Synge, Geometrical Optics: An Introduction to Hamilton’s Method (Cambridge University Press, 1962) showing the derivation of the ray equations in the final chapter using variational methods. Synge takes a dim view of Bruns’ term “Eikonal” since Hamilton got there first and Bruns was unaware of it.
A book that makes an especially strong case for the Optical-Mechanical analogy of Fermat’s principle, connecting the trajectories of mechanics to the paths of optical rays is Daryl Holm, Geometric Mechanics: Part I Dynamics and Symmetry (Imperial College Press 2008).
 Hamilton, W. R. “On a general method in dynamics I.” Mathematical Papers, I ,103-161: 247-308. (1834); Hamilton, W. R. “On a general method in dynamics II.” Mathematical Papers, I ,103-161: 95-144. (1835)
 Schrodinger, E. “Quantification of the eigen-value problem.” Annalen Der Physik 79(6): 489-527. (1926)
There is a very real possibility that quantum computing is, and always will be, a technology of the future. Yet if it is ever to be the technology of the now, then it needs two things: practical high-performance implementation and a killer app. Both of these will require technological breakthroughs. Whether this will be enough to make quantum computing real (commercializable) was the topic of a special symposium at the Conference on Lasers and ElectroOptics (CLEO) held in San Jose the week of May 6, 2019.
Quantum computing is stuck in a sort of limbo between hype and hope, pitched with incredible (unbelievable?) claims, yet supported by tantalizing laboratory demonstrations.
The symposium had panelists from many top groups working in quantum information science, including Jerry Chow (IBM), Mikhail Lukin (Harvard), Jelena Vuckovic (Stanford), Birgitta Whaley (Berkeley) and Jungsang Kim (IonQ). The moderator Ben Eggleton (U Sydney) posed the question to the panel: “Will Quantum Computing Actually Work?”. My Blog for this week is a report, in part, of what they said, and also what was happening in the hallways and the scientific sessions at CLEO. My personal view after listening and watching this past week is that the future of quantum computers is optics.
It is either ironic or obvious that the central figure behind quantum computing is Albert Einstein. It is obvious because Einstein provided the fundamental tools of quantum computing by creating both quanta and entanglement (the two key elements to any quantum computer). It is ironic, because Einstein turned his back on quantum mechanics, and he “invented” entanglement to actually argue that it was an “incomplete science”.
The actual quantum revolution did not begin with Max Planck in 1900, as so many Modern Physics textbooks attest, but with Einstein in 1905. This was his “miracle year” when he published 5 seminal papers, each of which solved one of the greatest outstanding problems in the physics of the time. In one of those papers he used simple arguments based on statistics, combined with the properties of light emission, to propose — actually to prove — that light is composed of quanta of energy (later to be named “photons” by Gilbert Lewis in 1924). Although Planck’s theory of blackbody radiation contained quanta implicitly through the discrete actions of his oscillators in the walls of the cavity, Planck vigorously rejected the idea that light itself came in quanta. He even apologized for Einstein, as he was proposing Einstein for membership the Berlin Academy, saying that he should be admitted despite his grave error of believing in light quanta. When Millikan set out in 1914 to prove experimentally that Einstein was wrong about photons by performing exquisite experiments on the photoelectric effect, he actually ended up proving that Einstein was right after all, which brought Einstein the Nobel Prize in 1921.
In the early 1930’s after a series of intense and public debates with Bohr over the meaning of quantum mechanics, Einstein had had enough of the “Copenhagen Interpretation” of quantum mechanics. In league with Schrödinger, who deeply disliked Heisenberg’s version of quantum mechanics, the two proposed two of the most iconic problems of quantum mechanics. Schrödinger launched, as a laughable parody, his eponymously-named “Schrödinger’s Cat”, and Einstein launched what has become known as the “Entanglement”. Each was intended to show the absurdity of quantum mechanics and drive a nail into its coffin, but each has been embraced so thoroughly by physicists that Schrödinger and Einstein are given the praise and glory for inventing these touchstones of quantum science. Schrödinger’s cat and entanglement both lie at the heart of the problems and the promise of quantum computers.
Between Hype and Hope
Quantum computing is stuck in a sort of limbo between hype and hope, pitched with incredible (unbelievable?) claims, yet supported by tantalizing laboratory demonstrations. In the midst of the current revival in quantum computing interest (the first wave of interest in quantum computing was in the 1990’s, see “Mind at Light Speed“), the US Congress has passed a house resolution to fund quantum computing efforts in the United States with a commitment $1B. This comes on the heels of commercial efforts in quantum computing by big players like IBM, Microsoft and Google, and also is partially in response to China’s substantial financial commitment to quantum information science. These acts, and the infusion of cash, will supercharge efforts on quantum computing. But this comes with real danger of creating a bubble. If there is too much hype, and if the expensive efforts under-deliver, then the bubble will burst, putting quantum computing back by decades. This has happened before, as in the telecom and fiber optics bubble of Y2K that burst in 2001. The optics industry is still recovering from that crash nearly 20 years later. The quantum computing community will need to be very careful in managing expectations, while also making real strides on some very difficult and long-range problems.
This was part of what the discussion at the CLEO symposium centered around. Despite the charge by Eggleton to “be real” and avoid the hype, there was plenty of hype going around on the panel and plenty of optimism, tempered by caution. I admit that there is reason for cautious optimism. Jerry Chow showed IBM’s very real quantum computer (with a very small number of qubits) that can be accessed through the cloud by anyone. They even built a user interface to allow users to code their own quantum codes. Jungsang Kim of IonQ was equally optimistic, showing off their trapped-atom quantum computer with dozens of trapped ions acting as individual qubits. Admittedly Chow and Kim have vested interests in their own products, but the technology is certainly impressive. One of the sharpest critics, Mikhail Lukin of Harvard, was surprisingly also one of the most optimistic. He made clear that scalable quantum computers in the near future is nonsense. Yet he is part of a Harvard-MIT collaboration that has constructed a 51-qubit array of trapped atoms that sets a world record. Although it cannot be used for quantum computing, it was used to simulate a complex many-body physics problem, and it found an answer that could not be calculated or predicted using conventional computers.
The panel did come to a general consensus about quantum computing that highlights the specific challenges that the field will face as it is called upon to deliver on its hyperbole. They each echoed an idea known as the “supremacy plot” which is a two-axis graph of number of qubits and number of operations (also called circuit depth). The graph has one region that is not interesting, one region that is downright laughable (at the moment), and one final area of great hope. The region of no interest lies in the range of large numbers of qubits but low numbers of operations, or large numbers of operations on a small number of qubits. Each of these extremes can easily be calculated on conventional computers and hence is of no practical interest. The region that is laughable is the the area of large numbers of qubits and large numbers of operations. No one suggested that this area can be accessed in even the next 10 years. The region that everyone is eager to reach is the region of “quantum supremacy”. This consists of quantum computers that have enough qubits and enough operations that they cannot be simulated by classical computers. When asked where this region is, the panel consensus was that it would require more than 50 qubits and more than hundreds or thousands of operations. What makes this so exciting is that there are real technologies that are now approaching this region–and they are based on light.
Chris Monroe’s Perfect Qubits
The second plenary session at CLEO featured the recent Nobel prize winners Art Ashkin, Donna Strickland and Gerard Mourou who won the 2018 Nobel prize in physics for laser applications. (Donna Strickland is only the third woman to win the Nobel prize in physics.) The warm-up band for these headliners was Chris Monroe, founder of the start-up company IonQ out of the University of Maryland. Monroe outlined the general layout of their quantum computer which is based on trapped atoms which he called “perfect qubits”. Each trapped atom is literally an atomic clock with the kind of exact precision that atomic clocks come with. The quantum properties of these atoms are as perfect as is needed for any quantum computation, and the limits on the performance of the current IonQ system is entirely caused by the classical controls that trap and manipulate the atoms. This is where the efforts of their rapidly growing R&D team are focused.
If trapped atoms are the perfect qubit, then the perfect quantum communication channel is the photon. The photon in vacuum is the quintessential messenger, propagating forever and interacting with nothing. This is why experimental cosmologists can see the photons originating from the Big Bang 13 billion years ago (actually from about a hundred thousand years after the Big Bang when the Universe became transparent). In a quantum computer based on trapped atoms as the gates, photons become the perfect wires.
On the quantum supremacy chart, Monroe plotted the two main quantum computing technologies: solid state (based mainly on superconductors but also some semiconductor technology) and trapped atoms. The challenges to solid state quantum computers comes with the scale-up to the range of 50 qubits or more that will be needed to cross the frontier into quantum supremacy. The inhomogeneous nature of solid state fabrication, as perfected as it is for the transistor, is a central problem for a solid state solution to quantum computing. Furthermore, by scaling up the number of solid state qubits, it is extremely difficult to simultaneously increase the circuit depth. In fact, circuit depth is likely to decrease (initially) as the number of qubits rises because of the two-dimensional interconnect problem that is well known to circuit designers. Trapped atoms, on the other hand, have the advantages of the perfection of atomic clocks that can be globally interconnected through perfect photon channels, and scaling up the number of qubits can go together with increased circuit depth–at least in the view of Monroe, who admittedly has a vested interest. But he was speaking before an audience of several thousand highly-trained and highly-critical optics specialists, and no scientist in front of such an audience will make a claim that cannot be supported (although the reality is always in the caveats).
The Future of Quantum Computing is Optics
The state of the art of the photonic control of light equals the levels of sophistication of electronic control of the electron in circuits. Each is driven by big-world applications: electronics by the consumer electronics and computer market, and photonics by the telecom industry. Having a technology attached to a major world-wide market is a guarantee that progress is made relatively quickly with the advantages of economy of scale. The commercial driver is profits, and the driver for funding agencies (who support quantum computing) is their mandate to foster competitive national economies that create jobs and improve standards of living.
The yearly CLEO conference is one of
the top conferences in laser science in the world, drawing in thousands of
laser scientists who are working on photonic control. Integrated optics is one of the current hot
topics. It brings many of the resources
of the electronics industry to bear on photonics. Solid state optics is mostly concerned with
quantum properties of matter and its interaction with photons, and this year’s
CLEO conference hosted many focused sessions on quantum sensors, quantum control,
quantum information and quantum communication.
The level of external control of quantum systems is increasing at a
spectacular rate. Sitting in the
audience at CLEO you get the sense that you are looking at the embryonic stages
of vast new technologies that will be enlisted in the near future for quantum
computing. The challenge is, there are
so many variants that it is hard to know which of these naissent technologies
will win and change the world. But the
key to technological progress is diversity (as it is for society), because it
is the interplay and cross-fertilization among the diverse technologies that
drives each forward, and even technologies that recede away still contribute to
the advances of the winning technology.
The expert panel at CLEO on the
future of quantum computing punctuated their moments of hype with moments of
realism as they called for new technologies to solve some of the current
barriers to quantum computers. Walking
out of the panel discussion that night, and walking into one of the CLEO technical
sessions the next day, you could almost connect the dots. The enabling technologies being requested by
the panel are literally being built by the audience.
In the end, the panel had a surprisingly prosaic argument in favor of the current push to build a working quantum computer. It is an echo of the movie Field of Dreams, with the famous quote “If you build it they will come”. That was the plea made by Lukin, who argued that by putting quantum computers into the hands of users, then the killer app that will drive the future economics of quantum computers likely will emerge. You don’t really know what to do with a quantum computer until you have one.
Given the “perfect qubits” of
trapped atoms, and the “perfect photons” of the communication channels,
combined with the dizzying assortment of quantum control technologies being
invented and highlighted at CLEO, it is easy to believe that the first
large-scale quantum computers will be based on light.