Harald Kirsch

genug Unfug.


Shortest Distance (II.2, p.124)*



This is not explicitly an exercise in the book. Rather I write down for myself how to prove with the help of the Euler-Lagrange Equation that a straight line is the shortest distance between two points in a Euclidian space.

Actually I like the more formal approach of explaining the Euler-Lagrange Equation in the book Emmy Noether's Wonderful Theorem, but motivated by Zee, I try to use it here for the first time for a very simple and obvious problem. $\newcommand{\R}{\mathbb{R}}$

So the task is now to prove by means of the Euler-Lagrange Equation that the shortest distance between two points in a Euclidian space is a straight line.



Zee writes the Euler-Langrange Equation in the form \begin{equation} \frac{d}{dx} \left( \frac{\delta \mathcal{E}}{\delta\frac{d\phi_j}{dx}} \right) - \frac{\delta\mathcal{E}}{\delta\phi_j} = 0\, . \end{equation} But I rather prefer the formulation in Neuenschwander's book. Given an n dimensional function of one variable, $\vec{x}:\R\to\R^n$, with the components written as $x^\mu:\R\to\R$ ($\mu=1\dots n$) and the so called Lagrangian $L:\R^{2n+1}\to\R$, the functional $J:(\R\to\R^n)\to\R$, mapping the $n$ dimensional function $\vec{x}$ to a number, is defined as \begin{equation} J(\vec{x}) = \int_a^b L(t,x^\mu(t),\dot x^\mu(t)) \,dt\,, \end{equation} where $L(t,x^\mu(t),\dot x^\mu(t))$ is short for $L(t,x^1(t),\dots,x^n(t),\dot x^1(t),\dots,\dot x^n(t))$. The theory of the Euler-Lagrange Equations, I think it is also called the action principle, now says that the function $\vec{x}$ for which $J$ becomes extremal (minimal or maximal) satifies the Euler-Lagrange Equation(s) \begin{equation}\label{eqEL} \frac{\delta L}{\delta x^\mu} = \frac{d}{dt}\frac{\delta L}{\delta \dot x^\mu} \qquad \mu=1\dots n\,. \end{equation} It took me some time to understand the meaning of expressions like $\delta L / \delta x^\mu$ in cases where $x^\mu$ is not a variable. But this is really just the derivative of $L$ with regard to its respective argument. In particular it means that the chain rule needs not be invoked. As an example consider $L(a,b,c):= 3a + 4b +5c$ and let $\vec{x}(t):=t^2$ be a 1-dimensional function. This would give $L(t,x^\mu(t),\dot x^\mu(t)) = 3t + 4x^\mu(t) + 5\dot x^\mu(t) = 3t+4t^2$ for $\mu=1$. And the seemingly difficult differentiation is just so: \begin{align} \frac{\delta L}{\delta x^\mu} \notag &= \frac{\delta L}{\delta b}\Bigr|_{b=x^\mu} \notag\\ &= 4b\Bigr|_{b=x^\mu} \notag\\ &= 4 t^2 \notag \end{align} Note in particular, that $4t^2$ is not further differentiated according to the chain rule.

The relation between Zee and Neuenschwander is the following: \begin{align} \mathcal{E} & \to L\\ \phi_j & \to x^\mu\\ x & \to t\\ \end{align}


Now that we have the tools ready, we can actually start to solve the exercise. Let $\vec{x}(t) = (x^\mu(t))|_{\mu=1}^n$ define an arbitrary curve in the $n$-dimensional Euclidian space. From what I learned in calculus, the length of the curve between two points $\vec{x}(a)$ and $\vec{x}(b)$ is given by integration of the absolute value of the first derivative \begin{equation}\label{eqJ} \text{length} = \int_a^b |\dot{\vec{x}}(t)|\, dt =: J(\vec{x}) \,, \end{equation} independent of the specific parameterization. Just consider $t$ as time and $\vec{x}$ as the path of a point over time. Then $|\dot{\vec{x}}(t)|$ is the speed of the point and integrating the speed over time indeed results in the distance travelled.

Evidently the Lagrangian is $L(t,x^\mu,\dot x^\mu) = |\dot{\vec{x}}(t)| =\sqrt{\sum_\mu (\dot x^\mu)^2}$. We want to solve equation \ref{eqEL} for $L$. Since $L$ does not directly depend on $\vec{x}$, but only on $\dot{\vec{x}}$, the left hand side of \ref{eqEL} is equal to zero. For the right hand side, we differentiate with regard to $\dot{\vec{x}}$. Differentiating an absolute value is a bit cumbersome if done the first time, but actually the absolute value moves to the denominator and the chain rule terms go into the numerator: \begin{equation} \frac{\delta L}{\delta \dot x^\mu} = \frac{2 \dot x^\mu}{2 \sqrt{\sum_\nu (\dot x^\nu)^2}} = \frac{\dot x^\mu}{L} \,. \end{equation} The result needs to be differentiated by $t$ to provide the right hand side of \ref{eqEL}. We saw already that the left hand side is zero, so we get: \begin{equation} 0 = \frac{d}{dt}\frac{\dot x^\mu}{L}\;\;\;\forall \mu=1,\dots,n \end{equation} At this point, Zee says on p. 125 that "these are easy enough to solve by inspection" and that $\dot x^\mu$ is constant.

I must admit, I don't see how to solve it with this result. All I can derive is that $\dot x^\mu / L$ is constant. Further it is not even a necessary condition, that $\dot x^\mu$ is constant for the line to be straight. With $x^\mu(t)=t^2$ ($t>=0,\,\mu=1\dots n$), the line is perfectly straight for any $n$, while $\dot x^\mu(t)=2t$ is not constant.


Rather I continue by actually computing the derivative with regard to $t$. Since both, numerator and denominator depend on $t$, we can apply the standard rule for differentiating a fraction: \begin{align} \frac{d}{dt} \frac{\dot x^\mu}{L} &= \frac{\ddot x^\mu L - \dot x^\mu \dot L}{L^2}\,. \end{align} By inserting into \ref{eqEL} we get: \begin{align} 0 &= \ddot x^\mu L - \dot x^\mu \dot L\\ &= \ddot x^\mu - \dot x^\mu\frac{\dot L}{L} \,. \end{align} Each component $\ddot x^\mu$ of the acceleration is a multiple of the respective component $\dot x^\mu$ of the velocity, where the factor $\dot L/L$ is the same for all $\mu$. This means that the acceleration $\ddot{\vec{x}}$ is always parallel, or rather, in the exact same direction as the velocity $\dot{\vec{x}}$. This again means, that a point following the "curve" $\vec{x}$ can never actually curve, but always goes straight. If it would curve, the accelleration would have to act other than in the very same direction where the point is already moving.

What we just proved with the Euler-Lagrange equation is that to get the length of the path from $\vec{x}(a)$ to $\vec{x}(b)$, i.e. $J(\vec{x})$ in equation \ref{eqJ}, extremal, the path has to be a straight line.

For the proof to be really complete, we would have to show two more things:

  1. The straight line indeed minimizes $J$ and does not maximize it.
  2. $\dot{\vec{x}}$ beeing parallel to $\ddot{\vec{x}}$ is sufficient for the "curve" to be straight.

For now I just believe this :-)