Derivation of the Euler-Lagrange Equation for a Function of Several Dependent Variables


SOURCE FOR CONTENT: Classical Dynamics of Particles and Systems. Thornton and Marion. 5th Edition. 


Consider a functional

\displaystyle \phi = \phi(y_{\mu},y_{\mu}^{\prime}; x), (1)

where \mu = 1,2,...,n. By the method used in a previous section of the aforementioned text, we may write

\displaystyle y_{\mu}(\alpha, x) = y_{\mu}(0,x) +\alpha \eta_{\mu}(x). (2)

Additionally, we will find it useful to define

\displaystyle y_{\mu}^{\prime}(\alpha,x) = y_{\mu}^{\prime}(0,x)+\alpha \eta_{\mu}^{\prime}(x). (3)

Further we may also define an integral functional by way of integrating Eq.(1) over the interval x_{1}\leq x \leq x_{2}, and introducing a variational parameter \alpha we have

\displaystyle J(\alpha) = \int_{x_{1}}^{x_{2}} \phi(y_{\mu},y_{\mu};x)dx. (4)

Two necessary conditions that are used to derive the Euler-Lagrange equation include

\displaystyle \frac{\partial J(\alpha)}{\partial \alpha}\bigg\|_{\alpha=0}=0, (5)


\displaystyle \eta_{\mu}(x_{1})=\eta_{\mu}(x_{2})=0. (6)

Let us take the derivative of J(\alpha) with respect to \alpha yielding

\displaystyle \frac{\partial J}{\partial \alpha}\bigg\|_{\alpha=0}=\frac{\partial}{\partial \alpha}\int_{x_{1}}^{x_{2}}\phi(y_{\mu},y_{\mu}^{\prime};x)dx. (7)

Carrying out the derivative operator on the right-hand-side of Eq.(7) we get

\displaystyle \frac{\partial J}{\partial \alpha}=\int_{x_{1}}^{x_{2}}\sum_{\mu}\bigg\{\partial_{y_{\mu}}\phi \partial_{\alpha}y_{\mu}+\partial_{y_{\mu}^{\prime}}\phi \partial_{\alpha}y_{\mu}^{\prime}\bigg\}dx. (8)

From Eqs.(2) and (3) we see that

\displaystyle \partial_{\alpha}y_{\mu}= \eta_{\mu}(x), (9)


\displaystyle \partial_{\alpha}y_{\mu}^{\prime} = \eta_{\mu}^{\prime}(x). (10)

Thus Eq.(8) becomes…

\displaystyle \frac{\partial J}{\partial \alpha}=\int_{x_{1}}^{x_{2}}\sum_{\mu}\bigg\{\partial_{y_{\mu}}\phi \eta_{\mu}(x)+\partial_{y_{\mu}^{\prime}}\phi \eta_{\mu}^{\prime}(x)\bigg\}dx. (11)

Consider the second term under the summation. We may make use of integration by parts to obtain the following

\displaystyle \frac{\partial J}{\partial \alpha}=\int_{x_{1}}^{x_{2}}\sum_{\mu}\bigg\{\partial_{y_{\mu}}\phi +\frac{d}{dx}(\partial_{y_{\mu}^{\prime}}\phi) \bigg\}\eta_{\mu}(x)dx. (12)

By the necessary condition (Eq.(5)), it follows that

\displaystyle 0 = \sum_{\mu}\bigg\{\partial_{y_{\mu}}\phi +\frac{d}{dx}(\partial_{y_{\mu}^{\prime}}\phi) \bigg\}. (13)

Additionally, in Eq.(13) above, we have also made use of the condition that \eta_{\mu}(x_{1})=\eta_{\mu}(x_{2})=0. Since \eta_{\mu}(x) \neq 0 for any x_{1}\leq x \leq x_{2}, then the terms in the brackets must vanish, yielding the Euler-Lagrange Equation for several dependent variables.

Update: the next posts will be those discussed on my Facebook page. Namely, I intend to continue with my Research Series and my series in Tensor Calculus and General Relativity with various ancillary posts in my Astrophysics Series.


Clear Skies!

A “Proof” of the Sturm-Liouville Theorem/Problem

IMAGE CREDIT: NASA/JPL: This shows Jupiter’s Great Red Spot; a storm that has been occurring for over 300 years now. Quite recently, however, observations show that the Spot appears to be shrinking in size. 

About a week ago, I was looking through my notebooks and came across an unfinished problem posed by one of my professors. Unfortunately, I was not able to solve the problem during the semester. However, I thought it might be something interesting to consider. I did a quick search and found that the problem he gave us was to prove the Sturm-Liouville Theorem.

The main brute-force method to analytically solving a given partial differential equation is the separation of variables. This method is heavily used by physicists and in doing so transforms the initial boundary value problem (IBVP) into a Sturm-Liouville problem in which we have an ordinary differential equation and linear homogeneous boundary conditions.

Continue reading A “Proof” of the Sturm-Liouville Theorem/Problem

Helicity and Magnetic Helicity

SOURCE FOR CONTENT: Davidson, P.A., 2001, An Introduction to Magnetohydrodynamics. 3. 

The purpose of this post is to introduce the concepts of helicity as an integral invariant and its magnetic analog. More specifically, I will be showing that

\displaystyle h = \int_{V_{\omega}}\textbf{u}\cdot \omega d^{3}r  (1)

is invariant and thus correlates to the conservation of vortex line topology using the approach given in the source material above.  I will then discuss magnetic helicity and how it relates to MHD.

Consider the total derivative of \textbf{u}\cdot \omega:

\displaystyle \frac{D}{Dt}(\textbf{u}\cdot \omega) = \frac{D\textbf{u}}{Dt}\cdot \omega + \frac{D\omega}{Dt}\cdot \textbf{u}, (2)

where I have used the product rule for derivatives. Also I denote a total derivative as \frac{D}{Dt} = \frac{\partial}{\partial t}+(\nabla \cdot \textbf{v}) so as not to confuse it with the notation for an ordinary derivative. Recall the Navier-Stokes’ equation and the vorticity equation

\displaystyle \frac{\partial \textbf{u}}{\partial t}+(\nabla \cdot \textbf{v})\textbf{u}=\textbf{F}-\frac{1}{\rho}\nabla P + \nu \nabla^{2}\textbf{u}, (3)


\displaystyle \frac{\partial \omega}{\partial t}+(\nabla \cdot \textbf{v})\omega =(\nabla \cdot \omega)\textbf{u} +\eta \nabla^{2}\omega (4).

Let \textbf{F}=0 and let \eta \nabla^{2} \omega \equiv 0 and \nu \nabla^{2} \textbf{u} \equiv 0. Then,

\displaystyle \frac{D\textbf{u}}{Dt}=-\nabla \frac{P}{\rho},  (5)


\displaystyle \frac{D\omega}{Dt}=(\nabla \cdot \omega)\textbf{u}. (6)

If we substitute Eqs. (5) and (6) into Eq.(2), we get

\displaystyle \frac{D}{Dt}(\textbf{u}\cdot \omega)=-\nabla \frac{P}{\rho}\cdot \omega + (\nabla \cdot \omega)\textbf{u}\cdot \textbf{u}. (7)

Evaluating each scalar product gives

\displaystyle -2\nabla \frac{P}{\rho}\omega + \omega u^{2}\nabla = 0. (8)

If we divide by 2 and note the definition of the scalar product, we may rewrite this as follows

\displaystyle \nabla \cdot \bigg\{-\frac{P}{\rho}\omega +\frac{u^{2}}{2}\omega\bigg\}=0. (9)

With this in mind, consider now the total derivative of (\textbf{u}\cdot \omega) multiplied by a volume element \delta V given by

\displaystyle \frac{D}{Dt}(\textbf{u}\cdot \omega)\delta V = \nabla \cdot \bigg\{-\frac{P}{\rho}\omega +\frac{u^{2}}{2}\omega\bigg\}\delta V. (10)

When the above equation is integrated over a volume V_{\omega}, the total derivative becomes an ordinary time derivative. Thus, Eq.(10) becomes

\displaystyle \int_{V_{\omega}}\frac{d}{dt}(\textbf{u}\cdot \omega)dV=\iint_{S_{\omega}}\bigg\{\nabla \cdot [-\frac{P}{\rho}\omega + \frac{u^{2}}{2}\omega ]\bigg\}\cdot d\textbf{S}=0. (11)

Hence, Eq.(11) shows that helicity is an integral invariant. In this analysis, we have assumed that the fluid is inviscid. If one considers two vortex lines (as Davidson does as a mathematical example), one finds that there exists two contributions to the overall helicity, corresponding to an individual vortex line. If one considers the vorticity multiplied by a volume element, one finds that for each line, their value helicity is the same. Thus, because they are the same, we can say that the overall helicity remains invariant and hence are linked. If they weren’t linked, then the helicity would be 0. In regards to magnetic helicity we define this as

\displaystyle h_{m}=\int_{V_{B}}\textbf{A}\cdot \textbf{B}dV, (12)

where \textbf{A} is the magnetic vector potential which is described by the following relations

\displaystyle \nabla \times \textbf{A}=\textbf{B}, (13.1)


\displaystyle \nabla \cdot \textbf{A}=0. (13.2)

 Again, we see that magnetic helicity is also conserved. However, this conservation arises by means of Alfven’s theorem of flux freezing in which the magnetic field lines manage to preserve their topology.

Consequences and some Elementary Theorems of the Ideal One-Fluid Magnetohydrodynamic Equations


Priest, E. Magnetohydrodynamics of the Sun, 2014. Cambridge University Press. Ch.2.;

Davidson, P.A., 2001. An Introduction to Magnetohydrodynamics. Ch.4. 

We have seen how to derive the induction equation from Maxwell’s equations assuming no charge and assuming that the plasma velocity is non-relativistic. Thus, we have the induction equation as being

\displaystyle \frac{\partial \textbf{B}}{\partial t}=\nabla \times (\textbf{v}\times \textbf{B})+\lambda \nabla^{2}\textbf{B}. (1)

Many texts in MHD make the comparison of the induction equation to the vorticity equation

\displaystyle \frac{\partial \Omega}{\partial t}= \nabla \times (\textbf{v} \times \Omega)+\nu \nabla^{2}\Omega, (2)

where I have made use of the vector identity

\nabla \times (\textbf{X}\times \textbf{Y})=\textbf{X}(\nabla \cdot \textbf{Y})-\textbf{Y}(\nabla \cdot \textbf{X})+(\textbf{Y}\cdot \nabla)\textbf{X}-(\textbf{X}\cdot \nabla)\textbf{Y}.

Indeed, if we do compare the induction equation (Eq.(1)) to the vorticity equation (Eq.(2)) we easily see the resemblance between the two. The first term on the right hand side of Eq.(1)/ Eq.(2) determines the advection of magnetic field lines/vortex field lines; the second term on the right hand side deals with the diffusion of the magnetic field lines/vortex field lines.

From this, we can impose restrictions and thus look at the consequences of the induction equation (since it governs the evolution of the magnetic field). Furthermore, we see that we can modify the kinematic theorems of classical vortex dynamics to describe the properties of magnetic field lines. After discussing the direct consequences of the induction equation, I will discuss a few theorems of vortex dynamics and then introduce their MHD analogue.

Inherent to this is magnetic Reynold’s number. In geophysical fluid dynamics, the Reynolds number (not the magnetic Reynolds number) is a ratio between the viscous forces per volume and the inertial forces per volume given by

\displaystyle Re=\frac{ul}{V}, (3)

where u, l, V represent the typical fluid velocity, length scale and typical volume respectively. The magnetic Reynolds number is the ratio between the advective and diffusive terms of the induction equation. There are two canoncial regimes: (1) Re_{m}<<1, and (2)Re_{m}>>1 The former is sometimes called the diffusive limit and the latter is called either the Ideal limit or the infinite conductivity limit (I prefer to call it the ideal limit, since the terms infinite conductivity limit is not quite accurate).


Case I:  Re_{m}<<1

Consider again the induction equation

\displaystyle \frac{\partial \textbf{B}}{\partial t}=\nabla \times (\textbf{v}\times \textbf{B})+\lambda\nabla^{2}\textbf{B}.

If we then assume that we are dealing with incompressible flows (i.e. (\nabla \cdot \textbf{v})=0) then we can use the aforementioned vector identity to write the induction equation as

\displaystyle \frac{D\textbf{B}}{Dt}=(\textbf{B}\cdot \nabla)\textbf{v}+\lambda\nabla^{2}\textbf{B}. (4)

In the regime for which Re_{m}<<1, the induction equation for incompressible flows (Eq.(4)) assumes the form

\displaystyle \frac{\partial \textbf{B}}{\partial t}=\lambda \nabla^{2}\textbf{B}. (5)

Compare this now to the following equation,

\displaystyle \frac{\partial T}{\partial t}=\alpha \nabla^{2}T. (6)

We see that the magnetic field lines are diffused through the plasma.


Case II: Re_{m}>>1

If we now consider the case for which the advective term dominates, we see that the induction equation takes the form

\displaystyle \frac{\partial \textbf{B}}{\partial t}=\nabla \times (\textbf{v}\times \textbf{B}). (7)

Mathematically, what this suggests is that the magnetic field lines become “frozen-in” the plasma, giving rise to Alfven’s theorem of flux freezing.

Many astrophysical systems require a high magnetic Reynolds number. Such systems include the solar magnetic field (heliospheric current sheet), planetary dynamos (Earth, Jupiter, and Saturn), and galactic magnetic fields.

Kelvin’s Theorem & Helmholtz’s Theorem:

Kelvin’s Theorem: Consider a vortex tube in which we have that (\nabla \cdot \Omega)=0, in which case

\displaystyle \oint \Omega \cdot d\textbf{S}=0, (8)

and consider also the curve taken around a closed surface, (we call this curve a material curve C_{m}(t)) we may define the circulation as being

\displaystyle \Gamma = \oint_{C_{m}(t)}\textbf{v}\cdot d\textbf{l}. (9)

Thus, Kelvin’s theorem states that if the material curve is closed and it consists of identical fluid particles then the circulation, given by Eq.(9), is temporally invariant.

Helmholtz’s Theorem:

Part I: Suppose we consider a fluid element which lies on a vortex line at some initial time t=t_{0}, according to this theorem it states that this fluid element will continue to lie on that vortex line indefinitely.

Part II: This part says that the flux of vorticity

\displaystyle \Phi = \int \Omega \cdot d\textbf{S}, (10)

remains constant for each cross-sectional area and is also invariant with respect to time.


Now the magnetic analogue of Helmholtz’s Theorems are found to be Alfven’s theorem of flux freezing and conservation of magnetic flux, magnetic field lines, and magnetic topology.

The first says that fluid elements which lie along magnetic field lines will continue to do so indefinitely; basically the same for the first Helmholtz theorem.

The second requires a more detailed argument to demonstrate why it works but it says that the magnetic flux through the plasma remains constant. The third says that magnetic field lines, hence the magnetic structure may be stretched and deformed in many ways, but the magnetic topology overall remains the same.

The justification for these last two require some proof-like arguments and I will leave that to another post.

In my project, I considered the case of high magnetic Reynolds number in order to examine the MHD processes present in region of metallic hydrogen present in Jupiter’s interior.

In the next post, I will “prove” the theorems I mention and discuss the project.

Monte Carlo Simulations of Radiative Transfer: Basics of Radiative Transfer Theory (Part I)

SOURCE FOR CONTENT: Chandrasekhar, S., 1960. Radiative Transfer. 1. 


In this post, I will be discussing the basics of radiative transfer theory necessary to understand the methods used in this project. I will start with some definitions, then I will look at the radiative transfer equation and consider two simple cases of scattering.

The first definition we require is the specific intensity, which is the amount of energy associated with a specific frequency dE_{\nu} passing through an area dA constrained to a solid angle d\Omega in a time dt. We may write this mathematically as

dE_{\nu}=I_{\nu}\cos{\theta}d\nu d\Sigma d\Omega dt. (1)

We must also consider the net flux given by

\displaystyle d\nu d\Sigma dt \int I_{\nu}\cos{\theta}d\Omega, (2)

where if we integrate over all solid angles \Omega we get

\pi F_{\nu}=\displaystyle \int I_{\nu}\cos{\theta}d\Omega. (3)

Let d\Lambda be an element of the surface \Lambda in a volume V through which radiation passes. Further let \Theta and \theta denote the angles which form normals with respect to elements d\Lambda and d\Sigma. These surfaces are joined by these normals and hence we have the surface across which energy flows  includes the elements d\Lambda and d\Sigma, given by the following:

I_{\nu}\cos{\Theta}d\Sigma d\Omega^{\prime}d\nu = I_{\nu}d\nu \frac{\cos{\Theta}\cos{\theta}d\Sigma d\Lambda}{r^{2}} (4),

where d\Omega^{\prime}=d\Lambda \cos{\Theta}/r^{2} is the solid angle subtended by the surface element d\Lambda at a point P and volume element dV=ld\Sigma \cos{\theta} is the volume that is intercepted in volume V. If we take this further, and integrate over all V and \Omega we arrive at

\displaystyle \frac{d\nu}{c}\int dV \int I_{\nu} d\Omega=\frac{V}{c}d\nu \int I_{\nu}d\Omega, (5)

where if the radiation travels some distance L in the volume, then we must multiply Eq.(5) by l/c, where c is the speed of light.

We now define the integrated energy density as being

U_{\nu}=\displaystyle \frac{1}{c}\int I_{\nu}d\Omega, (6.1)

while the average intensity is

J_{\nu}=\displaystyle \frac{1}{4\pi}\int I_{\nu}d\nu, (6.2)

and the relation between these two equations is

U_{\nu}=\frac{4\pi}{c}J_{\nu}. (6.3)

I will now introduce the radiative transfer equation. This equation is a balance between the amount of radiation absorbed and the radiation that is emitted. The equation is,

\frac{dI_{\nu}}{ds}=-\epsilon \rho I_{\nu}+h_{\nu}\rho, (7)

where if we divide by \epsilon \rho we get

-\frac{1}{\epsilon_{\nu}\rho}\frac{dI_{\nu}}{ds}=I_{\nu}+U_{\nu}(\theta, \phi), (8)

where U(\theta,\phi) represents the source function given by

U_{\nu}(\theta,\phi)=\displaystyle \frac{1}{4\pi}\int_{0}^{\pi}\int_{0}^{2\pi}p(\theta,\phi;\theta^{\prime},\phi^{\prime})I_{\nu}\sin{\theta^{\prime}}d\theta^{\prime}d\phi^{\prime}. (9)

The source function is typically the ratio between the absorption and emission coefficients. One of the terms in the source function is the phase function which varies according to the specific scattering geometry. In its most general form, we can represent the phase function as an expansion of Legendre polynomials:

p(\theta, \phi; \theta^{\prime},\phi^{\prime})=\displaystyle \sum_{j=0}^{\infty}\gamma_{j}P_{j}(\mu), (10)

where we have let \mu = \cos{\theta} (in keeping with our notation in previous posts).

In Part II, we will discuss a few simple cases of scattering and their corresponding phase functions, as well as obtaining the formal solution of the radiative transfer equation. (DISCLAIMER: While this solution will be consistent in a mathematical sense, it is not exactly an insightful solution since much of the more interesting and complex cases involve the solution of either integro-differential equations or pure integral equations (a possible new topic).)


Monte Carlo Simulations of Radiative Transfer: Project Overview

The goal of this project was to simulate radiative transfer in the circumbinary disk GG Tau by using statistical methods (i.e. Monte Carlo methods). To provide some context for the project, I will describe the astrophysical system that I considered, and I will describe the objectives of the project

Of the many stars in the night sky, a significant number of them, when you are able to resolve them, are in fact binary stars. These are star systems in which two stars orbit a common center of mass known as a barycenter. For this reason, we regard GG Tau to be a close binary system; a system in which two stars are separated by a significant distance comparable to the diameters of each star.

The system of GG Tau has four known components with a hypothesized fifth component. There is a disk of material encircling the entire system which we call a circumbinary disk. Additionally, there is also an inner disk of material that surrounds the primary and secondary components referred to as a circumstellar disk.

The goal of this project was to use Monte Carlo methods to simulate radiative transfer processes present in GG Tau. This is accomplished through the use of importance sampling two essential quantities over cumulative distribution function of the form

\displaystyle \xi=\int_{0}^{L} \mathcal{P}(x)dx\equiv \psi(x_{0}),  (1)

using FORTRAN code. I will discuss the mathematical background of probability and Monte Carlo methods (stochastic processes) in more detail in a later post. The general process of radiative transfer is relatively simple. Starting with an emitter,  radiation travels from its emission point, travels a certain distance, and at this point the radiation could either be scattered into an angle different from the incidence angle, or it can be absorbed by the material. In the context of this project, the emitter (or emitters) are the primary and secondary components of GG Tau. The radiation emitted is electromagnetic radiation, or light. This light will travel a distance L after which the light is either scattered or absorbed. The code uses Eq.(1) (cumulative distribution function) to sample optical depths and scattering angles to calculate the distance traveled L. I will reserve the exact algorithm for a future post as well. In the next post, I will discuss the basics of radiative transfer theory as presented in the text “Radiative Transfer” by S. Chandrasekhar, 1960.


The following are the sources  used in this project

Wood, K., Whitney, B., Bjorkman, J., and Wolff M., 2013. Introduction to Monte Carlo Radiative Transfer.

Whitney, B. A., 2011. Monte Carlo Radiative Transfer. Bull. Astr. Soc. India.

Kitamura, Y., Kawabe, R., Omodaka, T., Ishiguro, M., and Miyama, S., 1994. Rotating Protoplanetary Gas Disk in GG Tau. ASP Conference Series. Vol. 59.

Piètu, V., Gueth, F., Hily-Blant, P., Schuster, K.F., and Pety, J., 2011. High Resolution Imaging of the GG Tauri system at 267 GHz. Astronomy & Astrophysics. 528. A81.

Skrutskie, M.F., Snell, R.L., Strom, K.M., Strom, S.E., and Edwards, S., 1993. Detection of Circumstellar Gas Associated with GG Tauri. The Astrophysical Journal. 409:422-428.

Choudhuri, A.R., 2010. Astrophysics for Physicists. 2.

Carroll B.W., Ostlie, D.A., 2007. An Introduction to Modern Astrophysics. 9,10.

Chandrasekhar, S., 1960. Radiative Transfer. 1.

Schroeder, D., 2000. Introduction to Thermal Physics. 8.

Ross, S., 2010. Introduction to Probability Model. 2,4,11.


I obtained the FORTRAN code from K. Wood via the following link. (For the sake of completeness the URL is:


Solution to Laplace’s Equation

This post deals with the familiar (to the physics student) Laplace’s equation. I am solving this equation in the context of physics, instead of a pure mathematical perspective. This problem is considered most extensively in the context of electrostatics. This equation is usually considered in the spherical polar coordinate system. A lot of finer details are also considered in a mathematical physics course where the topic of spherical harmonics is discussed. Assuming spherical-polar coordinates, Laplace’s equation is

\displaystyle \frac{1}{r^{2}}\frac{\partial}{\partial r}\bigg\{r^{2}\frac{\partial \psi}{\partial r}\bigg\}+\frac{1}{r^{2}\sin{\theta}}\frac{\partial}{\partial \theta}\bigg\{\sin{\theta}\frac{\partial \psi}{\partial \theta}\bigg\}+\frac{1}{r^{2}\sin{\theta}}\frac{\partial^{2}\psi}{\partial \phi^{2}}=0, (1)

Suppose that the function \psi\rightarrow \psi(r,\theta,\phi). Furthermore, by the method of separation of variables we suppose that the solution is a product of eigenfunctions of the form R(r)Y(\theta,\phi). Hence, Laplace’s equation becomes

\displaystyle \frac{Y}{r^{2}}\frac{d}{dr}\bigg\{r^{2}\frac{d^{2}R}{dr^{2}}\bigg\}+\frac{R}{r^{2}\sin{\theta}}\frac{\partial}{\partial \theta}\bigg\{\sin{\theta}\frac{\partial Y}{\partial \theta}\bigg\}+\frac{R}{r^{2}\sin^{\theta}}\bigg\{\frac{\partial^{2}Y}{\partial \phi^{2}}\bigg\}=0. (2)

Furthermore, we can separate further the term Y(\theta, \phi) into \Theta(\theta)\Phi(\phi). Rewriting (2) and multiplying by r^{2}\sin^{2}{\theta}R^{-1}Y^{-1}, we get

\displaystyle \frac{1}{R}\frac{d}{dr}\bigg\{r^{2}\frac{dR}{dr}\bigg\}+\frac{\sin{\theta}}{\Theta}\frac{d}{d\theta}\bigg\{\sin{\theta}\frac{d\Theta}{d\theta}\bigg\}+\frac{1}{\Phi}\frac{d^{2}\Phi}{d\phi^{2}}=0.(3)

Bringing the radial and angular component to the other side of the equation and setting the azimuthal component equal to a separation constant -m^{2}, yielding

\displaystyle -\frac{1}{R}\frac{d}{dr}\bigg\{r^{2}\frac{dR}{dr}\bigg\}-\frac{\sin{\theta}}{\Theta}\frac{d}{d\theta}\bigg\{\sin{\theta}\frac{d\Theta}{d\theta}\bigg\}=\frac{1}{\Phi}\frac{d^{2}\Phi}{d\phi^{2}}=-m^{2}.

Solving the right-hand side of the equation we get

\displaystyle \Phi(\phi)=A\exp({+im\phi}). (4)

Now, we set the azimuthal component equal to m^{2} and carry out the derivative in the angular component to get the Associated Legendre equation:

\displaystyle \frac{d}{d\mu}\bigg\{(1-\mu^{2})\frac{d\Theta}{d\mu}\bigg\}+\bigg\{l(l+1)-\frac{m^{2}}{1-\mu^{2}}\bigg\}\Theta=0, (5)

where I have let \mu=\cos{\theta}. The solutions to this equation are known as the associated Legendre functions. However, instead of solving this difficult equation by brute force methods (i.e. power series method), we consider the case for which m^{2}=0. In this case, Eq.(5) simplifies to Legendre’s differential equation discussed previously. Instead of quoting the explicit form of the Legendre polynomials, we equivalently state the Rodrigues formula for these polynomials

\displaystyle P_{l}(\mu)=\frac{1}{2^{l}l!}\frac{d^{l}}{d\mu^{l}}(\mu^{2}-1)^{l}. (6)

However, the solutions to Eq.(5) are the associated Legendre functions, not polynomials. Therefore, we use the following to determine the associated Legendre functions from the Legendre polynomials:

\displaystyle {\Theta_{l}}^{m}(\mu) = (1-\mu^{2})^{\frac{|m|}{2}}\frac{d^{|m|}}{d\mu^{|m|}}P_{l}(\mu). (7)

For real solutions |m|\leq l and for complex solutions -l \leq m \leq l, thus we can write the solutions as series whose indices run from 0 to l for the real-values and from -l to l for the complex-valued solutions.

Now we turn to the radial part which upon substitution of -l(l+1) in place of the angular component, we get

\displaystyle \frac{d}{dr}\bigg\{r^{2}\frac{dR}{dr}\bigg\}-l(l+1)R=0,

and if we evaluate the derivatives in the first term we get an Euler equation of the form

\displaystyle r^{2}\frac{d^{2}R}{dr^{2}}+2r\frac{dR}{dr}-l(l+1)R=0. (8)

Let us assume that the solution can be represented as

\displaystyle R(r)=\sum_{j=0}^{\infty}a_{j}r^{j}.

Taking the necessary derivatives and substituting into Eq.(8) gives

\displaystyle r^{2}\sum_{j=0}^{\infty}j(j-1)a_{j}r^{j-2}+\sum_{j=0}^{\infty}2rja_{j}r^{j-1}-l(l+1)\sum_{j=0}^{\infty}a_{j}r^{j}=0. (9)

Simplifying the powers of r we find that

\displaystyle \sum_{j=0}^{\infty}[j(j-1)+2j-l(l+1)]a_{j}r^{j}=0.

Now, since \displaystyle \sum_{j=0}^{\infty}a_{j}r^{j}\neq 0, this means that

\displaystyle j(j-1)+2j-l(l+1)=0. (10)

We can simplify this further to get

\displaystyle j^{2}-l^{2}+j-l=0.

Factoring out (j-l), we arrive at

\displaystyle (j-l)((j+l)+1)=0 \implies j=l ; j=-l-1.

Since this must be true for all values of l, the solution therefore becomes

\displaystyle R(r)= \sum_{l=0}^{\infty}Br^{l}+Cr^{-l-1}. (11)

Thus, we can write the solution to Laplace’s equation as

\displaystyle \psi(r,\theta,\phi) = \sum_{l=0}^{\infty} \sum_{m=0}^{\infty}[Br^{l}+Cr^{-l-1}]\Theta_{l}^{m}(\mu)A\exp({im\phi}),  (12)

for real solutions, and

\displaystyle \psi(r,\theta,\phi)=\sum_{l=0}^{\infty}\sum_{m=-l}^{l}[Br^{l}+Cr^{-l-1}]\Theta_{l}^{m}(\mu)A\exp({im\phi}), (13)

for complex solutions.

Solution to the three-dimensional Heat Equation

After taking a topics course in applied mathematics (partial differential equations), I found that there were equations that I should solve since I would later see those equations embedded into other larger-scale equations. This equation was Laplace’s equation (future post). Once I solved this equation, I realized that it becomes a differential operator when acted upon a function of at least two variables. Thus, I could solve equations such as the Schrödinger equation using a three-dimensional laplacian in spherical-polar coordinates (another future post) and the three-dimensional heat equation. I will be solving the latter.


The heat equation initial-boundary-value-problem is therefore

\frac{\partial u}{\partial t}=\frac{1}{c\rho}\nabla^{2}u+Q(x,y,z), (1.1)

subjected to the boundary conditions

u(0,0,0,t)=0, (1.2)

and the initial condition

u(x,y,z,0)= \xi(x,y,z). (1.8)

Now, this solution is not specific to a single thermodynamic system, but rather it is a more general solution in a mathematical context. However, I will be appropriating certain concepts from physics for reasons that are well understood (i.e. that time exists on the interval I=0\leq t < +\infty ). It is nonphysical or nonsensical to speak of negative time.

To start, consider any rectangular prism in which heat flows through the volume, from the origin to the point (X,Y,Z).At a time t=0, the overall heat of the volume can be regarded to be a function \xi(x,y,z). After a time period \delta t=t has passed, the heat will have traversed to the point (X,Y,Z) from the origin (think of the heat traveling along the diagonal of a cube). The goal is to find the heat as a function of the three spatial components and a single time component.  Furthermore, the boundary conditions maintain that at the origin and the final point,  the heat vanishes. In other words, these points act as heat sinks. (A sink is a point where energy can leave the system.)

To simplify the notation, we define \textbf{r}=x\hat{i}+y\hat{j}+z\hat{k}. Thus the initial-boundary-value-problem becomes

\frac{\partial u}{\partial t}=\frac{1}{c\rho}\frac{\partial^{2}u}{\partial \textbf{r}^{2}}, (2.1)

where u\rightarrow u(\textbf{r},t). Also, this definition also reduces the three-dimensional laplacian to a second-order partial derivative of u. The boundary and initial conditions are then


u(R,t)=0, (2.3)

u(\textbf{r},0)=\xi(\textbf{r}). (2.2)

Now, we assume that the solution is a product of eigenfunctions of the form

u(\textbf{r},t)=\alpha(\textbf{r})\Gamma(t). (3)

Taking the respective derivatives and dividing by the assumed form of the solution, we get

\frac{\Gamma^{\prime}(t)}{\Gamma(t)}=\frac{1}{c\rho}\frac{\alpha^{\prime\prime}(\textbf{r})}{\alpha(\textbf{r})}+Q(\textbf{r}). (4)

Now, Eq.(4) can be equal to three different values, the first of which is zero, but this solution does not help in any way,  nor is it physically significant since it produces a trivial solution. The second is \lambda^{2}>0, \lambda \neq 0, so we can apply to the time dependence equation which then becomes

\frac{\Gamma^{\prime}(t)}{\Gamma(t)}=\lambda^{2}, (5.1)

whose solution is

\Gamma(t)=\Gamma_{0}\exp({\lambda^{2}t}). (5.2)

The third case where \lambda^{2}<0, \lambda \neq 0 allows us to write the spatial equation upon rearrangement as

\alpha^{\prime\prime}(\textbf{r})+c\rho(\lambda^{2}+Q)\alpha(\textbf{r})=0, (6.1)

whose solution is

\alpha(\textbf{r})=c_{1}\cos({\sqrt[]{c\rho(\lambda^{2}+Q)}\textbf{r}})+c_{3}\sin({\sqrt[]{c\rho(\lambda^{2}+Q)}\textbf{r}}), (6.2)

where c_{3}=-c_{2}. Next we apply the boundary conditions. Let \textbf{r}=0 in \alpha(\textbf{r}):

\alpha(0)=c_{1}+0=0 \implies c_{1}=0 \implies \alpha(\textbf{r})=C\sin(\sqrt[]{c\rho(\lambda^{2}+Q)} \textbf{r}). (7.1)

and let \textbf{r}=R in (7.1) to get

\alpha(R)\implies \sin({\sqrt[]{c\rho(\lambda^{2}+Q)}R})=0\implies \sqrt[]{c\rho(\lambda^{2}+Q)}R^{2}=(n\pi)^{2}.

Solving for \lambda^{2} above gives

\lambda^{2}=\frac{1}{c\rho}\bigg\{\frac{n\pi}{R}\bigg\}^{2}-Q. (8)

Substituting into  Eq.(7.1) and simplifying gives

\alpha(\textbf{r})=C\sin\bigg\{\frac{n\pi}{c\rho}\bigg\}^{2}\bigg\{\frac{\textbf{r}}{R}\bigg\}. (9)

To find as many solutions as possible we construct a superposition of solutions of the form

u(\textbf{r},t)=u_{l}(\textbf{r},t)=\sum_{l=0}^{\infty}\alpha_{l}(\textbf{r})\Gamma(t). (10)

Rewriting \alpha(\textbf{r}) as \alpha_{l}(\textbf{r}) and using the solution for the time dependence, and also let the coefficients form a product equivalent to the indexed coefficient A_{l}, we arrive at the solution for the heat equation:

u(\textbf{r},t)=\sum_{l=0}^{\infty}A_{l}\sin\bigg\{\frac{l\pi}{c\rho}\bigg\}^{2}\bigg\{\frac{\textbf{r}}{R}\bigg\}\exp(\lambda^{2}t). (11)

Now suppose that t=0 in Eq.(11). In this case we get the initial heat distribution given by \xi(\textbf{r}):

\xi(\textbf{r})=\sum_{l=0}^{\infty} A_{l}\sin\bigg\{\frac{l\pi}{c\rho}\bigg\}^{2}\bigg\{\frac{\textbf{r}}{R}\bigg\}. (12)


Electron Scattering of a Step Potential

The following post was initially one of my assignments for an independent study in modern physics in my penultimate year as an undergraduate. While studying this problem the text that I used to verify my answer was:

R. Eisberg and R. Resnick, Quantum Physics of Atoms, Molecules, Solids, Nuclei, and Particles. John Wiley & Sons. 1985. 6. 

One of the hallmarks of quantum theory is the Schrödinger equation. There are two forms: the time-dependent and the time-independent equation. The former can be turned into the latter by way of assuming stationary states in which case there is no time evolution (i.e. the time derivative \partial_{t} \Psi(x,t)=0. \partial_{t}\Psi(x,t) denotes the derivative of the wavefunction \Psi(x,t) with respect to time.)

The wavefunction describes the state of a system, and it is found by solving the Schrödinger equation. In this post, I’ll be considering a step potential in which

V(x) = V_{0},  x < 0,  (1.1)

V(x) = 0, x > 0.  (1.2)

After solving for the wavefunction, I will calculate the reflection and transmission coefficients for the case where the energy of the electron is less than that of the step potential E < V_{0}.

First, we assume that we are dealing with stationary states, by doing so we assume that there is no time-dependence. The wavefunction \Psi(x,t) becomes an eigenfunction (eigen– is German for “characteristic” e.g. characteristic function, characteristic value (eigenvalue), and so on), \psi(x). There are requirements for this eigenfunction in the context of quantum mechanics: eigenfunction \psi(x) and its first order spatial derivative \frac{d\psi(x)}{dx} must be finite, single-valued, and continuous. Using the wavefunction, we write the time-independent Schrödinger equation as

-\frac{\hbar^{2}}{2m}\frac{d^{2}\psi(x)}{dx^{2}}+V(x)\psi(x) = E\psi(x), (2)

where \hbar is the reduced Planck’s constant \hbar \equiv h(2\pi)^{-1}, m is the mass of the particle, V(x) represents the potential, and E is the energy.

Now, in electron scattering there are two cases regarding a step potential: the case for which E < V_{0} and the case for which E>V_{0} The focus of this post is the former case. In such a case, the potential is given mathematically by Eqs. (1.1) and (1.2) and can be depicted by the image below:


Image Credit:

The first part of this problem is to solve for \psi(x) when x<0. Then Eq.(2) becomes

\frac{d^{2}\psi_{I}(x)}{dx^{2}}=-\kappa_{I}^{2}\psi_{I}(x), (3)

where \kappa_{I}^{2}\equiv \frac{-2mE}{\hbar^{2}}. Eq.(3) is a second order linear homogeneous ordinary differential equation with constant coefficients and can be solved using a characteristic equation. We assume that the solution is of the general form


which upon taking the first and second order spatial derivatives and substituting into Eq.(3) yields


We can factor out the exponential and recalling that the graph of e^{x} \neq 0 (except in the limit x\rightarrow -\infty)we can then conclude that


Hence, r = \pm i\kappa_{I}. Therefore, we can write the solution Schrödinger equation in the region x < 0 as

\psi_{I}(x) = A\exp{+i\kappa_{I}x}+B\exp{-i\kappa_{I}x}. (4)

This is the eigenfunction for the first region. Coefficients A and B will be determined later.

Now we can use the same logic for the Schrödinger equation in the region x > 0 where V(x)=V_{0}:

\frac{d^{2}\psi_{II}(x)}{dx^{2}}= -\kappa_{II}^{2}\psi_{II}(x), (5)

where \kappa_{II}\equiv \frac{\sqrt[]{2m(E-V_{0})}}{\hbar}. The general solution for this region is

\psi_{II}(x) = C\exp{+i\kappa_{II}x}+D\exp{-i\kappa_{II}x}. (6)

Now the next step is taken using two different approaches: the first using a mathematical argument and the other from a conceptual interpretation of the problem at hand.  The former is this: Suppose we let x\rightarrow \infty. What results is that the first term on the right hand side of Eq.(6) diverges (i.e. becomes arbitrarily large). The second term on the right hand side ends up going to zero (it converges). Therefore, D remains finite, hence D\neq 0. However, in order to suppress the divergence of the first term, we let C=0. Thus we arrive at the eigenfunction for x>0

\psi_{II}(x) = D\exp{-i\kappa_{II}x}. (7)

The latter argument is this: In the region x < 0, the first term of the solution represents a wave propagating in the positive x-direction, while the second denotes a wave traveling in the negative x-direction. Similarly, in the region where x > 0, the first term corresponds to a wave traveling in the positive x-direction. However, this cannot be because the energy of the wave is not sufficient enough to overcome the potential. Therefore, the only term that is relevant here for this region is the second term, for it is a wave propagating in the negative x-direction.

We now determine the coefficients A and B. Recall that the eigenfunction must satisfy the following continuity requirements

\psi_{I}(x)=\psi_{II}(x), (8.1)

\psi^{\prime}_{I}(x)=\psi^{\prime}_{II}(x), (8.2)

evaluated when x=0. Doing so in Eqs. (4) and (7), and equating them we arrive at the continuity condition for \psi(x)

A+B = D. (9)

Taking the derivative of \psi_{I}(x) and \psi_{II}(x) and evaluating them for when x=0 we arrive at

A-B = \frac{i\kappa_{II}}{i\kappa_{I}}D. (10)

If we add Eqs. (9) and (10) we get the value for the coefficient A in terms of the arbitrary constant D

A =\frac{D}{2}(1+\frac{i\kappa_{II}}{i\kappa_{I}}. (11)

Conversely, if we subtract (9) and (10) we get the value for B in terms of D

B = \frac{D}{2}(1-\frac{i\kappa_{II}}{i\kappa_{I}}. (12)


Now the reflection coefficient defined as

R = \frac{B^{*}B}{A^{*}A}, (13)

which is the probability that an incident electron (wave) will be reflected. On the other hand, the transmission coefficient is the probability that an electron will be transmitted through the potential (e.g. barrier potential).  These two also must satisfy the relation

R+T =1. (14)

This means that the probability that the electron will be reflected or transmitted is 100%. Therefore, to evaluate R (the reason why I don’t calculate T will become apparent shortly), we take the complex conjugate of A and B and using them in Eq.(13) we get

R = \frac{\frac{D}{2}(1+\frac{\kappa_{II}}{\kappa_{I}})\frac{D}{2}(1-\frac{\kappa_{II}}{\kappa_{I}})} {\frac{D}{2}(1-\frac{\kappa_{II}}{\kappa_{I}}) \frac{D}{2}(1+\frac{\kappa_{II}}{\kappa_{I}})} = 1. (15)

What this conceptually means is that the probability that the electron is reflected is 100%. This implies that it is impossible for an electron to be transmitted through the potential for this system.


Basic Equations of Ideal One-Fluid Magnetohydrodynamics (Part II)

Continuing with the derivation of the ideal one-fluid MHD equations, the next equation governs the motion of a parcel of fluid (in this case plasma). This momentum equation stems from the Navier-Stokes’ equation. The derivation of this equation will be reserved for a future post. However, the solution of this equation will not be attempted. (Incidentally, the proof of the existence and uniqueness of solutions to the Navier-Stokes’ equation is one of the Millennium problems described by the Clay Institute of Mathematics.)

The purpose of this post is to derive the momentum equation from the Navier-Stokes’ equation.


The Navier-Stokes’ equation has the form

\frac{\partial \textbf{v}}{\partial t} + (\nabla \cdot \textbf{v})\textbf{v} = \textbf{F} - \frac{1}{\rho}\nabla P+\nu \nabla^{2}\textbf{v}, (1)

where \textbf{F} represents a source of external forces, \textbf{v} is the velocity field, \nabla P is the pressure gradient, \rho is the material density, \nu is kinematic viscosity, and \nabla^{2}\textbf{v} is the laplacian of the velocity field. More specifically, it is a consequence of the viscous stress tensor whose components can cause the parcel of fluid to experience stresses and strains.

Defining the magnetic force per unit volume as

\textbf{F}=\frac{\textbf{J}\times \textbf{B}}{\rho}, (2)

where we recall that \textbf{J} is the current density defined by Ohm’s law in a previous post, and also recall from Basic Equations of Ideal One-Fluid Magnetohydrodynamics (Part I) the equation

\nabla \times \textbf{B}=\mu_{0}\textbf{J}, (3)

if we solve for the current density \textbf{J}, we get

\textbf{J}=\frac{1}{\mu_{0}\rho}[(\nabla \times \textbf{B})\times \textbf{B}], (4)

where \mu_{0} is the permeability of free space. Now we invoke the vector identity

[(\nabla \times \textbf{B})\times \textbf{B}]=(\nabla \cdot \textbf{B})\textbf{B}-\nabla \bigg\{\frac{B^{2}}{2}\bigg\}. (5)

At this point, we assume that we are dealing with laminar flows in which case the term \nu=0. The Navier-Stokes’ equation becomes the Euler equation at this point. (Despite great understanding of classical mechanics, one phenomena for which we cannot account is turbulence and sources of friction, so this assumption is made out of necessity as well as simplicity. For processes in which turbulence cannot be neglected the best we can do in this regard is to parameterize turbulence in numerical models.) Using the vector identity as well as making use of our assumption of laminar flows, ideal one-fluid momentum equation is

\frac{\partial \textbf{v}}{\partial t}+ (\nabla \cdot \textbf{v})\textbf{v}=-\frac{1}{\rho}\nabla \bigg\{P + \frac{B^{2}}{2\mu_{0}}\bigg\}+\frac{(\nabla \cdot \textbf{B})\textbf{B}}{\mu_{0}\rho}, (6)

where the additive term to the pressure \frac{B^{2}}{2\mu_{0}} is the magnetic pressure exerted on magnetic field lines and the additional term \frac{(\nabla \cdot \textbf{B})\textbf{B}}{\mu_{0}\rho} is the magnetic tension acting along the magnetic field lines.