astrophytheory

Monte Carlo Simulations of Radiative Transfer: Basics of Radiative Transfer Theory (Part I)

SOURCE FOR CONTENT: Chandrasekhar, S., 1960. Radiative Transfer. 1.

In this post, I will be discussing the basics of radiative transfer theory necessary to understand the methods used in this project. I will start with some definitions, then I will look at the radiative transfer equation and consider two simple cases of scattering.

The first definition we require is the specific intensity, which is the amount of energy associated with a specific frequency $dE_{\nu}$ passing through an area $dA$ constrained to a solid angle $d\Omega$ in a time $dt$ . We may write this mathematically as

$dE_{\nu}=I_{\nu}\cos{\theta}d\nu d\Sigma d\Omega dt. (1)$

We must also consider the net flux given by

$\displaystyle d\nu d\Sigma dt \int I_{\nu}\cos{\theta}d\Omega, (2)$

where if we integrate over all solid angles $\Omega$ we get

$\pi F_{\nu}=\displaystyle \int I_{\nu}\cos{\theta}d\Omega. (3)$

Let $d\Lambda$ be an element of the surface $\Lambda$ in a volume $V$ through which radiation passes. Further let $\Theta$ and $\theta$ denote the angles which form normals with respect to elements $d\Lambda$ and $d\Sigma$ . These surfaces are joined by these normals and hence we have the surface across which energy flows includes the elements $d\Lambda$ and $d\Sigma$ , given by the following:

$I_{\nu}\cos{\Theta}d\Sigma d\Omega^{\prime}d\nu = I_{\nu}d\nu \frac{\cos{\Theta}\cos{\theta}d\Sigma d\Lambda}{r^{2}} (4),$

where $d\Omega^{\prime}=d\Lambda \cos{\Theta}/r^{2}$ is the solid angle subtended by the surface element $d\Lambda$ at a point $P$ and volume element $dV=ld\Sigma \cos{\theta}$ is the volume that is intercepted in volume $V$ . If we take this further, and integrate over all $V$ and $\Omega$ we arrive at

$\displaystyle \frac{d\nu}{c}\int dV \int I_{\nu} d\Omega=\frac{V}{c}d\nu \int I_{\nu}d\Omega, (5)$

where if the radiation travels some distance $L$ in the volume, then we must multiply Eq.(5) by $l/c$ , where $c$ is the speed of light.

We now define the integrated energy density as being

$U_{\nu}=\displaystyle \frac{1}{c}\int I_{\nu}d\Omega, (6.1)$

while the average intensity is

$J_{\nu}=\displaystyle \frac{1}{4\pi}\int I_{\nu}d\nu, (6.2)$

and the relation between these two equations is

$U_{\nu}=\frac{4\pi}{c}J_{\nu}. (6.3)$

I will now introduce the radiative transfer equation. This equation is a balance between the amount of radiation absorbed and the radiation that is emitted. The equation is,

$\frac{dI_{\nu}}{ds}=-\epsilon \rho I_{\nu}+h_{\nu}\rho, (7)$

where if we divide by $\epsilon \rho$ we get

$-\frac{1}{\epsilon_{\nu}\rho}\frac{dI_{\nu}}{ds}=I_{\nu}+U_{\nu}(\theta, \phi), (8)$

where $U(\theta,\phi)$ represents the source function given by

$U_{\nu}(\theta,\phi)=\displaystyle \frac{1}{4\pi}\int_{0}^{\pi}\int_{0}^{2\pi}p(\theta,\phi;\theta^{\prime},\phi^{\prime})I_{\nu}\sin{\theta^{\prime}}d\theta^{\prime}d\phi^{\prime}. (9)$

The source function is typically the ratio between the absorption and emission coefficients. One of the terms in the source function is the phase function which varies according to the specific scattering geometry. In its most general form, we can represent the phase function as an expansion of Legendre polynomials:

$p(\theta, \phi; \theta^{\prime},\phi^{\prime})=\displaystyle \sum_{j=0}^{\infty}\gamma_{j}P_{j}(\mu), (10)$

where we have let $\mu = \cos{\theta}$ (in keeping with our notation in previous posts).

In Part II, we will discuss a few simple cases of scattering and their corresponding phase functions, as well as obtaining the formal solution of the radiative transfer equation. (DISCLAIMER: While this solution will be consistent in a mathematical sense, it is not exactly an insightful solution since much of the more interesting and complex cases involve the solution of either integro-differential equations or pure integral equations (a possible new topic).)

Simple Harmonic Oscillators (SHOs) (Part I)

We all experience or see this happening in our everyday experience: objects moving back and forth. In physics, these objects are called simple harmonic oscillators. While I was taking my undergraduate physics course, one of my favorite topics was SHOs because of the way the mathematics and physics work in tandem to explain something we see everyday. The purpose of this post is to engage followers to get them to think about this phenomenon in a more critical manner.

Every object has a position at which these objects tend to remain at rest, and if they are subjected to some perturbation, that object will oscillate about this equilibrium point until they resume their state of rest. If we pull or push an object with an applied force $F_{A}$ we find that this force is proportional to Hooke’s law of elasticity, that is, $F_{A}=-k\textbf{r}$ . If we consider other forces we also find that there exists a force balance between the restoring force (our applied force), a resistance force, and a forcing function, which we assume to have the form

$F=F_{forcing}+F_{A}-F_{R}= -k\textbf{r}-\beta \dot{\textbf{r}}; (1)$

note that we are assuming that the resistance force is proportional to the speed of an object. Suppose further that we are inducing these oscillations in a periodic manner by given by

$F_{forcing}=F_{0}\cos{\omega t}. (2)$

Now, to be more precise, we really should define the position vector. So, $\textbf{r}=x\hat{i}+y\hat{j}+z\hat{k}$ . Therefore, we actually have a system of three second order linear non-homogeneous ordinary differential equations in three variables:

$m\ddot{ x}+\beta \dot{x}+kx=F_{0}\cos{\omega t}, (3.1)$

$m\ddot{y}+\beta \dot{y}+ky=F_{0}\cos{\omega t}, (3.2)$

$m\ddot{z}+\beta \dot{z}+kz=F_{0}\cos{\omega t}. (3.3)$

(QUICK NOTE: In the above equations, I am using the Newtonian notation for derivatives, only for convenience.) I will just make some simplifications. I will divide both sides by the mass, and I will define the following parameters: $\gamma \equiv \beta/m$ , $\omega_{0} \equiv k/m$ , and $\alpha \equiv F_{0}/m$ . Furthermore, I am only going to consider the $y$ component of this system. Thus, the equation that we seek to solve is

$\ddot{y}+\gamma \dot{y}+\omega_{0}y=\alpha\cos{\omega t}. (4)$

Now, in order to solve this non-homogeneous equation, we use the method of undetermined coefficients. By this we mean to say that the general solution to the non-homogeneous equation is of the form

$y = Ay_{1}(t)+By_{2}(t)+Y(t), (5)$

where $Y(t)$ is the particular solution to the non-homogeneous equation and the other two terms are the fundamental solutions of the homogeneous equation:

$\ddot{y}_{h}+\gamma \dot{y}_{h}+\omega_{0} y_{h} = 0. (6)$

Let $y_{h}(t)=D\exp{(\lambda t)}$ . Taking the first and second time derivatives, we get $\dot{y}_{h}(t)=\lambda D\exp{(\lambda t)}$ and $\ddot{y}_{h}(t)=\lambda^{2}D\exp{(\lambda t)}$ . Therefore, Eq. (6) becomes, after factoring out the exponential term,

$D\exp{(\lambda t)}[\lambda^{2}+\gamma \lambda +\omega_{0}]=0. (7)$

Since $D\exp{(\lambda t)}\neq 0$ , it follows that

$\lambda^{2}+\gamma \lambda +\omega_{0}=0. (8)$

This is just a disguised form of a quadratic equation whose solution is obtained by the quadratic formula:

$\lambda =\frac{-\gamma \pm \sqrt[]{\gamma^{2}-4\omega_{0}}}{2}. (9)$

Part II of this post will discuss the three distinct cases for which the discriminant $\sqrt[]{\gamma^{2}-4\omega_{0}}$ is greater than, equal to , or less than 0, and the consequent solutions. I will also obtain the solution to the non-homogeneous equation in that post as well.

Monte Carlo Simulations of Radiative Transfer: Project Overview

The goal of this project was to simulate radiative transfer in the circumbinary disk GG Tau by using statistical methods (i.e. Monte Carlo methods). To provide some context for the project, I will describe the astrophysical system that I considered, and I will describe the objectives of the project

Of the many stars in the night sky, a significant number of them, when you are able to resolve them, are in fact binary stars. These are star systems in which two stars orbit a common center of mass known as a barycenter. For this reason, we regard GG Tau to be a close binary system; a system in which two stars are separated by a significant distance comparable to the diameters of each star.

The system of GG Tau has four known components with a hypothesized fifth component. There is a disk of material encircling the entire system which we call a circumbinary disk. Additionally, there is also an inner disk of material that surrounds the primary and secondary components referred to as a circumstellar disk.

The goal of this project was to use Monte Carlo methods to simulate radiative transfer processes present in GG Tau. This is accomplished through the use of importance sampling two essential quantities over cumulative distribution function of the form

$\displaystyle \xi=\int_{0}^{L} \mathcal{P}(x)dx\equiv \psi(x_{0}), (1)$

using FORTRAN code. I will discuss the mathematical background of probability and Monte Carlo methods (stochastic processes) in more detail in a later post. The general process of radiative transfer is relatively simple. Starting with an emitter, radiation travels from its emission point, travels a certain distance, and at this point the radiation could either be scattered into an angle different from the incidence angle, or it can be absorbed by the material. In the context of this project, the emitter (or emitters) are the primary and secondary components of GG Tau. The radiation emitted is electromagnetic radiation, or light. This light will travel a distance $L$ after which the light is either scattered or absorbed. The code uses Eq.(1) (cumulative distribution function) to sample optical depths and scattering angles to calculate the distance traveled $L$ . I will reserve the exact algorithm for a future post as well. In the next post, I will discuss the basics of radiative transfer theory as presented in the text “Radiative Transfer” by S. Chandrasekhar, 1960.

The following are the sources used in this project

Wood, K., Whitney, B., Bjorkman, J., and Wolff M., 2013. Introduction to Monte Carlo Radiative Transfer.

Whitney, B. A., 2011. Monte Carlo Radiative Transfer. Bull. Astr. Soc. India.

Kitamura, Y., Kawabe, R., Omodaka, T., Ishiguro, M., and Miyama, S., 1994. Rotating Protoplanetary Gas Disk in GG Tau. ASP Conference Series. Vol. 59.

Piètu, V., Gueth, F., Hily-Blant, P., Schuster, K.F., and Pety, J., 2011. High Resolution Imaging of the GG Tauri system at 267 GHz. Astronomy & Astrophysics. 528. A81.

Skrutskie, M.F., Snell, R.L., Strom, K.M., Strom, S.E., and Edwards, S., 1993. Detection of Circumstellar Gas Associated with GG Tauri. The Astrophysical Journal. 409:422-428.

Choudhuri, A.R., 2010. Astrophysics for Physicists. 2.

Carroll B.W., Ostlie, D.A., 2007. An Introduction to Modern Astrophysics. 9,10.

Chandrasekhar, S., 1960. Radiative Transfer. 1.

Schroeder, D., 2000. Introduction to Thermal Physics. 8.

Ross, S., 2010. Introduction to Probability Model. 2,4,11.

I obtained the FORTRAN code from K. Wood via the following link. (For the sake of completeness the URL is: www-star.st-and.ac.uk/~kw25/research/montecarlo/montecarlo.html)

Monte Carlo Simulations of Radiative Transfer: Overview

FEATURED IMAGE: This is one of the plots that I will be explaining in Post IV. This plot represents the radiation flux emanating from the primary and secondary components of the system considered.

I realize I haven’t posted anything new recently. This is because I have been working on finishing the research project that I worked on during my final semester as an undergraduate.

However, I now intend to share the project on this blog. I will be starting a series of posts in which I will attempt to explain my project and the results I have obtained and open up a dialogue as to any improvements and/or questions that you may have.

Here is the basic overview of the series and projected content:

Post I: Project Overview:

In this first post, I will introduce the project. I will describe the goals of the project and, in particular, I will describe the nature of the system considered. I will also give a list of references used and mention any and all acknowledgements.

Posts II & III: Radiative Transfer Theory

These posts will most likely be the bulk of the concepts used in the project. I will be defining the quantities used in the research, developing the radiative transfer equation, formulating the problems of radiative transfer, and using the more elementary methods to solving the radiative transfer equation.

Post IV: Monte Carlo Simulations

This post will largely be concerned with the mathematics and physics of Monte Carlo simulations both in the context of probability theory and Ising models. I will describe both and explain how it applies to Monte Carlo simulations of Radiative Transfer.

Post V: Results

I will discuss the results that the simulations produced and explain the data displayed in the posts. I will then discuss the conclusions that I have made and I will also explain the shortcomings of the methods used to produce the results I obtained. I will then open it up to questions or suggestions.

Basic Equations of Ideal One-Fluid Magnetohydrodynamics (Part III & IV)

FEATURED IMAGE CREDIT: U.R. Christensen from the Nature article: “Earth Science: A Sheet-Metal Dynamo”

The image shows the overall distortion of magnetic field lines inside the core and its’ effect on the magnetic field outside.

This post shall continue to derive the principal equations of ideal one-fluid magnetohydrodynamics. Here I shall derive the continuity equation and the vorticity equation. Furthermore, I shall show that the Boussinesq approximation results in a zero-valued divergence. I have consulted the following works while researching this topic:

Davidson, P.A., 2001. An Introduction to Magnetohydrodynamics. 3-4,6.

Murphy, N., 2016. Ideal Magnetohydrodynamics. Lecture presentation. Harvard-Smithsonian Center for Astrophysics.

Consider a fluid element through which fluid passes. The mass of this element can be represented as the volume integral of the material density $\rho$ :

$\displaystyle m=\iiint \rho dV. (1)$

Take the first order time derivative of the mass, we arrive at

$\displaystyle \dot{m}=\frac{d}{dt}\iiint\rho dV = \iiint \frac{\partial \rho}{\partial t}dV. (2).$

We now make a slight notation change; let the triple integral be represented as

$\displaystyle \dot{m} = \int_{V}\frac{\partial \rho}{\partial t}dV. (3)$

Now, the mass flux through a surface element $dV = \hat{n}dV$ is $\rho \textbf{v} \cdot dV$ . Thus, the integral of the mass flux is

$\displaystyle \int_{V}(\rho\cdot \textbf{v})dV = \dot{m}. (4)$

We may write this as

$-\displaystyle \oint_{\partial V}(\rho\cdot \textbf{v})dV= \int_{V}\frac{\partial \rho}{\partial t}dV. (5)$

Now, we invoke Gauss’s theorem (also known as the Divergence Theorem) of the form

$\displaystyle \int_{V}(\nabla \cdot \rho \textbf{v})dV=-\oint_{\partial V}(\rho \cdot \textbf{v})dV, (6)$

we get

$\displaystyle \int_{V}\bigg\{\frac{\partial \rho}{\partial t}+(\nabla \cdot \rho \textbf{v})\bigg\}dV=0. (7)$

Since the integral cannot be zero, the integrand must be. Therefore, we get the continuity equation:

$\displaystyle \frac{\partial \rho}{\partial t}+ (\nabla \cdot \rho \textbf{v})=0. (8)$

Recall the following equation from a previous post (specifically Eq. (6) of that post)

$\displaystyle \frac{\partial \textbf{v}}{\partial t}+(\nabla\cdot \textbf{v})\textbf{v}=-\frac{1}{\rho}\nabla\bigg\{P+\frac{B^{2}}{2\mu_{0}}\bigg\}+\frac{(\nabla\cdot \textbf{B})\textbf{B}}{\mu_{0}\rho}, (8)$

Now we define the concept of vorticity. Conceptually, this refers to the rotation of the fluid within its velocity field. Mathematically, we define the vorticity $\Omega$ to be the curl of the fluid velocity:

$\Omega \equiv \nabla \times \textbf{v}. (9)$

Now recall, the vector identity

$\displaystyle \nabla \frac{\textbf{v}^{2}}{2}=(\nabla \cdot \textbf{v})\textbf{v}+\textbf{v}\times \Omega, (10.1)$

which upon rearrangement is

$(\nabla \cdot \textbf{v})\textbf{v}=\nabla \frac{\textbf{v}^{2}}{2}-\textbf{v}\times\Omega. (10.2)$

Using Eq.(10.2) with Eq.(8) we get

$\displaystyle \frac{\partial \textbf{v}}{\partial t}=\textbf{v} \times \Omega -\nabla \bigg\{\frac{P}{\rho}-\frac{\textbf{v}^{2}}{2} \bigg\}+\nu \nabla^{2} \textbf{v}. (11)$

Recall our definition of vorticity. Upon taking the curl of Eq.(11) we arrive at a variation of the induction equation (see post)

$\displaystyle \frac{\partial \Omega}{\partial t}=\nabla \times (\textbf{v} \times \Omega)+\nu \nabla^{2}\Omega. (12.1)$

Next, we invoke another vector identity

$\displaystyle \nabla \times (\textbf{v} \times \Omega) = (\Omega \cdot \nabla)\textbf{v}-(\textbf{v} \cdot \nabla)\Omega (12.2)$

Using Eq. (12.2) in Eq. (12.1) yields the vorticity equation

$\displaystyle \frac{\partial \Omega}{\partial t}+(\nabla \cdot \textbf{v})\Omega = (\nabla \cdot \Omega)\textbf{v}+\nu \nabla^{2}\Omega. (13)$

Returning to the continuity equation:

$\displaystyle \frac{\partial \rho}{\partial t}+(\nabla \cdot \rho \textbf{v})=0.$

I will now show that the flow does not diverge. In other words, there are no sources in the fluid velocity field. The Boussinesq approximation’s main assertion is that of isopycnal flow (i.e. flow of constant density). Therefore, let $\rho = \rho_{0}>0$ . Substitution into the continuity equation yields the following

$\displaystyle 0+ \rho_{0}(\nabla \cdot \textbf{v})=0. (14)$

The temporal derivative is a derivative of a non-zero constant which is itself zero. This just leaves $\rho_{0}(\nabla \cdot \textbf{v})=0$ . Now, since $\rho_{0}\neq 0$ , this then means that

$\displaystyle (\nabla \cdot \textbf{v})=0. (15)$

Hence, the divergence of the fluid’s velocity field is zero.

Solution to Laplace’s Equation

This post deals with the familiar (to the physics student) Laplace’s equation. I am solving this equation in the context of physics, instead of a pure mathematical perspective. This problem is considered most extensively in the context of electrostatics. This equation is usually considered in the spherical polar coordinate system. A lot of finer details are also considered in a mathematical physics course where the topic of spherical harmonics is discussed. Assuming spherical-polar coordinates, Laplace’s equation is

$\displaystyle \frac{1}{r^{2}}\frac{\partial}{\partial r}\bigg\{r^{2}\frac{\partial \psi}{\partial r}\bigg\}+\frac{1}{r^{2}\sin{\theta}}\frac{\partial}{\partial \theta}\bigg\{\sin{\theta}\frac{\partial \psi}{\partial \theta}\bigg\}+\frac{1}{r^{2}\sin{\theta}}\frac{\partial^{2}\psi}{\partial \phi^{2}}=0, (1)$

Suppose that the function $\psi\rightarrow \psi(r,\theta,\phi)$ . Furthermore, by the method of separation of variables we suppose that the solution is a product of eigenfunctions of the form $R(r)Y(\theta,\phi)$ . Hence, Laplace’s equation becomes

$\displaystyle \frac{Y}{r^{2}}\frac{d}{dr}\bigg\{r^{2}\frac{d^{2}R}{dr^{2}}\bigg\}+\frac{R}{r^{2}\sin{\theta}}\frac{\partial}{\partial \theta}\bigg\{\sin{\theta}\frac{\partial Y}{\partial \theta}\bigg\}+\frac{R}{r^{2}\sin^{\theta}}\bigg\{\frac{\partial^{2}Y}{\partial \phi^{2}}\bigg\}=0. (2)$

Furthermore, we can separate further the term $Y(\theta, \phi)$ into $\Theta(\theta)\Phi(\phi)$ . Rewriting (2) and multiplying by $r^{2}\sin^{2}{\theta}R^{-1}Y^{-1}$ , we get

$\displaystyle \frac{1}{R}\frac{d}{dr}\bigg\{r^{2}\frac{dR}{dr}\bigg\}+\frac{\sin{\theta}}{\Theta}\frac{d}{d\theta}\bigg\{\sin{\theta}\frac{d\Theta}{d\theta}\bigg\}+\frac{1}{\Phi}\frac{d^{2}\Phi}{d\phi^{2}}=0.(3)$

Bringing the radial and angular component to the other side of the equation and setting the azimuthal component equal to a separation constant $-m^{2}$ , yielding

$\displaystyle -\frac{1}{R}\frac{d}{dr}\bigg\{r^{2}\frac{dR}{dr}\bigg\}-\frac{\sin{\theta}}{\Theta}\frac{d}{d\theta}\bigg\{\sin{\theta}\frac{d\Theta}{d\theta}\bigg\}=\frac{1}{\Phi}\frac{d^{2}\Phi}{d\phi^{2}}=-m^{2}.$

Solving the right-hand side of the equation we get

$\displaystyle \Phi(\phi)=A\exp({+im\phi}). (4)$

Now, we set the azimuthal component equal to $m^{2}$ and carry out the derivative in the angular component to get the Associated Legendre equation:

$\displaystyle \frac{d}{d\mu}\bigg\{(1-\mu^{2})\frac{d\Theta}{d\mu}\bigg\}+\bigg\{l(l+1)-\frac{m^{2}}{1-\mu^{2}}\bigg\}\Theta=0, (5)$

where I have let $\mu=\cos{\theta}$ . The solutions to this equation are known as the associated Legendre functions. However, instead of solving this difficult equation by brute force methods (i.e. power series method), we consider the case for which $m^{2}=0$ . In this case, Eq.(5) simplifies to Legendre’s differential equation discussed previously. Instead of quoting the explicit form of the Legendre polynomials, we equivalently state the Rodrigues formula for these polynomials

$\displaystyle P_{l}(\mu)=\frac{1}{2^{l}l!}\frac{d^{l}}{d\mu^{l}}(\mu^{2}-1)^{l}. (6)$

However, the solutions to Eq.(5) are the associated Legendre functions, not polynomials. Therefore, we use the following to determine the associated Legendre functions from the Legendre polynomials:

$\displaystyle {\Theta_{l}}^{m}(\mu) = (1-\mu^{2})^{\frac{|m|}{2}}\frac{d^{|m|}}{d\mu^{|m|}}P_{l}(\mu). (7)$

For real solutions $|m|\leq l$ and for complex solutions $-l \leq m \leq l$ , thus we can write the solutions as series whose indices run from 0 to l for the real-values and from -l to l for the complex-valued solutions.

Now we turn to the radial part which upon substitution of $-l(l+1)$ in place of the angular component, we get

$\displaystyle \frac{d}{dr}\bigg\{r^{2}\frac{dR}{dr}\bigg\}-l(l+1)R=0,$

and if we evaluate the derivatives in the first term we get an Euler equation of the form

$\displaystyle r^{2}\frac{d^{2}R}{dr^{2}}+2r\frac{dR}{dr}-l(l+1)R=0. (8)$

Let us assume that the solution can be represented as

$\displaystyle R(r)=\sum_{j=0}^{\infty}a_{j}r^{j}.$

Taking the necessary derivatives and substituting into Eq.(8) gives

$\displaystyle r^{2}\sum_{j=0}^{\infty}j(j-1)a_{j}r^{j-2}+\sum_{j=0}^{\infty}2rja_{j}r^{j-1}-l(l+1)\sum_{j=0}^{\infty}a_{j}r^{j}=0. (9)$

Simplifying the powers of r we find that

$\displaystyle \sum_{j=0}^{\infty}[j(j-1)+2j-l(l+1)]a_{j}r^{j}=0.$

Now, since $\displaystyle \sum_{j=0}^{\infty}a_{j}r^{j}\neq 0$ , this means that

$\displaystyle j(j-1)+2j-l(l+1)=0. (10)$

We can simplify this further to get

$\displaystyle j^{2}-l^{2}+j-l=0.$

Factoring out $(j-l)$ , we arrive at

$\displaystyle (j-l)((j+l)+1)=0 \implies j=l ; j=-l-1.$

Since this must be true for all values of l, the solution therefore becomes

$\displaystyle R(r)= \sum_{l=0}^{\infty}Br^{l}+Cr^{-l-1}. (11)$

Thus, we can write the solution to Laplace’s equation as

$\displaystyle \psi(r,\theta,\phi) = \sum_{l=0}^{\infty} \sum_{m=0}^{\infty}[Br^{l}+Cr^{-l-1}]\Theta_{l}^{m}(\mu)A\exp({im\phi}), (12)$

for real solutions, and

$\displaystyle \psi(r,\theta,\phi)=\sum_{l=0}^{\infty}\sum_{m=-l}^{l}[Br^{l}+Cr^{-l-1}]\Theta_{l}^{m}(\mu)A\exp({im\phi}), (13)$

for complex solutions.

Solution to Legendre’s Differential Equation

Typically covered in a first course on ordinary differential equations, this problem finds applications in the solution of the Schrödinger equation for a one-electron atom (i.e. Hydrogen). In fact, this equation is a smaller problem that results from using separation of variables to solve Laplace’s equation. One finds that the angular equation is satisfied by the Associated Legendre functions. However, if it is assumed that $m=0$ then the equation reduces to Legendre’s equation.

The equation can be stated as

$\displaystyle (1-x^{2})\frac{d^{2}y}{dx^{2}}-2x\frac{dy}{dx}+l(l+1)y(x)=0. (1)$

The power series method starts with the assumption

$\displaystyle y(x)=\sum_{j=0}^{\infty}a_{j}x^{j}. (2)$

Next, we require the first and second order derivatives

$\displaystyle \frac{dy}{dx}=\sum_{j=1}^{\infty}ja_{j}x^{j-1}, (3)$

and

$\displaystyle \frac{d^{2}y}{dx^{2}}=\sum_{j=2}^{\infty}j(j-1)a_{j}x^{j-2}. (4)$

Substitution yields

$\displaystyle (1-x^{2})\sum_{j=2}^{\infty}j(j-1)a_{j}x^{j-2}-2x\sum_{j=1}^{\infty}ja_{j}x^{j-1}+l(l+1)\sum_{j=0}^{\infty}a_{j}x^{j}=0, (5)$

Distribution of the first terms gives

$\displaystyle \sum_{j=2}^{\infty}j(j-1)a_{j}x^{j-2}-\sum_{j=0}^{\infty}j(j-1)a_{j}x^{j}-2ja_{j}x^{j}+l(l+1)a_{j}x^{j}=0, (6)$

where in the second summation term we have rewritten the index to start at zero since the terms for which $j=0,1$ are zero, and hence have no effect on the overall sum. This allows us to write the sum this way. Next, we introduce a dummy variable. Therefore, let $m=j-2\implies j=m+2\implies m+1=j-1$ . Thus, the equation becomes

$\displaystyle \sum_{j=0}^{\infty}\bigg\{(j+2)(j+1)a_{j+2}-j(j-1)a_{j}-2ja_{j}+l(l+1)a_{j}\bigg\}x^{j}=0. (7)$

In order for this to be true for all values of j, we require the coefficients of $x^{j}$ equal zero. Solving for $a_{j+2}$ we get

$\displaystyle a_{j+2}=\frac{j(j+1)-l(l+1)}{(j+2)(j+1)}. (8)$

It becomes evident that the terms $a_{2},a_{3},a_{4},...$ are dependent on the terms $a_{0}$ and $a_{1}$ . The first term deals with the even solution and the second deals with the odd solution. If we let $p=j+2$ and solve for $a_{p}$ , we arrive at the term $a_{p-2}$ and we can obtain the next term $a_{p-4}$ . (I am not going to go through the details. The derivation is far too tedious. If one cannot follow there is an excellent video on YouTube that goes through a complete solution of Legendre’s ODE where they discuss all finer details of the problem. I am solving this now so that I can solve more advanced problems later on.) A pattern begins to emerge which we may express generally as:

$\displaystyle a_{p-2n}=\frac{(-1)^{n}(2p-2n)!}{(n!)(2^{p})(p-n)!(p-2n)!}. (9)$

Now, for even terms $0\leq n \leq \frac{p}{2}$ and for odd terms $0 \leq n \leq \frac{p-1}{2}$ . Thus for the even solution we have

$\displaystyle P_{p}(x)=\sum_{n=0}^{\frac{p}{2}}\frac{(-1)^{n}(2p-2n)!}{(n!)(2^{p})(p-n)!(p-2n)!}, (10.1)$

and for the odd solution we have

$\displaystyle P_{p}(x)=\sum_{n=0}^{\frac{p-1}{2}}\frac{(-1)^{n}(2p-2n)!}{(n!)(2^{p})(p-n)!(p-2n)!}. (10.2)$

These two equations make up the even and odd solution to Legendre’s equation. They are an explicit general formula for the Legendre polynomials. Additionally we see that they can readily be used to derive Rodrigues’ formula

$\displaystyle \frac{1}{2^{p}p!}\frac{d^{p}}{dx^{p}}\bigg\{(x^{2}-1)^{p}\bigg\}, (11)$

and that we can relate Legendre polynomials to the Associated Legendre function via the equation

$\displaystyle P_{l}^{m}(x)= (1-x^{2})^{\frac{|m|}{2}}\frac{d^{|m|}P_{l}(x)}{dx^{|m|}}, (12)$

where I have let $p=l$ so as to preserve a more standard notation. This is the general rule that we will use to solve the associated Legendre differential equation when solving the Schrödinger equation for a one-electron atom.

Solution to the three-dimensional Heat Equation

After taking a topics course in applied mathematics (partial differential equations), I found that there were equations that I should solve since I would later see those equations embedded into other larger-scale equations. This equation was Laplace’s equation (future post). Once I solved this equation, I realized that it becomes a differential operator when acted upon a function of at least two variables. Thus, I could solve equations such as the Schrödinger equation using a three-dimensional laplacian in spherical-polar coordinates (another future post) and the three-dimensional heat equation. I will be solving the latter.

The heat equation initial-boundary-value-problem is therefore

$\frac{\partial u}{\partial t}=\frac{1}{c\rho}\nabla^{2}u+Q(x,y,z), (1.1)$

subjected to the boundary conditions

$u(0,0,0,t)=0, (1.2)$

and the initial condition

$u(x,y,z,0)= \xi(x,y,z). (1.8)$

Now, this solution is not specific to a single thermodynamic system, but rather it is a more general solution in a mathematical context. However, I will be appropriating certain concepts from physics for reasons that are well understood (i.e. that time exists on the interval $I=0\leq t < +\infty$ ). It is nonphysical or nonsensical to speak of negative time.

To start, consider any rectangular prism in which heat flows through the volume, from the origin to the point $(X,Y,Z)$ .At a time $t=0$ , the overall heat of the volume can be regarded to be a function $\xi(x,y,z)$ . After a time period $\delta t=t$ has passed, the heat will have traversed to the point $(X,Y,Z)$ from the origin (think of the heat traveling along the diagonal of a cube). The goal is to find the heat as a function of the three spatial components and a single time component. Furthermore, the boundary conditions maintain that at the origin and the final point, the heat vanishes. In other words, these points act as heat sinks. (A sink is a point where energy can leave the system.)

To simplify the notation, we define $\textbf{r}=x\hat{i}+y\hat{j}+z\hat{k}.$ Thus the initial-boundary-value-problem becomes

$\frac{\partial u}{\partial t}=\frac{1}{c\rho}\frac{\partial^{2}u}{\partial \textbf{r}^{2}}, (2.1)$

where $u\rightarrow u(\textbf{r},t)$ . Also, this definition also reduces the three-dimensional laplacian to a second-order partial derivative of u. The boundary and initial conditions are then

$u(0,t)=0,(2.2)$

$u(R,t)=0, (2.3)$

$u(\textbf{r},0)=\xi(\textbf{r}). (2.2)$

Now, we assume that the solution is a product of eigenfunctions of the form

$u(\textbf{r},t)=\alpha(\textbf{r})\Gamma(t). (3)$

Taking the respective derivatives and dividing by the assumed form of the solution, we get

$\frac{\Gamma^{\prime}(t)}{\Gamma(t)}=\frac{1}{c\rho}\frac{\alpha^{\prime\prime}(\textbf{r})}{\alpha(\textbf{r})}+Q(\textbf{r}). (4)$

Now, Eq.(4) can be equal to three different values, the first of which is zero, but this solution does not help in any way, nor is it physically significant since it produces a trivial solution. The second is $\lambda^{2}>0, \lambda \neq 0$ , so we can apply to the time dependence equation which then becomes

$\frac{\Gamma^{\prime}(t)}{\Gamma(t)}=\lambda^{2}, (5.1)$

whose solution is

$\Gamma(t)=\Gamma_{0}\exp({\lambda^{2}t}). (5.2)$

The third case where $\lambda^{2}<0, \lambda \neq 0$ allows us to write the spatial equation upon rearrangement as

$\alpha^{\prime\prime}(\textbf{r})+c\rho(\lambda^{2}+Q)\alpha(\textbf{r})=0, (6.1)$

whose solution is

$\alpha(\textbf{r})=c_{1}\cos({\sqrt[]{c\rho(\lambda^{2}+Q)}\textbf{r}})+c_{3}\sin({\sqrt[]{c\rho(\lambda^{2}+Q)}\textbf{r}}), (6.2)$

where $c_{3}=-c_{2}$ . Next we apply the boundary conditions. Let $\textbf{r}=0$ in $\alpha(\textbf{r})$ :

$\alpha(0)=c_{1}+0=0 \implies c_{1}=0 \implies \alpha(\textbf{r})=C\sin(\sqrt[]{c\rho(\lambda^{2}+Q)} \textbf{r}). (7.1)$

and let $\textbf{r}=R$ in (7.1) to get

$\alpha(R)\implies \sin({\sqrt[]{c\rho(\lambda^{2}+Q)}R})=0\implies \sqrt[]{c\rho(\lambda^{2}+Q)}R^{2}=(n\pi)^{2}.$

Solving for $\lambda^{2}$ above gives

$\lambda^{2}=\frac{1}{c\rho}\bigg\{\frac{n\pi}{R}\bigg\}^{2}-Q. (8)$

Substituting into Eq.(7.1) and simplifying gives

$\alpha(\textbf{r})=C\sin\bigg\{\frac{n\pi}{c\rho}\bigg\}^{2}\bigg\{\frac{\textbf{r}}{R}\bigg\}. (9)$

To find as many solutions as possible we construct a superposition of solutions of the form

$u(\textbf{r},t)=u_{l}(\textbf{r},t)=\sum_{l=0}^{\infty}\alpha_{l}(\textbf{r})\Gamma(t). (10)$

Rewriting $\alpha(\textbf{r})$ as $\alpha_{l}(\textbf{r})$ and using the solution for the time dependence, and also let the coefficients form a product equivalent to the indexed coefficient $A_{l}$ , we arrive at the solution for the heat equation:

$u(\textbf{r},t)=\sum_{l=0}^{\infty}A_{l}\sin\bigg\{\frac{l\pi}{c\rho}\bigg\}^{2}\bigg\{\frac{\textbf{r}}{R}\bigg\}\exp(\lambda^{2}t). (11)$

Now suppose that $t=0$ in Eq.(11). In this case we get the initial heat distribution given by $\xi(\textbf{r})$ :

$\xi(\textbf{r})=\sum_{l=0}^{\infty} A_{l}\sin\bigg\{\frac{l\pi}{c\rho}\bigg\}^{2}\bigg\{\frac{\textbf{r}}{R}\bigg\}. (12)$

Deriving the speed of light from Maxwell’s equations

We are all familiar with the concept of the speed of light. It is the speed beyond which no object may travel. Many seem to associate Einstein for the necessity of this universal constant, and while it is inherent to his theory of special and general theories of relativity, it was not necessarily something he discovered. It is, in fact, a consequence of the Maxwell equations from my first post. I will be deriving the speed of light quantity using the four field equations of electrodynamics, and I will explain how Einstein used this fact to challenge Newtonian relativity in his theory of special relativity (I am not as familiar with general relativity). The reason for this post is just to demonstrate the origin of a well-known concept; the speed of light.

We start with Maxwell’s equations of electrodynamics

$\nabla \cdot \textbf{E}=\frac{\rho}{\epsilon_{0}}, (1)$

$\nabla \cdot \textbf{B}=0, (2)$

$\nabla \times \textbf{E}=-\frac{\partial \textbf{B}}{\partial t}, (3)$

$\nabla \times \textbf{B}=\mu_{0}\textbf{j}+\mu_{0}\epsilon_{0}\frac{\partial \textbf{E}}{\partial t}. (4)$

Now, we let $\rho =0$ , which means that the charge density must be zero, and we also let the current density $\textbf{j}=0$ . Moreover, note that the form of the wave equation as

$\frac{\partial^{2} u}{\partial t^{2}}=\frac{1}{v^{2}}\nabla^{2}u. (5)$

This equation describes the change in position of material in three dimensions (choose whichever coordinate system you like) propagating through some amount of time, with some velocity v.

After making these assumptions, we arrive at

$\nabla \cdot \textbf{E}=0, (6)$

$\nabla \cdot \textbf{B}=0, (7)$

$\nabla \times \textbf{E}=-\frac{\partial \textbf{B}}{\partial t}, (8)$

$\nabla \times \textbf{B}=\mu_{0}\epsilon_{0}\frac{\partial \textbf{E}}{\partial t}. (9)$

Also note the vector identity $\nabla \times (\nabla \times \textbf{A})=\nabla(\nabla\cdot\textbf{A})-\nabla^{2}\textbf{A}$ . Now, take the curl of Eqs.(8) and (9), and we get

$\frac{1}{\mu_{0}\epsilon_{0}}\nabla^{2}\textbf{E}=\frac{\partial^{2}\textbf{E}}{\partial t^{2}}, (10)$

and

$\frac{1}{\mu_{0}\epsilon_{0}}\nabla^{2}\textbf{B}=\frac{\partial^{2}\textbf{B}}{\partial t^{2}}, (11)$

where we have used Eqs. (6), (7), (8), and (9) to simplify the expressions. Eqs. (10) and (11) are the electromagnetic wave equations. Note the form of these equations and how they compare to Eq. (5). They are identical, and upon inspection one can see that the velocity with which light travels is

$\frac{1}{c^{2}}=\frac{1}{\mu_{0}\epsilon_{0}} \implies c=\sqrt[]{\mu_{0}\epsilon_{0}}, (12)$

where $\mu_{0}$ is the permeability of free space and $\epsilon_{0}$ is the permittivity of free space.

Most waves on Earth require a medium to travel. Sound waves, for example are actually pressure waves that move by collisions of the individual molecules in the air. For some time, light was thought to require a medium to travel. So it was proposed that since light can travel through the vacuum of space, there must exist a universal medium dubbed “the ether”. This “ether” was sought after most notably in the famous Michelson-Morley experiment, in which an interferometer was constructed to measure the Earth’s velocity through this medium. However, when they failed to find any evidence that the “ether” existed, the new way of thinking was that it didn’t exist. It turned out that light doesn’t need a medium to travel through space. Technically-speaking, space itself acts as the medium through which light travels.

In Newtonian relativity, it was assumed that time and space were separate constructs and were regarded as absolute. In other words, it was the speed that changed. What this meant is that even as speeds became very large, space and time remained the same. What Einstein did was that he saw the consequence of Maxwell’s equations and regarded this speed as absolute, and allowed space and time (really spacetime) to vary. In Einstein’s theory of special relativity, as one approaches the speed of light, time slows down, and objects become contracted. These phenomena are known as time dilation and length contraction:

$\delta t = \frac{\delta t_{0}}{\sqrt[]{1-v^{2}/c^{2}}}, (13)$

$\delta l = l_{0}\sqrt[]{1-v^{2}/c^{2}}. (14)$

These phenomena will be discussed in more detail in a future post. Thus, Maxwell’s formulation of the electrodynamic field equations led Einstein to change the way we perceive the fundamental concepts of space and time.

Electron Scattering of a Step Potential

The following post was initially one of my assignments for an independent study in modern physics in my penultimate year as an undergraduate. While studying this problem the text that I used to verify my answer was:

R. Eisberg and R. Resnick, Quantum Physics of Atoms, Molecules, Solids, Nuclei, and Particles. John Wiley & Sons. 1985. 6.

One of the hallmarks of quantum theory is the Schrödinger equation. There are two forms: the time-dependent and the time-independent equation. The former can be turned into the latter by way of assuming stationary states in which case there is no time evolution (i.e. the time derivative $\partial_{t} \Psi(x,t)=0$ . $\partial_{t}\Psi(x,t)$ denotes the derivative of the wavefunction $\Psi(x,t)$ with respect to time.)

The wavefunction describes the state of a system, and it is found by solving the Schrödinger equation. In this post, I’ll be considering a step potential in which

$V(x) = V_{0}, x < 0, (1.1)$

$V(x) = 0, x > 0. (1.2)$

After solving for the wavefunction, I will calculate the reflection and transmission coefficients for the case where the energy of the electron is less than that of the step potential $E < V_{0}$ .

First, we assume that we are dealing with stationary states, by doing so we assume that there is no time-dependence. The wavefunction $\Psi(x,t)$ becomes an eigenfunction (eigen– is German for “characteristic” e.g. characteristic function, characteristic value (eigenvalue), and so on), $\psi(x)$ . There are requirements for this eigenfunction in the context of quantum mechanics: eigenfunction $\psi(x)$ and its first order spatial derivative $\frac{d\psi(x)}{dx}$ must be finite, single-valued, and continuous. Using the wavefunction, we write the time-independent Schrödinger equation as

$-\frac{\hbar^{2}}{2m}\frac{d^{2}\psi(x)}{dx^{2}}+V(x)\psi(x) = E\psi(x), (2)$

where $\hbar$ is the reduced Planck’s constant $\hbar \equiv h(2\pi)^{-1}$ , m is the mass of the particle, $V(x)$ represents the potential, and $E$ is the energy.

Now, in electron scattering there are two cases regarding a step potential: the case for which $E < V_{0}$ and the case for which $E>V_{0}$ The focus of this post is the former case. In such a case, the potential is given mathematically by Eqs. (1.1) and (1.2) and can be depicted by the image below:

Image Credit: http://physics.gmu.edu/~dmaria/590%20Web%20Page/public_html/qm_topics/potential/barrier/STUDY-GUIDE.htm

The first part of this problem is to solve for $\psi(x)$ when $x<0$ . Then Eq.(2) becomes

$\frac{d^{2}\psi_{I}(x)}{dx^{2}}=-\kappa_{I}^{2}\psi_{I}(x), (3)$

where $\kappa_{I}^{2}\equiv \frac{-2mE}{\hbar^{2}}$ . Eq.(3) is a second order linear homogeneous ordinary differential equation with constant coefficients and can be solved using a characteristic equation. We assume that the solution is of the general form

$\psi_{I}(x)=\exp{rx},$

which upon taking the first and second order spatial derivatives and substituting into Eq.(3) yields

$r^{2}\exp{rx}+\kappa_{I}^{2}\exp{rx}=0.$

We can factor out the exponential and recalling that the graph of $e^{x} \neq 0$ (except in the limit $x\rightarrow -\infty$ )we can then conclude that

$r^{2}+\kappa_{I}^{2}=0$ .

Hence, $r = \pm i\kappa_{I}$ . Therefore, we can write the solution Schrödinger equation in the region $x < 0$ as

$\psi_{I}(x) = A\exp{+i\kappa_{I}x}+B\exp{-i\kappa_{I}x}. (4)$

This is the eigenfunction for the first region. Coefficients A and B will be determined later.

Now we can use the same logic for the Schrödinger equation in the region $x > 0$ where $V(x)=V_{0}$ :

$\frac{d^{2}\psi_{II}(x)}{dx^{2}}= -\kappa_{II}^{2}\psi_{II}(x), (5)$

where $\kappa_{II}\equiv \frac{\sqrt[]{2m(E-V_{0})}}{\hbar}$ . The general solution for this region is

$\psi_{II}(x) = C\exp{+i\kappa_{II}x}+D\exp{-i\kappa_{II}x}. (6)$

Now the next step is taken using two different approaches: the first using a mathematical argument and the other from a conceptual interpretation of the problem at hand. The former is this: Suppose we let $x\rightarrow \infty$ . What results is that the first term on the right hand side of Eq.(6) diverges (i.e. becomes arbitrarily large). The second term on the right hand side ends up going to zero (it converges). Therefore, D remains finite, hence $D\neq 0$ . However, in order to suppress the divergence of the first term, we let $C=0$ . Thus we arrive at the eigenfunction for $x>0$

$\psi_{II}(x) = D\exp{-i\kappa_{II}x}. (7)$

The latter argument is this: In the region $x < 0$ , the first term of the solution represents a wave propagating in the positive x-direction, while the second denotes a wave traveling in the negative x-direction. Similarly, in the region where $x > 0$ , the first term corresponds to a wave traveling in the positive x-direction. However, this cannot be because the energy of the wave is not sufficient enough to overcome the potential. Therefore, the only term that is relevant here for this region is the second term, for it is a wave propagating in the negative x-direction.

We now determine the coefficients A and B. Recall that the eigenfunction must satisfy the following continuity requirements

$\psi_{I}(x)=\psi_{II}(x), (8.1)$

$\psi^{\prime}_{I}(x)=\psi^{\prime}_{II}(x), (8.2)$

evaluated when $x=0$ . Doing so in Eqs. (4) and (7), and equating them we arrive at the continuity condition for $\psi(x)$

$A+B = D. (9)$

Taking the derivative of $\psi_{I}(x)$ and $\psi_{II}(x)$ and evaluating them for when $x=0$ we arrive at

$A-B = \frac{i\kappa_{II}}{i\kappa_{I}}D. (10)$

If we add Eqs. (9) and (10) we get the value for the coefficient A in terms of the arbitrary constant D

$A =\frac{D}{2}(1+\frac{i\kappa_{II}}{i\kappa_{I}}. (11)$

Conversely, if we subtract (9) and (10) we get the value for B in terms of D

$B = \frac{D}{2}(1-\frac{i\kappa_{II}}{i\kappa_{I}}. (12)$

Now the reflection coefficient defined as

$R = \frac{B^{*}B}{A^{*}A}, (13)$

which is the probability that an incident electron (wave) will be reflected. On the other hand, the transmission coefficient is the probability that an electron will be transmitted through the potential (e.g. barrier potential). These two also must satisfy the relation

$R+T =1. (14)$

This means that the probability that the electron will be reflected or transmitted is 100%. Therefore, to evaluate R (the reason why I don’t calculate T will become apparent shortly), we take the complex conjugate of A and B and using them in Eq.(13) we get

$R = \frac{\frac{D}{2}(1+\frac{\kappa_{II}}{\kappa_{I}})\frac{D}{2}(1-\frac{\kappa_{II}}{\kappa_{I}})} {\frac{D}{2}(1-\frac{\kappa_{II}}{\kappa_{I}}) \frac{D}{2}(1+\frac{\kappa_{II}}{\kappa_{I}})} = 1. (15)$

What this conceptually means is that the probability that the electron is reflected is 100%. This implies that it is impossible for an electron to be transmitted through the potential for this system.

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: