# A “Proof” of the Sturm-Liouville Theorem/Problem

IMAGE CREDIT: NASA/JPL: This shows Jupiter’s Great Red Spot; a storm that has been occurring for over 300 years now. Quite recently, however, observations show that the Spot appears to be shrinking in size.

About a week ago, I was looking through my notebooks and came across an unfinished problem posed by one of my professors. Unfortunately, I was not able to solve the problem during the semester. However, I thought it might be something interesting to consider. I did a quick search and found that the problem he gave us was to prove the Sturm-Liouville Theorem.

The main brute-force method to analytically solving a given partial differential equation is the separation of variables. This method is heavily used by physicists and in doing so transforms the initial boundary value problem (IBVP) into a Sturm-Liouville problem in which we have an ordinary differential equation and linear homogeneous boundary conditions.

# A Narrow, Technical Problem in Partial Differential Equations

While I was in school, one of my professors set this problem to me and my classmates and challenged us to solve it over the next few days. I found the challenge intriguing and it fascinated me, so I thought it was worth sharing. The problem was this:

Show that

$\displaystyle v(x,t) = \int_{-\infty}^{\infty} f(x-y,t)g(y)dy, (1.1)$

where $\displaystyle g(y)$ has finite support and also satisfies the PDE

$\displaystyle \frac{\partial v}{\partial t} = -\kappa \frac{\partial^{2}v}{\partial x^{2}}. (1.2)$

First off, what does finite support mean? Mathematically speaking, a function has support which is characterized by a subset of its domain whose members do not map to zero, and yet are finite. (Just as a quick note: much of the proper definitions require an understanding in mathematical analysis and measure theory, something which I have not studied in detail, so take that explanation with a grain of salt.)

As for the solution, we can rewrite the given PDE as

$\displaystyle \frac{\partial v}{\partial t} - \kappa \frac{\partial^{2}v}{\partial x^{2}} = 0. (2)$

The PDE requires a first-order time derivative and a second-order spatial derivative.

$\displaystyle \therefore \frac{\partial v}{\partial t} = \frac{\partial}{\partial t}\int_{-\infty}^{\infty} f(x-y,t)g(y)dy, (3.1)$

and

$\displaystyle \frac{\partial^{2} v}{\partial x^{2}} = \frac{\partial^{2}}{\partial x^{2}}\int_{-\infty}^{\infty} f(x-y,t)g(y)dy. (3.2)$

Next, we substitute Eqs. (3.1) and (3.2) into Eq.(2), yielding

$\displaystyle \frac{\partial}{\partial t}\int_{-\infty}^{\infty} f(x-y,t)g(y)dy -\kappa \frac{\partial^{2}}{\partial x^{2}}\int_{-\infty}^{\infty} f(x-y,t)g(y)dy = 0. (4)$

Note that taking the derivative of a function and then integrating that function is equivalent to integrating the function and differentiating the same function, in conjunction with the fact that the sum or difference of the integrals is the integral of the sum or difference (proofs of these facts are typically covered in a course in real analysis). Taking advantage of these gives

$\displaystyle \int_{-\infty}^{\infty} \bigg\{\frac{\partial}{\partial t}f(x-y,t)-\kappa\frac{\partial^{2}}{\partial x^{2}}f(x-y,t)\bigg\}g(y)dy = 0. (5)$

Notice that the terms contained in the brackets equate to $\displaystyle 0$. This means that

$\displaystyle \int_{-\infty}^{\infty} 0 \cdot g(y)dy = 0. (6)$

This implies that the function $\displaystyle v(x,t)$ does satisfy the given PDE (Eq.(2)).

References:

Definition of Support in Mathematics: https://en.wikipedia.org/wiki/Support_(mathematics)

# Observing the Variable Star W Ursae Majoris

While I was an undergraduate, one of my smaller research projects involved observing the variable star W Ursae Majoris.

In general, there are six types of binary star systems: Optical double, Visual binary, Astrometric binary, Eclipsing binary, Spectrum binary, and Spectroscopic binary.

In this project, my classmate and I were interested in the eclipsing binary (EW) W Ursae Majoris. An eclipsing binary is a binary system in which one of the stars will pass in front of its companion, effectively causing an eclipse. We are able to observe this by way of generating the light curves of the system. An example light curve is shown below:

(Image was obtained at the URL:  https://imagine.gsfc.nasa.gov/educators/hera_college/binary-model.html)

The graph shows a plot of intensity over time (which in this case is an orbital period). Observations of an EW should show dips in the intensity of the two stars. What is really fascinating to me is that we can gain valuable information from this graph. For example, the length of a dip can indicate the masses of the star. If we have a star of mass $m_{1}$ and the other is $m_{2}$ such that $m_{2}>m_{1}$, and if the duration of the decrease in intensity of the system is significant we can then infer that the mass passing in front of its companion is that of $m_{1}$. By default, the mass that is being “eclipsed” is $m_{2}$. Conversely, if the intensity decreases but only for a short while, the positions are reversed, with $m_{2}$ passing in front (relatively speaking) and $m_{1}$ is being “eclipsed”. (I am assuming that the barycenter (i.e. the system’s center of mass) is equidistant from the centers of the two stars.)

Another form of classification of binary stars is whether or not the binary system components are touching or not. More precisely, there are three kinds of close binaries: detached, semi-detached, and contact binary. There are sub-categories of contact binaries: near contact, contact, overcontact,  and double contact.

An equipotential surface map of a system (assuming that the binary system has a mass ratio of 2:1, which may be incorrect as most W UMa binaries have a mass ratio of 10:1) is shown below:

Image Credit: Fig.1 of Terrell, D., Eclipsing Binary Stars: Past, Present, and Future. JAAVSO Vol. 30, 2001.

To quickly elaborate, each type of contact binary will fill its inner Lagrangian surface (aka Roche lobes) to an extent. In the context of our project, W Ursae Majoris is an overcontact eclipsing binary system.  This type of binary will overfill its inner Lagrangian surface. As a result of this, processes such as mass transfer and accretion can occur. The diagram below shows the orbital evolution of a W UMa EW AC Bootis (in addition to being its own binary system, W UMa is also a class of close binaries)

Image Credit: Fig. 15 of Alton, K., A Unified Roche-Model Light Curve Solution for the W UMa Binary AC Bootis. JAAVSO. Vol. 38, 2010.

The objective of the project was to image the eclipsing binary, measure the apparent magnitude, to process the images, and to obtain a light curve. To observe this system, a classmate and I made use of the 20″ Ritchey-Chrétien telescope at the university observatory. We made use of the CCD camera attached and set a sequence of images to be taken every two minutes. W UMa has a period of approximately 8 hours, however, due to time constraints (and as much as I would have liked to, the weather was not conducive for observations exceeding two hours), we ended up only taking images for around two hours.

After the session was over, we ended up taking a total of roughly 40-50 images. Additionally, the software used to capture the images simultaneously measured the magnitude of W UMa at the time each image was taken. This allowed us to use Excel (and later on MATLAB) to obtain a partial light curve. However, since this is a partial light curve, we can say that an eclipse (and a short one at that) occurs, yet we cannot determine whether or not the local minimum depicted in the graph below is a primary or a secondary minimum–we simply do not have enough data.

In addition to the partial light curve above, we were able to process the images (using Registax v.6). Below is a stacked image of W UMa. The big blob near the center of the image is the binary. The binary is not able to be resolved by telescopes component-wise.

References:

Caroll, B.W., and Ostlie, D. A., Introduction to Modern Astrophysics. 2017. Cambridge University Press. 7.

Catalog and Atlas of Eclipsing Binaries (CALEB): Types of Binary Stars

http://www.daviddarling.info/encyclopedia/C/close_binary.html

American Association of Variable Star Observers (AAVSO) URL: https://www.aavso.org/vsots_wuma

Journal of American Association of Variable Star Observers: Figure References

# Coordinates and Transformations of Coordinates

SOURCES FOR CONTENT:  Neuenschwander, D.E., Tensor Calculus for Physics, 2015. Johns Hopkins University Press.

In this post, I continue the introduction of tensor calculus by discussing coordinates and coordinate transformations as applied to relativity theory. (A side note: I have acquired Misner, Thorne, and Wheeler’s Gravitation and will be using it sparingly given its reputation and weight of the material.)

# Derivation of the Finite-Difference Equations

In my final semester, my course load included a graduate course that had two modules: astronomical instrumentation and numerical modeling. The latter focused on developing the equations of motion of geophysical fluid dynamics (See Research in Magnetohydrodynamics). Such equations are then converted into an algorithm based on a specific type of numerical method of solving the exact differential equation.

The purpose of this post is to derive the finite-difference equations. Specifically, I will be deriving the forward, backward, centered first order equations. We start with the Taylor expansion about the points $x_{0}= \pm h$:

$\displaystyle f(x+h)=\sum_{n=1}^{\infty}\frac{h^{n}}{n!}\frac{d^{n}f}{dx^{n}}, (1)$

and

$\displaystyle f(x-h)=\sum_{n=1}^{\infty}(-1)^{n}\frac{h^{n}}{n!}\frac{d^{n}f}{dx^{n}}. (2)$

Let $f(x_{j})=f_{j}, f(x_{j}+h)=f_{j+1}, f(x_{j}-h)=f_{j-1}$. Therefore, if we consider the following differences…

$\displaystyle f_{j+1}-f_{j}=f^{\prime}_{j}+f^{\prime \prime}_{j}\frac{h^{2}}{2!}+...+f^{n}_{j}\frac{h^{n}}{n!}, (3)$

and

$\displaystyle f_{j}-f_{j-1}=hf^{\prime}_{j}-\frac{h^{2}}{2!}f^{\prime \prime}_{j}+...\mp \frac{h^{n}}{n!}f^{n}_{j}, (4)$

and

$\displaystyle f_{j+1}-f_{j-1}=2hf^{\prime}_{j}+\frac{2h^{3}}{3!}f^{\prime\prime\prime}_{j}+..\mp \frac{h^{n}}{n!}f^{n}_{j}, (5)$

and if we keep only linear terms, we get

$\displaystyle f^{\prime}_{j}=\frac{f_{j+1}-f_{j}}{h}+\mathcal{O}(h), (6)$

$\displaystyle f^{\prime}_{j}=\frac{f_{j}-f_{j-1}}{h}+\mathcal{O}(h), (7)$

and

$\displaystyle f^{\prime}_{j}=\frac{f_{j+1}-f_{j-1}}{2h}+\mathcal{O}(h)$

where the first is the forward difference, the second is the backward difference, the last is the centered difference, and $\mathcal{O}(h)$ represents the quadratic, cubic, quartic,quintic,etc. terms. One can use similar logic to derive the second-order finite-difference equations.

# Helicity and Magnetic Helicity

SOURCE FOR CONTENT: Davidson, P.A., 2001, An Introduction to Magnetohydrodynamics. 3.

The purpose of this post is to introduce the concepts of helicity as an integral invariant and its magnetic analog. More specifically, I will be showing that

$\displaystyle h = \int_{V_{\omega}}\textbf{u}\cdot \omega d^{3}r (1)$

is invariant and thus correlates to the conservation of vortex line topology using the approach given in the source material above.  I will then discuss magnetic helicity and how it relates to MHD.

Consider the total derivative of $\textbf{u}\cdot \omega$:

$\displaystyle \frac{D}{Dt}(\textbf{u}\cdot \omega) = \frac{D\textbf{u}}{Dt}\cdot \omega + \frac{D\omega}{Dt}\cdot \textbf{u}, (2)$

where I have used the product rule for derivatives. Also I denote a total derivative as $\frac{D}{Dt} = \frac{\partial}{\partial t}+(\nabla \cdot \textbf{v})$ so as not to confuse it with the notation for an ordinary derivative. Recall the Navier-Stokes’ equation and the vorticity equation

$\displaystyle \frac{\partial \textbf{u}}{\partial t}+(\nabla \cdot \textbf{v})\textbf{u}=\textbf{F}-\frac{1}{\rho}\nabla P + \nu \nabla^{2}\textbf{u}, (3)$

and

$\displaystyle \frac{\partial \omega}{\partial t}+(\nabla \cdot \textbf{v})\omega =(\nabla \cdot \omega)\textbf{u} +\eta \nabla^{2}\omega (4).$

Let $\textbf{F}=0$ and let $\eta \nabla^{2} \omega \equiv 0$ and $\nu \nabla^{2} \textbf{u} \equiv 0$. Then,

$\displaystyle \frac{D\textbf{u}}{Dt}=-\nabla \frac{P}{\rho}, (5)$

and

$\displaystyle \frac{D\omega}{Dt}=(\nabla \cdot \omega)\textbf{u}. (6)$

If we substitute Eqs. (5) and (6) into Eq.(2), we get

$\displaystyle \frac{D}{Dt}(\textbf{u}\cdot \omega)=-\nabla \frac{P}{\rho}\cdot \omega + (\nabla \cdot \omega)\textbf{u}\cdot \textbf{u}. (7)$

Evaluating each scalar product gives

$\displaystyle -2\nabla \frac{P}{\rho}\omega + \omega u^{2}\nabla = 0. (8)$

If we divide by 2 and note the definition of the scalar product, we may rewrite this as follows

$\displaystyle \nabla \cdot \bigg\{-\frac{P}{\rho}\omega +\frac{u^{2}}{2}\omega\bigg\}=0. (9)$

With this in mind, consider now the total derivative of $(\textbf{u}\cdot \omega)$ multiplied by a volume element $\delta V$ given by

$\displaystyle \frac{D}{Dt}(\textbf{u}\cdot \omega)\delta V = \nabla \cdot \bigg\{-\frac{P}{\rho}\omega +\frac{u^{2}}{2}\omega\bigg\}\delta V. (10)$

When the above equation is integrated over a volume $V_{\omega}$, the total derivative becomes an ordinary time derivative. Thus, Eq.(10) becomes

$\displaystyle \int_{V_{\omega}}\frac{d}{dt}(\textbf{u}\cdot \omega)dV=\iint_{S_{\omega}}\bigg\{\nabla \cdot [-\frac{P}{\rho}\omega + \frac{u^{2}}{2}\omega ]\bigg\}\cdot d\textbf{S}=0. (11)$

Hence, Eq.(11) shows that helicity is an integral invariant. In this analysis, we have assumed that the fluid is inviscid. If one considers two vortex lines (as Davidson does as a mathematical example), one finds that there exists two contributions to the overall helicity, corresponding to an individual vortex line. If one considers the vorticity multiplied by a volume element, one finds that for each line, their value helicity is the same. Thus, because they are the same, we can say that the overall helicity remains invariant and hence are linked. If they weren’t linked, then the helicity would be 0. In regards to magnetic helicity we define this as

$\displaystyle h_{m}=\int_{V_{B}}\textbf{A}\cdot \textbf{B}dV, (12)$

where $\textbf{A}$ is the magnetic vector potential which is described by the following relations

$\displaystyle \nabla \times \textbf{A}=\textbf{B}, (13.1)$

and

$\displaystyle \nabla \cdot \textbf{A}=0. (13.2)$

Again, we see that magnetic helicity is also conserved. However, this conservation arises by means of Alfven’s theorem of flux freezing in which the magnetic field lines manage to preserve their topology.

# Method of Successive Approximations-Elementary Methods to Solving Integral Equations

SOURCES: Tricomi, F.G., Integral Equations. 1985. Dover. 1.

Wazwaz, A.M., A First Course in Integral Equations. 2015. 3.6

In regards to solving integral equations, there are two main types of equations: Volterra and Fredholm equations. The first of which is the topic of this post. First, a brief introduction. Just as there are equations which deal with unknown functions and their derivatives, there also exists another type of equation which involves the same function but include the integral of this function. There are also a class of equations called integro-differential equations (see my series on Monte Carlo and Radiative transfer), in which the equation deals with the function, its derivative, and its integral.

Here, I will be dealing with Volterra integral equations. More specifically, I will be considering the Volterra equation of the second kind (VESK) of the form:

$\displaystyle \alpha(x)=h(x)+\lambda\int_{0}^{x}K(x,y)\alpha(y)dy, (1)$

where $y\in [0,x]$. The first term of the integrand $K(x,y)$ denotes the kernel. The kernel of the integral arises from its conversion from an initial value problem. Indeed, solving the integral equation is equivalent to solving the initial value problem of a differential equation. The integral equation includes the initial conditions instead of being added in near the end of the solution of an IVP.

The fact that we are dealing with Volterra equations means that the kernel is subject to the condition:

$\displaystyle K(x.y); y > x.$

The typical way to solve this involves the method of successive approximations (some call this method Picard’s method of sucessive approximation). I first came across this method whilst taking my differential equations course. The context in which this arose was that of the existence and uniqueness of solutions to first order differential equations.

The method is as follows:

Suppose we have the equation given in (1). We then define an initial function $\alpha_{0}(x)=h(x).$ Then the next iteration of $\alpha$ is

$\displaystyle \alpha_{1}(x)=h(x)+\lambda\int_{0}^{x}K(x,y)h(y)dy. (2)$

Naturally, the next term $\alpha_{2}(x)$ is

$\displaystyle \alpha_{2}(x)=h(x)+\lambda\int_{0}^{x}K(x,y)f(y)dy+\lambda^{2}\int_{0}^{x}K(x,z)dz \int_{0}^{z}K(z,y)f(y)dy, (3)$

or more simply

$\displaystyle \alpha_{2}(x)=h(x)+\lambda\int_{0}^{x}K(x,y)\alpha_{1}(y)dy. (4)$

The reason I chose to include Eq.(3) is because while reading about this method I found that the traditional expression (Eq.(4)) left much to be desired. For me, it didn’t demonstrate the “successive” part of successive approximations. In general, we may become convinced that

$\displaystyle \alpha_{n}(x)=h(x)+\lambda\int_{0}^{x}K(x,y)\alpha_{n-1}(y)dy. (5)$

Once the general expression for $\alpha_{n}(x)$ is determined, we may determine the exact solution $\alpha(x)$ via

$\displaystyle \alpha(x)=\lim_{n\rightarrow \infty}\alpha_{n}(x). (6)$

This is probably one of the simpler methods of solving integral equations, since we do not require any real analysis, but we can obtain solutions for simple integral equations. I will discuss a few other methods of solution in future posts in this series.

# “Proof” of Alfven’s Theorem of Flux Freezing

SOURCE FOR CONTENT: Choudhuri, A.R., 2010. Astrophysics for Physicists. Ch. 8.

In the previous post we saw the consequences of different regimes of the magnetic Reynolds’ number under which either diffusion or advection of the magnetic field dominates. In this post, I shall be doing a “proof” of Alven’s Theorem of Flux Freezing. (I hesitate to call it a proof since it lacks the mathematical rigor that one associates with a proof.) Also note in this post, I will be working with the assumption of a high magnetic Reynolds number.

Alfven’s Theorem of Flux Freezing: Suppose we have a surface $S$ located within a plasma at some initial time $t_{0}$. From the theorem it is known that the flux of the associated magnetic field is linked with surface $S$ by

$\displaystyle \int_{S}\textbf{B}\cdot d\textbf{S}. (1)$

At some later time $t^{\prime}$, the elements of plasma contained within $S$ at $t_{0}$ move to some other point and will constitute some different surface $M$. The magnetic flux, linked to $M$ at $t^{\prime}$ by

$\displaystyle \int_{M}\textbf{B}\cdot d\textbf{M}, (2)$

from which we may mathematically state the theorem as

$\displaystyle \int_{S}\textbf{B}\cdot d\textbf{S}=\int_{M}\textbf{B}\cdot d\textbf{M}. (3)$

If we know that the magnetic field evolves in time in accordance to the induction equation we may express Eq.(3) as

$\displaystyle \frac{d}{dt}\int_{S}\textbf{B}\cdot d\textbf{S}=0. (4)$

To confirm that this is true, we note the two ways magnetic flux may change as being due to either (1) some intrinsic variability of the magnetic field strength or (2) movement of the surface. Therefore, either way it follows that

$\displaystyle \frac{d}{dt}\int_{S}\textbf{B}\cdot d\textbf{S}=\int_{S}\frac{\partial \textbf{B}}{\partial t}\cdot d\textbf{S}+\int_{S}\textbf{B}\cdot \frac{d}{dt}(d\textbf{S}). (5)$

Now, consider again the two surfaces. Let us suppose now that $M$ is displaced some distance relative to $S$. Further, let us also suppose that this displacement occurs during a time interval $t^{\prime}=t_{0}+\delta t.$ Additionally, if we imagine a cylinder formed by projecting a circular cross-section from one surface to the other, we may consider its length to be $\delta l$ with area given by the cross product: $-\delta t \textbf{v}\times \delta\textbf{l}$. Moreover, since we know that the area of integration is a closed region we see that the integral vanishes (goes to 0). Thus, we may write the difference

$\displaystyle d\textbf{M}-d\textbf{S}-\delta \oint \textbf{v}\times \delta \textbf{l}=0 (6).$

Recall the definition for a derivative, we may apply it to the second term on the right hand side of Eq.(5) to get

$\displaystyle \frac{d}{dt}(d\textbf{S})=\lim_{\delta t\rightarrow 0}\frac{d\textbf{M}-d\textbf{S}}{\delta t}=\oint \textbf{v}\times \delta \textbf{l}. (7)$

Thus the term becomes

$\displaystyle \int_{S}\textbf{B}\cdot \frac{d}{dt}(d\textbf{S})=\int \oint \textbf{B}\cdot (\textbf{v}\times \delta\textbf{l})=\int \oint (\textbf{B}\times \textbf{v})\cdot \delta l. (8)$

Since the integrals that exist interior to the boundary of the surface (call it path $C$) vanish and recall Stokes’ theorem

$\displaystyle \oint_{\partial \Omega}\textbf{F}\cdot d\textbf{r}=\int\int_{\Omega} (\nabla \times \textbf{F})\cdot d\Omega,$

and applying it to Eq.(8) we arrive at

$\displaystyle \frac{d}{dt}\int_{S}\textbf{B}\cdot d\textbf{S}=\oint_{\partial S}(\textbf{B}\times\textbf{v}) \cdot \delta \textbf{l}=\int\int_{S}\bigg\{\nabla \times (\textbf{B}\times\textbf{v})\bigg\}\cdot d\textbf{S}. (9)$

Recall that we are dealing with high magnetic Reynolds number, if we use the corresponding form of the induction equation in Eq.(9) we arrive at

$\displaystyle \frac{d}{dt}\int_{S}\textbf{B}\cdot d\textbf{S}=\int\int_{S}d\textbf{S}\cdot \bigg\{ \frac{\partial \textbf{B}}{\partial t}-\nabla\times (\textbf{v}\times \textbf{B})\bigg\}= 0. (10)$

Thus, this completes the “proof”.

# Deriving the Bessel Function of the First Kind for Zeroth Order

NOTE: I verified the solution using the following text: Boyce, W. and DiPrima, R. Elementary Differential Equations.

In this post, I shall be deriving the Bessel function of the first kind for the zeroth order Bessel differential equation. Bessel’s equation is encountered when solving differential equations in cylindrical coordinates and is of the form

$\displaystyle x^{2}\frac{d^{2}y}{dx^{2}}+x\frac{dy}{dx}+(x^{2}-\nu^{2})y(x)=0, (1)$

where $\nu = 0$ describes the order zero of Bessel’s equation. I shall be making use of the assumption

$\displaystyle y(x)=\sum_{j=0}^{\infty}a_{j}x^{j+r}, (2)$

where upon taking the first and second order derivatives gives us

$\displaystyle \frac{dy}{dx}=\sum_{j=0}^{\infty}(j+r)a_{j}x^{j+r-1}, (3)$

and

$\displaystyle \frac{d^{2}y}{dx^{2}}=\sum_{j=0}^{\infty}(j+r)(j+r-1)a_{j}x^{j+r-2}. (4)$

Substitution into Eq.(1) and noting the order of the equation we arrive at

$\displaystyle x^{2}\sum_{j=0}^{\infty}(j+r)(j+r-1)a_{j}x^{j+r-2}+x\sum_{j=0}^{\infty}(j+r)a_{j}x^{j+r-1}+x^{2}\sum_{j=0}^{\infty}a_{j}x^{j+r}=0. (5)$

Distribution and simplification of Eq.(5) yields

$\displaystyle \sum_{j=0}^{\infty}\bigg\{(j+r)(j+r-1)+(j+r)\bigg\}a_{j}x^{j+r}+\sum_{j=0}^{\infty}a_{j}x^{j+r+2}=0. (6)$

If we evaluate the terms in which $j=0$ and $j=1$, we get the following

$\displaystyle a_{0}\bigg\{r(r-1)+r\bigg\}x^{r}+a_{1}\bigg\{(1+r)r+(1+r)\bigg\}x^{r+1}+\sum_{j=2}^{\infty}\bigg\{[(j+r)(j+r-1)+(j+r)]a_{j}+a_{j-2}\bigg\}x^{j+r}=0, (7)$

where I have introduced the dummy variable $m=(j+r)-2$ and I have shifted the indices downward by 2. Consider now the indicial equation (coefficients of $a_{0}x^{r}$),

$\displaystyle r(r-1)+r=0, (8)$

which upon solving gives $r=r_{1}=r_{2}=0$. We may determine the recurrence relation from summation terms from which we get

$\displaystyle a_{j}(r)=\frac{-a_{j-2}(r)}{[(j+r)(j+r-1)+(j+r)]}=\frac{-a_{j-2}(r)}{(j+r)^{2}}. (9)$

To determine $J_{0}(x)$ we let $r=0$ in which case the recurrence relation becomes

$\displaystyle a_{j}=\frac{-a_{j-2}}{j^{2}}, (10)$

where $j=2,4,6,...$. Thus we have

$\displaystyle J_{0}(x)=a_{0}x^{0}+a_{1}x+... (11)$

The only way the second term above is 0 is if $a_{1}=0$. So, the successive terms are $a_{3},a_{5},a_{7},..., = 0$. Let $j=2k$, where $k\in \mathbb{Z}^{+}$, then the recurrence relation is again modified to

$\displaystyle a_{2k}=\frac{-a_{2k-2}}{(2k)^{2}}. (12)$

In general, for any value of $k$, one finds the expression

$\displaystyle ... \frac{(-1)^{k}a_{0}x^{2k}}{2^{2k}(k!)^{2}}. (13)$

Thus our solution for the Bessel function of the first kind is

$\displaystyle J_{0}(x)=a_{0}\bigg\{1+\sum_{k=1}^{\infty}\frac{(-1)^{k}x^{2k}}{2^{2k}(k!)^{2}}\bigg\}. (14)$

# Basics of Tensor Calculus and General Relativity-Vectors and Introduction to Tensors (Part II-Continuation of Vectors)

SOURCE FOR CONTENT: Neuenschwander, D.E., 2015. Tensor Calculus for Physics. Johns Hopkins University Press. Ch.1

In the preceding post of this series, we saw how we may define a vector in the traditional sense. There is another formulation which is the focus of this post. One becomes familiar with this formulation typically in a second course in quantum mechanics or of a similar form in an introductory linear algebra course.

A vector may be written in two forms: a ket vector,

$\displaystyle |A\rangle=\begin{pmatrix} a_{1}&\\ a_{2}&\\ \vdots &\\ a_{N} \end{pmatrix}, (1.1)$

or a bra vector (conjugate vector):

$\displaystyle \langle A|=\begin{pmatrix} a_{1} & a_{2} & \hdots & a_{N} \end{pmatrix}, (1.2)$

Additionally, if $\langle A| \in \mathbb{C}$, then the conjugate vector takes the form

$\displaystyle \langle A|=\begin{pmatrix} a_{1}^{*} & a_{2}^{*} & \hdots & a_{N}^{*} \end{pmatrix}. (1.3)$

In words, if the conjugate vector exists in the complex plane, we may express such a vector in terms of complex conjugates of the components.

We may form the inner product by the following

$\displaystyle \langle A|B\rangle = a_{1}^{*}b_{1}+a_{2}^{*}b_{2}+\hdots+a_{N}^{*}b_{M}=\sum_{i=1}^{N}\sum_{j=1}^{M}a_{i}^{*}b_{j}. (2)$

Conversely we may form the outer product as follows

$\displaystyle |A\rangle \langle B|= \begin{pmatrix} a_{1}b_{1}^{*} & a_{1}b_{2}^{*} & \hdots & a_{1}b_{M}^{*}\\ a_{2}b_{1}^{*} & a_{2}b_{2}^{*} & \hdots & a_{2}b_{M}^{*}\\ \vdots & \vdots & \ddots & \vdots \\ a_{N}b_{1}^{*} & a_{N}b_{2}^{*}& \hdots & a_{N}b_{M}^{*}\\ \end{pmatrix}, (3)$

Additionally, one typically defines a vector as a linear combination of basis vectors which we shall express by the following:

$\displaystyle \hat{i}=|1\rangle \equiv \begin{pmatrix} 1 \\ 0 \\ 0 \\ \end{pmatrix},$

$\displaystyle \hat{j} = |2\rangle \equiv \begin{pmatrix} 0 \\ 1 \\ 0 \\ \end{pmatrix},$

$\displaystyle \hat{k}=|3\rangle \equiv \begin{pmatrix} 0 \\ 0 \\ 1 \\ \end{pmatrix},$

where in general the basis vectors satisfy $\langle i|j \rangle = \delta_{ij}$. Therefore we can write any arbitrary vector in the following way

$\displaystyle \textbf{A}= a_{1}|1\rangle + a_{2}|2\rangle + ... + a_{n}|n\rangle. (4)$

Moreover, any conjugate vector may be written as

$\displaystyle \langle B|= b_{1}\langle 1| + b_{2}\langle 2| + ... + b_{m}\langle m|. (5)$

Let $\mathcal{P} = |A\rangle \langle B|$ represent the outer product which may also be written in component form as $\mathcal{P}_{ij}=\langle i|\mathcal{P}|j\rangle$. We therefore have the condition that

$\displaystyle \textbf{1} = \sum_{\mu}|\mu\rangle \langle \mu|, (6)$

and

$\displaystyle |A\rangle = \sum_{\mu}|\mu\rangle \langle \mu|A\rangle =\sum_{\mu}A^{\mu}|\mu\rangle, (7)$

wherein $\displaystyle A^{\mu}\equiv \langle \mu|A\rangle$. The first relation represents the completeness of an orthonormal set of basis vectors and the second is its modified form.

The next post will discuss the transformations of coordinates of vectors using both the matrix formulation and using partial derivatives.