Notes on Mathematical Methods for Physicists Chapter2

本文最后更新于 2024年5月15日 晚上

Notes on Mathematical Methods for Physicists §2

Notes on Mathematical Methods for Physicists

Chapter2 Determinants & Matrices

Determinants

We begin the study of matrices by solving linear equations that will lead us to determinants and matrices. The concept of determinant and the notation were introduced by the renowned German mathematician and philosopher .

Homogeneous Linear Equations

Suppose three unknowns (or equations with unknowns) :

The problem is to determine under what conditions there is any solution , apart from the trivial one . Using vectors , we have . These three vector equations have the geometrical intepretation that is orthogonal to .

If the volume spanned by given by determinant (or triple scalar product)

is not zero , then there is the only trivial solution .

Conversely , if the aforementional determinant of the coefficient vanishes , then one of the row vectors is a combination of the other two. , only ratios of are relevant.

This is Cramer's Rule for homogeneous linear equation.

Inhomogeneous Linear Equation

Simple example :

This is Cramer's Rule for inhomogeneous linear equation.

Definitions

Before defining a determinant , we need to introduce some related concepts and definitions.

When we write two-dimensional () arrays of items , we identify the item in the th horizontal row and the th vertical column by the index set ; note that the row index is conventionally written first.

Starting from a set of objects in some reference order (e.g. , the number sequence ) , we can make a permutation of them to some other order ; the total number of distinct permutations that are possible is (choose the first object ways , then choose the second in ways , etc.).

Every permutation of objects can be reached from the reference order by a succession of pairwise interchanges (e.g. , can be reached by the successive steps ). Although the number of pairwise interchanges needed for a given permutation depends on the path (compare the above example with ) , for a given permutation the number of interchanges will always either be even or odd. Thus a permutation can be identified as having either even or odd parity.

It is convenient to introduce the Levi-Civita symbol , which for an -object system is denoted by , where has subscripts , each of which identifies one of the objects.

We now define a determinant of order to be an square array of numbers (or functions) , with the array conventionally written within vertical bars (not parentheses , braces , or any other type of brackets) , as follows :

The determinant has a value .

Properties of Determinants

Take determinants of order for example.

Any determinant with two rows equal , or two columns equal , has the value zero. To prove this , interchange the two identical rows or columns ; the determinant both remains the same and changes sign , and therefore must have the value zero.

An extension of the above is that if two rows (or columns) are proportional , the determinant is zero.

The value of a determinant is unchanged if a multiple of one row is added (column by column) to another row or if a multiple of one column is added (row by row) to another column.

If each element in a row or each element in a column is zero , the determinant has the value zero.

Laplacian Development by Minor

The fact that a determinant of order expands into terms means that it is important to identify efficient means for determinant evaluation. One approach is to expand in terms of minors. The minor corresponding to , denoted , or if we need to identify as coming from the , is the determinant (of order ) produced by stiking out row and column of the original determinant. And we get

Linear Equation Systems

For equation

We define

Then we have

This is the Cramer's Rule.

If is nonzero , the above construction of the is definitive and unique , so that there will be exactly one solution to the equation set.

Determinants & Linear Dependence

If the coefficients of linear forms in variables form a nonzero determinant , the forms are linearly independent ; if the determinant of the coefficients is zero , the forms exhibit linear dependence.

Linearly Dependent Equations

Situation

All the equations are homogeneous (which means all the right hand side quantities are zero). Then , one or more of the equations in the set will be equivalent to linear combinations of others , and we will have less than equations in our variables. We can then assign one (or in some cases , more than one) variable an arbitrary value , obtaining the others as functions of the assigned variables. We thus have a manifold (i.e. , a parameterized set) of solutions to our equation system.

Situation

A second case is where we have (or combine equations so that we have) the same linear form in two equations , but with different values of the right-hand quantities . In that case the equations are mutually inconsistent , and the equation system has no solution.

Situation

A third , related case , is where we have a duplicated linear form , but with a common value of . This also leads to a solution manifold.

Numerical Evaluation

There are many methods to evaluate determinants , even using computers. We use the Gauss Elimination to calculate determinants , which is a versatile procedure that can be used for evaluating determinants, for solving linear equation systems, and (as we will see later) even for matrix inversion.

Gauss Elimination : make the determinant into a form that all the entries in the lower triangle of the determinant. Then the only effective part is the product of thediagonal elements.

Matrices

Matrices are arrays of numbers or functions that obey the laws that define matrix algebra.

Basic Definitions

A matrix is a set of numbers or functions in a square or rectangular array. A matrix with (horizontal) rows and (vertical) columns is known as an matrix. When we introduced determinants , when row and column indices or dimensions are mentioned together , it is customary to write the row indicaters first.

A matrix for which is termed square; One consisting of a single column (an matrix) is often called a column vector , while a matrix with only one row (therefore ) is a row vector.

Equality

If and are matrices , only if for all values of and . A necessary but not sufficient condition for equality is that both matrices have the same dimensions.

Addition , Subtraction

Addition and subtraction are defined only for matrices and of the same dimensions , in which case , with for all values of and . Addition is commutative () and also associative (). A matrix with all elements zero , called a null matrix or zero matrix , can either be written as or as a simple zero. Thus for all ,

Multiplication (by a Scalar)

Here we have , with for all values of and . This operation is commutative , with .

Note that the definition of multiplication by a scalar causes each element of marix to be multiplied by the scalar factor , so there is

Matrix Multiplication (Inner Product)

Matrix multiplication is not an element-by-element operation like addition or multiplication by a scalar. The inner product of matrices and is defined as

This definition causes the element of to be formed from the entire th row of and the entire th column of . And as you can realize , .

It is useful to define the commutator of and ,

which , as stated above , will in many cases be nonzero.

But , matrix multiplication is associative , meaning that .

Unit Matrix

By direct matrix multiplication , it is possible to show that a square matrix with elements of value unity on its principal diagonal (the elements with ) , and zeros everywhere else , will leave unchanged any matrix with which it can be multiplied. For example , the unit matrix has the form

note that it is not a matrix all of whose elements are unity. Giving such a matrix the name ,

Remember that must be .

The previously introduced null matrices have only zero elements , so it is also obvious that for all ,

 

Diagonal Matrices

If a matrix has nonzero elements only for , it is said to be diagonal.

Matrix Inverse

It will often be the case that given a square matrix , there will be a square matrix such that . A matrix with this property is called the inverse of and is given the name . If exists , it must be unique.

Every nonezero real (or complex) number has a nonzero multiplicative inverse , often written . But the corresponding property does not hold for matrices ; there exist nonzero matrices that do not have inverses. To demonstrate this , consider the following :

If has an inverse , we can multiply the equation on the left by , thereby obtaining

Since we started with a matrix that was nonzero , this is an inconsistency , and we are forced to conclude that does not exist. A matrix without an inverse is said to be singular , so our conclusion is that is singular. Note that in our derivation , we had to be careful to multiply both members of from the left , because multiplication is noncommutative. Alternatively , assuming to exist , we could multiply this equation on the right by , obtaining

This is inconsistent with the nonzero with which we started ; we conclude that is also singular. Summerizing , there are nonzero matrices that do not have inverses and are identified as singular.

The algebraic properties of real and complex numbers (including the existence of inverses for all nonzero numbers) define what mathematicians call a field. The properties we have identified for matrices are different ; they form what is called a ring.

A closed , but cumber-some formula for the inverse of a matrix exists ; it expresses the elements of in terms of the determinants that are the minors of . That formula , the derivation of which is in several of the Additional Readings , is

We describe here a well-known method that is computationally more efficient than the equation above , namely the Gauss-Jordan procedure.

Example Gauss-Jordan Matrix Inversion

The Gauss-Jordan method is based on the fact that there exist matrices such that the product will leave an arbitrary matrix unchanged , except with ​ (a) one row multiplied by a constant , or ​ (b) one row replaced by the original row minus a multiple of another row , or ​ (c) the interchange of two rows.

By using these transformations , the rows of a matrix can be altered (by matrix multiplication) in the same way as we did to the elements of determinants. If is nonsingular , we change both side of the equation to reduce to , then we get :

What we need to do is to find out how to reduce to using the method which we have used to determinants. Here is a concrete example :

Write , side by side , the matrix and a unit matrix of the same size , and perform the same operations on each until has been converted to a unit matrix , which means that the unit matrix will have been changed into :

Multiply the rows as necessary to set to unity all elments of the first column of the left matrix ,

Subtracting the first row from the second an third rows , we obtain

Divide the second row by and subtract times it from the first row and times it from the third row ,

Divide the third row by . Then as the last step , times the third row is subtracted from each of the first two rows. Our final pair is

Derivatives of Determinants

The formula giving the inverse of a matrix in terms of its minors enables us to write a compact formula for the derivative of a determinant where the matrix has elements that depend on some variable . To carry out the differentiation with respect to the dependence of its element , we write as its expansion in minors about the elements of row , so we have

Applying now the chain rule to allow for the dependence of all elements of , we get

 

Systems of Linear Equations

Note that if is a square matrix , and and are column vectors. Consider the matrix equation , which is equivalent to a system of lenear equations.

This tells us two things : ​ (a) that if we can evaluate , we can compute the solution ; ​ (b) and that the existence of means that this equation system has a unique solution.

Then the result is important enough to be emphasized : A square matrix is singular if and only if .

Determinant Product Theorem

The Product Theorem is that . The proof of this theorem is dull , I would just skip it.

Note that .

Rank of a Matrix

The concept of a matrix singularity can be refined by introducing the notion of the rank of a matrix. If the elements of a matrix are viewed as the coefficients of a set of linear forms , a square matrix is assigned a rank equal to the number of linearly independent forms that its elements describe. Thus , a nonsingular matrix will have rank , while a singular matrix will have a rank less . The rank provides a measure of the extent of the singularity ; if , the matrix describes one linear form that is dependent on the others , etc. Further and more systematical discussion will be seen in Chapter 6.

Transpose , Adjoint , Trace

Transpose

The transpose of a matrix is the matrix that results from interchanging its row and column indices. This operation corresponds to subjecting the array to reflection about its principal diagonal. If a matrix is not square , its transpose will not even have the same shape as the original matrix. The transpose of , denoted or sometimes , thus has elements

Note that transposition will convert a column vector into a row vector. A matrix that is unchanged by transposition is called symmetric.

Adjoint

The adjoint of a matrix , denoted , is obtained by both complex conjugating and transposing it. Thus ,

 

Trace

The trace , a quantity defined for square matrices , is the sum of the elements on the principal diagonal. Thus , for an matrix ,

Some properties of the trace :

The second property holds even if , which means and are not commute.

Considering the trace of the matrix product , if we group the factors as , we easily see that

Repeating this process , we also find . Note , however , that we cannot equate any of these quantities to or to the trace of any other noncyclic permutation of these matrces.

Operations on Matrix Products

There are some properties of the determinant and trace :

whether or not and commute. The properties above establish that the trace is a linear operator. Since similar relations do not exist for the determinant , it is not a linear operator.

For other operations on matrix products , there are

 

Matrix Representation of Vectors

 

I have nothing to say , because it is easy to understand. (I am going to use to represent vectors , and may sometimes use as well.)

Orthogonal Matrices

A real matrix is termed orthogonal if its transpose is equal to its inverse. Thus , if is orthogonal , we may write

Since , for orthogonal , , we see that . It is easy to prove that if and are each orthogonal , then so are and .

Unitary Matrices

The definition is matrix which the adjoint is also the inverse is identified as unitary. One way of expressing this relationship is

If all the elements of a unitary matrix are real , the matrix is also orthogonal.

Since for any matrix , and therefore , application of the determinant product theorem to a unitary matrix leads to

We observe that if and are both unitary , then and will be unitary as well. This is a generalization of earlier result that the matrix product of two orthogonal matrices is also orthogonal.

Hermitian Matrices

A matrix is identified as Hermitian , or , synonymously , self-adjoint , if it is equal to its adjoint. To be self-adjoint , a matrix must be square , and in addition , its elements must satisfy

We see that the principal diagonal elements must be real.

Note that if two matrices and are Hermitian , it is not necessarily true that or is Hermitian ; however , , if nonzero , will be Hermitian , and , if nonzero , will be anti-Hermitian , meaning that .

Extraction of a Row or Column

It is useful to define column vectors which are zero except for the element , which is unity ; example are

One use of these vectors is to extract a single column from a matrix. For example , if is a matrix , then is the 2nd column of .

The row vector can be used in a similar fashin to extract a row from an arbitrary matrix , as in .

These unit vectors will also have many uses in other contexts.

Direct Product

A second procedure for multiplying matrices , known as the direct product or Kronecker product , combines a matrix and a matrix to make the direct product matrix , which is of dimension and has elements

with , . The direct product matrix uses the indices of the first factor as major and those of the second factor as minor ; it is therefore a noncom-mutative process. It is , however , associative.

Example Direct Products

Some examples of direct product. If and are both matrices

Some properties :

The last two require and to be of the same dimensions.

Example Dirac Matrices

In the original , nonrelativistic formulation of quantum mechanics , agreement between theory and experiment for electronic systems required the introduction of the concept of electron spin (intrinsic angular momentum) , both to provide a doubling in the number of available states and to explain phenomena involving the electron’s magnetic moment. The concept was introduced in a relatively ad hoc fashion ; the electron needed to be given spin quantum number 1/2 , and that could be done by assigning it a two-component wave function , with the spin-related properties described using the Pauli matrices

Of relevance here is the fact that these matrices anticommute (which means ) and have squares that are unit matrices :

In 1927, P. A. M. Dirac developed a relativistic formulation of quantum mechanics applicable to spin-1/2 particles. To do this it was necessary to place the spatial and time variables on an equal footing , and Dirac proceeded by converting the relativistic expression for the kinetic energy to an expression that was first order in both the energy and the momentum (parallel quantities in relativistic mechanics). He started from the relativistic equation for the energy of a free particle ,

Note that in the passage to quantum mechanics , the quantities are to be replaced by the differential operators , and the entire equation is applied to a wave function.

It was desirable to have a formulation that would yield a two-component wave function in the nonrelativistic limit and therefore might be expected to contain the . Dirac made the observation that a key to the solution of his problem was to exploit the fact that the Pauli matrices , taken together as a vector

The Pauli matrices should be taken as a component in this case.

Then , it could be combined with the vector to yeild the identity

where denotes a unit matrix. The equation allows us , at the price of going to matrices , to linearize the quadratic occurences of and . We first write

Then factor the left-hand side of the equation and apply both sides of the resulting equation to a two-component wave function that we will call :

The meaning of this equation becomes clearer if we make the additional definition

Then ,

Define that , , we get

 

Use the direct product notation to condense the equation into a simpler form

 

where is the four-component wave function built from the two-component wave functions :

 

and the terms on the left-hand side have the indicated structure because

 

It has become customary to identify the matrices above as and refer to them as Dirac matrices , with

 

The matrices resulting from the individual components of are ( for )

 

The so-called Dirac equation can be written as

 

To put this matrix into the specific form known as the Dirac equation we multiply both sides of it (on the left) by . Noting that and giving the new name , we reach

 

The Dirac gamma matrices have some properties similar to Pauli matrices :

 

In the nonrelativistic limit, the four-component Dirac equation for an electron reduces to a two-component equation in which each component satisfies the Schrödinger equation , with the Pauli and Dirac matrices having completely disappeared. In this limit, the Pauli matrices reappear if we add to the Schrödinger equation an additional term arising from the intrinsic magnetic moment of the electron. The passage to the nonrelativistic limit provides justification for the seemingly arbitrary introduction of a two-component wavefunction and use of the Pauli matrices for discussions of spin angular momentum.

The set of matrices which satisfies the same properties as Pauli matrices are called a Clifford algebra. The Pauli matrices is said to be of the dimension 4 (the number of linearly independent such matrices). The Dirac matrices are members of a Clifford algebra of dimension 16. A complete basis for this Clifford algebra with convenient Lorentz transformation properties consists o fthe 16 matrices

 

Functions of Matrices

Using Taylor expansion , we could write functions of matrices.

For Pauli matrices , the Euler identity for real and ,

 

For Dirac matrices ,

 

Hermitian and unitary matrices are related in that , given as

 

is unitary if is Hermitian , because .

 

Anither result is that any Hermitian matrix satisfies the trace formula ,

 

Finally , we note that the multiplication of two diagonal matrices produces a matrix that is also diagonal , with elements that are the products of the corresponding elements of the multiplicands. This result implies that an arbitrary function of a diagonal matrix will also be diagonal , with diagonal elements that are that function of the diagonal elements of the original matrix.

Example Exponential of a Diagonal Matrix

If a matrix is diagonal , then its th power is also diagonal , with the original diagonal matrix elements raised to the th power. For example , given

 

then

 

We can now compute

 

A final and important result is the Baker-Hausdorff formula , which is used in the coupled-clusterd expansions that yield highly accurate electric structure calculations on atoms and molecules :

 

Some Important Facts in Exercises

(1)

 

(2)

 

(3)

 

(4) Jacobi identity

 

(5)


Notes on Mathematical Methods for Physicists Chapter2
http://physics-nya.github.io/2024/04/09/Notes on Mathematical Methods for Physicists §2/
作者
菲兹克斯喵
发布于
2024年4月9日
更新于
2024年5月15日
许可协议