Special Relativity/Mathematical approach

From testwiki
Revision as of 08:07, 29 April 2022 by imported>ShakespeareFan00 (Stray formatting ? - Lint)
(diff) ← Older revision | Latest revision (diff) | Newer revision β†’ (diff)
Jump to navigation Jump to search

Template:Special Relativity

Vectors

Physical effects involve things acting on other things to produce a change of position, tension etc. These effects usually depend upon the strength, angle of contact, separation etc of the interacting things rather than on any absolute reference frame so it is useful to describe the rules that govern the interactions in terms of the relative positions and lengths of the interacting things rather than in terms of any fixed viewpoint or coordinate system. Vectors were introduced in physics to allow such relative descriptions.

The use of vectors in elementary physics often avoids any real understanding of what they are. They are a new concept, as unique as numbers themselves, which have been related to the rest of mathematics and geometry by a series of formulae such as linear combinations, scalar products etc.

Vectors are defined as "directed line segments" which means they are lines drawn in a particular direction. The introduction of time as a geometric entity means that this definition of a vector is rather archaic, a better definition might be that a vector is information arranged as a continuous succession of points in space and time. Vectors have length and direction, the direction being from earlier to later.

Vectors are represented by lines terminated with arrow symbols to show the direction. A point that moves from the left to the right for about three centimetres can be represented as:

If a vector is represented within a coordinate system it has components along each of the axes of the system. These components do not normally start at the origin of the coordinate system.

The vector represented by the bold arrow has components a, b and c which are lengths on the coordinate axes. If the vector starts at the origin the components become simply the coordinates of the end point of the vector and the vector is known as the position vector of the end point.

Addition of Vectors

If two vectors are connected so that the end point of one is the start of the next the sum of the two vectors is defined as a third vector drawn from the start of the first to the end of the second:

c is the sum of a and b:

c = a + b

If a components of a are x1, y1, z1 and the components of b are x2, y2, z2 then the components of the sum of the two vectors are (x1+x2), (y1+y2) and (z1+z2). In other words, when vectors are added it is the components that add numerically rather than the lengths of the vectors themselves.

Rules of Vector Addition

1. Commutativity a + b = b + a

2. Associativity (a + b) + c = a + (b + c)

If the zero vector (which has no length) is labelled as 0

3. a + (-a) = 0

4. a + 0 = a

Rules of Vector Multiplication by a Scalar

The discussion of components and vector addition shows that if vector a has components a,b,c then qa has components qa, qb, qc. The meaning of vector multiplication is shown below:

The bottom vector c is added three times which is equivalent to multiplying it by 3.

1. Distributive laws q(a + b) = qa + qb and (q + p)a = qa + pa

2. Associativity q(pa) = qpa

Also 1 a = a

If the rules of vector addition and multiplication by a scalar apply to a set of elements they are said to define a vector space.

Linear Combinations and Linear Dependence

An element of the form:

q1𝐚𝟏+q2𝐚𝟐+q3πšπŸ‘+....+qm𝐚𝐦

is called a linear combination of the vectors.

The set of vectors multiplied by scalars in a linear combination is called the span of the vectors. The word span is used because the scalars (q) can have any value - which means that any point in the subset of the vector space defined by the span can contain a vector derived from it.

Suppose there were a set of vectors (a1,a2,....,am) , if it is possible to express one of these vectors in terms of the others, using any linear combination, then the set is said to be linearly dependent. If it is not possible to express any one of the vectors in terms of the others, using any linear combination, it is said to be linearly independent.

In other words, if there are values of the scalars such that:

(1). 𝐚𝟏=q2𝐚𝟐+q3πšπŸ‘+....+qm𝐚𝐦

the set is said to be linearly dependent.

There is a way of determining linear dependence. From (1) it can be seen that if q1 is set to minus one then:

q1𝐚𝟏+q2𝐚𝟐+q3πšπŸ‘+....+qm𝐚𝐦=0

So in general, if a linear combination can be written that sums to a zero vector then the set of vectors (𝐚𝟏,𝐚𝟐,....,𝐚𝐦) are not linearly independent.

If two vectors are linearly dependent then they lie along the same line (wherever a and b lie on the line, scalars can be found to produce a linear combination which is a zero vector). If three vectors are linearly dependent they lie on the same line or on a plane (collinear or coplanar).

Dimension

If n+1 vectors in a vector space are linearly dependent then n vectors are linearly independent and the space is said to have a dimension of n. The set of n vectors is said to be the basis of the vector space.

Scalar Product

Also known as the 'dot product' or 'inner product'. The scalar product is a way of removing the problem of angular measures from the relationship between vectors and, as Weyl put it, a way of comparing the lengths of vectors that are arbitrarily inclined to each other.

Consider two vectors with a common origin:

The projection of 𝐚 on the adjacent side is:

P=|𝐚|cosθ

Where |𝐚| is the length of 𝐚.

The scalar product is defined as:

(2) 𝐚.𝐛=|𝐚||𝐛|cosθ

Notice that cosθ is zero if 𝐚 and 𝐛 are perpendicular. This means that if the scalar product is zero the vectors composing it are orthogonal (perpendicular to each other).

(2) also allows cosθ to be defined as:

cosθ=𝐚.𝐛/(|𝐚||𝐛|)

The definition of the scalar product also allows a definition of the length of a vector in terms of the concept of a vector itself. The scalar product of a vector with itself is:

𝐚.𝐚=|𝐚||𝐚|cos0

cos 0 (the cosine of zero) is one so:

𝐚.𝐚=a2

which is our first direct relationship between vectors and scalars. This can be expressed as:

(3) a=𝐚.𝐚

where a is the length of 𝐚.

Properties:

1. Linearity [G𝐚+H𝐛].𝐜=G𝐚.𝐜+H𝐛.𝐜

2. symmetry 𝐚.𝐛=𝐛.𝐚

3. Positive definiteness 𝐚.𝐚 is greater than or equal to 0

4. Distributivity for vector addition (𝐚+𝐛).𝐜=𝐚.𝐜+𝐛.𝐜

5. Schwarz inequality |𝐚.𝐛|ab

6. Parallelogram equality |𝐚+𝐛|2+|πšπ›|2=2(|𝐚|2+|𝐛|2)

From the point of view of vector physics the most important property of the scalar product is the expression of the scalar product in terms of coordinates.

7. 𝐚.𝐛=a1b1+a2b2+a3b3

This gives us the length of a vector in terms of coordinates (Pythagoras' theorem) from:

8. 𝐚.𝐚=a2=a12+a22+a32

The derivation of 7 is:

𝐚=a1𝐒+a2𝐣+a3𝐀

where 𝐒,𝐣,𝐀 are unit vectors along the coordinate axes. From (4)

𝐚.𝐛=(a1𝐒+a2𝐣+a3𝐀).𝐛=a1𝐒.𝐛+a2𝐣.𝐛+a3𝐀.𝐛

but 𝐛=b1𝐒+b2𝐣+b3𝐀

so:

𝐚.𝐛=b1a1𝐒.𝐒+b2a1𝐒.𝐣+b3a1𝐒.𝐀+b1a2𝐣.𝐒+b2a2𝐣.𝐣+b3a2𝐣.𝐀+b1a3𝐀.𝐒+b2a3𝐀.𝐣+b3a3𝐀.𝐀

𝐒.𝐣,𝐒.𝐀,𝐣.𝐀, etc. are all zero because the vectors are orthogonal, also 𝐒.𝐒,𝐣.𝐣 and 𝐀.𝐀 are all one (these are unit vectors defined to be 1 unit in length).

Using these results:

𝐚.𝐛=a1b1+a2b2+a3b3

Matrices

Matrices are sets of numbers arranged in a rectangular array. They are especially important in linear algebra because they can be used to represent the elements of linear equations.

11a + 2b = c

5a + 7b = d

The constants in the equation above can be represented as a matrix:

𝐀=[11257]

The elements of matrices are usually denoted symbolically using lower case letters:

𝐀=[a11a12a21a22]


Matrices are said to be equal if all of the corresponding elements are equal.

Eg: if aij=bij

Then 𝐀=𝐁

Matrix Addition

Matrices are added by adding the individual elements of one matrix to the corresponding elements of the other matrix.

cij=aij+bij

or 𝐂=𝐀+𝐁

Matrix addition has the following properties:

1. Commutativity 𝐀+𝐁=𝐁+𝐀

2. Associativity (𝐀+𝐁)+𝐂=𝐀+(𝐁+𝐂)

and

3. 𝐀+(𝐀)=0

4. 𝐀+0=𝐀

From matrix addition it can be seen that the product of a matrix 𝐀 and a number p is simply p𝐀 where every element of the matrix is multiplied individually by p.

Transpose of a Matrix

A matrix is transposed when the rows and columns are interchanged:

𝐀=[a11a12a13a21a22a23a31a32a33]
𝐀𝐓=[a11a21a31a12a22a32a13a23a33]


Notice that the principal diagonal elements stay the same after transposition.

A matrix is symmetric if it is equal to its transpose eg: akj=ajk.

It is skew symmetric if 𝐀𝐓=𝐀 eg: akj=ajk. The principal diagonal of a skew symmetric matrix is composed of elements that are zero.

Other Types of Matrix

Diagonal matrix: all elements above and below the principal diagonal are zero.

[400010002]


Unit matrix: denoted by I, is a diagonal matrix where all elements of the principal diagonal are 1.

[100010001]

Matrix Multiplication

Matrix multiplication is defined in terms of the problem of determining the coefficients in linear transformations.

Consider a set of linear transformations between 2 coordinate systems that share a common origin and are related to each other by a rotation of the coordinate axes.


Two Coordinate Systems Rotated Relative to Each Other

If there are 3 coordinate systems, x, y, and z these can be transformed from one to another:

x1=a11y1+a12y2

x2=a21y1+a22y2


y1=b11z1+b12z2

y2=b21z1+b22z2


x1=c11z1+c12z2

x2=c21z1+c22z2


By substitution:

x1=a11(b11z1+b12z2)+a12(b21z1+b22z2)

x2=a21(b11z1+b12z2)+a22(b21z1+b22z2)


x1=(a11b11+a12(b21)z1+(a11b12+a12b22)z2

x2=(a21b11+a22(b21)z1+(a21b12+a22b22)z2


Therefore:

c11=(a11b11+a12(b21)

c12=(a11b12+a12b22)

c21=(a21b11+a22b21)

c22=(a21b12+a22b22)


The coefficient matrices are:

𝐀=[a11a12a21a22]
𝐁=[b11b12b21b22]
𝐂=[c11c12c21c22]

From the linear transformation the product of A and B is defined as:

𝐂=𝐀𝐁=[(a11b11+a12b21)(a11b12+a12b22)(a21b11+a22b21)(a21b12+a22b22)]

In the discussion of scalar products it was shown that, for a plane the scalar product is calculated as: 𝐚.𝐛=a1b1+a2b2 where a and b are the coordinates of the vectors a and b.

Now mathematicians define the rows and columns of a matrix as vectors:

A Column vector is 𝐛=[b11b21]

And a Row vector 𝐚=[a11a12]


Matrices can be described as vectors eg:

𝐀=[a11a12a21a22]=[𝐚𝟏𝐚𝟐]

and

𝐁=[b11b12b21b22]=[π›πŸπ›πŸ]

Matrix multiplication is then defined as the scalar products of the vectors so that:


𝐂=[𝐚𝟏.π›πŸπšπŸ.π›πŸπšπŸ.π›πŸπšπŸ.π›πŸ]

From the definition of the scalar product 𝐚𝟏.π›πŸ=a11b11+a12b21 etc.

In the general case:

𝐂=[𝐚𝟏.π›πŸπšπŸ.π›πŸ.𝐚𝟏.π›π§πšπŸ.π›πŸπšπŸ.π›πŸ.𝐚𝟐.𝐛𝐧....𝐚𝐦.π›πŸπšπ¦.π›πŸ.𝐚𝐦.𝐛𝐧]

This is described as the multiplication of rows into columns (eg: row vectors into column vectors). The first matrix must have the same number of columns as there are rows in the second matrix or the multiplication is undefined.

After matrix multiplication the product matrix has the same number of rows as the first matrix and columns as the second matrix:

[134632] times [237] has 2 rows and 1 column [3935]

ie: first row is 1*2+3*3+4*7=39 and second row is 6*2+3*3+2*7=35

𝐀𝐁=[132213] times [234321513] has 2 rows and 3 columns[21111316716]

Notice that 𝐁𝐀 cannot be determined because the number of columns in the first matrix must equal the number of rows in the second matrix to perform matrix multiplication.


Properties of Matrix Multiplication

1. Not commutative 𝐀𝐁𝐁𝐀

2. Associative 𝐀(𝐁𝐂)=(𝐀𝐁)𝐂

(k𝐀)𝐁=k(𝐀𝐁)=𝐀(k𝐁)

3. Distributative for matrix addition

(𝐀+𝐁)𝐂=𝐀𝐂+𝐁𝐂

matrix multiplication is not commutative so 𝐂(𝐀+𝐁)=𝐂𝐀+𝐂𝐁 is a separate case.

4. The cancellation law is not always true:

𝐀𝐁=0 does not mean 𝐀=0 or 𝐁=0

There is a case where matrix multiplication is commutative. This involves the scalar matrix where the values of the principle diagonal are all equal. Eg:

𝐒=[k000k000k]

In this case 𝐀𝐒=𝐒𝐀=k𝐀. If the scalar matrix is the unit matrix: π€πˆ=πˆπ€=𝐀.

Linear Transformations

A simple linear transformation such as:

x1=a11y1+a12y2

x2=a21y1+a22y2

can be expressed as:

𝐱=𝐀𝐲

eg:

[x1x2]=[a11a12a21a22]*[y1y2]

and y1=b11z1+b12z2

y2=b21z1+b22z2


as: 𝐲=𝐁𝐳

Using the associative law:

𝐱=𝐀(𝐁𝐳)=𝐀𝐁𝐳=𝐂𝐳

and so:

𝐂=𝐀𝐁=[(a11b11+a12b21)(a11b12+a12b22)(a21b11+a22b21)(a21b12+a22b22)]

as before.

Indicial Notation

Consider a simple rotation of coordinates:

xμ is defined as x1 , x2

xν is defined as x1' , x2'

The scalar product can be written as:

𝐬.𝐬=gμνxμxν

Where:

gμν=[1001]

and is called the metric tensor for this 2D space.

𝐬.𝐬=g11x1x1'+g12x1x2'+g21x2x1'+g22x2x2'

Now, g11=1, g12=0, g21=0, g22=1 so:

𝐬.𝐬=x1x1'+x2x2'

If there is no rotation of coordinates the scalar product is:

𝐬.𝐬=x1x1+x2x2

s2=x12+x22

Which is Pythagoras' theorem.

The Summation Convention

Indexes that appear as both subscripts and superscripts are summed over.

gμνxμxν=g11x1x1'+g12x1x2'+g21x2x1'+g22x2x2'

by promoting ν to a superscript it is taken out of the summation ie:.

gμνxμxν=g1νx1xν'+g2νx2xν'

Matrix Multiplication in Indicial Notation

Consider:

Columns times rows:

[x1x2] times [y1y2] = [x1y1x1y2x2y1x2y2]


Matrix product π—π˜=xiyj Where i = 1, 2 j = 1, 2

There being no summation the indexes are both subscripts.

Rows times columns: [x1x2] times [y1y2] = [x1y1x2y2]

Matrix product π—π˜=δijxiyj

Where δij is known as Kronecker delta and has the value 0 when ij and 1 when i=j. It is the indicial equivalent of the unit matrix:

[1001]

There being summation one value of i is a subscript and the other a superscript.

A matrix in general can be specified by any of:

Mij , Mij , Mji , Mij depending on which subscript or superscript is being summed over.

Vectors in Indicial Notation

A vector can be expressed as a sum of basis vectors.

𝐱=a1𝐞1+a2𝐞2+a3𝐞3

In indicial notation this is: x=aiei

Linear Transformations in indicial notation

Consider 𝐱=𝐀𝐲 where 𝐀 is a coefficient matrix and 𝐱 and 𝐲 are coordinate matrices.

In indicial notation this is:

xμ=Aνμxν

which becomes:

x1=a11x1'+a12x2'+a13x3'

x2=a21x1'+a22x2'+a23x3'

x3=a31x1'+a32x2'+a33x3'

The Scalar Product in indicial notation

In indicial notation the scalar product is:

𝐱.𝐲=δijxiyj

Analysis of curved surfaces and transformations

It became apparent at the start of the nineteenth century that issues such as Euclid's parallel postulate required the development of a new type of geometry that could deal with curved surfaces and real and imaginary planes. At the foundation of this approach is Gauss's analysis of curved surfaces which allows us to work with a variety of coordinate systems and displacements on any type of surface.

Elementary geometric analysis is useful as an introduction to Special Relativity because it suggests the physical meaning of the coefficients that appear in coordinate transformations.

Suppose there is a line on a surface. The length of this line can be expressed in terms of a coordinate system. A short length of line Δs in a two dimensional space may be expressed in terms of Pythagoras' theorem as:

Δs2=Δx2+Δy2

Suppose there is another coordinate system on the surface with two axes: x1, x2, how can the length of the line be expressed in terms of these coordinates? Gauss tackled this problem and his analysis is quite straightforward for two coordinate axes:

Figure 1:

It is possible to use elementary differential geometry to describe displacements along the plane in terms of displacements on the curved surfaces:

ΔY=Δx1δYδx1+Δx2δYδx2

ΔZ=Δx1δZδx1+Δx2δZδx2

The displacement of a short line is then assumed to be given by a formula, called a metric, such as Pythagoras' theorem

ΔS2=ΔY2+ΔZ2

The values of ΔY and ΔZ can then be substituted into this metric:

ΔS2=(Δx1δYδx1+Δx2δYδx2)2+(Δx1δZδx1+Δx2δZδx2)2

Which, when expanded, gives the following:

ΔS2=

(δYδx1δYδx1+δZδx1δZδx1)Δx1Δx1

+(δYδx2δYδx1+δZδx2δZδx1)Δx2Δx1

+(δYδx1δYδx2+δZδx1δZδx2)Δx1Δx2

+(δYδx2δYδx2+δZδx2δZδx2)Δx2Δx2

This can be represented using summation notation:

ΔS2=i=12k=12(δYδxiδYδxk+δZδxiδZδxk)ΔxiΔxk

Or, using indicial notation:

ΔS2=gikΔxiΔxk

Where:

gik=(δYδxiδYδxk+δZδxiδZδxk)

If the coordinates are not merged then Δs is dependent on both sets of coordinates. In matrix notation:

Δs2=𝐠Δ𝐱Δ𝐱

becomes:

[Δx1Δx2] times [abcd] times [Δx1Δx2]

Where a, b, c, d stand for the values of gik.

Therefore:

[Δx1a+Δx2cΔx1b+Δx2d] times [Δx1Δx2]

Which is:

(Δx1a+Δx2c)Δx1+(Δx1b+Δx2d)Δx2=Δx12a+2Δx1Δx2(c+b)+Δx22d

So:

Δs2=Δx12a+2Δx1Δx2(c+b)+Δx22d

Δs2 is a bilinear form that depends on both Δx1 and Δx2. It can be written in matrix notation as:

Δs2=Δ𝐱𝐓𝐀Δ𝐱

Where A is the matrix containing the values in gik. This is a special case of the bilinear form known as the quadratic form because the same matrix (Δ𝐱) appears twice; in the generalised bilinear form 𝐁=𝐱𝐓𝐀𝐲 (the matrices 𝐱 and 𝐲 are different).

If the surface is a Euclidean plane then the values of gik are:

[δY/δx1δY/δx1+δZ/δx1δZ/δx1δY/δx2δY/δx1+δZ/δx2δZ/δx1δY/δx2δY/δx1+δZ/δx2δZ/δx1δY/δx2δY/δx2+δZ/δx2δZ/δx2]

Which become:

gμν=[1001]

So the matrix A is the unit matrix I and:

Δs2=Δπ±π“πˆΔ𝐱

and:

Δs2=Δx12+Δx22

Which recovers Pythagoras' theorem yet again.

If the surface is derived from some other metric such as Δs2=ΔY2+ΔZ2 then the values of gik are:

[δY/δx1δY/δx1+δZ/δx1δZ/δx1δY/δx2δY/δx1+δZ/δx2δZ/δx1δY/δx2δY/δx1+δZ/δx2δZ/δx1δY/δx2δY/δx2+δZ/δx2δZ/δx2]

Which becomes:

gμν=[1001]

Which allows the original metric to be recovered ie: Δs2=Δx12+Δx22.

It is interesting to compare the geometrical analysis with the transformation based on matrix algebra that was derived in the section on indicial notation above:

𝐬.𝐬=g11x1x1'+g12x1x2'+g21x2x1'+g22x2x2'

Now,

gμν=[1001]

ie: g11=1, g12=0, g21=0, g22=1 so:

𝐬.𝐬=x1x1'+x2x2'

If there is no rotation of coordinates the scalar product is:

𝐬.𝐬=x1x1+x2x2

s2=x12+x22

Which recovers Pythagoras' theorem. However, the reader may have noticed that Pythagoras' theorem had been assumed from the outset in the derivation of the scalar product (see above).

The geometrical analysis shows that if a metric is assumed and the conditions that allow differential geometry are present then it is possible to derive one set of coordinates from another. This analysis can also be performed using matrix algebra with the same assumptions.

The example above used a simple two dimensional Pythagorean metric, some other metric such as the metric of a 4D Minkowskian space:

ΔS2=ΔT2+ΔX2+ΔY2+ΔZ2

could be used instead of Pythagoras' theorem.