User:TStein/Gauge invariance

Gauge Invariance is a principle usually first encountered in Electricity and Magnetism with the scalar potential (V) and the vector potential A. It is sometimes thought of—and presented as—a quirk of electromagnetism that detracts from the physical nature of these potentials. The reality is that gauge invariance is a universal and fundamental property of energy and momentum. It is a corner stone of many modern theories of physics including gauge theory and general relativity.

Gauge Invariance in Introductory Physics[edit]

The simplest forms of gauge invariance are encountered early in physics with the introduction of energy and momentum. The terminology of gauge invariance is not used then, but they do serve to illustrate some important properties of gauge transformations and gauge invariance.

Simple Gauge Invariance: Energy in a Conservative System[edit]

The work, W, done in changing the height of an object of mass, m, by a distance ∆y is W = mg∆y, where g is the acceleration of gravity. If an object is raised and lowered many times along a complex path such as a roller coaster it becomes convenient to define a potential energy U = mgy where y is not a change of height but the height from some arbitrary height called the ground, which we define to have y = 0. With this definition the numerical value for U becomes arbitrary to an additive constant. This arbitrary constant is the simplest example of a gauge.

This ambiguity of energy to an additive constant does not affect the motion of the object, of course. In real life, energy is only measured when it is changed. It is the change in energy ∆U = mg∆y that does work and that quantity is invariant for any choice of gauge.

Simple Gauge Invariance: Momentum[edit]

It is not as widely noted, but momentum can also be expressed in an infinite number of gauges. But momentum is far worse, in the sense that it is arbitrary to an additive constant vector! Momentum, p, is defined as the product of mass, m, and velocity v, p = mv. But, what shall we define as zero velocity? If two moving vehicles collide shall we use the reference frame of the first car (defining its velocity as zero) or shall we use the reference frame of the second Mac Truck, or shall we use the ground? What about the fact that both vehicles are on a spinning Earth orbiting a Sun which is it self orbiting a moving galaxy?

Different reference frames give different numerical values for p corresponding to different gauges. They differ by additive constant vectors yet the physics remains the same. For instance, the force on a particle F is the time rate of the change of its momentum F = dp/dt and is invariant to the choice of the gauge. In fact, all physical quantities must be gauge invariant.

Full Blown Gauge Invariance: Electrodynamics[edit]

What you may not be told is that the equation W = −∆U depends on the physics not changing with time. If Earth was loosing mass (and therefore g was changing) with time then even the change in potential energy becomes arbitrary since the potential formulation only deals with changes in position. In addition, we may want to define a ground that varies with position such as a roller coaster on a hill. How do we deal with these types of gauges?

In summary, if the physics of a situation does not vary with time we can define a potential energy that varies with position. If the physics of a situation does not vary with location (here or at the top of Mt. Everest) then we can define a momentum that varies with time. Both of these quantities are arbitrary to an additive gauge. Yet the force (F = −dU/dx or F = −dp/dt) and therefore the equations of motion are invariant to the choice of gauge. But what if the physics varies with both time and location? What if we want to make a gauge change to an accelerating (or non-inertial) reference frame or to one that varies with position such as the roller coaster on the hill? Can we still define a force in terms of U and p and will it still be gauge invariant? The answer to all these questions is yes, although we will need to discuss gauge invariance in electrodynamics before we get there.

Electrostatics[edit]

Gauge invariance is encountered in all its messy glory in Electrodynamics. Simple gauge invariance is reintroduced with the definition of the (Scalar) Potential, V, in electrostatics. A movable charge Q is pushed around with a force, F due to a stationary charge(s) q. The force on Q varies with location relative to q in a well understood way. By calculating the work done in moving the charge we can define an (arbitrary to a constant gauge depending on what we call ‘ground’) electric potential energy. It is a simple matter to divide by Q, to get something that does not depend on the charge being moved but only on the charges q doing the work (say in a battery). We call that something the electric potential V (due to the battery or whatever the voltage source is). It is not surprising that V is arbitrary to the same type of gauge as the potential energy, U, since it differs from U only by a multiplicative constant, Q. (Care must be taken here to note that if the charges q that are creating V are moving then V is much more arbitrary than to a constant and the energy equation of electrostatics, W = −Q∆V, fails without additional terms.)

Magnetostatics[edit]

The real fun begins with magnetostatics, though. Currents of charge create a magnetic field which loops around the current like tree rings around the center of a trunk. All currents have magnetic fields that loop around them. The reverse is also true, magnetic fields always comes in loops. And, where ever there is a magnetic field loop there has to be someplace within that loop a current (normal to the loop) that created it. We say that the current is a circulation source for the magnetic field and we define a mathematical quantity called the curl with which you can calculate the ‘circulation source’ of any field that is looping around. If we know the magnetic field everywhere in a region we can easily determine the current that caused it since it is proportional to curlB.

What makes this interesting is that it can be mathematically shown that the magnetic field B itself can be determined as the curl of something else we call the vector potential, A, B = curlA. By this definition, A circulates around the magnetic field, B, which we already knows circulates around the current. This would be a mere curiosity, at this point, except for the fact that it is sometime much easier to calculate a vector potential A that corresponds to a given current then to calculate the magnetic field created from that same current. From there it is easy to calculate the magnetic field B = curlA.

Because of the nature of the curl, there are an infinite number of A fields that will generate the same B. These choices are called gauges. For example, in a particular gauge, A points in the same direction as the current and can be straightforwardly calculated from the current with essentially the same equation that relates the charge, q, to the potential, V, in electrostatics. (This is not a coincidence and makes complete sense when discussed in terms of relativity.) The vector potential in this gauge looks similar to the current although it is spread out. Other gauges will have that same pattern but will have added to it a field that does not loop but instead moves outward (or inward) from arbitrary locations.

Mathematically, we say that A is arbitrary not just to a constant but to an infinite set of fields created by a mathematical operator called a gradient. You can add whole classes of functions (more technically fields) to A and have it still produce the same curl (in other words the same magnetic field B). Yikes! Is it really that bad? Well it is that bad although it is also necessary and an important quality of nature shared with the energy and momentum of all systems. This quality is called gauge invariance and is the same property that we saw in a much more limited form for U, p, and V. In fact, the reason that A has the same property as p is that A represents a sort of potential or stored momentum per unit charge in the same way that V represents stored energy per unit charge. (If you know anything about relativity then all kinds of bells should be going off in your head. More on that later.)

Physical interpretation of gauge invariance in magnetostatics[edit]

To understand what this means, consider an analogy of A with the circulation of water cascading down a giant infinitely long screw at a constant speed. Here we are interested in the horizontal motion only. From this point of view the water flows in circles around the center of the screw like A flows around B. In this analogy, the magnitude and direction of A at any given point is represented by the magnitude and direction of the horizontal motion of the water. If we took the curl of this situation (curlA=B) we would find a magnetic field that pointed straight up or down within the screw depending on whether the screw was clockwise or counterclockwise. Gauge invariance says we can warp the screw anyway we liked as long as we did the same to each level and it will produce the same circulation. The warping would drastically affect A, how the water flows, but it would not affect how much it loops. The shape of the loops would change but not what was causing the circulation.

To put it in a more mathematical form, the gauge invariance of A corresponds to adding to it the gradient of an arbitrary function. (In this case that function is the way we warped the screw.) The gradient is well named in that it calculates the slope or grade of the lay of the land (or any other function that changes with position) from knowing how its height varies with location. For hills, large gradients corresponds to steeper slopes produce faster flows of water falling down them. Water flows outwards in all directions from the top of a hill as does the (negative of the) gradient of any function near a maximum. Most importantly, from our perspective, no matter how complex a set of hills and valleys are it is impossible to have a gradient such that water will flow in a continuous loop. Hills and valleys cannot produce circulation. For the same reason, the curl of a gradient is always zero and none of the warping of your screw will change the circulation at all. Completing our analogy it becomes apparent that one can add the gradient of any function to A and still produce the same B.

This begs the question, which A is the correct A? Which is the correct gauge? You should be able to guess the correct answer already. All of them!! What matters is how A changes and not A itself. In certain situations the change of A with time, dA/dt, like momentum, is a physical invariant. In most cases, though, A alone is not sufficient. One needs both A and V. (The same is true for V. Only in certain situations does V have a physical meaning by itself.) To understand this, it is first necessary to study what occurs with more complicated situations than stationary charges (electrostatics) or constant currents (magnetostatics). One needs to examine electrodynamics.

Electrodynamics[edit]

It is unfortunate that magnetostatics comes before electrodynamics, because it is in electrodynamics that we can see what A means physically. Imagine an infinite coil of wire with a current flowing in the coils. There is no magnetic field outside in this case, but there is an A field. In the simplest gauge, A loops around in the same manner as the current but it extends further and weaker as you move outward.. The vector potential is directly proportional to the current so doubling the current doubles A everywhere. Something interesting happens, though, if A is changed by increasing (or decreasing) the current. The changing A will create an electrical force (per unit charge) E opposite to the direction that A is changing. This effect is well known and should be considered a law of nature as strong as that of the law of gravity. This induced electric field is proportional to the time rate of change of A reminiscent of how the force depends on the time rate of change of a momentum! Because of this, it is natural to interpret A as a potential momentum per unit charge, in the same manner that the potential is a potential energy per unit charge.

But, wait a minute! What about the gauge invariance of A? How can something ‘physical’ like a force depend directly on something that is arbitrary to entire classes of fields? To understand that we have to understand that the full set of forces depends not only on A alone but on A together with V and that the full force equation as determined from A and V is more complicated. Further any gauge transformation in A must be accompanied by a different transformation of V that keeps all ‘physical’ quantities the same. Together both transformations form a gauge transformation. The most important thing to understand about gauge invariance is that is that it is neither a property of A or of V, but of both together. The three components of A (pointing in the three spatial dimensions) together with V represent the four components of another quantity called the four vector potential and this is the quantity that is gauge invariant. The exact prescription for gauge invariance is that if you add the gradient of a function to A (A→A+gradf) then you must subtract the time rate of change of that same function from V (V→V −∂f/∂t).

Physical Meaning of Gauge Invariance[edit]

Gauge invariance is not just an odd property of A and V but is a result of them being related to momentum and energy respectively. This becomes more apparent when we recall the simple cases of gauge invariance for energy and momentum. To obtain the simple gauge invariance for U we moved to another reference system that is shifted up or down by a constant value. To obtain the simple gauge invariance of p we shifted to a uniformly moving reference frame. (We do not need to shift to a different reference frame to have a gauge change it just makes it easy to visualize.) Mathematically the gauge transformation of U represents using a gauge transformation with f = −y_ot and that for p is f = v_oxx (for a velocity shift is in the x direction) for these simple cases. Since there is no explicit spatial dependence for the first term and no explicit time dependence for the second term, the first term changes the energy but not the momentum while the second does the reverse.

To extend this analogy to more complicated situations it is helpful to consider non-inertial systems. (This consideration is not necessary for gauge invariance but it is helpful to understand it.) Let us examine the case of a rocket accelerating at a constant rate a. A ball, with mass m, let go inside the rocket at the nose of the rocket will fall toward the bottom as it accelerates into the ball. An observer inside the rocket will conclude there is a force F = −ma (the − means that it is moving downward) and would naturally conclude that there is a potential energy U = may that produces this force. An observer from a ‘stationary’ reference frame outside the rocket will see the floor of the rocket accelerate up so that is picks up an additional speed of at. When the stationary observer predicts what the observer in the rocket sees it is natural to transfer this speed to the accelerated reference frame. In this point of view it is the ball which picks up a speed −at and therefore gains a momentum of −mat this change of momentum occurs because of a change of potential momentum p_pot = +mat. (This potential momentum is, of course, due to the floor accelerating up to hit the ball. Note as well that the potential momentum is in the same direction as the rocket is accelerating.)

The stationary viewer sees U = 0 and p_pot = +at. The observer on the rocket will see U = may and p_pot = 0. Yet, both predict the same force. To see that this represents a gauge transformation using f = mayt, notice that gradf =mat and −∂f/∂t = −may.

We can extend this analogy to other accelerating systems. We could for instance have another rocket accelerating at a different acceleration looking into the first rocket and predicting what a view in the first rocket will see. Such a person will see a different mixture of U and p_pot but will predict the same force (as they must since they are looking at the same system). The force equation will have the same form as the first two cases F = −gradU −∂p_pot/∂t involving changes in U and p_pot with distance and time respectively. (It should be noted that F will be more complicated for non-conservative forces like those of electrodynamics, but the results are the same.) We can imagine more complicated reference systems with different velocities and accelerations for each point in space. Each of these systems will yield different combinations of U and p_pot that are valid for the view inside the rocket. From this we see that gauge invariance reflects the fact that different viewers looking into the same system from different perspectives must predicts the same force for the perspective of the common system. In this sense gauge invariance is akin in spirit to the equivalence principle from general relativity, although it is different in that it does not involve gravity explicitly.

Gauge invariance is a necessary ‘symmetry’ that all physical systems need to have. Indeed it is a fundamental postulate of all modern theories of physics such as general relativity and gauge theory.

Mathematics of Gauge Invariance[edit]

Gauge Invariance in Classical Mechanics (Lagrangian)[edit]

In classical physics, the laws of physics as expressed by Newton's 2nd Law (F = m a) can be equivalently expressed in an alternative manner involving a quantity called the Lagrangian. Although this method does not look the same as the more familiar (F = m a) and is not as intuitive it is just as fundamental (for non-dissipative systems). Further it has the advantage, unlike Newton's 2nd Law, of being valid for quantum mechanics . One could argue that since the laws of quantum mechanics are more fundamental that the Lagrangian is the more fundamental of the two.

Introduction to the Lagrangian[edit]

The Lagrangian has a different initial goal, although it produces the same end result. The Lagrangian, L, usually has the form of L = T − U, where T is the kinetic energy. For the simple situation of a single object of mass m moving at a speed v the kinetic energy T = ½ mv². The potential energy, U, is assumed to be a function of position only. The Lagrangian is naturally a function of position, x, and speed, v. (Here we consider motion in one dimension only. Allowing forces and motion in other directions is a straight forward extension of the 1 dimensional case and does not change the physics.) The 1D Lagrangian has the form of,

L(x,v)={\frac {1}{2}}mv^{2}-U\left(x\right).

If we watch the motion of an object along a given path as its Lagrangian varies and calculate the quantity of L∆t for a given small time interval ∆t then add all these quantities up we will obtain another quantity called the action S,

S=\int Ldt=\int \left[{\frac {1}{2}}mv^{2}-U(x)\right]dt.

The action, S, is not a function of v, x, or t; rather, it is a function of the path the object took, in other words it depends on x(t).

The reason why the Lagrangian and the action are important is that if you calculated the action for all paths (including paths that the object would not naturally take) you would find that the action of the path that the particle actually takes has the smallest action S. (For the technically correct, sometimes the correct path maximizes S instead of minimizing it, but most often the correct path has minimal S.)

Our mission then, should you chose to accept it, is to find the path x(t) that minimizes S. Fortunately, that has already been done. The correct path x(t) must satisfy the following Euler Equation:

{\frac {d}{dt}}\left({\frac {\partial L}{\partial v}}\right)-{\frac {\partial L}{\partial x}}=0.

Turning the crank we find that, after all that work, we have managed to derive Newton's 2nd Law of motion. Indeed, one use of the Lagrangian is as a Newton's 2nd Law generator for coordinate systems where it is easier to derive T and V than the acceleration a. That is not what concerns us here, though. Examining the argument of the time derivative of the first term for the simple case of T = ½ mv² we notice that this quantity,

{\frac {\partial L}{\partial v}}={\frac {\partial (T-V)}{\partial v}}={\frac {\partial \left({\frac {1}{2}}mv^{2}-U(x)\right)}{\partial v}}=mv=p.

Further if we calculate the quantity,

pv-L=(mv)v-\left[{\frac {1}{2}}mv^{2}-U\left(x\right)\right]={\frac {1}{2}}mv^{2}+U\left(x\right)={\frac {p^{2}}{2m}}+U\left(x\right)=H,

we will have calculated the total energy H in terms of our derived momentum, p. Not only is the Lagrangian a Newton's 2nd law generator it is a p and H generator as well!

Gauge invariance in the Lagrangian Formulation[edit]

We are finally to the point where we can show that all of classical mechanics is gauge invariant. Consider the transformation of the Langrangian L to a new Lagrangian L' such that L' = L + df/dt, where f is any function of x and t what so ever! It is straight forward to show that this new Lagrangian (L') must produce the exact same equation of motions as the first. If we examine that part of the action due to df/dt,

=\int {\frac {df}{dt}}dt=f(t_{f})-f(t_{i})=constant,

we obtain a constant that does not vary with the path. (For the intermediate step we used the fundamental theorem of calculus and the fact the limits of the integral goes from an initial time t_i to a final time t_f). Since the action of this part of L' does not vary with the path it will not affect the process of minimizing the action. We say that the Lagrangian is invariant to the addition of the time derivative of a function of space and time. If we calculate p and H for this new Lagrangian we find that they are:

p'=p+{\frac {\partial f}{\partial x}},H'=H-{\frac {\partial f}{\partial t}},

where we have used the fact that df/dt = ∂f/∂t + ∂f/∂x dx/dt = ∂f/∂t + v ∂f/∂x. Note that extending these equations to 3 dimension yields the gauge transformation equations of A and V. Here we are transforming the total energy and the total momentum, though, instead of potential energy and potential momentum. Notice, as well, that p and H comes in pairs, if we transform one we have to transform the other.

In the previous section we have shown that gauge invariance is a necessary property of nature similar to the equivalence principle. Here we have shown that all classical mechanics is gauge invariant. What we thought were immutable properties called momentum and energy are instead quite flexible and subject to gauge transformations. This flexibility is usually hidden because we have chosen a gauge where the total potential momentum = 0. Potential energy and momentum are tied together just as the total energy and momentum are. Only in special circumstances can we specify a particular gauge where one or the other is zero. Otherwise we must have both.) Conservative systems are one of those special cases, where we can choose the potential momentum to be 0 and we do. Electrodynamics is the first non-dissipative conservative force, where this simple gauge becomes impossible and therefore it seems stranger than it should.

Gauge Invariance in Quantum Mechanics (Schrödinger's Equation)[edit]

Classical mechanics, while valid for the majority of physical properties around us is only an approximation for what is really happening. In the majority of the physical processes in our world, where classical mechanics is valid, we can talk about one object acting on another with a well known force at a well known location and time. We know exactly where an object is and where it will be. Yet that is an approximation that is only valid when we have the interaction of a very large number of particles.

In a world where there are only a few particles interacting with each other, such as one photon of light hitting an electron that is ‘orbiting’ a nucleus, the laws of physics are quite different. Together, the theory that describes how small numbers of ‘particles’ interact with each other is known as Quantum Mechanics. In quantum mechanics everything is a particle (or quantized). Even light (which scientists normally describe in terms of being a wave) is actually formed of many particles called photons. Particles are distinguished from waves in that they come in packets of energy (and momentum) that cannot be subdivided. A photon of light carries a specific amount of energy and momentum that it can give up (either all or in part) to another particle when it collides. Unlike a wave on a beach which interacts with the entire beach at once a photon interacts only with one other particle at a time.

But quantum mechanical particles are very different from particles as we understand them because of the uncertainty principle. The upshot of this is that we cannot know exactly where a particle is for a given time. What we can do is describe the probability of where the particle is. We can also describe an average momentum, energy, and location, in addition to other average quantities. We can say that if the particle is here it will have this energy and momentum and if it is there it will have this other energy and momentum, but we cannot say exactly where it is. If we manage to localize it by for instance having it pass through a narrow opening then this will cause the uncertainty in the momentum to increase so that we will not know where it will be in the future. In short quantum mechanical particles travel like waves but hit like a particle.

Introduction to Schrödinger's Equation[edit]

The key quantity that describes these physical parameters is function of position and time called the wave function, ψ. It is a complex function in that every given location has a real and an imaginary number associated with it. This amplitude can alternatively be expressed as in terms of a magnitude and a phase. The phase tells what portion of the magnitude is real or imaginary. For examples a phase of 0° is positive and real, 90° positive and imaginary, 180° is negative and real, and 270° is negative and imaginary. Angles in between are specific mixtures of real and imaginary; for example 45° corresponds to 0.707 real and 0.707 imaginary. The convention of physics is to say that complex numbers cannot be used to describe anything physical. All ‘physical’ quantities such as probability must involve ‘squares’ of the wave function (Mathematicians have a tendency of disparaging such statements. Imaginary numbers are no less real than real numbers. Both are abstractions used to describe the ‘physical’ world. It is a useful rule for physicist, though, since this property is enforced by the model we use.) The probability (per unit volume) of a particle being in that unit volume, for instance, is expressed as ψ* ψ, which is a real number. (The quantity ψ* is the same as ψ with the sign of the imaginary part reversed.) The energy (per unit volume) for a given location is ψ*H ψ which again is a real quantity. (Note that H is an energy operator and operates on functions to produce different functions. Therefore as a given rule one cannot exchange (mathematically we say commute) H with ψ or ψ*. (ψ*H ψ ≠ ψ *ψH ≠ Hψ* ψ).

In this model of quantum mechanics no ‘physical quantity’ can depend on the phase of ψ. For example ψ* ψ has same value if ψ is real positive as it has for ψ being negative imaginary as it has for any combination of real and imaginary parts. Further this phase doe not have to be the same everywhere for this to work. You can change the phase of ψ any way you want for each individual point in space and ψ* will be changed in exactly the right way to cancel that change out. This property is known as phase invariance and it is equivalent to gauge invariance.

To obtain a quantum expression one starts with a classical expression then replace all the relevant quantities by operators which are things (for example derivatives) which change one function to another. For example the classical expression for energy is:

H={\frac {\left(\mathbf {p} -\mathbf {A} \right)^{2}}{2m}}+U(x).

Here we have included a potential momentum term A in anticipation of this phase invariance. (Remember that our conventional gauge choice is for A = 0, but that is not true for other gauges.) Note as well that this is a generic expression valid for all simple systems of A, U. (In electrodynamics one would simply replace A by qA and U by qV, where q is the charge of the moving particle.) The operator for x and t and any function of x and t are very simple you just multiply by the relevant quantity. The operator for p_x is p_x→ −iħ ∂/∂x and that for H is H→ iħ ∂/∂t. (You may be hearing more bells from relativity here.) With this the quantum mechanical Schrödinger equation is formed.

i\hbar {\frac {\partial }{\partial t}}\psi ={\frac {1}{2m}}\left(-i\hbar {\frac {\partial }{\partial x}}-A\right)^{2}\psi +U(x)\psi

Gauge invariance of Schrödinger's Equation[edit]

We are now in a position to show that gauge invarience is equivalent to phase invariance and is inherent in Schrödinger's Equation. The phase rotation of ψ can be expressed as ψ → ψ' = e^{i f(x,t)/ħ ψ, where e^y is the exponential of y. Phase invariance says that we should have the same equation for ψ' as for ψ. If we solve for ψ in terms of ψ' we get ψ = e^{−i f(x,t)/ħ ψ'. Plugging that expression ψ into the Schrödinger equation we see that:}}

H\psi =i\hbar {\frac {\partial }{\partial t}}\left(e^{i{\frac {f(x,t)}{\hbar }}}\psi '\right)=i\hbar \left(-{\frac {i}{\hbar }}{\frac {\partial f(x,t)}{\partial t}}e^{-i{\frac {f(x,t)}{\hbar }}}\psi '+e^{-i{\frac {f(x,t)}{\hbar }}}{\frac {\partial \psi '}{\partial t}}\right)=

e^{-i{\frac {f(x,t)}{\hbar }}}\left({\frac {\partial f(x,t)}{\partial t}}+i\hbar {\frac {\partial }{\partial t}}\right)\psi ',

where here we have used the product rule for differentiation and the derivative of the exponential function. We then apply the same procedure to first of the momentum operators:

\left(p-A\right)\psi =\left(-i\hbar {\frac {\partial }{\partial x}}-A\right)e^{-i{\frac {f(x,t)}{\hbar }}}\psi '=e^{-i{\frac {f(x,t)}{\hbar }}}\left(-i\hbar \left(-{\frac {i}{\hbar }}{\frac {\partial f(x,t)}{\partial x}}+{\frac {\partial }{\partial x}}\right)-A\right)\psi '=

e^{-i{\frac {f(x,t)}{\hbar }}}\left(p-{\frac {\partial f(x,t)}{\partial x}}-A\right)\psi '

using the same steps as for H and using p = iħ∂/∂x for the last step. Notice that pulling the phase term (the exponential) out front had the effect of subtracting an additional term, ∂f(x,t)/∂x, from p. Pulling this phase factor through the second identical momentum term is therefore trivial and our final result is:

e^{-i{\frac {f(x,t)}{\hbar }}}\left[{\frac {1}{2m}}\left(p-\left(A+{\frac {\partial f(x,t)}{\partial x}}\right)\right)^{2}+\left(U(x)-{\frac {\partial f(x,t)}{\partial t}}\right)\right]\psi '=e^{-i{\frac {f(x,t)}{\hbar }}}H\psi ',

which, after canceling the same phase factor on both sides of the equation, is an equation for ψ' that is the same as our original equation for ψ if we let A' = A + ∂f(x,t)/∂x and U' = U − ∂f(x,t)/∂x. These are, of course, our equations for gauge invariance. Again we see that gauge invariance is a fundamental property of nature and any physical theory needs to include it, including quantum mechanics.

Gauge Invariance and Special Relativity[edit]

As we have seen, gauge invariance is a general property applicable to the potentials of both classical and quantum mechanics. Here we will show that gauge invariance is a general property of special relativity as well. Consider a moving bus. If the bus is moving perfectly smoothly with no accelerations either to turn or to speed up or slow down, then people inside the bus cannot tell that they are moving. From this point of view, the bus is stationary and the world is moving backwards. The technical term for such a smoothly moving reference frame is called an inertial system. If they checked, the people on the bus would see the same laws of physics applying inside the bus as those used in the reference frame of the ground. The numbers they used and the numbers they calculated would be different but the equations would be the same. In fact any person moving in any inertial frame in any direction will see themselves as being stationary while the rest of the world moves and each of these inertial frames would have the same equations with nothing to distinguish them. So which is the correct stationary system? The answer should not surprise you by now: all of them. There is no absolute stationary frame. What matters is the relative motion of one reference frame to another. (After all even the Earth is moving.)

Introduction to Relativity[edit]

Relativity describes what a person in one reference frame observes when they 'look' into another reference frame that is moving relative to it. Imagine that a person on the bus sets up a grid with a ruler and a clock on each corner of the grid that are all synchronized in his reference frame. Using this coordinate system he is able to measure the location and time of any event in his reference frame. He may label a particular event, such as a blown tire, as (t',x',y',z') where t' is the time that the event occurred and the other coordinates represent the location of the event. The person on the ground sets up her own (stationary to her) grid with her own synchronized clocks. In her reference frame, the grid on the bus is moving with a speed v and the blown tire occurs at a time and location of (t,x,y,z). For a given time and location of any event in the primed reference frame (t',x',y',z'), relativity allows us to calculate the observed time and location in the unprimed frame (t,x,y,z). Such a calculation is called a transformation and depends on the relative velocity between the two reference frames.

The correct transformation that is valid for all relative velocities is called a Lorentz Tranformation and is well proven for a large range of speeds. For normal relative speeds the Lorentz transformation agrees very well with everyday experience. But, for speeds near the speed of light (or extremely sensitive measurements at slower speeds) the transformation has a number of oddities that we typically do not experience. The person on the ground and the person on the bus will disagree about almost anything that we would think they should agree on. Both will see the clocks of the other system running slower and the distances between their grids as being shorter. Worse, both will observe the clocks of the other system as being unsynchronized even though each worked painfully hard to synchronize the clocks in their own reference frame. They will agree however on a few things which we will call invariants. For instance, they both will agree about what the clocks read in the clocks own reference system (but only at that location). This time is called an objects proper time, τ.

An important invariant is that all reference systems notice that there is a speed limit above which nothing can travel faster, in their own reference system. This speed limit is given the symbol c and has the same value in all reference systems (c = 300,000 km/s = 186,000 mi/hr). In fact this 'speed limit' and the fact that everyone measures it as having the same value may be seen as a cause of all this oddity. Imagine event A shoots out something moving at c which then causes an event B at a different location. Different reference systems will see the trip starting at different times, taking different amounts of time, and covering different distances. They will agree, however that the particle traveled at the same speed c and reached the location just in time to cause event B. If the coordinates of the distance and time traveled during the trip is (∆t,∆x,∆y,∆z) then the square of the distance traveled is ∆x² + ∆y² + ∆z² which everyone agrees equals (ct)² even though they disagree about almost everything else.

Therefore the value of the quantity (∆s)² = (ct)² − (∆x² + ∆y² + ∆z²) is the same for all reference systems and is identical to zero if the events A and B are linked by a particle traveling at speed c. This calculated quantity is called the interval and is an important quantity in that it is always invariant for any event A and B whatsoever. The interval will be positive for all events in which something traveling at c will reach its location before B occurs. Intervals that are positive or zero are said to be causal in that event A could affect or cause event B if it happen to send out a particle at speed c. Zero intervals are barely causal. Negative intervals indicate that A cannot cause B since the event occurred before news (moving at speed c) could reach it.

Gauge Invariance of the relativistic momentum-energy[edit]

Products and quotients of invariants are themselves invariant which allows us to form many other invariants from the ones we have already: c, τ, (∆s) ². One important invariant relates the energy and momentum of a particle and is E² − (pc) ² = (mc²)², where E is the energy of a particle, p is the momentum of a particle and m is the mass of the particle. Defining a total energy H = E + V and a total momentum P = p + A, where V and A represent stored energy and momentum respectively, we can solve for the total energy

H={\sqrt {\left(\mathbf {P} -\mathbf {A} \right)^{2}+\left(mc^{2}\right)^{2}}}+V,

where A and V are functions of time and position. The force on the particle is F = dp/dt = dP/dt − dA/dt. We will solve for this force using the Hamiltonian formulation,

{\frac {\partial H}{\partial x}}=-{\frac {dP_{x}}{dt}},

{\frac {\partial H}{\partial P_{x}}}={\frac {dx}{dt}}=v_{x},

and the corresponding equations for the y and z components. Care must be taken here to distinguish between the partial derivatives and the full derivatives. Using these equations we arrive at:

-{\frac {\mathbf {P} -\mathbf {A} }{\sqrt {\left(\mathbf {P} -\mathbf {A} \right)^{2}+\left(mc^{2}\right)^{2}}}}\cdot {\frac {\partial \mathbf {A} }{\partial x}}+{\frac {\partial V}{\partial x}}=-{\frac {dP_{x}}{dt}},

{\frac {P_{x}-A_{x}}{\sqrt {\left(P-A\right)^{2}+\left(mc^{2}\right)^{2}}}}=v_{x}

and the corresponding equations for the y and z components. Plugging the second equation (and the other corresponding equation) into the first then solving for the force we obtain:

F_{x}={\frac {dP_{x}}{dt}}-{\frac {dA_{x}}{dt}}=\mathbf {v} \cdot {\frac {\partial \mathbf {A} }{\partial x}}-{\frac {\partial V}{\partial x}}-{\frac {dA_{x}}{dt}}.

Expanding the dot product and using the chain rule for full derivatives (d/dt = ∂/∂t + v_x∂/∂x + v_y∂/∂y + v_z∂/∂z) we get:

F_{x}=\left(v_{y}{\frac {\partial A_{y}}{\partial x}}-v_{y}{\frac {\partial A_{x}}{\partial y}}\right)+\left(v_{z}{\frac {\partial A_{z}}{\partial x}}-v_{z}{\frac {\partial A_{x}}{\partial z}}\right)+\left(-{\frac {\partial V}{\partial x}}-{\frac {\partial A_{x}}{\partial t}}\right)

after rearranging terms. From here it is straightforward to see that a gauge transformation of V' = V + ∂f/∂t, A'_x = A_x - ∂f/∂x, A'_y = A_y - ∂f/∂y, A'_z = A_z - ∂f/∂z will lead to the same force since the order of the partial derivatives commute for differentiable functions. Once again gauge invariance rules the day.

Incidentally, the last equation can be placed in a more illuminating form if we use the cross product and the curl operator.

\mathbf {F} =\mathbf {v} \times \left(\mathbf {\nabla } \times \mathbf {A} \right)+\left(-\mathbf {\nabla V} -{\frac {\partial \mathbf {A} }{\partial t}}\right)

which when you replace A by eA and V by eV is the Lorentz Force Law for a particle of charge e in an magnetic field B = curl A and an electric field E = −divV−∂A/∂t which is the standard definitions of E and B in terms of A and V.

Gauge invariance and rotating systems[edit]

As a guide to the physical nature of the potential momentum and as an example where we use the mechanical equivalent of the Lorentz force law, consider a uniformly rotating coordinate system rotating with an angular speed ω about a particular axis. An observer in a stationary reference frame will see the rotating reference frame accelerating away (due to centripetal acceleration) from any given point in space. The value of this centripetal acceleration is ω²a, where a is the shortest distance to the axis of rotation and the direction is inward (toward the axis.) Further the stationary observer would see the rotating reference frame rotating into any given point in space with a constant velocity that depends on the distance from the axis of rotation. This velocity would have a value of ω×r, where r is the location of the object and will be in the direction the coordinate system is rotating.

Putting both of these together the inertial observer would predict that the rotating observer would see a potential momentum A =

-m\omega ^{2}at{\hat {a}}+m\mathbf {\omega } \times \mathbf {r}

that is valid in the rotating system. (Here m is the mass of an object at location r in the rotating coordinate system.) The inertial observer would also predict that the rotating observer would see no potential energy, U = 0.

But what would the rotating observer see. To see that lets use the force equation we derived above from the relativistic Hamiltonian. In order to do that we need to calculate the partial time derivative and the curl of the potential momentum. The partial derivative with to respect to time of the second term is zero as is the curl of the first term. (A simple drawing of the first term will show that there is no circulation anywhere and therefore no curl. It is also straight-forward to show by taking the curl in cylindrical coordinate.) Solving for the force we find that F =

m\omega ^{2}a{\hat {a}}+\mathbf {v} \times (2m\mathbf {\omega } ),

where the first term is the 'centrifugal force' and is similar to the electric field portion of the Lorentz force law above and the second term is the 'Coriolis force' which is similar to the magnetic field portion of the Lorentz force law.

How would the observer in the rotating system interpret this, though? The first term is easy to deal with in terms of a conserved centrifugal potential energy = −½mω²a². The second is trickier to deal with since it has the same difficulty we faced with the magnetic field term in magnetostatics above. Unlike the constantly accelerating rocket, the rotating observer cannot simply choose a gauge where A = 0. There is no 'a priori' choice for A. The best that that observer (without information from an inertial reference frame) can do is to use an analogy of the 'Coriolis' force with the magnetic force and to apply the methods of magnetostatics to the problem. But the vector potential of magnetostatics is explicitly gauge invariant, which brings us around to where we started.