PDA

View Full Version : Any theory can be made covariant?



tashirosgt
2009-Aug-12, 07:23 PM
I've been trying to read the paper http://www.pitt.edu/~jdnorton/papers/decades.pdf, which DrRocket recommended in another thread. Not being an expert in general relativity, I probably miss much of what is being said. One passage that did pique my interest is on page 815


Kretschmannís point is that there must be something more to a relativity principle than covariance. For he argues that we can take any theory and reformulate it so that it is covariant under any group of transformations we pick; the problem is not physical, it is merely a challenge to mathematical ingenuity. In brief, general covariance is physically vacuous.

I can understand how a mathematical expression can be altered to be invariant under a given set of transformations. But I don't understand how any physical theory can be made invariant and still be the same physical theory. Can someone explain that?

DrRocket
2009-Aug-12, 11:07 PM
I've been trying to read the paper which DrRocket recommended in another thread. Not being an expert in general relativity, I probably miss much of what is being said. One passage that did pique my interest is on page 815



I can understand how a mathematical expression can be altered to be invariant under a given set of transformations. But I don't understand how any physical theory can be made invariant and still be the same physical theory. Can someone explain that?

If you recall I offered that link as an example of how the notion of covariance was turned into a hash in philosophical discussions.

What seems to be offered in that passage is an assertion of Kretschmann, without proof, that any theory can be made invariant under any group of transformations. I'm inclined to not believe that assertion without proof.

You cannot alter any mathematical expression to be invariant under any group of transformations. In fact the group of transformations can define the theory.

You can develop special relativity starting with 4-space and the Lorentz metric -- Minkowski. The Lorentz transformations then arise as the group of transformations that preserve inner products under the metric, subject to a couple of other restrictions. The metric is not preserved by GL(4,R) and that is what makes special relativity different from Newtonian mechanics.

Ken G
2009-Aug-16, 06:32 AM
I suspect what Kretschmann means is analogous to what Dirac did to quantum mechanics-- he took a theory that was not generally covariant under Lorentz transformations, and used "ingenuity" to make it generally covariant. I agree with DrRocket that a blanket statement that any theory can be made generally covariant with regard to any given set of transformations requires proof, but I am willing to accept the proposition as a potentially valid hypothesis. Then the question is, what does that imply in regard to being "physically vacuous"? That's hard to interpret, but it seems to me he cannot take a natural meaning for that phrase and still be right. I would say a natural meaning for the phrase "physically vacuous" is that it cannot predict any observations, as opposed to "physically informative", which means it can. That is not Kretschmann's meaning however, because his statement would not be correct for pretty obvious reasons.

So what is Kretschmann's meaning of "physically vacuous"? I'm going to guess that he means something more along the lines of, it is not amazing that the theory that describes our universe so well is generally covariant according to some group of transformations, for that is the same as saying that the universe follows laws (a statement that is not physically vacuous, so he is not even correct given that meaning, but it does make the point that we should not be amazed some relativity applies). Thus I think his argument boils down to:
1) for science to work, all observers must see things that can be translated between them in predictable ways, to recover a concept of objective reality, and
2) the translations should obey symmetry principles that can be accomodated in some group of transformations, and then
3) that group of transformations will generate the covariance requirement of that brand of relativity. In short, there must always be some kind of relativity, or science doesn't work.

As I said, this is actually not a physically vacuous statement, it is a physically profound statement, because it justifies physics. But if you take it for granted that physics has to work, then you can view this as an obvious corollary, and that is in some small sense "vacuous." But what Kretschmann also appears to be overlooking is that most of the battle here is finding which flavor of relativity is the right one, in other words, what is the specific group of transformations that the laws of physics should be generally covariant in regard to. That is hardly vacuous, that is the astonishing breakthrough of Einstein's relativity.

So I think the problem here is, when people say a theory is "generally covariant", it can be unclear if they mean covariant with regard to some group of transformations, or covariant with regard to the Lorentz group of transformations (I won't get into general relativity, it's hard enough to make correct statements about special relativity). Most people mean the latter, so the aspect of that which is highly non-vacuous is the prescribing of the group that works. What amazes physicists is not that the laws are covariant in regard to some group, it is that they are covariant in regard to this particular group. Galilean relativity involves covariance in regard to a far more intuitive seeming group, and when that turned out to be wrong, then it could have been any complicated group you could imagine. If it's not the simple, intuitive one, then it's Pandora's Box, you might think. But surprise, there is a different, highly non-intuitive, yet fairly simple group that does work. That's not "vacuous", that's one of the most profound physical discoveries of all time.

tashirosgt
2009-Aug-16, 04:29 PM
I have less trouble with the word "invariance" than the word "covariance". There are technical definitions for "covariant" and "contra-variant" tensors. (It would not surprise me if one can create "invariants" from these and vice-versa.) The Wikipedia article offers the definition of "General covariance". It says that the form of physical laws should not change under coordinate transformations. However it does not explain the technical meaning of "form".

Suppose we have a mathematical expression written in terms of scalar variables in one coordinate system. If we change coordinate systems, we can replace each variable in the orginal expression with an expression that represents its value in the new coordinate system. If we define "form" in such a way that the new expression obtained in this manner is of the same "form" then nothing miraculous has happened when the "form" is preserved by transformation of coordinates.

How can we use transformation of coordinates fo define a class of mathematical expressions (or equations, relations etc.) that have "true physical content"? I think we must define "form" is some way so that not all mathematical expressions retain the same "form" when coordinates are changed. The ones that do retain the same "form" become the expressions that describe the physical theory.

The way that physics is usually presented in elementary textbooks is that certain equations are asserted to be true in some standard coordinate system. Then it is asserted that in a new coordinate system, the applicable law is obtained by doing a change of coordinates. This does isolate a limited set of correct equations, but the change of coordinate machinery is not (by itself) what is distinguishing true equations from false equations.

Perhaps Kretschman was asserting that the "General covariance" property (however Einstein defined it) was a relatively trivial definition of "form" and that Einstein's theory requires the same information that elementary textbooks give in addition to "General covariance" before it describes a particular physical theory.

Ken G
2009-Aug-17, 04:50 AM
I have less trouble with the word "invariance" than the word "covariance". There are technical definitions for "covariant" and "contra-variant" tensors.It is my impression that this is fairly coincidental usage of the same word "covariant", it's not really the same thing.
The Wikipedia article offers the definition of "General covariance". It says that the form of physical laws should not change under coordinate transformations. However it does not explain the technical meaning of "form".The technical meaning is not all that technical-- it just means you can write the equations down without specifying the coordinates being used. The equations are totally independent of the coordinates, so the equations transcend the coordinates. The significance of this is that we can imagine the equations are something real, and the coordinates are something arbitrary. The former has more to do with the objective universe, the latter has more to do with the subjective observer. This separation is quite important for salvaging the meaning of objectivity, and even science itself. So we have to have some concept of general covariance, but we do better-- we find the actual group of symmetry transformations.



Suppose we have a mathematical expression written in terms of scalar variables in one coordinate system. If we change coordinate systems, we can replace each variable in the orginal expression with an expression that represents its value in the new coordinate system. If we define "form" in such a way that the new expression obtained in this manner is of the same "form" then nothing miraculous has happened when the "form" is preserved by transformation of coordinates.If you are using the technical meaning of "scalar", it holds that the values of the scalars won't depend on the coordinates used. So any theory built entirely from scalars will always be generally covariant, automatically. But I suspect you take the more informal meaning of "scalar" as a "number", but then the point is, if your formula calls some number X, then that's all you are allowed to write in the form of that formula-- you cannot also state the coordinates that give the value of X that makes the formula true. Any formula is true in some coordinates that you can pick to make the formula work in any specific situation, the trick is to find formulae that work in coordinates that you can specify without knowing the specific situation. In short, they need to work in any coordinates.


The way that physics is usually presented in elementary textbooks is that certain equations are asserted to be true in some standard coordinate system.That would be a very unfortunate concept for a textbook to imply, because it's tantamount to saying the equation is only true for a particular observer. That would be like saying physics isn't objective, and it would be toast at that point. At the very least, a formula must be true for a whole class of observers, a class of coordinates. These classes can also be organized in terms of the symmetry operations that transform between them, like a fixed translation of an origin, or motion at constant speed.


Then it is asserted that in a new coordinate system, the applicable law is obtained by doing a change of coordinates. This does isolate a limited set of correct equations, but the change of coordinate machinery is not (by itself) what is distinguishing true equations from false equations.
That approach would basically turn physics back to the time of the Greeks, where they thought the Earth was a "unique" reference frame. Then by Galileo and Newton, it was realized that whole classes of observers could use the same equations (the symmetries of fixed translation or motion at constant speed), and Einstein generalized that to all observers-- Einstein's symmetry is "swap observers", period.



Perhaps Kretschman was asserting that the "General covariance" property (however Einstein defined it) was a relatively trivial definition of "form" and that Einstein's theory requires the same information that elementary textbooks give in addition to "General covariance" before it describes a particular physical theory.If he was saying that, and if he imagined that elementary textbook formulae only work in a particular coordinate system and have to be translated into all others, then he was missing the whole point of general relativity. That statement is like saying that physics only works, in its "pristine" form if you will, for a single observer, and all other observers must translate their experiences to the perspective of that one observer to make sense of them. Instead, GR makes the diametrically opposite assertion, that translation is never necessary, any observer can use physics in exactly the same way, as if it was all built just for them.

tashirosgt
2009-Aug-17, 06:57 PM
Using the phrase that "you can write the equations without specifying the coordinates being used" sounds more specific that saying the equations keep the "same form" in various coordinate systems, but whatever idea these phrases are suppose to communicate is still unclear.

In general, an equation is a string of symbols with an "=' mark near the middle. The individual symbols do not have to represent numbers of n-tuples of numbers and the operations do not have to represent the ordinary arithmetic of multiplication and division. The definition of "=" can be complicated.

When it comes time to set up experiments in laboratories, the abstract symbolic equations must be translated into some specific set of (ordinary algebraic) equations where the symbols do represent sets of numbers and the operations do represent ordinary arithmetic (including the arithmetic of the complex numbers).

How do we decide whether the thing represented by a given abstract symbol indeed remains the same when we change reference frames or not? We have to make a definition of "sameness" in terms of ordinary numbers and operations. For example, to show a "purely geometric" object like a vector remains the same, we show the function that defines it's length remains invariant. There is a similar way to handle "direction" by looking at ratios of functions of its coordinates. An old fashioned view of science and mathematics is that we have a Platonic idea of a vector and when it comes to checking whether two vectors expressed in different coordinate systems are equal "we just know" how to do it, e.g. what formulas for length to use in cartesian representations or spherical polar representations etc. The modern view would be that your are certainly free to make up mathematical systems involving things called vectors that are unrelated to numbers or ordinary arithmetic. But if you want to claim that some mathematical object represented by an n-tuple of number is an example of a vector, you have to supply a lot detail about what you are using for the definition of equality and how you define the vector operations.


I think saying that a law of physics (as an equation) can be written without reference to a coordinate system actually makes complicated claims about referring things to coordinate systems. I think it claims at least the following:
1. We have defined a unique way to translate each symbol in the law into some mathematical object in each coordinate system
2. There is a mathematical definition for how to judge whether two objects expressed in different coordinate systems are equal (i.e. "the same")
3. The left hand and right hand side of the physical law (when completely worked out) also can be associated with some type of mathematical object in each coordinate system. We have also defined what it means for representations of this object in two different coordinate systems to represent the same object
4. Suppose we translate the physical law to a coordinate system and change coordinates of all the symbols involved in the law (including the final result of the left and right hand sides) so they are expressed in a second coordinate system. Then by translating from the second coordinate system to the abstract symbols of the physical law we can recover the same value for each individual symbol in the physical law as we originally had.


One distinction that needs to be made in this discussions is the distinction between a general change of coordinates and the particular changes of coordinates that define a change of "reference frame". I have yet to find a clear statement on the web of about what a "reference frame" is. In particular, is the idea of a "reference frame" always associated with mechanics ( mass, force etc.) or is it a more general idea?

Ken G
2009-Aug-17, 08:14 PM
Using the phrase that "you can write the equations without specifying the coordinates being used" sounds more specific that saying the equations keep the "same form" in various coordinate systems, but whatever idea these phrases are suppose to communicate is still unclear.
It means you can write down the equation before you know anything about the coordinate system it is meant to be applied to. Then, once you know the coordinates, you don't have to change the equation in any way. At this point you have still not substituted any numbers into that equation-- at that stage, the coordinates come in.

When it comes time to set up experiments in laboratories, the abstract symbolic equations must be translated into some specific set of (ordinary algebraic) equations where the symbols do represent sets of numbers and the operations do represent ordinary arithmetic (including the arithmetic of the complex numbers). Correct, and the translation depends on the coordinates. But the equation being translated, and each and every symbol in that equation, will have the same meaning for all of them.


How do we decide whether the thing represented by a given abstract symbol indeed remains the same when we change reference frames or not? Because its meaning is very simple-- it will have meanings like "the velocity of the particle", or the "location of the particle." That symbol will have that meaning for every observer. What numbers they associate with each symbol is coordinate-dependent, but not the meaning of the symbol. That's the difference between a "form" and a bunch of numbers. It is the former that is "objectively real" in physics, prior to even specifying any observer. The latter only achieves an objectively real meaning after you supply the observer.


We have to make a definition of "sameness" in terms of ordinary numbers and operations. No, it is quite important (for objective physics) that we do not need actual numbers for that. We only need to supply the same definition of a symbol, and that does not require any specific numbers. Saying the symbols have the same meaning for all observers just means they have the same definition for all observers, because definition is exactly the same thing as meaning. When someone tells you the meaning of a velocity or a position, they just have to tell you how to get it, they do not need to supply a single number. Sometimes, for pedagogical reasons, we find it helps to give examples that use actual numbers, but that is not required for meaning, and after a while we find we need no examples at all.


For example, to show a "purely geometric" object like a vector remains the same, we show the function that defines it's length remains invariant.True, but this requires only manipulations of the vector definitions and identities. No coordinates are ever needed, so no vectors are ever given any numbers when we show those things.


The modern view would be that your are certainly free to make up mathematical systems involving things called vectors that are unrelated to numbers or ordinary arithmetic. Actually, the definitions of vectors assert ways to apply arithmetic to them, so this is not true.

But if you want to claim that some mathematical object represented by an n-tuple of number is an example of a vector, you have to supply a lot detail about what you are using for the definition of equality and how you define the vector operations.It is never necessary to think of vectors as "n-tuples", as proofs about them don't require that. But even if you do think about them that way, which many people like to do (as do I, often), you still never need to supply numerical values to any of those n-tuples to understand what they are or to prove things about them. The numbers are just a crutch, you only need the arithmetic relationships themselves. The concept of "addition" transcends the numbers you may happen to be adding, for example, and proofs about addition do not need to supply any numbers.



I think saying that a law of physics (as an equation) can be written without reference to a coordinate system actually makes complicated claims about referring things to coordinate systems. I think it claims at least the following:
1. We have defined a unique way to translate each symbol in the law into some mathematical object in each coordinate system
2. There is a mathematical definition for how to judge whether two objects expressed in different coordinate systems are equal (i.e. "the same")
3. The left hand and right hand side of the physical law (when completely worked out) also can be associated with some type of mathematical object in each coordinate system. We have also defined what it means for representations of this object in two different coordinate systems to represent the same object
4. Suppose we translate the physical law to a coordinate system and change coordinates of all the symbols involved in the law (including the final result of the left and right hand sides) so they are expressed in a second coordinate system. Then by translating from the second coordinate system to the abstract symbols of the physical law we can recover the same value for each individual symbol in the physical law as we originally had.
But none of those are "claims", they are simply the way mathematics works in physics. Were any of them not true, physics would not be possible, and so far they have all proven to be true, so we say that physics is possible, and hope they remain true.


One distinction that needs to be made in this discussions is the distinction between a general change of coordinates and the particular changes of coordinates that define a change of "reference frame". I have yet to find a clear statement on the web of about what a "reference frame" is. In particular, is the idea of a "reference frame" always associated with mechanics ( mass, force etc.) or is it a more general idea?I agree that the concept of "reference frame" is quite ambiguous, and all kinds of misconceptions arise as a result. For example, there is a convention for choosing global coordinates that is used in special relativity, and is associated with the Lorentz transformation viewed as a global coordinate change. This is, however, just a convention, doesn't work in general relativity, and leads one to all kinds of clearly unphysical conclusions if one takes the global coordinatization too literally. Here's how I prefer to think about it:
1) A reference frame is defined by the motion of the observer, so contains no more information than that motion. As that motion is a local property of the observer, a reference frame is an entirely local concept.
2) Global coordinatizations are completely arbitrary in principle, but in practice we'd like to use space and time in our coordinates in such a way that they take on their local physical meanings for some observers whom we'd like to know how to specify. As such, we can restrict what we mean by a "global coordinatization" to mean the instructions for transforming into the local reference frames of an array of observers, as a function of the way we are labeling those observers. This array must be rich enough to have a local observer at every event or process that is relevant to our interests in adopting that global coordinatization.
3) The laws of dynamics are expressed in covariant form for each of those observers, so they are still just one set of laws, not a set for each observer. It is an entirely separate step in which we take the dynamics that each observer experiences locally, and translate them, using the array of transformations from part (2), into a global description of the "objective reality" under consideration.

DrRocket
2009-Aug-17, 08:36 PM
I have less trouble with the word "invariance" than the word "covariance". There are technical definitions for "covariant" and "contra-variant" tensors.

There is some trouble with the terminology. I agree with you.

Contravariant and covariant with respect to tensors and forms is simply terminology to determine the difference between acting on the tangent bundle or co-tangent bundle of a manifold. That one is reasonably clear, (except that I tend to forget which is which, but for the record covariant vectors come from the dual space, which is the co-tangent space).

Invariance is also clear. A thing is invariant if it does not change under whatever transformatin you are considering at the time. So, in Minkowski space the inner product is invariant under Lorentz transformations.

"Covariant" and "generally covariant" as applied to an entire theory seems to be an invention of Einstein. It is not a normal mathematical term.

Einstein was at something of a disadvantage in that when he formulated general relativity, the mathematica apparatus necessary to properly describe the theory was either unknown or just being discovered. That makes his achievement all the more impressive, but it also makes it a bit difficult for someone in the present era to fully understand the context in which he was speaking.

It appears that by "generally covariant" what Einstein meant was a theory that is independent of any particular choice of coordinates. In more modern terminology we find that general relativity can be formulated in the language of differentiable manifolds and differential forms without recourse to any coordinate system at all. The "invariants" are then those things that one can describe in such terms and that they are "generally covariant" becomes trivial since the formulation does not involve coordinates at all. That was not so obvious at the time that Einstein was doing his formulation of GR.

tashirosgt
2009-Aug-18, 06:02 PM
It looks like there is no precise mathematical definition for "general covariance", so perhaps Kretschman's claim cannot be proven or disproven mathematically. (I am not sure how representative the material on the web is of current scientific opinion, but the scholarly material that I have read there says that Einstein eventually conceded Krestchman's point, whatever it was.)

Another critical term which needs a precise definition is "coordinate free". For example, as DrRocket mentions, one can write statements about differentiable manifolds in symbolic terms and say things that do not refer to a coordinate system. However, I have never seen a mathematical definition of "manifold" that avoided mentioning "charts" and "atlases", which are concepts that involve coordinate systems. So there are "coordinate free" approaches to mathematical objects whose definitions depend on coordinates.

There is another meaning of "coordinate free". There are definitions of mathematical objects that never mention coordinates. An example of this is the modern set of axioms for things like a "vector space" and "inner product space". (This is usually not the presentation of vectors given by elementary textbooks. Such books define vectors in terms of directed line segments or cartesian coordinates.) The modern set of axioms never mentions line segments or coordinates. A consequence of this is that when you want to make up an example of a vector space in terms of a coordinate system, you have many possible choices. There is no unique way to defining the length of a vector, angle between two vectors in terms of coordinates. When you introduce these definitions (in order to cover the material presented in the elementary texts) you are adding-on assumptions and definitions in order to make one specific example of the general idea of a vector space.

--------------

KenG,

At least we agree that phrasing a law of physics in a "coordinate free" manner does have implications when use coordinates. You are defending the Platonic view of mathematics. You suggest that defining the length of a vector in cartesian coordinates follows from the mathematical definition or assumed properties of a vector. This is not true in modern axiomatics, as I mentioned above.

The Platonic view is defensible as a philosophical or sociological outlook. You can say "My idea of a vector is something so simple and clear that if you propose a coordinate system, I'll have no trouble telling you how to associate a coordinate with a vector and telling you the formula for its length". This is a claim about a human capability (and it may be true!). However, it is not a mathematical principle that can be used to write a mathematical proof.

-------------
DrRocket,

I think your remarks confirm my emphasis on invariants as the fundamental things to use in defining what we mean by a law of physics being coordinate free. As I understand the modern approach, we begin with a "group of transformations" that represent changes of representation between a set of coordinate system. So we are not claiming to deal with "all possible coordinates systems" or even "all possible changes of representation". Within this limited context, there will be certain functions of the coordinates that are invariant under a change of coordinates. These (or functions of them) are the candidates for defining the objects in a physical theory, i.e. the symbols that we can use in a "coordinate free" formula.

---------------------------

My intuition is that knowing the set of invariants of a group of coordinate transformations does not imply a specific coordinate free theory of physics. I see nothing that prevents us from finding a set of (empirically) wrong but consistent formulae using the invariants as symbols. On the other hand, I still find Kretschman's claim surprising. I conjecture it could be a claim made in the context where one has freedom of choice about which coordinate systems to use and which groups of transformations to use. Perhaps it is saying "Give me a set of functions defined in terms of some coordinate system. I'll show you how to define other coordinate systems and a group of transformations among these coordinates systems that leaves those functions invariant".

The changes of coordinates that Einstein and Kretschman were talking about were probably restricted to those that were a change of "reference frame". I still don't grasp a precise definition for that phrase.

hhEb09'1
2009-Aug-18, 06:44 PM
But I don't understand how any physical theory can be made invariant and still be the same physical theory. Can someone explain that?It's an exercise in Misner, Thorne, and Wheeler :)

Basically (http://www.bautforum.com/against-mainstream/2274-fatal-flaw-geocentrism-7.html#post38800), you have to cast Newtonian physics in covariant form. There's a form of the Newtonian physical laws such that they are valid in any reference frame, and they reduce to the usual Newton's laws in ordinary frames. But the laws are extremely ugly.

So I think the problem here is, when people say a theory is "generally covariant", it can be unclear if they mean covariant with regard to some group of transformations, or covariant with regard to the Lorentz group of transformations (I won't get into general relativity, it's hard enough to make correct statements about special relativity). I'm going to disagree with that. General covariance means that a physical theory is covariant with any general transformation. It applies only to general relativity (in the list of Newton, SR, and GR).

One of the reasons that it is an issue, is that Einstein used it ("general covariance") as one of his three main criteria in evaluating physical theories, when he was developing general relativity (the other two were the equivalence principle, and Mach's principle). But if it is true that any physical theory (say, Newton's) can be cast into an equivalent generally covariant form, then it lessens the force of the critieria. That is, the criteria would be physically vacuous--it would not be able to discriminate between the theories.

DrRocket is right (as usual) that there is a historical perspective at work here.

Ken G
2009-Aug-18, 06:59 PM
At least we agree that phrasing a law of physics in a "coordinate free" manner does have implications when use coordinates. You are defending the Platonic view of mathematics. You suggest that defining the length of a vector in cartesian coordinates follows from the mathematical definition or assumed properties of a vector. This is not true in modern axiomatics, as I mentioned above. I did not say that was true, I said that we can use vectors in physics because we know how to connect them with measurements, and measurements are reference-frame dependent, which is associated with the concept of coordinate dependence. Nevertheless, it is quite important for objective science that the proofs about vectors never use anything but the axioms, as all proofs depend only on the axioms. Therefore, if the axioms can be expressed in a coordinate-free way, then all the proofs can be done without any references to coordinates, or reference frames, or individual observers, which was the point of using them in the first place.

The entire body of mathematical knowledge about vectors can be expressed in the form of all the coordinate-free definitions, and all the coordinate-free proofs, so that mathematical body is entirely coordinate free. There are no "assumptions" here, nor Platonic philosophy, it's just the definition of mathematics. Now, it is certainly true that in practice, we often find it convenient to work in a coordinate system, but this relates to how vectors are useful to us, not to what we know about vectors. The latter is entirely coordinate free, it is the former that needs coordinates and reference frames-- the interface between the mathematics and the physics is where all that comes in. That's how physics stays objective, by borrowing from the objective character of mathematical proofs. Nevertheless, the physics itself is not provable, it merely has to be objective and accurate.


The Platonic view is defensible as a philosophical or sociological outlook.Again, none of that has anything to do with the definition of mathematics, under whose aegis the vector concept resides.


You can say "My idea of a vector is something so simple and clear that if you propose a coordinate system, I'll have no trouble telling you how to associate a coordinate with a vector and telling you the formula for its length". This is a claim about a human capability (and it may be true!). However, it is not a mathematical principle that can be used to write a mathematical proof. Correct, nor did I ever imply otherwise. You are saying one of the reasons vectors are useful in physics. The other is that the proofs are coordinate free, so are objective. Taken together, we have the whole point of using vectors in objective science.

DrRocket
2009-Aug-18, 07:29 PM
Another critical term which needs a precise definition is "coordinate free". For example, as DrRocket mentions, one can write statements about differentiable manifolds in symbolic terms and say things that do not refer to a coordinate system. However, I have never seen a mathematical definition of "manifold" that avoided mentioning "charts" and "atlases", which are concepts that involve coordinate systems. So there are "coordinate free" approaches to mathematical objects whose definitions depend on coordinates.

There is another meaning of "coordinate free". There are definitions of mathematical objects that never mention coordinates. An example of this is the modern set of axioms for things like a "vector space" and "inner product space". (This is usually not the presentation of vectors given by elementary textbooks. Such books define vectors in terms of directed line segments or cartesian coordinates.) The modern set of axioms never mentions line segments or coordinates. A consequence of this is that when you want to make up an example of a vector space in terms of a coordinate system, you have many possible choices. There is no unique way to defining the length of a vector, angle between two vectors in terms of coordinates. When you introduce these definitions (in order to cover the material presented in the elementary texts) you are adding-on assumptions and definitions in order to make one specific example of the general idea of a vector space.



To do manifold theory you need the notion of charts and atlases to define what is meant by a differentiable function on the manifold. However charts are just open sets coupled with a homeomorphism from the open set to some Euclidean space such that the composition of one map with the inverse of another is a smooth function on the overlap of their domains. You do not need explicit coordinates, or equivalently the chart defines a local coordinate system

Once you have enough charts to define an atlas, you have the ring of smooth functions on the manifold. From that you can define tangent vector fields, differential forms, tensors, ... all algebraically and without explicit mention of any coordinate systems. This is compact way to develop the theory of differentiable manifolds. You can see this done in Helgason's book Differential Geometry and Symmetric Spaces.

Any good algebra book will develop linear algebra as the abstract study of vector spaces, and not use coordinates. Basis vectors will be discussed, but it is best to not invoke the existence of a basis unless absolutely necessary. This is a great help when you make the transition to infinite-dimensional vector spaces, since in that setting bases are not very important.

For the physicists who gasped a the last sentence, I will expand a bit. In the infinite-dimensional setting you must be careful about what you mean by a basis. The definition from elementary algebra works, but you get what is called a Hamel basis. That is quite different from the notion used for Hilbert spaces which is a "complete orthonormal set", since a Hamel basis is typically uncountable. The idea for a "Hilbert space basis" has no generalizable counterpart to the theory of Banach spaces or to general topological vector spaces. There you find yourself quite naturally working in a coordinate free setting.

Ken G
2009-Aug-18, 07:45 PM
I'm going to disagree with that. General covariance means that a physical theory is covariant with any general transformation. It applies only to general relativity (in the list of Newton, SR, and GR).You are right, I was mixing two issues that need to be kept separate:
issue #1) General covariance just means that the form of the laws use mathematical expressions that are coordinate-free, and more importantly, true even before specifying the observer who is using them. It is a general requirement on the mathematical language one is allowed to use in physics, to preserve objectivity. It is still not "physically vacuous", because requiring that such mathematical language be used severely constrains what kinds of physical laws are "allowed", i.e., it is an objectivity requirement. For example, an improper physical law would be "whatever the observer wants to happen is what will happen", because that cannot be expressed in a coordinate-free way, since it depends explicitly on the observer. However, if you presume that the universe must be objective, or at least that physics must be able to work on some component that is objective, then you can say there is no further requirement expressed by general covariance.

issue #2) There is much more to relativity than general covariance, and this was the second feature I was incorrectly mixing together into the "general covariance" label (probably because this is how the term is often used in practice, given the various ambiguities around it that we have discussed). What is also crucial to physics is that we have use of the mathematical concept of an underlying group, which is the group of the transformations between all possible observers (real and hypothetical) that might generate observations of some process. Relativity provides the appropriate representation of this group, which is a technical term meaning that it allows us to bring simple algebraic methods to the problem of transforming between observers. So unless Einstein was wrong, there is an additional constraint on the laws of physics, not covered under "general covariance", but which is a much more specific requirement, so is even more "non-vacuous": the requirement that the appropriate transformations between observers be the Lorentz group (remember I'm not including any gravity here, this is hard enough).

So even in the absence of gravity, we expect algebraic laws that are generally covariant in their form (just to make sure we are dealing with objective laws), but we also expect algebraic laws whose form is specific to involving representations of the Lorentz group (to make sure they automatically incorporate the correct ways to transform between different observers). If you have both of these requirements (which is what many people colloquially mean by "generally covariant", even though hhEb09'1 is quite correct it is improper usage), then when testing that law, you know you can focus entirely on its local comoving physics, i.e., you can test it using an observer and apparatus in the same location and moving with the physical process being analyzed. If it works for that observer, you automatically have the complete law that can be used by any observer who is analyzing that physical process, using their own measuring equipment. And that is how the laws of physics are supposed to work, and do seem to work: they should be the same laws for all observers, as though they were "built just for you", as I said above. No other situation could recover a concept of physical objective truth.

Of course, the waters get muddied quite a bit when you throw in gravity, where nonlocal transformations get much more complicated, and quantum entanglement, where even the very nature of a nonlocal coupling becomes more bizarre. But let's stick to special relativity of classical systems for this discussion!

And thankfully, DrRocket and I are in complete agreement on this thread, which might amaze some of you by now!

DrRocket
2009-Aug-18, 09:14 PM
And thankfully, DrRocket and I are in complete agreement on this thread, which might amaze some of you by now!

That happens whenever you are right.

tashirosgt
2009-Aug-19, 02:55 AM
A psychological observation: Imagine the rare occasion when a physicist is going to lecture on his physical theory to an audience who will have no trouble following him. He considers two methods of presentation.

1) He can announce this theory using abstract symbols and not talk about coordinates. Then he can examine various coordinate systems that physicists like to use and show that the coordinate interpretation of each symbol that used is invariant under transformations of coordinates. (Applause. Audience is amazed.)

2) He can lecture about the coordinate systems that physicists like to use and study the transformation among them and use things like the theory of group representations to find functions that are invariant under these transformations. Then he can announce his physical theory using symbols that are defined in terms of these invariant functions. (Less applause. Coordinate free nature of the theoy is obvious. Audience is amazed at his tolerance for algebraic computations.)

I think method 1) is psychologically more impressive. Unless a person suspects that there is an as-yet-undiscovered important coordinate system, I don't think it really a stronger argument in favor of his physical theory. Does it make sense to talk about important undiscovered coordinate systems or is it obvious that there aren't any?

----------------


General covariance means that a physical theory is covariant with any general transformation.

Then I think that a "general transformation" must have some special meaning to physicists so that it isn't really a "general transformation" in the sense of "any possible transformation to any possible type of coordinate system. Also the issue of defining a "reference frame" and an "inertial reference frame" further limit the scope of the transformations we care about. A basic question and limitation on coordinate systems is "What apsects about a physical situation must be given coordinates?". From the imprecise definitions of "inertial frame" that I have found on the web, I gather that transformations from one "inertial reference frame" to another have special interest.

DrRocket
2009-Aug-19, 03:21 AM
A psychological observation: Imagine the rare occasion when a physicist is going to lecture on his physical theory to an audience who will have no trouble following him. He considers two methods of presentation.



He is talking to himself.

When Feynman gave his first talk and used what are now called Feynman diagrams, the audience had several "monster minds" in it. As I recall no one understood what he was talking about. It took time and exposure before his approach was accepted.

hhEb09'1
2009-Aug-19, 03:46 AM
issue #1) General covariance just means that the form of the laws use mathematical expressions that are coordinate-free, and more importantly, true even before specifying the observer who is using them. It is a general requirement on the mathematical language one is allowed to use in physics, to preserve objectivity. It is still not "physically vacuous", because requiring that such mathematical language be used severely constrains what kinds of physical laws are "allowed", i.e., it is an objectivity requirement. For example, an improper physical law would be "whatever the observer wants to happen is what will happen", because that cannot be expressed in a coordinate-free way, since it depends explicitly on the observer. However, if you presume that the universe must be objective, or at least that physics must be able to work on some component that is objective, then you can say there is no further requirement expressed by general covariance.Well, that's all that author was saying, I'm sure. :)

The point is that general covariance does not discriminate between general relativity and newton's theory--contrary to what Einstein first thought.
But let's stick to special relativity of classical systems for this discussion!
I think the OP opened the door farther than that. :)



Then I think that a "general transformation" must have some special meaning to physicists so that it isn't really a "general transformation" in the sense of "any possible transformation to any possible type of coordinate system. No, no special meaning really.

tashirosgt
2009-Aug-19, 06:18 AM
Is there a set of invariants that are invariant for all possible transformations? There are "large" groups such as the group of all linear transformations of an n-dimensional real space and I suppose these have invariants. But it would be surprising to me if these invariants continue to "invary" under non-linear transformations. Plus the paper in the original post is referring to diffeomorphisms, which restricts the type of transformations that are used. So I don't understand how "general covariance" can claim to apply to all possible transformations of coordinates if it depends on having a set of invariants.

Ken G
2009-Aug-19, 06:43 AM
Does it make sense to talk about important undiscovered coordinate systems or is it obvious that there aren't any?
Nothing is obvious. If Einstein was right, and the observations are firmly in his camp so far, there aren't any, but no one can say what that really means. All our knowledge is conditional on our experience, that is inescapable.

DrRocket
2009-Aug-19, 09:35 PM
Is there a set of invariants that are invariant for all possible transformations? There are "large" groups such as the group of all linear transformations of an n-dimensional real space and I suppose these have invariants. But it would be surprising to me if these invariants continue to "invary" under non-linear transformations. Plus the paper in the original post is referring to diffeomorphisms, which restricts the type of transformations that are used. So I don't understand how "general covariance" can claim to apply to all possible transformations of coordinates if it depends on having a set of invariants.

The set of all linear transformations on an n-dimensional real space is not a group. Some of them fail to have inverses. Since the 0 transformation is one of the elements, the only thing invariant under all such transformations is the 0 vector.

"General covariance" is not a mathematical term. As I said earlier it seems to be an invention of Einstein, and the import is that physics should be formulated in a coordinate free way. That is somewhat different from being invariant under some specified transformation group.

It is necessary to be a bit careful with "invariant" and "covariant" when you see those terms being used by physicists. They appear to use those terms a bit more loosely than what you would see from a mathematician. For instance the term "invariant" is sometimes used without any clear indication of the transformation group that one has in mind. In special relativity the term can be safely assumed to mean "invariant under the action of the Poincare group". In general relativity it is not so clear, but "general covariance" seems to refer to quantities that are definable without recourse to any coordinate system, which is the physicists version of "tensor".

You need to be very careful when trying to apply ordinary mathematical terminology to work being explained by physicists. For instance in special relativity you will see statements made with regard to something or other being a "4-vector". To a mathematician it is perfectly obvious when something is or is not a 4-vector, you just look at the number of components needed to describe it, and if the number is 4, end of story. But a physicists means that the object described in physical terms when transformed using a Lorentz transformation is the same physical object that would be seen by an observer in an inertial reference frame moving at the speed associated with the transformation.

DrRocket
2009-Aug-19, 09:52 PM
One distinction that needs to be made in this discussions is the distinction between a general change of coordinates and the particular changes of coordinates that define a change of "reference frame". I have yet to find a clear statement on the web of about what a "reference frame" is. In particular, is the idea of a "reference frame" always associated with mechanics ( mass, force etc.) or is it a more general idea?

You are having trouble understanding the language used by physicists as opposed to the language used by mathematicians. That is understandable.

They are NOT the same language. Sometimes the same word may mean different things in the two languages.

Reference frames are appropriate to special relativity. In SR you are working in Minkowski 4-space. You start by picking an orthonormal basis with respect to a quadratic form with signature (+,-,-,-) or (-,+,+,+). That defines "time" and "space" for one arbitrary observer. Next determine the transformations that preserve that quadratic form and that also preserve the direction of time (orthochronous transformations). It turns out that this consists of translations plus a class of linear transformations. The linear transformations are called the Lorentz group and consist of spatial rotations, plus transformations that result in the usual length contraction and time dilation discussed in elementary treatments of SR. Those are parameterized by gamma or equivalently speed.

So, "reference frame" refers to the result of transformations of the original basis vectors under the Lorentz transformations. That is how a mathematician would look at it. But a physicists takes the viewpoint that a second observer has associated with him a gamma, and that determines his perspective and his reference frame and the Lorentz transformation is a result of that perspective. The two approaches are equivalent, but the language of the mathematician and the physicist get in the way of communication.

I doubt that you will find this on the web. The only way I know to see this is to try to understand both perspectives, get yourself confused and then figure out what is going on.

It helps to recognize that physicists are very sloppy in their use of mathematics and mathematical terminology. Don't go astray by expecting rigor where there is none.

tashirosgt
2009-Aug-20, 04:51 AM
Thank you., DrRocket. You have clarified matters. I find that account of "general covariance" and "general transformations" believable since it is has a more restricted context that a mathematician's first impression of those terms.

Ken G
2009-Aug-20, 07:43 AM
A couple technical points, since this is actually a discussion in physics. A reference frame is defined by an observer in physics, not by transformations or basis vectors. You have the "reference frame of the observer", and the term really doesn't mean anything else. In other words, the concept could predate mathematics altogether, though it wouldn't have much quantitative usefulness. It is also true that a reference frame is not a mathematically rigorous concept, but that's because it is a physics concept. It is also true that without the mathematics, the concept of the reference frame of an observer would be more like creative writing than like quantitatively predictive physics, so physics does need to borrow all those mathematical notions that have been talked about to make it work. The rigor lives in the math, that's why physics borrows it. That physics is not rigorous is the nature of the animal, it's a feature not a bug-- the real world is messy, and trying to figure it out with the brains of highly advanced apes is messier still. Rigor is good whenever it can be obtained, and perhaps physicists settle for less of it than they could possibly get, but they still know they'll never have complete rigor in physics because it wouldn't meet the needs of real-life physics.

To keep this distinction precise, it is necessary to recognize that the Lorentz group is not the linear transformations that preserve the quadratic form. That's the "general linear group" that shows up in the representation of the Lorentz group, which allows us to use algebra on the abstract concept of the Lorentz group. The distinction is not of any computational importance for making predictions, but does matter when thinking about what are the differences between the ontological entities in physics and in math. The actual physics group that is the Lorentz group is the action of mapping between the perceptions of different observers, which amount to some abstract (and probably unknowable) relationship between observers, and the group operation is the composition of these mappings. The identity operation appears because when you apply the map from observer A to observer B, and then apply the map from observer B to observer A, you better end up back where you started, or physics is in big trouble as an objective science.

hhEb09'1
2009-Aug-20, 10:45 AM
They are NOT the same language. Sometimes the same word may mean different things in the two languages.I dunno if I'd put it that way.
It helps to recognize that physicists are very sloppy in their use of mathematics and mathematical terminology. OK, I might put it that way. :)

But seriously you must have some words, some specific examples in mind. I'd like to see what you mean.

DrRocket
2009-Aug-20, 01:26 PM
To keep this distinction precise, it is necessary to recognize that the Lorentz group is not the linear transformations that preserve the quadratic form. That's the "general linear group" that shows up in the representation of the Lorentz group, which allows us to use algebra on the abstract concept of the Lorentz group.

The general linear group on dimension n, GL(n,R) is the group of all invertible linear transformations. It is not the Lorentz group. It is quite a bit larger. It consists of all linear transformations with non-zero detereminant. The Lorentz group is quite a bit smaller. All Lorentz transformations, have detereminant +/- 1.

The Lorentz group is the subgroup of GL(4,R) that preserves the Minkowski inner product. One can show that preservation of the Minkowski inner product is equivalent to preservation of the associated quadratic form. In short the Lorentz group is precisely the group of linear transformations that preeserve the quadratic form.
http://en.wikipedia.org/wiki/Lorentz_group

An analogy is the unitary group O(4) that preserves the usual positive definite inner product on 4-space.

The general froup of isometries on Minkoski space includes not only the linear transformations but also translations. This full group is called the Poincare group, the Lorentz group being the isotropy subgroup; i.e the isometries that preserve the origin.

The group that determines special relativity is actually a bit more specialized. For that one specializes to the subgroup of the full Lorentz group that preserves the orientation of time-like vectors, the subgroup of isochronous Lorentz transformations. One also generally limits attention to the "proper" orthochronous transformations, those with determinant +1.

There is a nice treatment of the Lorentz group in The Geometry of Minkowski Spacetime by Gregory L. Naber.

Ken G
2009-Aug-20, 09:01 PM
The general linear group on dimension n, GL(n,R) is the group of all invertible linear transformations. It is not the Lorentz group.That's why I was not talking about that general linear group, I was referring to the group you were talking about, the Lorentz transformations. That group is the general linear group of the Lorentz group. Certainly, the distinction is often glossed over, especially by mathematicians, because the Lorentz group and its general linear group "work the same way", indeed that is the purpose of representation theory. The sole context where the distinction actually does matter is just the context of your earlier post-- understanding the difference between physics and mathematics, especially in regard to the concept of a reference frame.


The Lorentz group is the subgroup of GL(4,R) that preserves the Minkowski inner product. One could generally get away with that statement, but again, not in the present context. In the present context, we should say that the subgroup of GL(4,R) that preserves the Minkowski inner product is the general linear group of the representation of the Lorentz group. Then we are being clear about what is the physics that could exist independently of the discovery of representation theory, and what is the mathematics that physics takes advantage of because of the discovery of representation theory. To the layman, I'm saying that the physical concept of the Lorentz group is an abstract animal that describes the physical mappings between different inertial observers in our universe, but it is made useful to physics by borrowing the mathematical notion of representation theory and its general linear group. Without the latter, we wouldn't have Lorentz transformations, so the whole concept of the Lorentz group would lack application or usefulness in real problems involving quantitative measurements.


In short the Lorentz group is precisely the group of linear transformations that preeserve the quadratic form.
http://en.wikipedia.org/wiki/Lorentz_groupAgain, that is the general linear group corresponding to the representation of the Lorentz group. As the two function exactly the same way, mathematicians (like that Wiki link, and yourself) would have no reason to distinguish them-- all animals that do the same things in mathematics are the same animals in mathematics. But when we ask a physics question like "what is a reference frame, and what is the difference between the Lorentz group in physics and the way that group functions in mathematics", only then does this distinction matter. Succinctly, to have a concept of a Lorentz group in physics requires no mathematical language whatsoever except for the definition of a group, but to use that concept in quantitative applications, that requires representation theory and its general linear group.

Put differently, the "Lorentz group" means something different in physics than in mathematics, along the lines of the different languages you were talking about above. In physics, it is a statement about observers who are carrying accelerometers that read zero all the time. We'd have a Lorentz group for the mappings between said observers, even if we had never discovered either relativity or representation theory. It just wouldn't be terribly useful to know we had a Lorentz group there, it would just be a definition whose meaning rests entirely in observations (being physics, after all). To make it quantitatively useful, we need to translate into the mathematics usage, borrowing from representation theory, so the difference between the two ways of talking about the Lorentz group would never matter to any quantitative application, it is an ontological distinction that appears only in the context of your earlier post about what physics is. (We seem to be knee-deep in ontology lately!)


The general froup of isometries on Minkoski space includes not only the linear transformations but also translations. This full group is called the Poincare group, the Lorentz group being the isotropy subgroup; i.e the isometries that preserve the origin.Yes, that is a technical point that in physics translates into saying that the Lorentz group is really only the group of mappings between observers at the same location in spacetime. Translation symmetries allow us to get a bit sloppy and pretend it applies to mappings between all observers, or equivalently, we deal with global inertial frames in special relativity, even though a reference frame is technically only local, being associated with a single observer. This distinction gets into even more differences between SR and GR, but in the present context of the differences between physics and mathematics language, we needn't muddy those waters, and we can just stick to SR and talk about the Lorentz group when we really should say the Poincare group. We'll let the ghosts of Lorentz and Poincare hash out who should get the credit there, it doesn't create problems in SR applications because translations between origins are ignorable.

The group that determines special relativity is actually a bit more specialized. For that one specializes to the subgroup of the full Lorentz group that preserves the orientation of time-like vectors, the subgroup of isochronous Lorentz transformations. One also generally limits attention to the "proper" orthochronous transformations, those with determinant +1.
That is true, if one adds further physical constraints on the observers, you get further restrictions to the group. An interesting point for sure, but still a technical detail to the basic distinctions we are talking about. In physics, the "orthochronous" and "proper" restrictions you are referring to must connect to some symmetry constraints on the observers. I'm not sure what they are without some thought, but I'd guess that the "orthochronous" restriction is that an observer moving forward in time is viewed as equivalent to one moving backward in time, so by convention we restrict to those moving forward in time. I'd also guess that the "proper" restriction is that if you map from observer A to B, then back to A, you need a true identity transformation there, not just a transformation to a physically equivalent description. The latter is more general, but the added generality is meaningless, so again by convention we "mod out" these various symmetries to obtain the minimum physically interesting set. It's pure simplifying convention, a la Occam-- we might someday discover there are differences between these seemingly equivalent observers, even in SR, and then we'll need to use the full group again.
There is a nice treatment of the Lorentz group in The Geometry of Minkowski Spacetime by Gregory L. Naber.Most likely a mathematician...

DrRocket
2009-Aug-20, 09:36 PM
That's why I was not talking about that general linear group, I was referring to the group you were talking about, the Lorentz transformations. That group is the general linear group of the Lorentz group. Certainly, the distinction is often glossed over, especially by mathematicians, because the Lorentz group and its general linear group "work the same way", indeed that is the purpose of representation theory. The sole context where the distinction actually does matter is just the context of your earlier post-- understanding the difference between physics and mathematics, especially in regard to the concept of a reference frame.
One could generally get away with that statement, but again, not in the present context. In the present context, we should say that the subgroup of GL(4,R) that preserves the Minkowski inner product is the general linear group of the representation of the Lorentz group. Then we are being clear about what is the physics that could exist independently of the discovery of representation theory, and what is the mathematics that physics takes advantage of because of the discovery of representation theory. To the layman, I'm saying that the physical concept of the Lorentz group is an abstract animal that describes the physical mappings between different inertial observers in our universe, but it is made useful to physics by borrowing the mathematical notion of representation theory and its general linear group. Without the latter, we wouldn't have Lorentz transformations, so the whole concept of the Lorentz group would lack application or usefulness in real problems involving quantitative measurements.

Again, that is the general linear group corresponding to the representation of the Lorentz group. As the two function exactly the same way, mathematicians (like that Wiki link, and yourself) would have no reason to distinguish them-- all animals that do the same things in mathematics are the same animals in mathematics. But when we ask a physics question like "what is a reference frame, and what is the difference between the Lorentz group in physics and the way that group functions in mathematics", only then does this distinction matter. Succinctly, to have a concept of a Lorentz group in physics requires no mathematical language whatsoever except for the definition of a group, but to use that concept in quantitative applications, that requires representation theory and its general linear group.

Put differently, the "Lorentz group" means something different in physics than in mathematics, along the lines of the different languages you were talking about above. In physics, it is a statement about observers who are carrying accelerometers that read zero all the time. We'd have a Lorentz group for the mappings between said observers, even if we had never discovered either relativity or representation theory. It just wouldn't be terribly useful to know we had a Lorentz group there, it would just be a definition whose meaning rests entirely in observations (being physics, after all). To make it quantitatively useful, we need to translate into the mathematics usage, borrowing from representation theory, so the difference between the two ways of talking about the Lorentz group would never matter to any quantitative application, it is an ontological distinction that appears only in the context of your earlier post about what physics is. (We seem to be knee-deep in ontology lately!)

Yes, that is a technical point that in physics translates into saying that the Lorentz group is really only the group of mappings between observers at the same location in spacetime. Translation symmetries allow us to get a bit sloppy and pretend it applies to mappings between all observers, or equivalently, we deal with global inertial frames in special relativity, even though a reference frame is technically only local, being associated with a single observer. This distinction gets into even more differences between SR and GR, but in the present context of the differences between physics and mathematics language, we needn't muddy those waters, and we can just stick to SR and talk about the Lorentz group when we really should say the Poincare group.
That is true, if one adds further physical constraints on the observers, you get further restrictions to the group. An interesting point for sure, but still a technical detail to the basic distinctions we are talking about. In physics, the "orthochronous" and "proper" restrictions you are referring to must connect to some symmetry remarks about the observers. I'm not sure what they are without some thought, but I'd guess that the "orthochronous" restriction is that an observer moving forward in time is viewed as equivalent to one moving backward in time, so by convention we restrict to those moving forward in time. I'd also guess that the "proper" restriction is that if you map from observer A to B, then back to A, you need a true identity transformation there, not just a transformation to a physically equivalent description. The latter is more general, but the added generality is meaningless, so again by convention we "mod out" these various symmetries to obtain the minimum physically interesting set.

We need to get straight on terminology.

The "general linear group" is by definition all invertible linear transformations on a finite-dimensional vector space. It is called GL(n,R) if the dimension of the space is n and underlying field is the real numbers. It is called GL(n,C) if the vector space is a complex vector space. That is the only context in which the term "general linear group" is used in mathematics.

"Representation theory" in mathematics refers to the study of the irreducible unitary representations of a group on a Hilbert space. That has nothing to do with the subject at hand. In the case of abelian groups the irreducible unitary representations take place on a one-dimensional Hilbert space and the resulting theory is the theory of the Fourier transform. In the case of non-abelian Lie groups the theory has implications for quantum mechanics. I suppose that the representations of the Lorentz group may have some interest to physicists, but I don't know about them.

The normal terminology for all transformations that preserve the Mankowski inner product is the "Poincare group" or the "inhomogensous Lorentz group". It differs from the usual Lorentz group in that it admits translations. Within the Lorentz group one usually singles out the transformations of determinant 1 that are also orthochronous (preserve the orientation of time-like vectors). Those are called proper Lorentz transformations. Those give you the "physically interesting" set.

There is one other mild subtlety. The choice of an origin for Minkowski space is arbitrary. If you simply "forget" about the origin, but continue to look at vectors as displacements, then the object that you get is called an affine space. It is just like a vector space, except that there is no distinguished origin and you think of vectors as "little arrows" and add with the parallelogram law. The Minkowski space of physics is really an affine space. This is not terribly important unless one is being more rigorous than is usually necessary, but the idea comes in handy from time to time.

I understand what you are trying to say with respect to Lorentz transformations, but the term "general linear group of the Lorentz transformations" is confusing because those words have a very specific meaning in a closely related context. I think you are trying to say that the set of all linear transformations that preserve the Minkowski inner product is too big and contains transformations that are not physically interesting. I agree with that. That is why one singles out the "proper" "isochronous" elements. Proper here means of determinant 1 rather than -1 and has the physical effect of maintaining the usual orientation of the 3 spatial coordinates (a right-handed frame). "Orthochronous" means that timelike vectors in the future portion of the light cone stay in the future direction.

I have no idea what you mean by "representation theory" or "representation of the Lorentz group" in this context unless you are simply talking about the correspondence between a linear transformation and a matrix, once one has selected a basis. In the case of Minkowski space one usually wants to choose a basis of orthonormal vectors (orthonormal here means norm +/- 1). Once can do that, and the construction is interesting. It is a bit more difficult than in the Hilbert space case with a positive-definite inner product, but the result is similar. Whenever possible I find it easier to define the Lorentz group without reference to a basis, and only invoke a specific basis when necessary. That makes it easier in most cases to decide what is invariant and to get the geometric flavor without a bunch of calculations involving lots of indices getting in the way. But for the Lorentz group the 4,4 position in the matrix is important and one must refer to it from time to time.

So, I think we are saying similar things, but with wildly divergent terminologies in which similarities in words do not reflect similarities in meanings.

Ken G
2009-Aug-20, 10:38 PM
The "general linear group" is by definition all invertible linear transformations on a finite-dimensional vector space.Yes, and as such they depend on the vector space. Since we are talking about representation theory, the vector space is the representation space of the Lorentz group. So we can say that Lorentz transformations are the general linear group on the representation space of the Lorentz group, but that's the same as what I said above: they are the general linear group of the representation of the Lorentz group.


"Representation theory" in mathematics refers to the study of the irreducible unitary representations of a group on a Hilbert space. That has nothing to do with the subject at hand. You'll have to take that statement up with Wikipedia ( http://en.wikipedia.org/wiki/Representation_theory), I'm content to use "representation theory" to mean "the theory involving the representation of groups", which has everything to do with the subject at hand. For example, from http://en.wikipedia.org/wiki/Group_representation we have:
"A representation of a group G on a vector space V over a field K is a group homomorphism from G to GL(V), the general linear group on V."
There's a nice general definition of this vastly useful concept that appears all over the place in physics. Why should either mathematicians or physicists want to limit the concept to Hilbert spaces? Banach spaces are fine for doing physics.

I suppose that the representations of the Lorentz group may have some interest to physicists, but I don't know about them. Well, since the Lorentz transformations involve representations of the of the Lorentz group (physics meaning) on the "4-vector" (Minkowski) representation space, you do know about them.


The normal terminology for all transformations that preserve the Mankowski inner product is the "Poincare group" or the "inhomogensous Lorentz group". I've already posited that, it's the normal terminology for mathematics, which is not interested in what physics is. But physics is interested in what physics is, so when having a discussion about how physics uses mathematics, and what the differences are there, one needs to use the physics concepts of these ontological entities like the Lorentz group (which I gave above).
I understand what you are trying to say with respect to Lorentz transformations, but the term "general linear group of the Lorentz transformations" is confusing because those words have a very specific meaning in a closely related context.Well, I talked about the "general linear group of the representation of the Lorentz group," which is perfectly correct and not confusing at all, if you merely recognize that I am taking the physics meaning of the Lorentz group, which I claimed (in the absence of gravity) was:

The members of the Lorentz group are the mappings between inertial observers, and the group operation is the composition of those mappings.

Note this uses no mathematical concept other than a "group", so does not as yet have any representation or any Lorentz transformations, it is just a physical entity with a little help from a specific math concept that is studiable entirely via observations. However, with a bit more help from mathematics, we get the concept of a group representation, which allows us to do quantitative algebra on this physical concept, and when we also look at observations, we find the appropriate representation space is Minkowski space, and the general linear group is the Lorentz transformations (glossing over when we should really be saying the Poincare group and so forth, as that's hardly the issue here).
I think you are trying to say that the set of all linear transformations that preserve the Minkowski inner product is too big and contains transformations that are not physically interesting. I agree with that.Yes, but that's all the sidelight about Poincare vs. Lorentz. The real issue is about what physics is and how it uses mathematics, and how the Lorentz group plays into that. It all stems from the issue in physics "what is a reference frame", without that context we would be arguing two halves of the same coin, there's no other distinction.


I have no idea what you mean by "representation theory" or "representation of the Lorentz group" in this context unless you are simply talking about the correspondence between a linear transformation and a matrix, once one has selected a basis. No, I'm using the definitions from the Wiki link I gave above. Representation theory is the way to make abstract groups useful in physics, and it applies not just to Hilbert spaces, but to when you have a vector space with an inner product. Physics almost always does, including Minkowski space.

So, I think we are saying similar things, but with wildly divergent terminologies in which similarities in words do not reflect similarities in meanings.It is certainly true that you are thinking about these entities like a mathematician, which is natural because they are concepts that borrow heavily from mathematics. But the context of the thread is the different language of physics and math, and what does a reference frame mean in physics. So I'm explaining these same concepts from that same perspective: what do they mean in physics. It's just as you say, there's a different meaning there, and it can cause a lot of miscommunication, especially around issues like the role of rigor, another important context of this discussion.

DrRocket
2009-Aug-21, 01:55 AM
Yes, and as such they depend on the vector space. Since we are talking about representation theory, the vector space is the representation space of the Lorentz group. So we can say that Lorentz transformations are the general linear group on the representation space of the Lorentz group, but that's the same as what I said above: they are the general linear group of the representation of the Lorentz group.

You'll have to take that statement up with Wikipedia ( http://en.wikipedia.org/wiki/Representation_theory), I'm content to use "representation theory" to mean "the theory involving the representation of groups", which has everything to do with the subject at hand. For example, from http://en.wikipedia.org/wiki/Group_representation we have:
"A representation of a group G on a vector space V over a field K is a group homomorphism from G to GL(V), the general linear group on V."



There's a nice general definition of this vastly useful concept that appears all over the place in physics. Why should either mathematicians or physicists want to limit the concept to Hilbert spaces? Banach spaces are fine for doing physics.Well, since the Lorentz transformations involve representations of the of the Lorentz group (physics meaning) on the "4-vector" (Minkowski) representation space, you do know about them.
....., I'm using the definitions from the Wiki link I gave above. Representation theory is the way to make abstract groups useful in physics, and it applies not just to Hilbert spaces, but to when you have a vector space with an inner product. Physics almost always does, including Minkowski space.
It is certainly true that you are thinking about these entities like a mathematician, which is natural because they are concepts that borrow heavily from mathematics. But the context of the thread is the different language of physics and math, and what does a reference frame mean in physics. So I'm explaining these same concepts from that same perspective: what do they mean in physics. It's just as you say, there's a different meaning there, and it can cause a lot of miscommunication, especially around issues like the role of rigor, another important context of this discussion.[/QUOTE]

For what it is worth that Wiki article is trying to be all things to all people, fairly common for Wiki.

What they say is true. However, if you run into a mathematician who says that he works in "representation theory" that will generally mean someone who is concerned with the analysis that surrounds the irreducible unitary representations of a Lie group on a Hilbert space. This is a natural extension of Fourier analysis. The reason that one looks at Hilbert spaces in this context is that representation theory in this sense is a generalization of Fourier analysis on locally compact abelian groups. There the unitary representations are simply homomorphisms from the group to the unit circle in the complex plane, i.e. complex numbers of magnitude one which are unitary operators on the one-dimensional Hilbert space of the complex numbers themselves. In this setting this seems rather overdone, but it makes sense when you look to generalize Fourier analysis to the non-abelian setting and has deep implications for quantum mechanics. The representations of the Heisenberg group (3x3 matrices with 1's on the diagonal and 0's below the diagonal) are intimately related to the non-commutation of operators which is the Heisenberg uncertainty principle.

You can certainly look at homomorphisms into the group of operators on a Banach space, or for that matter any topological vector space, but only in the context of a Hilbert space does the study of "untary" operators have meaning, and it is fruitful to study them in that context. Banach spaces are useful in that setting since the continuous operators on Hilbert space are themselves a Banach space, and in fact are a special kind of Banach space, called a C* algebra. The study of C* algebras is a significant area of functional analysis.

I don't understand the sentence that I have bolded. Any topological vector space with an inner product (in this context I mean a positive-definite inner product), that is complete with respect to the metric generated by the inner product is a Hilbert space. And any inner product space over the real or complex numbers can be completed and the result is a Hilbert space. Also, any finite-dimensional real or complex inner product space is automatically a Hilbert space, since it is of necessity complete. So, from the perspective of a physicist if you have an inner product space, you may as well go all the way to the associated Hilbert space, since there is nothing to be lost and everything to be gained.

Ken G
2009-Aug-21, 06:12 AM
The reason that one looks at Hilbert spaces in this context is that representation theory in this sense is a generalization of Fourier analysis on locally compact abelian groups. There the unitary representations are simply homomorphisms from the group to the unit circle in the complex plane, i.e. complex numbers of magnitude one which are unitary operators on the one-dimensional Hilbert space of the complex numbers themselves. In this setting this seems rather overdone, but it makes sense when you look to generalize Fourier analysis to the non-abelian setting and has deep implications for quantum mechanics. The representations of the Heisenberg group (3x3 matrices with 1's on the diagonal and 0's below the diagonal) are intimately related to the non-commutation of operators which is the Heisenberg uncertainty principle.
Don't get me wrong, I do think this stuff is cool. Without the theorems of mathematics, it would be very hard to make sense of the physical world.
So, from the perspective of a physicist if you have an inner product space, you may as well go all the way to the associated Hilbert space, since there is nothing to be lost and everything to be gained.You're right, inner products are the key quantities for physics, and Hilbert spaces are pretty much the crux of all physical geometry.

grav
2009-Aug-23, 03:48 AM
Is there a set of invariants that are invariant for all possible transformations? There are "large" groups such as the group of all linear transformations of an n-dimensional real space and I suppose these have invariants. But it would be surprising to me if these invariants continue to "invary" under non-linear transformations. Plus the paper in the original post is referring to diffeomorphisms, which restricts the type of transformations that are used. So I don't understand how "general covariance" can claim to apply to all possible transformations of coordinates if it depends on having a set of invariants.I have been wanting to reply to this for a while now, since I have started a thread using just such invariants to find what must be true regardless of theory, as some of this thread appears to be about, but unfortunately it has been placed in ATM and I have so far not been able to get it out, although it uses only mainstream methods. It seems strange, but because of where it is, that limits my ability to discuss this topic with you guys, regrettably, although much of what you are describing is over my head anyway, but replying with what can only possibly be construed as mainstream, yes, there are invariants that can be used to describe any theory within its domain of applicability. Well, really just one, but many formulas can be derived from it, and that is the times that read upon clocks when they coincide in the same place. All observers must agree upon that regardless of any theory used to describe physics. The formulas that are derived using that one invariant must apply to any theory in physics within the domain of the formulas. One might be surprised how much can be determined that way.