Harald Kirsch

genug Unfug.

$\def\v#1{\mathfrak{#1}} \def\vx{\v{x}} \def\vy{\v{y}} \def\vz{\v{z}} \def\mA#1#2{a_{#1 #2}} \def\ma#1#2{a^{#1}_{#2}} \def\t#1{\tilde #1} \def\tx{\t{x}} \def\ty{\t{y}} \def\d#1{\partial #1} \def\dd#1{\partial_{#1}} \def\pderiv#1{\frac{\partial}{\partial #1}} $


Contravariant, Covariant, Tensor

(I) Contravariance

For some time now I struggled to understand what covariant and contravariant mean in the context of vectors and tensors. By writing this series of articles, I primarily explain the concepts to myself, but other may like it too.

I started reading Raum, Zeit, Materie from Hermann Weyl, a classic that explains these concepts really good. Well, at least after having read in other books and on Wikipedia about the topic, Weyl's book seems to have helped to overcome the last hurdle to understanding. Much more so anyway than statements like "a tensor is something that transforms like a tensor" in an otherwise nice book.

Let me describe what I learned. In the discussion of a vector space $V$ over a field $K$ the terms covariant and contravariant become relevant in particular when more than one basis comes into play. Something varies (changes) either along with the change from one basis to the other — "co-", or against it — "contra-".

Let the $n$ dimensional vector space $V$ have a basis $$ \v{X} = (\vx_1,\dots,\vx_n) , $$ which in particular means that an arbitrary element $\vz\in V$ has a unique representation with regards to $\v{X}$ as \begin{equation} \vz = x_1\vx_1 +\dots+ x_n\vx_n = \sum_{i=1}^n x_i\vx_i \,, \qquad x_i\in K . \label{eq:zFromX} \end{equation} Here is a point that can easily lead to confusion, since the $n$-tuple $(x_1,\dots,x_n)$ might be called a "vector" in other contexts. But $\vz$, as an element of the vector space $V$, is a vector, while $(x_1,\dots,x_n)$ are just the coordinates of $\vz$ with respect to $\v{X}$.

We should look at the vectors in $V$ as opaque items for which we do not know any inner structure. They are no numbers, or tuples of numbers or anything we can manipulate directly. All we know about them is that we can add them and multiply them with a value from $K$ to get another one of them. The coordinates are merely a handle for the vector, a view, a representation but with a kink: they only make sense if we know the respective basis. Without reference to the basis, $(x_1,\dots,x_n)$ are not coordinates, but just a meaningless bunch of numbers.

We see how arbitrary the coordinates are as soon as we introduce another basis $$\v{Y} = (\vy_1,\dots,\vy_n),$$ different from $\v{X}$. Now the same $\vz$ has different coordinates $(y_1,\dots,y_n)$ such that \begin{equation} \vz = y_1\vy_1+\dots +y_n\vy_n = \sum_{j=1}^n y_j\vy_j . \label{eq:zFromY} \end{equation} Luckily, the two sets of coordinates are not completely arbitrary but have a relation to each other, dictated by the relation between the two bases, as can be derived as follows.

In the same way as $\vz$ is a weighted sum of basis vectors, each $\vy_j$ of the second basis is a weighted zum of the basis vectors $\vx_i$: \begin{equation} \vy_j = \mA{1}{j}\vx_1+\dots+\mA{n}{j}\vx_n = \sum_{i=1}^n \mA{i}{j}\vx_i \qquad \mA{i}{j}\in K, \, \forall j\in\{1,\dots,n\}. \label{eq:switchbase} \end{equation} So for a given $j$, the $\mA{i}{j}$ are the coordinates of $\vy_j$ with respect to basis $\v{X}$. Now we can ask how $\vz$ looks in basis $\v{Y}$ when we build each $\vy$ from the $\vx$: \begin{align} \vz &= \sum_{j=i}^n y_j\vy_j \\ &= \sum_{j=1}^n y_j\left( \sum_{i=1}^n \mA{i}{j} \vx_i\right)\\ &= \sum_{i=1}^n \left(\sum_{j=1}^n \mA{i}{j} y_j\right) \vx_i \end{align} By matching the term in parentheses with equation \eqref{eq:zFromX}, and invoking the uniqueness of coordinates given a basis, we get \begin{equation} x_i = \sum_{j=1}^n \mA{i}{j} y_j. \end{equation} Comparing this to equation \eqref{eq:switchbase}, $$ \vy_j= \sum_{i=1}^n \mA{i}{j}\vx_i , $$ we see that on the one hand the $\mA{i}{j}$ transform the basis vectors from $\v{X}$ to $\v{Y}$, while on the other hand they transform the coordinates in the opposite direction, from $\v{Y}$ coordinates to $\v{X}$ coordinates.

This is how the term contravariant comes about: the coordinates transform contravariant to the bases. And typically one just says that the coordinates are (or transform) contravariant.

The next part will show that there are also $n$-tupels that are transformed by the $\mA{i}{j}$ in the same direction as the basis and are hence called covariant.