In this article we’re going to talk about tensors. They’re not too difficult to understand, especially if you know what a functional is, so we’re going to start with the basics and slowly work up to the definition.

Coming up is a short overview of vector spaces and their duals (when you have one you always have the other). We then talk about adding extra structure by way of the set of all homomorphisms between spaces. We need this extra structure to be able to talk about maps between spaces, and the origination of these maps is not always obvious. Finally, we focus our attention on the maps which take you from vector spaces (or the dual) to the underlying field of numbers (scalars). These special maps (also called functionals) are the simplest type of tensor. From here is is but a short leap of faith to consider functionals which take you from, not one, but many vector spaces and covector spaces to the underlying field of scalars.

Some people say that tensors are the workhorse of general relativity. Why is that? In my opinion there’s two reasons: First, because General Relativity, like all physical theories, is a framework used to measure stuff. When you measure something you tend to get a number representing the value of your measurement. Measurement of a thing always results in a number; this number can then used to compare other numbers and arrive at an opinion about the thing that you are measuring. A tensor is a particular kind of mathematical object which always returns a number. Always. But that’s not all (because there are many mathematical objects which return numbers). Tensors are also really, really good at converting many types of physical objects (things which can be described by vectors and covectors) in to numbers. They also behave very nicely when you change coordinate systems, something which physicists tend to do a lot. Furthermore, when a basis is present, tensors can also be neatly described by arrays or matrices of ordinary numbers, making it even easier to use them!

Secondly, tensors are sensitive to curvature. It does this because they can have extra components that keep track of tiny differences between distances as measured on the curved surface and distances measured as if they were on a flat surface. Ordinary vectors just can’t do this.

But what’s so special about curvature? Well, for starters, ordinary classical calculus hates curved things. Then, to make matters worse, in the early 20th century Einstein proposed that gravity was a manifestation of curved spacetime (and not a flat force field emitted by an object with mass). This appeared like a great idea but calculus had to be re-invented for curved surfaces, and the tensor was born (with the help of Tullio Levi-Civita and Gregorio Ricci-Curbastro).

Vector Spaces

Suppose you have two vector spaces V and W. A vector v from V can be associated with a vector w from W by way of a map. A map in this sense is a function (i.e. it eats a vector and spits out another vector) and is denoted by f . Right now, maps or functions are not really concretely defined; they certainly do not come with the vector spaces and they certainly do not come with the underlying field, so where do they come from and how are they allowed to exist? Technically, for us to be able to talk about vector spaces and maps we need to impose extra structure on the vector space. A plain vanilla vector space is simply a set of objects called vectors (and they satisfy a whole bunch of rules called the vector space axioms) and we group those together in to a set called V. Then there is a set of other objects called scalars (and they satisfy a whole bunch of rules called the scalar axioms) and we group those together in to a set called \mathbb{F}. Then there are 4 binary operations called 1) vector space addition, 2) vector space scalar multiplication, 3) field addition, and 4) field multiplication. Thus, we write a \mathbb{F}-vector space as (V,\mathbb{F}) , but really if you want to be precise it is

(V,+_V,\cdot_V;\mathbb{F},+_{\mathbb{F}},\cdot_{\mathbb{F}})

If we have two vector spaces V and W we can include some extra structure which we will call the set of all homomorphisms from V to W and denote this set as \mbox{Hom}(V,W) and this set contains objects called maps or functions (and they satisfy a whole bunch of rules called linearity which we will describe below) and they look like this: f\,:\,V \longrightarrow W and they map f\,:\,v\mapsto w. Thus, the complete picture of where we are working is now

(V,+_V,\cdot_V;\mathbb{F},+_{\mathbb{F}},\cdot_{\mathbb{F}},\mbox{Hom}(V,W))

Note that \mbox{Hom}(V,W) is itself a vector space with its own addition and scalar multiplication operations called +_H and \cdot_H , defined by

+_H \,:\, \mbox{Hom}(V,W) \times \mbox{Hom}(V,W) \longrightarrow \mbox{Hom}(V,W)

which maps the pair like so (f,g) \mapsto f +_H g \in \mbox{Hom}(V,W), and similarly for \cdot_H. Since the right hand side of the addition operation is an element of \mbox{Hom}(V,W) it must be of the form f\,:\,V \longrightarrow W hence

f +_H g \,:\, V \longrightarrow W

and so it takes vectors and maps them according to V \ni v\mapsto f(v) +_W g(v) =: (f+_H g)(v) \in W.

Linearity

We impose a condition on these maps to ensure that when you scale the vectors before the mapping, the resultant vector is also scaled by an associated amount – this makes the map a linear map and it satisfies

f(a_1 v_1 + a_2 v_2) = a_1 f(v_1) + a_2 f(v_2) = a_1 w_1 + a_2 w_2

For convenience I’ve re-labelled the f(v) on the right hand side to emphasise that they are vectors in W and not V . In fact, in tensor analysis it is really important to keep track of which side of the mapping you are on, and in some cases it can get awfully confusing. Take for example the above linearity condition for a linear map. The f(v_1) on the right hand side is both i) a function on a vector v \in V and, ii) a vector w \in W. Likewise, the plus on the left hand side is the addition operator that comes with the vector space V but the plus on the right hand side is the addition operator that comes with the vector space W; and one may write the above with subscripts to be really precise:

f(a_1 \cdot_V v_1 +_V a_2 v_2) = a_1 \cdot_W f(v_1) +_W a_2 f(v_2)

However, the a_1 and a_2 are elements of the same field \mathbb{F} that comes with both vector spaces, and the multiplication of a_1 with f(v_1) is the scalar multiplication that comes with the vector space W, which just happens to be the same scalar multiplication that comes with the vector space V.

The point here is that writing a simple expression as that of linearity of maps involves quite a bit of thinking about where you are and what operation you are using. Recall that with vector spaces you actually get quite a lot of structure! Not only do you get a set of vectors V, not one but two operations called addition and scalar multiplication

+_V \,:\, V \times V \longrightarrow V

\cdot_V \,:\, \mathbb{F} \times V \longrightarrow V

but you also get the addition and product operations that come with the underlying field! That’s two types of addition and two types of multiplication! We can see these in action. Consider the expression u + v for two vectors u,v \in V. What plus operator is being used here? It’s the plus operator that comes with vector space, thus u +_V v. OK, what about (a+b)v for a vector v \in V and two scalars a,b \in \mathbb{F}? Well, that plus is from the field, then there is a product which comes from the vector space (vector spaces allow scalar multiplication between elements from the underlying field and vectors), hence, to be absolutely clear, we could write (a +_{\mathbb{F}} b) \cdot_V v .

In light of this, we can no longer say “a vector space” because now we have to be specific about the underlying field, thus we say a \mathbb{F}-vector space.

Dual Vector Spaces

So we are halfway there. We’ve introduced vector spaces, their underlying fields, and a bit of extra structure (a new set) containing all the homomorphisms from one vector space to another different vector space. Note that the addition and scalar multiplication operations are also maps but they live in a different set called the set of endomorphisms (or \mbox{End}(V) ) because they are not maps between vector spaces, they are maps from a vector space in to itself.

So the set \mbox{Hom}(V,W) takes care of the maps between different vector spaces, the set \mbox{End}(V) takes care of the maps (or operations) needed from within the vector space, now we introduce more structure by considering now maps which go from the vector space to the underlying field! These maps are called functionals and look like this f\,:\,V \longrightarrow \mathbb{F}. They map vectors to numbers (if you want to call elements of the underlying field “numbers”). The set of all linear functionals f on a vector space (V,\mathbb{F}) is actually more than a set, it actually satisfies the properties of an entirely new vector space (actually, the set of homomorphisms is also a vector space). It is called the dual vector space and consists of functionals (not vectors), you could call it “functional space” but we don’t, we could also call it the set of all linear maps from the vector space to the field, i.e. by \mbox{Hom}(V,\mathbb{F}). We denote the dual space as V^* to signify that it has something to do with the original vector space V.

Unfortunately, functionals require a bit of work to define. The reason being is that a vector is an object, but a number is a number, two very different things. The homomorphisms and endomorphisms were easy to define because we were mapping apples to apples and oranges to oranges. How do we turn a vector in to a number!?

Basis

To answer the question we need a basis. A basis is even more structure we need to add to our vector space and it is, yet another, set. If we impose a basis on our vector space then we have an additional set of privileged vectors, let’s call them \{e_1,e_2,\dots,e_n\} with representative e_j , which are all linearly independent (you can’t represent one as a linear combination of any others) and they also span the entire vector space (which just means every vector can be written as a sum of (scaled) basis vectors). There is also a dual basis, denoted by \{e^{*1}, e^{*2}, \dots, e^{*n}\} (note the dimension of the dual vector space equals the dimension of the vector space so we must be talking about finite dimensional spaces, trust me, there’s a theorem), or a representative e^{*i}. Now, since the dual vector e^{*i} is a functional and notvector, it eats a vector and spits out a number. So we could write something like e^{*i}(v) = 5. There could well be a vector that when the basis vector eats it (acts on it) the result is the number 5. In general, however we don’t know what number we get. However we do know the number when the basis dual vector eats a basis vector; the number we get is the Kronecker delta and it looks like this:

e^{*i}(e_j) = \delta_j^i

Now we see why we wrote the basis vectors with the index as a superscript! The Kronecker delta must be written with an index upstairs and downstairs and downstairs is reserved for the basis vector index so the dual basis vector index goes upstairs (there’s also the Einstein summation convention to worry about, later).

Just like a vector can be written in terms of its basis vectors

v = v^1 e_1 + v^2 e_2 + \cdots + v^n e_n

where the basis coefficients {v^i} are elements from the underlying field, the dual vectors can be written in terms of their dual basis vectors and coefficients:

f = f_1 e^{*1} + f_2 e^{*2} + \cdots f_n e^{*n}

where the dual basis coefficients {f_i} are elements from the underlying field.

The final ingredient to see how we functionals (dual vectors) turn vectors in to numbers is to now consider (in terms of basis) what happens when a dual vector eats a vector. First we write out the action in terms of basis vectors:

f(v) = f_i e^{*i} (v^j e_j)

That was easy. Now by virtue of linearity let’s re-arrange:

f_i v^j e^{*i} (e_j)

But we know that when a dual basis vector eats a regular basis vector we get the Kronecker delta, so:

f_i v^j e^{*i}(e_j) = f_i v^j \delta_j^i

The Kronecker delta just equals 1 when i \neq j so we are left with

f_i v^j \in \mathbb{F}

These guys are numbers from the underlying field f_i,\,v^j \in \mathbb{F}, hence this multiplication is just field multiplication. Hence, if this representation works for any i component of vectors and functionals, it means dual vectors are capable of taking vectors to numbers.

If we denote by \langle\,,\,\rangle the map which takes a dual vector and a vector to the field, as in \langle\,,\,\rangle\,:\, V^* \times V \longrightarrow \mathbb{F} and call this map the inner product, then the action of a dual vector on a vector (which returns a number), given by f(v) is the same operation:

\langle f\,,\, v \rangle = f_i v^i = (f_1 v^1 + \cdots + f_n v^n) \in \mathbb{F}

Tensors

We have met maps which takes vectors to the field and called them dual vectors (or functionals), f\,:\,V \longrightarrow \mathbb{F}. We have met maps which take dual vectors and vectors to the field (inner products), \langle\,,\,\rangle\,:\,V^* \times V \longrightarrow \mathbb{F}. Can we keep going and define maps which takes more than one dual vector and one vectors to the field? Yes we can.

We are asking if we can define something that look like this:

T\,:\,\bigotimes^p V^* \bigotimes^q V \longrightarrow \mathbb{F}

In a way, this is like a super multiple version of an inner product. Instead of taking just one dual and one vector and getting a number back, you can take multiple duals and multiple vectors and get a number back. The above object is called a (p,q)-tensor.

A (0,1)-tensor maps a vector to a real number, i.e. T^0{}_1\,:\,V \longrightarrow \mathbb{F}. Thus, (0,1)-tensors are functionals; they are the duals. Similarly, a (1,0)-tensor is a vector. So the vector space contains all the (1,0)-tensors and the dual space contains all the (0,1)-tensors. The (0,0)-tensor is the underlying field. The inner product is a (1,1)-tensor. A (0,n)-tensor that it totally antisymmetric is called an n-form.

The set of all (p,q)-tensors is called a tensor space of type (p,q) and is denoted by \mathcal{T}^p_q. Just as a vector space has addition and scalar multiplication operations, a tensor space has the tensor product operator given by

\otimes\,:\, \mathcal{T}^p_q \times \mathcal{T}^{p^{\prime}}_{q^{\prime}} \longrightarrow \mathcal{T}^{p+p^{\prime}}_{q+q^{\prime}}

Thus, given a (p,q)-tensor \mu , and a (p^{\prime},q^{\prime})-tensor v, thus \mu \in \mathcal{T}^p{}_q and v \in \mathcal{T}^{p^{\prime}}{}_{q^{\prime}}, their tensor product is given by

\mu \otimes v (\omega_1,\dots,\omega_p,\xi_1,\dots,\xi_p^{\prime},u_1,\dots,u_q,v_1,\dots,v_{q^{\prime}}) \\ \quad\quad = \mu(\omega_1,\dots,\omega_p,u_1,\dots,u_q)\cdot_{\mathbb{F}} v(\xi_1,\dots,\xi_p^{\prime},v_1,\dots,v_{q^{\prime}})

Tensor Basis

A tensor can be represented by its basis vectors (the basis vectors, obviously, they come from the associated dual and vector spaces) in the following way

T^{a_1,\dots,a_p}{}_{b_1,\dots,b_q} = T(\xi^{*a_1},\dots,\xi^{*a_p},e_{b_1},\dots,e_{b_q}) \in \mathbb{F}

Tensor Space

Let us collect together all (p,q) -tensors, T_p^qV and introduce two new binary operations \oplus \,:\, T_p^qV \times T_p^q V \longrightarrow T_p^q V and \odot \,:\, \mathbb{F} \times T_p^q V \longrightarrow T_p^qV. Then (T_p^q V,\oplus,\odot) is a vector space called the tensor space and will be denoted by \mathcal{T}_p^q (M).

References

Lecture 8: Tensor space theory I – Frederic P. Schuller

Nakahara, M. 2008 Geometry, Topology and Physics ed. II, Institute of Physics ISBN 0-7503-0606-8