In this article we’re going to talk about tensors. They’re not too difficult to understand, especially if you know what a functional is, but often their representation can be awkward. So we’re going to start with the basics and slowly work up to the definition of an actual tensor. We’ll start with a short overview of vector spaces and their duals (when you have one you always have the other). We will then talk about adding extra structure to our space by way of including the set of all homomorphisms between vector spaces. We will certainly need this extra structure to be able to talk about maps between spaces, and the origination of these maps is not always obvious. Finally, we focus our attention on the maps which take you from vector spaces (or the dual) to the underlying field of numbers (or scalars). These maps (also called functionals) are the simplest type of tensor. From here is is but a short leap of faith to consider functionals which take you from, not one, but many vector spaces and duals to the underlying field of scalars.

Some people say that tensors are the workhorse of general relativity. Why is that? Why are they workhorses? In my opinion there are two reasons: First, because General Relativity, like all physical theories, is a framework used to measure stuff. When you measure something you would like to get a number back representing the value of your measurement. Measuring a thing should always result in a number; this number can then used to compare to other numbers (other measurements) and arrive at an opinion about the thing that you are measuring. A tensor is useful here because they always return a number. Always. But hang on, aren’t there other mathematical objects that return numbers? What makes tensors so special? Tensors are really, really good at converting many types of physical objects (things which can be described by vectors and covectors) in to numbers. They also behave very nicely when you change coordinate systems, something which physicists tend to do a lot. Furthermore, when a basis is present, tensors can also be neatly described by arrays or matrices of ordinary numbers, making it even easier to use them! And this is why tensors are workhorses.

If you want to get in to a bit more detail, tensors are also sensitive to curvature. It does this because they can have extra components that keep track of tiny differences between distances as measured on the curved surface and distances measured as if they were on a flat surface. Ordinary vectors just can’t do this.

But what’s so special about curvature? Well, for starters, ordinary classical calculus hates curved things. Then, to make matters worse, in the early 20th century Einstein proposed that gravity was a manifestation of curved spacetime (and not a flat force field emitted by an object with mass). This appeared like a great idea but calculus had to be re-invented for curved surfaces, and the tensor was born (with the help of Tullio Levi-Civita and Gregorio Ricci-Curbastro).

Vector Spaces

But before we get in to curved spacetimes, let’s go back to the beginning. Suppose you have two vector spaces V and W. A vector v from V can be associated with a vector w from W by way of a map. A map in this sense is a function (i.e. it eats a vector and spits out another vector) and is denoted by f . Right now, maps or functions are not really concretely defined; they certainly do not come naturally with the vector spaces and they certainly do not come with the underlying field, so where do they come from and how are they even allowed to exist? Technically, for us to be able to talk about vector spaces and maps we need to impose extra structure on the vector space. As you should know, plain vanilla vector space is simply a set of objects called vectors (and they satisfy a whole bunch of rules called the vector space axioms) and we group those together in to a set called V. Then there is a set of different objects called scalars (and they satisfy a whole bunch of rules called the scalar axioms) and we group those together in to a set called \mathbb{F}. Then there are 4 binary operations called 1) vector space addition, 2) vector space scalar multiplication, 3) field addition, and 4) field multiplication. Thus, we write a \mathbb{F}-vector space as (V,\mathbb{F}) , but really if you want to be precise it is

(V,+_V,\cdot_V;\mathbb{F},+_{\mathbb{F}},\cdot_{\mathbb{F}})

If we have two vector spaces V and W we can include some extra structure which we will call the set of all homomorphisms from V to W and denote this set as \mbox{Hom}(V,W) and this set contains objects called maps or functions (and they satisfy a whole bunch of rules called linearity which we will describe below) and they look like this: f\,:\,V \longrightarrow W and they map f\,:\,v\mapsto w. Thus, the complete picture of where we are working is now

(V,+_V,\cdot_V;\mathbb{F},+_{\mathbb{F}},\cdot_{\mathbb{F}},\mbox{Hom}(V,W))

Note that \mbox{Hom}(V,W) is itself a vector space with its own addition and scalar multiplication operations called +_H and \cdot_H , defined by

+_H \,:\, \mbox{Hom}(V,W) \times \mbox{Hom}(V,W) \longrightarrow \mbox{Hom}(V,W)

which maps the pair like so (f,g) \mapsto f +_H g \in \mbox{Hom}(V,W), and similarly for \cdot_H. Since the right hand side of the addition operation is an element of \mbox{Hom}(V,W) it must be of the form f\,:\,V \longrightarrow W hence

f +_H g \,:\, V \longrightarrow W

and so it takes vectors and maps them according to V \ni v\mapsto f(v) +_W g(v) =: (f+_H g)(v) \in W.

Linearity

We impose a condition on these maps to ensure that when you scale the vectors before the mapping, the resultant vector is also scaled by an associated amount – this makes the map a linear map and it satisfies

f(a_1 v_1 + a_2 v_2) = a_1 f(v_1) + a_2 f(v_2) = a_1 w_1 + a_2 w_2

For convenience I’ve re-labelled the f(v) on the right hand side to emphasise that they are vectors in W and not V . In fact, in tensor analysis it is really important to keep track of which side of the mapping you are on, and in some cases it can get awfully confusing. Take for example the above linearity condition for a linear map. The f(v_1) on the right hand side is both i) a function on a vector v \in V and, ii) a vector w \in W. Likewise, the plus on the left hand side is the addition operator that comes with the vector space V but the plus on the right hand side is the addition operator that comes with the vector space W; and one may write the above with subscripts to be really precise:

f(a_1 \cdot_V v_1 +_V a_2 v_2) = a_1 \cdot_W f(v_1) +_W a_2 f(v_2)

However, the a_1 and a_2 are elements of the same field \mathbb{F} that comes with both vector spaces, and the multiplication of a_1 with f(v_1) is the scalar multiplication that comes with the vector space W, which just happens to be the same scalar multiplication that comes with the vector space V.

The point here is that writing a simple expression such as that of linearity of maps involves quite a bit of thinking about where you are and what operation you are using. Recall that with vector spaces you actually get quite a lot of structure! Not only do you get a set of vectors V, not one but two operations called addition and scalar multiplication

+_V \,:\, V \times V \longrightarrow V

\cdot_V \,:\, \mathbb{F} \times V \longrightarrow V

but, don’t forget, you also get the addition and product operations that come with the underlying field! That’s two more types of addition and multiplication! We can see these in action. Consider the expression u + v for two vectors u,v \in V. What plus operator is being used here? It’s the plus operator that comes with vector space, thus  it should be written as u +_V v. OK, what about (a+b)v for a vector v \in V and two scalars a,b \in \mathbb{F}? Well, that plus is from the field, then there is a product which comes from the vector space (vector spaces allow scalar multiplication between elements from the underlying field and vectors), hence, to be absolutely clear, we could write (a +_{\mathbb{F}} b) \cdot_V v .

In light of this, we can no longer say “a vector space” because now we have to be specific about the underlying field, thus we say a \mathbb{F}-vector space.

Dual Vector Spaces

So we are halfway there. We’ve introduced vector spaces, their underlying fields, and a bit of extra structure (a new set) containing all the homomorphisms from one vector space to another different vector space. Note that the addition and scalar multiplication operations are also maps but they live in a different set called the set of endomorphisms (or \mbox{End}(V) ) because they are not maps between vector spaces, they are maps from a vector space in to itself.

So the set \mbox{Hom}(V,W) takes care of the maps between different vector spaces, the set \mbox{End}(V) takes care of the maps (or operations like addition and multiplication) needed from within the vector space, now we introduce more structure by considering now the maps which go from the vector space to the underlying field! These maps are called functionals and they look like this f\,:\,V \longrightarrow \mathbb{F}. They map vectors to numbers (if you want to call elements of the underlying field “numbers”). The set of all linear functionals f on a vector space (V,\mathbb{F}) is actually more than a set, it actually satisfies the properties of an entirely new vector space (actually, the set of homomorphisms is also a vector space). It is called the dual vector space and consists of functionals (not vectors), you could call it “functional space” but we don’t, we could also call it the set of all linear maps from the vector space to the field, i.e. by \mbox{Hom}(V,\mathbb{F}). We denote the dual space as V^* to signify that it has something to do with the original vector space V.

Unfortunately, functionals require a bit of work to define. The reason being is that a vector is an object, but a number is a number, two very different things. The homomorphisms and endomorphisms were easy to define because we were mapping apples to apples and oranges to oranges. How do we turn a vector in to a number!? Gosh.

Basis

To answer the question we need a basis. A basis is even more structure we need to add to our vector space and it is, yet another, set. If we impose a basis on our vector space then we have an additional set of privileged vectors, let’s call them \{e_1,e_2,\dots,e_n\} with representative e_j , which are all linearly independent (you can’t represent one as a linear combination of any others) and they also span the entire vector space (which just means every vector can be written as a sum of (scaled) basis vectors). There is also a dual basis, denoted by \{e^{*1}, e^{*2}, \dots, e^{*n}\} (note the dimension of the dual vector space equals the dimension of the vector space so we must be talking about finite dimensional spaces, trust me, there’s a theorem), or a representative e^{*i}. Now, since the dual vector e^{*i} is a functional and notvector, it eats a vector and spits out a number. So we could write something like e^{*i}(v) = 5 indicating that the result is an element from the field. In general, however we don’t know what number we get, it could be a 5 or it could be 500. However we do know the number when the basis dual vector eats a basis vector; the number we get is the Kronecker delta and it looks like this:

e^{*i}(e_j) = \delta_j^i

Now we see why we wrote the basis vectors with the index as a superscript! The Kronecker delta must be written with an index upstairs and downstairs; and downstairs is reserved for the basis vector index so the dual basis vector index goes upstairs (there’s also the Einstein summation convention to worry about, later).

Just like a vector can be written in terms of its basis vectors, here’ one:

v = v^1 e_1 + v^2 e_2 + \cdots + v^n e_n

where the basis coefficients {v^i} are elements from the underlying field, the dual vectors can be also be written in terms of their dual basis vectors and coefficients:

f = f_1 e^{*1} + f_2 e^{*2} + \cdots f_n e^{*n}

where the dual basis coefficients {f_i} are elements from the underlying field.

The final ingredient to see how functionals (dual vectors) turn vectors in to numbers is to now consider (in terms of basis) what happens when a dual vector eats a vector. First we write out the action in terms of basis vectors:

f(v) = f_i e^{*i} (v^j e_j)

That was easy. Now by virtue of linearity let’s re-arrange:

f_i v^j e^{*i} (e_j)

But we know that when a dual basis vector eats a regular basis vector we get the Kronecker delta, so:

f_i v^j e^{*i}(e_j) = f_i v^j \delta_j^i

The Kronecker delta just equals 1 when i \neq j so we are left with

f_i v^j \in \mathbb{F}

These guys are numbers from the underlying field f_i,\,v^j \in \mathbb{F}, hence this multiplication is just scalar multiplication. Hence, if this representation works for any i component of vectors and functionals, it means dual vectors are capable of taking vectors to numbers.

Side note: If we denote by \langle\,,\,\rangle the map which takes a dual vector and a vector to the field, as in \langle\,,\,\rangle\,:\, V^* \times V \longrightarrow \mathbb{F} and call this map the inner product, then the action of a dual vector on a vector (which returns a number), given by f(v) is the same operation:

\langle f\,,\, v \rangle = f_i v^i = (f_1 v^1 + \cdots + f_n v^n) \in \mathbb{F}

Tensors

We have now met maps which take vectors to the field and called them dual vectors (or functionals), f\,:\,V \longrightarrow \mathbb{F}. We have met maps which take dual vectors and vectors to the field (inner products), \langle\,,\,\rangle\,:\,V^* \times V \longrightarrow \mathbb{F}. Can we keep going and define maps which takes more than one dual vector and more than one vector to the field? Yes we can.

We are asking if we can define something that look like this:

T\,:\,\bigotimes^p V^* \bigotimes^q V \longrightarrow \mathbb{F}

In a way, this is like a super multiple version of an inner product. Instead of taking just one dual and one vector and getting a number back, you can take multiple duals and multiple vectors and get a number back. The above object is called a (p,q)-tensor.

A (0,1)-tensor maps a vector to a real number, i.e. T^0{}_1\,:\,V \longrightarrow \mathbb{F}. Thus, (0,1)-tensors are functionals; they are the duals. Similarly, a (1,0)-tensor is a vector. So the vector space contains all the (1,0)-tensors and the dual space contains all the (0,1)-tensors. The (0,0)-tensor is the underlying field. The inner product is a (1,1)-tensor. A (0,n)-tensor that it totally antisymmetric is called an n-form.

The set of all (p,q)-tensors is called a tensor space of type (p,q) and is denoted by \mathcal{T}^p_q. Just as a vector space has addition and scalar multiplication operations, a tensor space has the tensor product operator given by

\otimes\,:\, \mathcal{T}^p_q \times \mathcal{T}^{p^{\prime}}_{q^{\prime}} \longrightarrow \mathcal{T}^{p+p^{\prime}}_{q+q^{\prime}}

Thus, given a (p,q)-tensor \mu , and a (p^{\prime},q^{\prime})-tensor v, thus \mu \in \mathcal{T}^p{}_q and v \in \mathcal{T}^{p^{\prime}}{}_{q^{\prime}}, their tensor product is given by

\mu \otimes v (\omega_1,\dots,\omega_p,\xi_1,\dots,\xi_p^{\prime},u_1,\dots,u_q,v_1,\dots,v_{q^{\prime}}) \\ \quad\quad = \mu(\omega_1,\dots,\omega_p,u_1,\dots,u_q)\cdot_{\mathbb{F}} v(\xi_1,\dots,\xi_p^{\prime},v_1,\dots,v_{q^{\prime}})

Tensor Basis

A tensor can be represented by its basis vectors (the basis vectors, obviously, they come from the associated dual and vector spaces) in the following way

T^{a_1,\dots,a_p}{}_{b_1,\dots,b_q} = T(\xi^{*a_1},\dots,\xi^{*a_p},e_{b_1},\dots,e_{b_q}) \in \mathbb{F}

Tensor Space

Let us collect together all (p,q) -tensors, T_p^qV and introduce two new binary operations \oplus \,:\, T_p^qV \times T_p^q V \longrightarrow T_p^q V and \odot \,:\, \mathbb{F} \times T_p^q V \longrightarrow T_p^qV. Then (T_p^q V,\oplus,\odot) is a vector space called the tensor space and will be denoted by \mathcal{T}_p^q (M).

References

Lecture 8: Tensor space theory I – Frederic P. Schuller

Nakahara, M. 2008 Geometry, Topology and Physics ed. II, Institute of Physics ISBN 0-7503-0606-8