Exploring the Wonders of the Cosmos

Q: Is a scalar a tensor?

Yes. A scalar is a tensor of rank (0,0). It requires zero vectors and zero covectors as inputs, and simply exists as a single real number that is completely invariant under coordinate transformations.

Q: Can any matrix be treated as a tensor?

Absolutely not. You can fill a matrix with numbers, but unless those numbers transform according to the strict multilinear tensor transformation laws when you change coordinates, the matrix does not represent a physical tensor.

What Is a Tensor, Really? A Physicist's Honest Answer

The Transformation Trap

If you ask a mathematician what is a tensor in physics, they will tell you it is a multilinear map from a product of vector spaces and their duals to the real numbers. If you ask a physicist, they will likely tell you that a tensor is an object that transforms like a tensor. This physicist definition is notoriously infuriating. It is entirely circular, offering absolutely no conceptual foothold for an undergraduate trying to survive a general relativity or advanced electrodynamics course. You are handed an array of numbers, told to multiply it by Jacobian matrices, and left wondering what fundamental reality you are actually describing. To understand general relativity to truly see how spacetime bends and matter flows we must discard the circular definitions and look at the underlying geometric reality.

Series Note: This post is part of A Physicist's Guide to GR a structured series from tensors to black holes. A Physicist's Guide to General Relativity

Final Warning Before Tensor Suffering Begins

At some point in this journey, you will encounter a sentence so horrifyingly abstract that you will briefly consider abandoning physics entirely and becoming a potato farmer instead. This is normal. Every relativist, geometers, and field theorist before you has survived the same transformation-induced psychological event. Before you start agonizing over what a tensor fundamentally is, it is worth adopting a pragmatic mindset. As Ray D'Inverno states in his textbook.

“We shall be concerned more with what you do with tensors rather than what tensors actually are.”

~ Ray D'Inverno

This operational approach allows us to see tensors as tools that manipulate geometry, which is far more useful than getting bogged down in abstract algebra.

What a Tensor is NOT

We must immediately kill the most pervasive misconception in early physics education: a tensor is not a matrix, and it is not simply an array of numbers. A matrix is just a grid. It is a way of writing numbers down on a piece of paper. A tensor, by contrast, is a real, physical, geometric object that exists independently of any coordinate system you choose to draw on the universe. An electric field vector exists whether you use Cartesian coordinates, spherical coordinates, or no coordinates at all. When you change your coordinate system, the grid of numbers you use to describe the tensor will change, but the tensor itself remains perfectly invariant. Confusing a tensor with its matrix of components is like confusing a physical mountain with the contour map you drew of it.

Vectors and the Dual Space

To build a tensor, we start with vectors. Geometrically, a vector is an arrow living in a vector space \(V\) at a specific point in spacetime, representing things like velocity or displacement. To describe this vector algebraically, we introduce a basis of unit vectors \(\mathbf{e}_\mu\) and state that our vector \(\mathbf{V}\) is a linear combination of these basis vectors. Throughout this text, we will use the Einstein summation convention, where repeated indices one upper (contravariant) and one lower (covariant) are implicitly summed over all spacetime dimensions, from 0 to 3, without writing a summation symbol. We write the vector as: \[ \mathbf{V} = V^\mu \mathbf{e}_\mu \tag{1} \] However, vectors are only half the story. Associated with every vector space \(V\) is a dual space, denoted \(V^*\). The objects that live in this dual space are called covectors, or 1-forms. A covector is a linear machine: you feed it a vector, and it spits out a real number. The canonical physics example of a covector is the gradient of a scalar field, \(\partial_\mu \phi\). While you can visualize a vector as an arrow, you should visualize a covector as a series of parallel, infinitely long contour planes. When an arrow (a vector) pierces these planes, you simply count the number of planes it crosses to get a real number.

Geometric interpretation of vectors and covectors

Geometric interpretation of a contravariant vector and a covector (1-form). A covector can be visualized as a family of equally spaced hypersurfaces; acting on a vector counts how many hypersurfaces are crossed, producing a scalar.

The Formal Definition

We are now ready to state exactly what a tensor is. A tensor \(T\) of rank \((p, q)\) is a multilinear map that takes \(p\) covectors and \(q\) vectors as inputs, and linearly outputs a single real number. Mathematically, it is a map from the Cartesian product of the dual spaces and vector spaces to the real numbers \(\mathbb{R}\): \[ T: \underbrace{V^* \times \cdots \times V^*}_{p \text{ times}} \times \underbrace{V \times \cdots \times V}_{q \text{ times}} \to \mathbb{R} \tag{2} \] The word multilinear means that if you scale any one of the input vectors or covectors by a constant, the final real number output scales by that exact same constant. That is the entire definition. A tensor is just a machine with slots. A rank \((0,2)\) tensor has two slots for vectors. A rank \((2,0)\) tensor has two slots for covectors. A rank \((1,1)\) tensor has one slot for a vector and one slot for a covector.

Components vs The Tensor Itself

Here is where most physics textbooks fail the reader. When you see a symbol like \(T^{\mu\nu}\) in a textbook, you are explicitly not looking at the tensor. You are looking at the components of the tensor evaluated in a specific coordinate basis. Just as a vector is built from components multiplied by basis vectors, a tensor is built from its components multiplied by basis tensor products. For a rank \((2,0)\) tensor, the full geometric object is written using the tensor product operator \(\otimes\): \[ \mathbf{T} = T^{\mu\nu} \mathbf{e}_\mu \otimes \mathbf{e}_\nu \tag{3} \] The object \(\mathbf{T}\) is invariant; it does not care about your coordinates. The components \(T^{\mu\nu}\) are just the shadows this object casts onto your specific basis vectors \(\mathbf{e}_\mu\) and \(\mathbf{e}_\nu\). If you rotate your basis vectors, the shadows must change shape precisely to ensure that the physical object \(\mathbf{T}\) stays exactly the same.

Deriving the Transformation Law

We can now see exactly why the infamous transformation law exists. It is not an arbitrary rule; it is a mathematical inevitability derived from demanding that the geometric object remains invariant under a change of coordinates (Carroll, Ch. 1).

Derivation: Contravariant Vector Transformation

Let us change our coordinate system from \(x^\mu\) to a new set of coordinates \(x^{\mu'}\). By the chain rule of calculus, the basis vectors transform according to the partial derivatives of the coordinates, moving from the primed system back to the unprimed system: \[ \mathbf{e}_{\mu'} = \frac{\partial x^\mu}{\partial x^{\mu'}} \mathbf{e}_\mu \] This equation simply states that the new basis vectors are a linear combination of the old ones. Now, let us require that a physical vector \(\mathbf{V}\) remains completely unchanged by this coordinate switch. Its expression in the new primed basis must equal its expression in the old unprimed basis: \[ \mathbf{V} = V^{\mu'} \mathbf{e}_{\mu'} \] We substitute the transformed basis vectors into this expression to see how the components must behave to compensate: \[ \mathbf{V} = V^{\mu'} \left( \frac{\partial x^\mu}{\partial x^{\mu'}} \mathbf{e}_\mu \right) \] Because \(\mathbf{V}\) is also equal to \(V^\mu \mathbf{e}_\mu\) by definition, we can equate the components attached to the unprimed basis vectors on both sides of our logical equivalence: \[ V^\mu = \frac{\partial x^\mu}{\partial x^{\mu'}} V^{\mu'} \] To find how the components in the new frame relate to the old frame, we multiply both sides by the inverse Jacobian matrix \(\frac{\partial x^{\nu'}}{\partial x^\mu}\). The partial derivatives contract via the chain rule to form a Kronecker delta, leaving us with: \[ V^{\nu'} = \frac{\partial x^{\nu'}}{\partial x^\mu} V^\mu \] This gives the transformation law for a rank \((1,0)\) contravariant tensor. For a rank \((2,0)\) tensor like \(T^{\mu\nu}\), we simply apply a Jacobian matrix for every index to ensure the total multilinear object remains invariant: \[ T^{\mu'\nu'} = \frac{\partial x^{\mu'}}{\partial x^\mu} \frac{\partial x^{\nu'}}{\partial x^\nu} T^{\mu\nu} \]

The Metric Tensor

The most important tensor in general relativity is the metric tensor, denoted \(g_{\mu\nu}\). It is a symmetric, rank \((0,2)\) tensor, meaning it takes two vectors as inputs and outputs a scalar. Physically, it is the machine that defines distances, angles, and the dot product in curved spacetime. Following Carroll's conventions, we use a mostly-plus metric signature \((-,+,+,+)\). The metric dictates the fundamental spacetime interval \(ds^2\) between two infinitesimally close events: \[ ds^2 = g_{\mu\nu} dx^\mu dx^\nu \tag{4} \] Beyond defining geometry, the metric tensor serves as a crucial mathematical tool for manipulating other tensors by mapping vectors into the dual space.

Derivation: Raising and Lowering Indices

Because \(g_{\mu\nu}\) defines an isomorphism between the vector space \(V\) and its dual space \(V^*\), we can convert a contravariant vector component \(V^\mu\) into a covariant covector component \(V_\nu\) simply by contracting it with the metric. The metric acts as a lowering operator: \[ V_\nu = g_{\mu\nu} V^\mu \] To reverse the process, we require the inverse metric tensor \(g^{\mu\nu}\). By definition, this inverse metric satisfies the identity \(g^{\mu\lambda} g_{\lambda\nu} = \delta^\mu_\nu\), where \(\delta^\mu_\nu\) is the Kronecker delta. We raise an index by contracting with this inverse metric: \[ V^\mu = g^{\mu\nu} V_\nu \] This operational rule applies similarly to higher-rank tensors, manipulating them one slot at a time. For example, lowering one index of a rank \((2,0)\) tensor transforms it into a rank \((1,1)\) tensor: \[ T^\mu_{\phantom{\mu}\nu} = g_{\nu\lambda} T^{\mu\lambda} \]

Object Name	Rank (p,q)	Physics Example
Scalar	(0,0)	Rest mass \(m\), Temperature \(T\)
Vector (Contravariant)	(1,0)	Four-velocity \(U^\mu\)
Covector (Covariant)	(0,1)	Four-momentum \(p_\mu\)
Metric	(0,2)	Spacetime metric \(g_{\mu\nu}\)

Two More Essential Physics Examples

General relativity is sourced by the stress-energy tensor, \(T^{\mu\nu}\). This is a symmetric rank \((2,0)\) tensor that encapsulates the density and flux of energy and momentum in spacetime. Physically, the component \(T^{\mu\nu}\) measures the flux of the \(\mu\)-component of four-momentum across a surface of constant \(x^\nu\) coordinate. For instance, \(T^{00}\) is the energy density (flux of energy through time), \(T^{0i}\) represents momentum density, and \(T^{ij}\) represents shear stress and pressure. It completely characterizes the matter and radiation in a given region. In electrodynamics, all electric and magnetic fields are unified into a single geometric object called the electromagnetic field tensor, \(F_{\mu\nu}\). This is an antisymmetric rank \((0,2)\) tensor. Because it is antisymmetric (\(F_{\mu\nu} = -F_{\nu\mu}\)), its diagonal components must be zero. The time-space components \(F_{0i}\) encode the electric field vector, while the space-space components \(F_{ij}\) encode the magnetic field vector. Writing Maxwell's equations in terms of \(F_{\mu\nu}\) reduces them from four complex vector equations into just two elegant, coordinate-independent tensor equations.

The Covariant Derivative

If tensors are strictly defined by how their components transform, we encounter a severe problem in curved space when taking derivatives. In flat Minkowski space, taking the partial derivative of a tensor yields another valid tensor. In curved spacetime, it does not (Schutz, Ch. 5).

Derivation: Why Partial Derivatives Fail as Tensors

Let us evaluate the partial derivative of a vector \(V^\nu\) in a new coordinate system \(x^{\mu'}\) to see if it obeys the tensor transformation law. The vector itself transforms contravariantly as \(V^{\nu'} = \frac{\partial x^{\nu'}}{\partial x^\nu} V^\nu\). The derivative operator transforms via the chain rule as \(\partial_{\mu'} = \frac{\partial x^\mu}{\partial x^{\mu'}} \partial_\mu\). Applying the transformed operator to the transformed vector yields: \[ \partial_{\mu'} V^{\nu'} = \frac{\partial x^\mu}{\partial x^{\mu'}} \partial_\mu \left( \frac{\partial x^{\nu'}}{\partial x^\nu} V^\nu \right) \] Because the Jacobian matrix \(\frac{\partial x^{\nu'}}{\partial x^\nu}\) is not composed of constants in a curved or curvilinear coordinate system, we must use the product rule to distribute the partial derivative \(\partial_\mu\). This gives two distinct terms: \[ \partial_{\mu'} V^{\nu'} = \frac{\partial x^\mu}{\partial x^{\mu'}} \frac{\partial x^{\nu'}}{\partial x^\nu} \partial_\mu V^\nu + \frac{\partial x^\mu}{\partial x^{\mu'}} \left( \frac{\partial^2 x^{\nu'}}{\partial x^\mu \partial x^\nu} \right) V^\nu \] The first term is exactly how a rank \((1,1)\) tensor is required to transform. However, the second term containing the second derivative \(\frac{\partial^2 x^{\nu'}}{\partial x^\mu \partial x^\nu}\) is a mathematical anomaly that ruins the transformation. This term exists specifically because the coordinate grids themselves bend and change across the manifold. Because of this extra term, the operation \(\partial_\mu V^\nu\) is fundamentally not a tensor in curved spacetime.

To fix this fatal flaw, we introduce the covariant derivative, denoted by \(\nabla_\mu\), which corrects for the twisting of the coordinate basis across spacetime.

Derivation: The Covariant Derivative Correction Term

To cancel out the non-tensorial anomaly from the partial derivative, we must manually add a correction piece. We define the covariant derivative of a contravariant vector as: \[ \nabla_\mu V^\nu = \partial_\mu V^\nu + \Gamma^\nu_{\mu\lambda} V^\lambda \] The added term, \(\Gamma^\nu_{\mu\lambda}\), is the Christoffel symbol (or connection coefficient). It explicitly measures how the basis vectors twist and scale as you move through spacetime. The Christoffel symbols are not tensors themselves; they transform with their own anomalous second-derivative term. However, this anomaly is precisely designed to cancel out the bad term from the partial derivative, leaving the total operator \(\nabla_\mu V^\nu\) to transform perfectly as a rank \((1,1)\) tensor. For a covariant vector (covector), the correction is subtracted rather than added: \[ \nabla_\mu V_\nu = \partial_\mu V_\nu - \Gamma^\lambda_{\mu\nu} V_\lambda \]

In general relativity, gravity is described entirely by the metric tensor \(g_{\mu\nu}\). Consequently, the metric dictates exactly how spacetime curves, which means the Christoffel symbols must be derived directly from the metric itself. This unique geometric linkage follows from two physical requirements: the metric is covariantly constant (metric compatibility) and spacetime is torsion-free.

Derivation: Christoffel Symbols from the Metric

We begin with the metric compatibility condition, meaning the metric tensor's lengths and angles do not change under parallel transport. Mathematically, this means its covariant derivative is zero everywhere: \(\nabla_\lambda g_{\mu\nu} = 0\). Using the rule for the covariant derivative of a rank \((0,2)\) tensor, we expand this into its partial derivatives and connection coefficients: \[ \partial_\lambda g_{\mu\nu} - \Gamma^\sigma_{\lambda\mu} g_{\sigma\nu} - \Gamma^\sigma_{\lambda\nu} g_{\mu\sigma} = 0 \tag{A1} \] Next, we generate two more identical equations by cyclically permuting the indices \((\lambda, \mu, \nu)\) to expose the symmetries of the metric: \[ \partial_\mu g_{\nu\lambda} - \Gamma^\sigma_{\mu\nu} g_{\sigma\lambda} - \Gamma^\sigma_{\mu\lambda} g_{\nu\sigma} = 0 \tag{A2} \] \[ \partial_\nu g_{\lambda\mu} - \Gamma^\sigma_{\nu\lambda} g_{\sigma\mu} - \Gamma^\sigma_{\nu\mu} g_{\lambda\sigma} = 0 \tag{A3} \] We now enforce the torsion-free condition, which demands that the Christoffel symbols are symmetric in their lower indices: \(\Gamma^\sigma_{\mu\nu} = \Gamma^\sigma_{\nu\mu}\). This simplifies our system. We add equations (A2) and (A3) together, and then subtract equation (A1). Grouping terms and leveraging the symmetry of the metric tensor, four of the Christoffel terms cancel out perfectly, leaving: \[ \partial_\mu g_{\nu\lambda} + \partial_\nu g_{\lambda\mu} - \partial_\lambda g_{\mu\nu} - 2\Gamma^\sigma_{\mu\nu} g_{\sigma\lambda} = 0 \] We rearrange this expression to isolate the remaining Christoffel term on one side: \[ 2\Gamma^\sigma_{\mu\nu} g_{\sigma\lambda} = \partial_\mu g_{\nu\lambda} + \partial_\nu g_{\lambda\mu} - \partial_\lambda g_{\mu\nu} \] Finally, to strip away the metric \(g_{\sigma\lambda}\) and solve explicitly for \(\Gamma\), we multiply both sides by half the inverse metric \(\frac{1}{2}g^{\rho\lambda}\). Since contracting the metric with its inverse yields a Kronecker delta (\(g_{\sigma\lambda}g^{\rho\lambda} = \delta^\rho_\sigma\)), the left side simplifies dramatically. We arrive at the fundamental equation connecting the metric to the connection: \[ \Gamma^\rho_{\mu\nu} = \frac{1}{2} g^{\rho\lambda} \left( \partial_\mu g_{\nu\lambda} + \partial_\nu g_{\mu\lambda} - \partial_\lambda g_{\mu\nu} \right) \]

With the covariant derivative and Christoffel symbols defined, we finally possess the mathematical machinery necessary to describe how objects move in a curved spacetime. In flat space, a free particle moves in a straight line, keeping its velocity vector perfectly constant. In curved spacetime, we cannot define a globally straight line. Instead, we define parallel transport, the process of dragging a vector along a curve while keeping it as constant as the curved geometry allows.

Derivation: Parallel Transport Equation

Consider a vector \(V^\mu\) moving along a parameterized curve \(x^\mu(\lambda)\). To measure how this vector changes strictly along the path, we take the directional covariant derivative, known as the absolute derivative \(\frac{DV^\mu}{d\lambda}\). We construct this by projecting the standard covariant derivative \(\nabla_\nu V^\mu\) along the tangent vector of the curve \(\frac{dx^\nu}{d\lambda}\): \[ \frac{DV^\mu}{d\lambda} = \frac{dx^\nu}{d\lambda} \nabla_\nu V^\mu \] We then expand the covariant derivative into its constituent partial derivative and Christoffel symbol correction terms: \[ \frac{DV^\mu}{d\lambda} = \frac{dx^\nu}{d\lambda} \left( \partial_\nu V^\mu + \Gamma^\mu_{\nu\rho} V^\rho \right) \] Distributing the tangent vector across the terms inside the parentheses yields: \[ \frac{DV^\mu}{d\lambda} = \frac{dx^\nu}{d\lambda} \partial_\nu V^\mu + \Gamma^\mu_{\nu\rho} \frac{dx^\nu}{d\lambda} V^\rho \] By applying the multivariable chain rule, we recognise that the first term \(\frac{dx^\nu}{d\lambda} \partial_\nu V^\mu\) is exactly the ordinary derivative \(\frac{dV^\mu}{d\lambda}\). Replacing it gives us the full absolute derivative: \[ \frac{DV^\mu}{d\lambda} = \frac{dV^\mu}{d\lambda} + \Gamma^\mu_{\nu\rho} \frac{dx^\nu}{d\lambda} V^\rho \] For a vector to be parallel transported, it must remain as invariant as geometrically possible along the curve, meaning its absolute derivative must vanish completely: \[ \frac{DV^\mu}{d\lambda} = \frac{dV^\mu}{d\lambda} + \Gamma^\mu_{\nu\lambda} \frac{dx^\nu}{d\lambda} V^\lambda = 0 \]

Tensors are the ultimate language of general relativity because they guarantee that our physical laws do not depend on the arbitrary coordinate grids we draw on the universe. When mass and energy bend the metric tensor, the Christoffel symbols shift, altering the covariant derivatives and forcing parallel transported trajectories to converge or diverge. What we call gravity is not a force pulling on objects; it is merely matter attempting to travel in a straight line through a curved tensorial reality. Once you see tensors as physical, coordinate-independent constraints rather than matrix algorithms, the abstract mathematics of Einstein's universe suddenly becomes a tangible, geometric landscape.

FAQ

Is a scalar a tensor?

Yes. A scalar is a tensor of rank \((0,0)\). It requires zero vectors and zero covectors as inputs, and simply exists as a single real number that is completely invariant under coordinate transformations. Your rest mass is the same number regardless of the coordinate system used to measure it.

What is the difference between covariant and contravariant?

These terms describe how the components of a geometric object respond to a change in basis. Contravariant components (upper indices) transform inversely to the basis vectors; these describe standard vectors like velocity. Covariant components (lower indices) transform identically to the basis vectors; these describe covectors, like the gradient.

Can any matrix be treated as a tensor?

Absolutely not. You can fill a 4x4 matrix with any random numbers you like, but unless those numbers transform according to the strict multilinear tensor transformation laws when you change coordinates, the matrix does not represent a physical tensor. A matrix is just an array; a tensor is a geometric reality constraint.

What is the trace of a tensor?

The trace of a tensor is an operation called contraction, where one covariant index and one contravariant index are set to be equal and summed over. For a rank \((1,1)\) tensor \(T^\mu_\nu\), the trace is \(T^\mu_\mu = T^0_0 + T^1_1 + T^2_2 + T^3_3\). Because the transformation matrices exactly cancel out during this summation, the trace of a tensor produces a purely invariant scalar.

What Is a Tensor, Really? An Honest Answer