Advanced Quantum Theory IV Michaelmas

Ben Hoare

These lecture notes accompany the first half of the Advanced Quantum Theory IV module held in the Michaelmas term of the 2025-2026 academic year as part of the Master of Mathematics degree at Durham University.

Please send comments and corrections to Ben Hoare at ben.hoare-at-durham.ac.uk.

Durham, 6 October 2025

last updated 12 December 2025

1 Introduction

1.1 Course details

This course provides an introduction to quantum field theory.

Acknowledgements. This version of Advanced Quantum Theory IV has been inherited from Silvia Nagy, Nabil Iqbal and Charlotte Sleight in reverse chronological order and largely follows the structure of their course. In turn, it also owes a lot to previous versions of the course by Marija Zamaklar and Kasper Peeters, the notes by D. Tong and the books by M. Peskin & D. Schroeder and M. Srednicki.

Outline.

  1. Overview of the course. The Lorentz group.

  2. The Lorentz and Poincaré groups. Lagrangian methods for classical field theory.

  3. Noether’s theorem. Hamiltonian formalism.

  4. Quantizing a free complex scalar field.

  5. The Hamiltonian and the energy of the vacuum. Single particle and multiparticle states.

  6. Propagators and causality. Feynman propagator.

  7. Interacting quantum field theories.

  8. Wick’s theorem.

  9. Feynman diagrams and Feynman rules.

  10. Scattering (non-examinable).

More details. More details about the course, including information about lectures, problem classes and homework assignments, can be found on Blackboard at

https://blackboard.durham.ac.uk/ultra/courses/_68483_1/outline

Additional resources. There are many great books and notes on Quantum Field Theory. Some of these include:

You can also find previous versions of the lecture notes for this course on Blackboard.

1.2 Why quantum field theory?

Quantum field theory is the mathematical framework behind some of our most well-tested and precise theories of the natural world. It is the language of nearly all modern research in quantum physics. The goal of this course is to introduce to the principles and methods of quantum field theory and use these to calculate observable quantities. Quantum field theory can be challenging. This stems from the fact that we still do not have a rigorous understanding of how it works. As a result there are many different approaches to the same questions. While challenging, this is also exciting. Quantum field theory lies at the cutting edge of modern theoretical and mathematical physics.

Let us discuss why we need quantum field theory. Consider the following map of physical theories

The goal of quantum field theory is to describe the physics of the very small and very fast. Some examples include:

  1. Light. We are familiar with the idea of light as a wave but it can also behave like a particle. “Light particles” are called photons. It is natural to expect that these particles are “small” and move at the speed of light. Therefore, a quantum mechanical treatment of light needs quantum field theory.

  2. Particle colliders. These are machines, such as the LHC at CERN, that accelerate individual partners to very high speeds and then collides them. They are at the cutting edge of modern experimental physics and are key for discovering new particles and understanding different states of matter. To work out what happens in the collision processes requires quantum field theory.

  3. High temperature. Heating something up causes the particles in it to move around more quickly. If the speed of the particles comes close to the speed of light then quantum field theory is needed to describe the physics. This does not happen very often, but it does describe the very early universe, shortly after the “Big Bang”.

Quantum field theory is a theory that successfully combines quantum mechanics and special relativity. It is instructive to ask why it is not possible to simply make quantum mechanics relativistically invariant and why we need to introduce a new formalism. Consider a free particle with mass \(m\). In non-relativistic classical mechanics the energy of this particle is \[E_{\text{non-rel}} = \frac{\vec{p}^2}{2m} ~,\] where \(\vec{p}\) is the spatial momentum of the particle. For a relativistic particle this is corrected to \[E_{\text{rel}} = \sqrt{\vec{p}^2 c^2 + m^2 c^4} ~,\] where \(c\) is the speed of light. Expanding this equation for small momentum we find \[E_{\text{rel}} = mc^2 + \frac{\vec{p}^2}{2m} + \mathcal{O}(|\vec{p}|^4) ~.\] Setting \(\vec{p} = 0\), we see that \(E_{\text{rel}} = mc^2\) for a relativistic particle at rest. This says that a single particle has an intrinsic amount of energy \(mc^2\), which is called the rest mass. The non-relativistic kinetic energy appears as a correction to this rest mass. This tells us that particles always have energy, even if they are at rest. However, this suggests that we may be able to create particles by supplying enough energy. In particular, if we supply energy \(E = 2mc^2\) we might expect to be able to create a particle/antiparticle pair.

On the other hand, in quantum mechanics we know that quantities such as position, momentum, time and energy are uncertain and can fluctuate. More precisely, if we consider a state that is not an energy eigenstate then it will have a spread of energies \(\Delta E\). If \(\Delta E \approx 2mc^2\) then this suggests that the fluctuations of energy will be enough to create a particle/antiparticle pair. Therefore, if we are interested in combining quantum mechanics and special relativity into a single theory we will need to account for states where the particle number can change. This is not allowed in traditional quantum mechanics where we consider the Schrödinger equation for a fixed number of particles \(N\). Therefore, we need a new formalism in which the space of states allows the particle number to change. This is the formalism of quantum field theory.

Let us now work out an approximation for when we expect quantum field theory to be experimentally relevant. Consider a relativistic particle of mass \(m\) in a box of size \(L\). Since the position of the particle is known to an accuracy of at least \(L\), it follows from the Heisenberg uncertainty relation \[\begin{equation} \label{eq:heisenberguncertainty} \Delta q \Delta p \geq \frac{\hbar}{2} ~, \end{equation}\] that the there is a lower bound on the uncertainty in the momentum \[\begin{equation} \label{eq:puncertainty} \Delta p \geq \frac{\hbar}{2L} ~. \end{equation}\] Assuming the particle is in a highly relativistic regime, i.e. \(|\vec{p}| \gg m c\), we have \[\begin{equation} \label{eq:relenergy} E_{\text{rel}} = \sqrt{\vec{p}^2 c^2 + m^2 c^4} \approx |\vec{p}| c ~. \end{equation}\] From eqs. \(\eqref{eq:puncertainty}\) and \(\eqref{eq:relenergy}\) we find the following uncertainty in the energy \[\Delta E \approx c \Delta p \geq \frac{c \hbar}{2L} ~.\] As we have argued, we expect the effects of changing particle number to become important when \(\Delta E\) exceeds \(2mc^2\). Comparing these two expressions, we see that if the particle is localised within a distance of order \[L_{\text{Compton}} \equiv \frac{\hbar}{mc} ~,\] then both quantum and relativistic effects become important. Note that \(L_{\text{Compton}}\) combines both \(\hbar\) (Planck’s constant) and \(c\) (the speed of light) and it indicates that if we try to confine a particle of mass \(m\) to a box of size smaller than \(L_{\mathrm{Compton}}\) then we expect quantum fluctuations of the energy to create particle/antiparticle pairs from the vacuum. \(L_{\text{Compton}}\) is called the Compton wavelength and for an electron is approximately \(10^{-12}m\).

Note that we have not shown that relativistic quantum mechanics does not work. It is an instructive exercise to try to build such a theory and see that it leads to inconsistencies such as negative probabilities and negative energies.

Before we start with a review of special relativity and Lorentz invariance, let us briefly describe some of the key features of quantum field theory. In classical mechanics, the degrees of freedom are real- or complex-valued functions of time \(q_a(t)\), \(p_a(t)\). Here \(a\) runs over the different degrees of freedom, e.g., \(a=1,2,3\) for a particle moving in \(\mathbb{R}^3\). In quantum mechanics, we canonically quantize these degrees of freedom following a well-established algorithm leading to operators \(\hat q_a\), \(\hat p_a\), which obey a Heisenberg uncertainty relation such as \(\eqref{eq:heisenberguncertainty}\).

In classical field theory, the basic degree of freedom is a field such as \(\phi(\vec{x},t)\). Here \(\phi(\vec{x},t)\) is a real scalar field and is a function from space-time to the real numbers \[\phi: \mathbb{R}^{1,3} \to \mathbb{R}~.\] We can have other types of field, e.g., a complex scalar field, or a vector field such as \(\vec{E}(\vec{x},t)\), the electric field from electrodynamics. The field \(\phi(\vec{x},t)\) has a conjugate momentum \(\pi(\vec{x},t)\). Canonically quantizing the classical fields \(\phi(\vec{x},t)\) and \(\pi(\vec{x},t)\) leads us to a quantum system with operators \(\hat \phi(\vec{x})\) and \(\hat \pi(\vec{x})\). This quantum system is a quantum field theory.

Since the classical fields \(\phi(\vec{x},t)\) and \(\pi(\vec{x},t)\) can take a different values at each point in space, they contain infinitely many degrees of freedom. This leads to infinitely many quantum operators, one at each point in space for each field, in the quantum field theory. Understanding how to organise and work with these infinities is one of the main successes of quantum field theory.

The development of quantum field theory leads to a number of deep physical insights that we will understand in this course:

1.3 Conventions

In this course we use natural units, i.e. \(\hbar = c = 1\). Factors of \(\hbar\) and \(c\) can be reinstated by dimensional analysis. We use signature \((-,+,+,+)\) for \(\mathbb{R}^{1,3}\).

2 Special Relativity and Lorentz Invariance

In this course we will study quantum field theory on the flat space-time \(\mathbb{R}^{1,3}\). The symmetry, or isometry, group of flat Minkowski space is the Poincaré group, which includes both Lorentz transformations and translations. Therefore, the laws of physics that we derive from quantum field theory should be invariant under these symmetries.

2.1 Rotational invariance in 2 dimensions

Let us start with a simpler example and consider flat 2-dimensional Euclidean space \(\mathbb{R}^2\). We let \((x,y)\) denote coordinates on \(\mathbb{R}^2\). Consider a second coordinate system \((x',y')\) related to the first by \[\begin{equation} \label{eq:2x2rotations} \vec{x}' = \begin{pmatrix} x' \\ y' \end{pmatrix} = \begin{pmatrix} \cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = R(\theta) \vec x ~. \end{equation}\] The matrix \(R(\theta)\) is an element of the group of rotations \(\mathrm{SO}(2)\) of \(\mathbb{R}^2\).

While the individual components \(x'\) and \(y'\) change under the rotation, the length of the vector is invariant. This is geometrically clear, but we can also check it explicitly by computing \[\vec x'{}^2 = x'{}^2 + y'{}^2 = (x \cos \theta + y \sin \theta)^2 + (y \cos \theta - x \sin \theta)^2 = x^2 + y^2 = \vec x^2 ~.\] More generally, the dot product of two vectors \(\vec{v} = (v^x,v^y)\) and \(\vec{w} = (w^x,w^y)\) is invariant, i.e., \(\vec{v}'\cdot \vec{w}' = \vec{v}\cdot \vec{w}\). The dot product can be written as \(\vec{v}\cdot \vec{w} = \vec{v}^T \vec{w}\), hence the condition that the dot product is invariant can then be written as \[\vec{v}'{}^T \vec{w}' = \vec{v}^T R^T R \vec{w} = \vec{v}^T \vec{w} ~.\] Since this should hold for any two vectors \(\vec{v}\) and \(\vec{w}\) it follows that \(R^T R = I\) where \(I\) is the identity matrix. This is the definition of the orthogonal group \(\mathrm{O}(2)\) in 2 dimensions. Elements of the special orthogonal group \(\mathrm{SO}(2)\) also satisfy \(\det R = 1\). It is easy to check that the rotation matrix \(R(\theta)\) \(\eqref{eq:2x2rotations}\) satisfies both \(R^T R = I\) and \(\det R = 1\).

It will also be useful to recall the triangle inequality. If we have three vectors \(\vec{v}\), \(\vec{w}\) and \(\vec{x} = \vec{v} + \vec{w}\), then the length of \(\vec{x}\) is less than or equal to the length of \(\vec{v}\) plus the length of \(\vec{w}\) \[\vec{x}^2 = (\vec{v} + \vec{w})^2 \leq \vec{v}^2 + \vec{w}^2 ~.\]

2.2 Basic kinematics of special relativity

Einstein defined special relativity from the following two postulates:

Informally, if person A is at rest in an inertial frame and person B moves past at a constant speed, then they are also in an inertial frame.

We group space and time into space-time, which we denote \(\mathbb{R}^{1,3}\). Here, \(1\) denotes that we have one time direction, while \(3\) represents the three space directions. We label a point in space-time by \(x^\mu = (t,x^1,x^2,x^3) = (x^0,x^1,x^2,x^3)\). Depending on the context, we will use both \(t\) and \(x^0\) to denote time. We have also written \(x^\mu\) with a raised index. This is important since \(x_\mu\) with a lowered index will differ from \(x^\mu\) by a minus sign.

Now consider the following transformation to a new coordinate system \(x'{}^\mu\) \[\begin{equation} \label{eq:lorentzboost} x'{}^\mu = \begin{pmatrix} t' \\ x'{}^1 \\ x'{}^2 \\ x'{}^3 \end{pmatrix} =\begin{pmatrix} \gamma (t - v x^1) \\ \gamma (x^1 - vt) \\ x^2 \\ x^3 \end{pmatrix} ~, \qquad \gamma = \frac{1}{\sqrt{1-v^2}} ~, \end{equation}\] where \(v\in (-1,1)\). This is a Lorentz boost and is a transformation that mixes space and time. To understand what it means, consider a particle sitting at rest in the origin of the first coordinate system \[\begin{pmatrix} t \\ x^1 \\ x^2 \\ x^3 \end{pmatrix}_{\text{particle}} = \begin{pmatrix} \tau \\ 0 \\ 0 \\ 0 \end{pmatrix} ~,\] where \(\tau\in\mathbb{R}\) parametrises the worldline of the particle. In the second coordinate system we find \[\begin{pmatrix} t' \\ x'{}^1 \\ x'{}^2 \\ x'{}^3 \end{pmatrix}_{\text{particle}} = \begin{pmatrix} \gamma\tau \\ -v\gamma\tau \\ 0 \\ 0 \end{pmatrix} ~.\] This means that in this coordinate system the particle is moving since the position \(x'{}^1\) is not constant. The second set of coordinates correspond to an inertial frame that is moving with speed \(v\) with respect to the original frame. If we consider two subsequent Lorentz boosts parametrised by speeds \(v_1\) and \(v_2\), we find that this is equivalent to a single Lorentz boost parametrised by speed \(\frac{v_1+v_2}{1+v_1v_2}\). We can also consider a particle moving at the speed of light, which is equal to \(1\) \[\begin{pmatrix} t \\ x^1 \\ x^2 \\ x^3 \end{pmatrix}_{\text{particle}} = \begin{pmatrix} \tau \\ \tau \\ 0 \\ 0 \end{pmatrix} ~.\] In the second coordinate system we find \[\begin{pmatrix} t' \\ x'{}^1 \\ x'{}^2 \\ x'{}^3 \end{pmatrix}_{\text{particle}} = \begin{pmatrix} \gamma(1-v)\tau \\ \gamma(1-v)\tau \\ 0 \\ 0 \end{pmatrix} ~.\] This means that the speed has not changed, hence the speed of light is the same in all inertial frames.

Note that if we reinstate \(c\) by rescaling \(t\to c t\), \(t' \to c t'\) and \(v \to \frac{v}{c}\), the non-relativistic limit is given by taking \(c \to \infty\). Taking this limit the transformation \(\eqref{eq:lorentzboost}\) becomes \[x'{}^\mu = \begin{pmatrix} t' \\ x'{}^1 \\ x'{}^2 \\ x'{}^3 \end{pmatrix} =\begin{pmatrix} t \\ (x^1 - vt) \\ x^2 \\ x^3 \end{pmatrix} ~,\] which is a Galilean boost familiar from Newtonian mechanics. In this limit we now have \(v \in (-\infty,\infty)\) and speeds are additive, i.e., two Galilean boosts by speeds \(v_1\) and \(v_2\) is equivalent to a Galilean boost by speed \(v_1 + v_2\).

In flat 2-dimensional Euclidean space we saw that certain quantities (lengths and dot products) are invariant under rotations. We now would like to construct similar invariants for Lorentz boosts. To do so we introduce the Minkowski metric on our flat 1+3-dimensional Lorentzian space-time \[\eta_{\mu\nu} = \begin{pmatrix} - 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}{\vphantom{\begin{pmatrix}0\\0\\0\end{pmatrix}}}_{\!\mu\nu} ~.\] The space-time metric allows us to lower indices \[x_\mu = (x_0,x_1,x_2,x_3) = \eta_{\mu\nu} x^\nu = (-x^0,x^1,x^2,x^3) ~,\] where we use the Einstein summation convention for repeated indices, i.e., \[\eta_{\mu\nu} x^\nu \equiv \sum_\nu \eta_{\mu\nu} x^\nu ~.\] The vector with a raised index is known as the contravariant vector, while the vector with a lowered index is the covariant vector. Due to the difference in the sign of the first component they transform differently under Lorentz transformations.

We also introduce the inverse metric \[\eta^{\mu\nu} = \begin{pmatrix} - 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}{\vphantom{\begin{pmatrix}0\\0\\0\end{pmatrix}}}^{\!\mu\nu} ~,\] which we can use to lower indices \[x^\mu = \eta^{\mu\nu} x_\nu ~.\] In the case of flat 1+3-dimensional Lorentzian space-time the matrix form of the metric and its inverse are the same as we have written them. This is not always the case and is a feature of the simplicity of flat space-time and our choice of coordinates. The statement that \(\eta^{\mu\nu}\) is the inverse of \(\eta_{\mu\nu}\) can be written as \[\eta^{\mu\nu}\eta_{\nu\rho} = \delta^\mu_\rho ~,\] where \(\delta^\mu_\rho\) is the identity matrix \[\delta^\mu_\rho = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}{\vphantom{\begin{pmatrix}0\\0\\0\end{pmatrix}}}^{\!\mu}_{\!\rho} ~.\]

We can now construct a scalar product between two vectors \(x^\mu\) and \(y^\mu\) \[x \cdot y = x^\mu y^\nu \eta_{\mu\nu} = x^T \eta y = \begin{pmatrix}x^0&x^1&x^2&x^3\end{pmatrix} \begin{pmatrix} - 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}\begin{pmatrix}y^0\\y^1\\y^2\\y^3\end{pmatrix} = -x^0y^0 + x^1y^2 + x^2y^2 + x^3y^3 ~.\] This is similar to the familiar dot product on Euclidean space, except that there is an extra minus sign appearing in front of the product of the time components of \(x^\mu\) and \(y^\mu\). This sign is important and can be understood as distinguishing the time direction from the space directions and ultimately is the origin of the different nature of time and space. Note that there are many different ways of writing the scalar product by raising and lowering indices and renaming dummy indices, i.e., indices that are summed over. For example, \[x\cdot y = x^\mu y^\nu \eta_{\mu\nu} = x^\mu y_\mu = x_\mu y^\mu ~.\]

Now that we have a scalar product between two vectors, we can ask what is the most general transformation of the coordinates that leaves this invariant. In other words, what is the most general \(4\times 4\) matrix \(\Lambda\) such that \[x' \cdot y' = (x')^\mu (y')_\mu = x^\mu y_\mu = x \cdot y ~, \qquad (x')^\mu = \Lambda^\mu{}_\nu x^\nu~, \qquad (y')^\mu = \Lambda^\mu{}_\nu y^\nu ~.\] Substituting in, we see that we require that \[x^T \Lambda^T \eta \Lambda y = x^T \eta y ~,\] which implies \[\begin{equation} \label{eq:lorentzrelation} \Lambda^T \eta \Lambda = \eta ~, \qquad \Lambda^\mu{}_\nu \eta_{\mu\rho}\Lambda^\rho{}_\sigma = \eta_{\nu\sigma} ~, \end{equation}\] where we have written the relation both as a matrix equation and explicitly with indices. This relation can be understood as saying that the transformations we are interested in leave the Minkowski metric invariant. Matrices \(\Lambda\) that satisfy this property are elements of a Lie group denoted \(\mathrm{O}(1,3)\), which is commonly known as the Lorentz group.

Finally, we check that the Lorentz boost in eq. \(\eqref{eq:lorentzboost}\) satisfies the property \(\eqref{eq:lorentzrelation}\). From eq. \(\eqref{eq:lorentzboost}\) we read off \[\begin{equation} \label{eq:lorentzboostmatrix} \Lambda^\mu{}_\nu = \begin{pmatrix} \gamma & -v\gamma & 0 & 0 \\ -v\gamma & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}{\vphantom{\begin{pmatrix}0\\0\\0\end{pmatrix}}}^{\!\mu}{\vphantom{\begin{pmatrix}0\\0\\0\end{pmatrix}}}_{\!\nu} ~. \end{equation}\] Substituting into the left-hand side of \(\eqref{eq:lorentzrelation}\), we find \[\begin{split} \begin{pmatrix} \gamma & -v\gamma & 0 & 0 \\ -v\gamma & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} & \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} \gamma & -v\gamma & 0 & 0 \\ -v\gamma & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \\ & = \begin{pmatrix} -\gamma^2(1-v^2) & 0 & 0 & 0 \\ 0 & \gamma^2(1-v^2) & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} = \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} ~, \end{split}\] as required.

2.3 Group theory of the Lorentz group.

The Lorentz boost in eq. \(\eqref{eq:lorentzboostmatrix}\) is an example of a Lorentz transformation. Let us now determine the full set of Lorentz transformations that satisfy the property \(\eqref{eq:lorentzrelation}\). If we take the determinant of eq. \(\eqref{eq:lorentzrelation}\) we find \[(\det \Lambda)^2 \det\eta = \det\eta \qquad \Rightarrow \qquad \det \Lambda = \pm 1 ~.\] We can also set \(\nu = \sigma = 0\) in \(\eqref{eq:lorentzrelation}\) to give \[-(\Lambda^0{}_0)^2 + (\Lambda^1{}_0)^2 + (\Lambda^2{}_0)^2 + (\Lambda^3{}_0)^2 = -1 ~,\] which, noting that we are considering real matrices \(\Lambda\) to ensure the transformed coordinates are real, implies that \[(\Lambda^0{}_0)^2 \geq 1 \qquad \Rightarrow \qquad \Lambda^0{}_0 \geq 1 \text{~or~} \Lambda^0{}_0 \leq -1 ~.\] Therefore, the set of all \(\Lambda\) can be split into four disjoint sets labelled by the sign of their determinant and the sign of \(\Lambda^0{}_0\).

We start by considering the set with \(\det \Lambda = 1\) and \(\Lambda^0{}_0 \geq 0\), which forms a subgroup of the Lorentz group. This subgroup is denoted \(\mathrm{SO}^+(1,3)\). This is the subgroup that is continuously connected to the identity, hence it is useful to consider the corresponding Lie algebra that describes the infinitesimal behaviour of the Lie group. Elements of the Lie algebra are often referred to as generators of the Lie group. To construct the Lie algebra, we consider a Lorentz transformation that takes the following form \[\begin{equation} \label{eq:identitycomponent} \Lambda = \exp (\mathcal{M}) ~, \end{equation}\] where \(\mathcal{M}\) is a \(4\times 4\) matrix. To determine the set of all possible elements \(\mathcal{M}\) of the Lie algebra, we assume \(\mathcal{M}\) is small and expand the exponential in eq. \(\eqref{eq:identitycomponent}\) \[\Lambda = I + \mathcal{M} + \mathcal{O}(\mathcal{M}^2) ~, \qquad \Lambda^\mu{}_\nu = \delta^\mu_\nu + \mathcal{M}^\mu{}_\nu + \mathcal{O}(\mathcal{M}^2) ~.\] Now substituting this into the group property \(\eqref{eq:lorentzrelation}\) we find \[\delta^\mu_\nu \eta_{\mu\rho}\delta^\rho_\sigma + \mathcal{M}^\mu{}_\nu \eta_{\mu\rho}\delta^\rho_\sigma + \delta^\mu_\nu \eta_{\mu\rho} \mathcal{M}^\rho{}_\sigma = \eta_{\nu\sigma} ~,\] where we have dropped terms of \(\mathcal{O}(\mathcal{M}^2)\). Simplifying the first term on the left-hand side, we find that it cancels with the single term on the right-hand side. Therefore, we are left with \[\mathcal{M}_{\sigma\nu} + \mathcal{M}_{\nu\sigma} = 0 ~,\] i.e., the matrix \(\mathcal{M}\) with lowered indices is antisymmetric. There are six linearly independent \(4 \times 4\) antisymmetric matrices. We consider the following basis of such matrices \[\begin{equation} \label{eq:so13down} (M^{\rho\sigma})_{\mu\nu} = \delta^\sigma_\mu\delta^\rho_\nu - \delta^\rho_\mu\delta^\sigma _\nu ~. \end{equation}\] In this expression the indices \(\rho\) and \(\sigma\) label the elements of the basis. Note that the right-hand side is antisymmetric in these indices so there are only six linearly independent elements of the basis as expected. The indices \(\mu\) and \(\nu\) label the components of the \(4\times 4\) matrices. Since the right-hand side is also antisymmetric in these indices, we see that these matrices are antisymmetric as required. That is, for each choice of \(\rho = 0,1,2,3\) and \(\sigma = 0,1,2,3\), eq. \(\eqref{eq:so13down}\) defines a \(4\times4\) matrix that is a generator of the Lorentz group. Moreover, out of the 16 possible \(4\times4\) matrices only six are linearly independent since \(M^{\rho\sigma} = -M^{\sigma\rho}\). Finally, we can raise the index \(\mu\) on the basis \(\eqref{eq:so13down}\) to find a basis for the generators of the Lorentz group \[\begin{equation} \label{eq:so13generators} (M^{\rho\sigma})^\mu{}_\nu = \eta^{\sigma\mu}\delta^\rho_\nu - \eta^{\rho\mu}\delta^\sigma _\nu ~. \end{equation}\]

We are now in a position to write down the most general Lorentz transformation connected to the identity. A general element of the Lie algebra is given by an arbitrary linear combination of all the elements of the basis of generators \[\mathcal{M}(\omega_{\rho\sigma}) = \omega_{\rho\sigma}M^{\rho \sigma} ~.\] A general Lorentz transformation is then given by the exponential of this expression, i.e., \[\Lambda(\omega_{\rho\sigma}) = \exp(\omega_{\rho\sigma}M^{\rho \sigma}) ~.\] The antisymmetry of \(M^{\rho \sigma}\) means that we can also take \(\omega_{\rho\sigma}\) to be antisymmetric, i.e. \(\omega_{\rho\sigma} = -\omega_{\sigma\rho}\), without loss of generality. Note that in these expressions we have suppressed the matrix indices \(\mu\) and \(\nu\).

Since Lorentz transformations form a group this means that the product of two Lorentz transformations is also a Lorentz transformation. How Lorentz transformations are composed with each other is encoded in the commutator of Lie algebra generators, e.g., through the Baker–Campbell–Hausdorff formula. Direct computation the definition \(\eqref{eq:so13generators}\) shows that \[\phantom{}[M^{\mu\nu}, M^{\rho\sigma}] = -\eta^{\nu\rho}M^{\mu\sigma} + \eta^{\mu\rho} M^{\nu\sigma} + \eta^{\nu\sigma} M^{\mu\rho} - \eta^{\mu\sigma}M^{\nu\rho} ~,\] where again we have suppressed the matrix indices. This commutator defines the Lorentz algebra.

Let us now consider some specific examples of Lorentz transformations. First we look at the generators \(M^{0i}\). From eq. \(\eqref{eq:so13generators}\) we find that \[M^{01} = \begin{pmatrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{pmatrix} ~, \qquad M^{02} = \begin{pmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{pmatrix} ~, \qquad M^{03} = \begin{pmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{pmatrix} ~.\] We see that the labelling of the basis denotes which two dimensions are being transformed. The Lorentz transformations \(M^{0i}\) correspond to boosts in the \(x^i\) direction. Focusing on \(M^{01}\) (\(M^{02}\) and \(M^{03}\) behave similarly) we have (\(\omega_{01} = \frac12 \omega\), \(\omega_{02} = \omega_{03} = \omega_{12} = \omega_{13} = \omega_{23} = 0\)) \[\Lambda_{B,01} = \exp(\omega M^{01}) = \begin{pmatrix} \cosh \omega & \sinh \omega & 0 & 0 \\ \sinh \omega & \cosh \omega & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} ~.\] This agrees with eq. \(\eqref{eq:lorentzboostmatrix}\) if we relate the boost velocity \(v\) to the Lie algebra parameter \(\omega\) as \[\sinh\omega = \frac{-v}{\sqrt{1-v^2}} ~.\] The Lie algebra parameter \(\omega\) is often called the rapidity.

Now let us look at the generators \(M^{ij}\). From eq. \(\eqref{eq:so13generators}\) we find that \[M^{12} = \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{pmatrix} ~, \qquad M^{13} = \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & -1 \\ 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{pmatrix} ~, \qquad M^{23} = \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & -1 \\ 0 & 0 & 1 & 0 \end{pmatrix} ~.\] Focusing on \(M^{12}\) (\(M^{13}\) and \(M^{23}\) behave similarly) we have (\(\omega_{12} = \frac12 \omega\), \(\omega_{01} = \omega_{02} = \omega_{01} = \omega_{13} = \omega_{23} = 0\)) \[\Lambda_{R,12} = \exp(\omega M^{12}) = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & \cos\omega & -\sin\omega & 0 \\ 0 & \sin\omega & \cos\omega & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} ~.\] This takes the form of an \(\mathrm{SO}(2)\) rotation of the \(x^1\) and \(x^2\) directions. Together with the rotations generated by \(M^{13}\) and \(M^{23}\) this gives the group of spatial rotations \(\mathrm{SO}(3)\), which is a subgroup of \(\mathrm{SO}^+(1,3)\).

Finally, let us return to those transformations that do not have \(\det\Lambda = 1\) and \(\Lambda^0{}_0 \geq 1\). Two important examples are time reversal, which acts as \(T: (x^0,x^1,x^2,x^3) \to (-x^0,x^1,x^2,x^3)\), and parity, which acts as \(P: (x^0,x^1,x^2,x^3) \to (x^0,-x^1,-x^2,-x^3)\) These change the direction of time and of all space directions respectively. In matrix form they are given by \[\Lambda_T = \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} ~, \qquad \Lambda_P = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{pmatrix} ~.\] We can then recover the full Lorentz group \(\mathrm{O}(1,3)\) by composing time reversal, parity and their composition with elements of the identity component \(\mathrm{SO}^+(1,3)\) discussed above.

2.4 Translations and the Poincaré group

Having understood Lorentz transformations, which are the generalisation of rotations in flat Euclidean space to a relativistic flat Lorentzian space-time, we now turn to space-time translations, i.e., transformations of the form \[x^\mu \to x'{}^\mu = x^\mu - a^\mu ~,\] where \(a^\mu\) is a constant vector. These are symmetries of (sufficiently small) regions of empty space-time. The time component of \(a^\mu\) corresponds to a translation of the origin of time, while the space components correspond to translations of the origin of space. This tells us that the laws of physics are the same now and in the future and are the same no matter where we are located.

The combination of space-time translations with Lorentz transformations results in a Lie group called the Poincaré group. The Poincaré group acts on the coordinates of space-time as \[\begin{equation} \label{eq:Poincaretransformations} x^\mu \to x'{}^\mu = \Lambda^\mu{}_\nu x^\nu - a^\mu ~. \end{equation}\] It has ten generators, six Lorentz generators \(M^{\rho\sigma}\) and four translations \(P^\mu\).

The translations do not commute with the Lorentz transformations and the full structure of the commutation relations of the corresponding Lie algebra is given by \[\begin{split} &\phantom{}[M^{\mu\nu}, M^{\rho\sigma}] = -\eta^{\nu\rho}M^{\mu\sigma} + \eta^{\mu\rho} M^{\nu\sigma} + \eta^{\nu\sigma} M^{\mu\rho} - \eta^{\mu\sigma}M^{\nu\rho} ~, \\ &\phantom{}[P^\mu,M^{\nu\rho}] = -\eta^{\mu\nu}P^\rho + \eta^{\mu\rho}P^\nu ~, \qquad [P^\mu,P^\nu] = 0 ~. \end{split}\] These are the commutation relations of the Poincaré algebra.

2.5 The twin paradox

We have seen that the symmetries of flat 1+3-dimensional Lorentzian space-time \(\mathbb{R}^{1,3}\) with the Minkowski metric consist of Lorentz transformations and translation. Together these form the Poincaré group. Working with these transformations is not that different to working with ordinary rotations of flat Euclidean space, except that we need to keep track of signs. The signs turn out to be important, they have important consequences and capture the fact that time is different from space.

As an example, let us consider the twin paradox.

Here we have one twin that stays at home and follows the space-time trajectory OAB. The second twin first travels along OC and then CB. When the two twins meet again, the travelling twin is younger. To see this we note that the invariant notion of time elapsed along each trajectory is measured with the Minkowski metric. Therefore, if the stationary twin OAB measures time \(\Delta t\), the travelling twin OCB will measure time \(2\sqrt{(\frac{\Delta t}{2})^2 - (\Delta x)^2} < \Delta t\). The twin paradox is not really a paradox but is a consequence of the fact that we use the Minkowski metric to measure the invariant, or proper, time and that the travelling twin accelerates between two inertial frames.

2.6 Realising Poincaré transformations on fields

Now we have understood how Poincaré transformations act on the coordinates of space-time \(x^\mu\), see eq. \(\eqref{eq:Poincaretransformations}\), which we can write in index-free notation as \(x' = \Lambda x - a\), we turn to the action of Poincaré transformations on fields. Fields are functions of space-time and are the building blocks of classical and quantum field theory.

2.6.1 Scalar fields

The simplest example of a field is a real scalar field \(\phi(x)\), which is a map from \(\mathbb{R}^{1,3}\) to \(\mathbb{R}\) \[\phi: \mathbb{R}^{1,3} \to \mathbb{R}~.\] Since \(\phi(x)\) depends on \(x\) it will also transform under Poincaré transformations and we say that the field transforms in a representation of the Poincaré group. When we go from the first coordinate system \(x\) to the second \(x'\) we find a transformed scalar field \(\phi'(x')\) that is related to the original scalar field by \[\begin{equation} \label{eq:transformedphi} \phi'(x') = \phi(x) ~. \end{equation}\]

First considering just translations, i.e., \[x'{}^\mu = x^\mu - a^\mu ~,\] we have \[\phi'(x') = \phi'(x-a) = \phi(x) ~,\] where the last equality follows from the definition \(\eqref{eq:transformedphi}\). This in turn implies \[\phi'(x) = \phi(x + a) ~.\] Taking \(a\) to be small we can consider the infinitesimal transformation and expand the right-hand side in powers of \(a\) \[\phi'(x) = \phi(x) + a^\mu \frac{\partial}{\partial x^\mu} \phi(x) + \mathcal{O}(a^2) ~.\] Since \(\frac{\partial x^\nu}{\partial x^\mu} = \delta_\mu^\nu\) it follows that \(\frac{\partial}{\partial x^\mu}\) transforms as an object with a lowered index under Lorentz transformations. Denoting \(\frac{\partial}{\partial x^\mu} = \partial_\mu\), we say that under an infinitesimal translation the scalar field transforms as \[\phi'(x) = \phi(x) + \delta\phi(x) + \mathcal{O}(a^2) ~, \qquad \delta\phi(x) = a^\mu\partial_\mu \phi(x) ~.\] Returning to finite transformations we recall that any transformation connected to the identity can be written as the exponential of an algebra element. Therefore, there should exist a \(P_\mu\) such that \[\phi'(x) = \phi(x+a) = \exp(a^\mu P_\mu) \phi(x) ~.\] Expanding both sides of the second equality in powers of \(a\) we can read off that \[\begin{equation} \label{eq:pdef} P_\mu = \partial_\mu ~. \end{equation}\] That is, the generator of translations on scalar fields is the derivative with respect to the coordinates on space-time.

Now let us turn to Lorentz transformations, for which the transformation of \(x\) is now given by \[x'{}^\mu = \Lambda^\mu{}_\nu x^\nu = \exp(\omega_{\rho\sigma}M^{\rho\sigma})^\mu{}_\nu x^\nu ~.\] Recalling the definition of the transformed scalar field \(\eqref{eq:transformedphi}\) we have \[\phi'(x') = \phi'(\Lambda x) = \phi(x) ~,\] which in turn implies \[\begin{equation} \label{eq:phiplorentz1} \phi'(x) = \phi(\Lambda^{-1} x) = \phi(\exp(-\omega_{\rho\sigma}M^{\rho\sigma}) x) ~. \end{equation}\] We would now like to find a set of generators \(L^{\rho\sigma}\) such that \[\begin{equation} \label{eq:phiplorentz2} \phi'(x) = \exp(\omega_{\rho\sigma}L^{\rho\sigma}) \phi(x) ~. \end{equation}\] Taking \(\omega\) to be small and considering infinitesimal transformations, we can expand the right-hand sides of eq. \(\eqref{eq:phiplorentz1}\) and eq. \(\eqref{eq:phiplorentz2}\) in powers of \(\omega\) and equate them to give \[\phi(x) - (\omega_{\rho\sigma}M^{\rho\sigma})^\mu{}_\nu x^\nu\partial_\mu \phi(x) + \mathcal{O}(\omega^2)= \phi(x) + \omega_{\rho\sigma}L^{\rho\sigma} \phi(x) + \mathcal{O}(\omega^2) ~.\] Substituting in \((M^{\rho\sigma})^\mu{}_\nu = \eta^{\sigma\mu}\delta^\rho_\nu - \eta^{\rho\mu}\delta^\sigma_\mu\) from eq. \(\eqref{eq:so13generators}\) and comparing coefficients of \(\omega_{\rho\sigma}\) we find \[\begin{equation} \label{eq:ldef} L^{\rho\sigma} = x^\sigma \partial^\rho - x^\rho \partial^\sigma ~. \end{equation}\] Therefore, under an infinitesimal Lorentz transformation the scalar field transforms as \[\phi'(x) = \phi(x) + \delta\phi(x) + \mathcal{O}(\omega^2) ~, \qquad \delta\phi(x) = \omega_{\rho\sigma} (x^\sigma \partial^\rho - x^\rho \partial^\sigma) \phi(x) ~.\] As expected, the change in the field depends on on how far we are from the origin of the transformation. Note the difference between the matrices \(M^{\rho\sigma}\) and the differential operators \(L^{\rho\sigma}\). Both satisfy the Lorentz algebra but the former realise the action on the space-time coordinates, while the latter realise the action on scalar fields. We say that they realise different representations of the Lie algebra and Lie group, and the space-time coordinates and scalar fields transform in these different representations. We also say that the space-time coordinates and scalar fields themselves form different representations. When we talk about representations, whether we are referring to generators or the objects that transform should be understood from context.

Finally, it is possible to explicitly check that the differential operators \(L^{\rho\sigma}\) and \(P_\mu\) defined in eq. \(\eqref{eq:ldef}\) and eq. \(\eqref{eq:pdef}\) satisfy the Poincaré algebra \[\begin{split} &\phantom{}[L^{\mu\nu}, L^{\rho\sigma}] = -\eta^{\nu\rho}L^{\mu\sigma} + \eta^{\mu\rho} L^{\nu\sigma} + \eta^{\nu\sigma} L^{\mu\rho} - \eta^{\mu\sigma}L^{\nu\rho} ~, \\ &\phantom{}[P^\mu,L^{\nu\rho}] = -\eta^{\mu\nu}P^\rho + \eta^{\mu\rho}P^\nu ~, \qquad [P^\mu,P^\nu] = 0 ~. \end{split}\]

2.6.2 Vector fields

There are many different types of fields. Consider a vector field \(A^\mu(x)\), which will be important for studying gauge theories, as another example. The vector field is different from a scalar field since it carries an extra index. This means that the transformation law is now given by \[A'{}^\mu(x') = \Lambda^\mu{}_\nu A^\nu(x) ~.\] Going through the same analysis as for the scalar field, we find that the form of the generators acting on vector fields is \[L^{\rho\sigma} = M^{\rho\sigma} + x^\sigma \partial^\rho - x^\rho \partial^\sigma ~.\] The first term acts on \(A^\mu(x)\) at a single point in space-time treating it is a vector, while the final two terms act on the space-time coordinate \(x\).

2.6.3 Spinor fields

Another type of field that is important are spinor fields. Their transformation law is more complicated and requires introducing the double cover of the Lorentz group. Physical considerations mean that their components are anticommuting fields rather than commuting fields. While the matter in our universe is largely described by spinor fields, we will focus on scalar fields and their quantization to understand the basics of quantum field theory.

3 Lagrangian Methods and Classical Field Theory

3.1 Lagrangian methods for classical mechanics

Before we introduce our first field theories, let us recall the Lagrangian formalism for classical mechanics. Let us take a system with \(N\) degrees of freedom \(q_a\), where \(a = 1,\dots, N\). Then we can define a Lagrangian \(L(q_1,\dots,q_N,\dot{q}_1,\dots,\dot{q}_N)\) where \(\dot{q}_a = \frac{dq_a}{dt}\) that encodes the dynamics of the system. A typical Lagrangian might be \[L(q_1,\dots,q_N,\dot{q}_1,\dots,\dot{q}_N) = \sum_{a=1}^N \frac{1}{2}m\dot{q}_a^2 - V(q_1,\dots,q_N) ~,\] which takes the form of kinetic minus potential energy.

From the Lagrangian we can then determine the action \[S[q_a] = \int_{t_i}^{t_f} dt\, L(q_a,\dot{q}_a) ~.\] The action is a functional and is a map from the space of particle trajectories to \(\mathbb{R}\). The principle of least action tells us that the solution to the classical equations of motion is the one that extremises the action, i.e., if we consider a variation of the path \(q_a(t) \to q_a(t) + \delta q_a(t)\) then the variation of the action should be zero \[0 = \delta_q S[q_a] = \int_{t_i}^{t_f} dt \, \delta_q L(q_a,\dot q_a) = \int_{t_i}^{t_f} dt \, \sum_{a=1}^N \Big(\frac{\partial L}{\partial q_a} \delta q_a + \frac{\partial L}{\partial \dot{q}_a} \delta \dot{q}_a \Big) = \int_{t_i}^{t_f} dt \, \sum_{a=1}^N \Big(\frac{\partial L}{\partial q_a} - \frac{d}{dt} \frac{\partial L}{\partial \dot{q}_a}\Big) \delta q_a ~,\] where we have integrated by parts and neglected a boundary term in the final equality. Formally, we demand that \(\delta q_a(t_i) = \delta q_a(t_f) = 0\). Demanding that this holds for all variations \(\delta q_a(t)\) we find that the coefficient of \(\delta q_a(t)\) in the integrand must vanish for each \(a\). This gives the Euler-Lagrange equations of motion \[\frac{\partial L}{\partial q_a} - \frac{d}{dt} \frac{\partial L}{\partial \dot{q}_a} = 0 ~.\] We have obtained \(N\) equations of motion, which for classical mechanics are ODEs, from a single scalar \(L\) highlighting the benefit of the Lagrangian formalism.

3.2 Lagrangian methods for classical field theory

Our goal is to write down a Lagrangian for a classical scalar field \(\phi(x) = \phi(t,\vec{x})\). To gain some insight into the structure of the Lagrangian, consider a classical mechanical system with degrees of freedom \(q_a(t)\) and the index \(a\) running over the sites of a cubic lattice. As we take the size of the lattice to infinity and the lattice spacing to zero, we see that that the lattice approximates a continuum \(\mathbb{R}^3\). We then have that \(q_a \to \phi(t,\vec{x})\) and \(\sum_{a=1}^N \to \int d^3 x\). We can think of the space coordinate \(\vec{x}\) as replacing the index \(a\) and labelling the degrees of freedom.

We are therefore led to consider a Lagrangian of the form \[L = \int d^3x \, \mathcal{L}(\phi(x),\partial_\mu \phi(x)) ~,\] where \(\mathcal{L}\) is called the Lagrangian density. Substituting this into the general form of the action we find \[S[\phi] = \int dt d^3x \, \mathcal{L}(\phi(x),\partial_\mu \phi(x)) = \int d^4x \, \mathcal{L}(\phi(x),\partial_\mu \phi(x)) ~,\] where we recall that \(x^0 = t\). While it is possible to consider more complicated actions, this is the general form that we will consider. Let us note a few of its properties:

The Euler-Lagrange equations of motion are again found by demanding that the variation of the action under \(\phi(x) \to \phi(x) + \delta \phi(x)\) vanishes, i.e., \[\begin{split} 0 = \delta_\phi S[\phi] & = \int d^4x \, \delta_\phi \mathcal{L}(\phi(x),\partial_\mu \phi(x)) = \int d^4x \, \Big( \frac{\partial\mathcal{L}}{\partial\phi}\delta\phi + \frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi)}\partial_\mu\delta\phi\Big) \\ & = \int d^4x \, \Big( \frac{\partial\mathcal{L}}{\partial\phi} - \partial_\mu \Big(\frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi)}\Big)\Big)\delta\phi ~. \end{split}\] From this we can read off the Euler-Lagrange equation of motion for a classical scalar field theory \[\begin{equation} \label{eq:el} \frac{\partial\mathcal{L}}{\partial\phi} - \partial_\mu \Big(\frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi)}\Big) = 0 ~. \end{equation}\] Since the field \(\phi\) is a function of space-time, the Lagrangian formalism yields a PDE from a single scalar density \(\mathcal{L}\). When a field configuration satisfies the Euler-Lagrange equations, we say that it is on-shell. Whenever we consider an on-shell solution, the action is stationary under small variations about this field configuration.

3.2.1 Action of a real scalar field

Let us now construct the action of a real scalar field by imposing some physical requirements on the Lagrangian density \(\mathcal{L}\). First, the action should be invariant under Poincaré transformations. Under translations \(x' = x - a\) and Lorentz transformations \(x' = \Lambda x\), the measure \(d^4x\) is invariant since \(a\) is constant and the Jacobian of the Lorentz transformation is \(|\det\Lambda|\), which is equal to 1. Therefore, requiring the action is invariant amounts to requiring that the Lagrangian density transforms as a scalar under Poincaré transformations. Second, to yield non-trivial dynamics the Euler-Lagrange equation \(\eqref{eq:el}\) should be a non-trivial PDE. In practice, this means that \(\mathcal{L}\) must depend on \(\partial_\mu \phi\).

The simplest Lagrangian density for a real scalar field that satisfies these two properties is \[\begin{equation} \label{eq:lagscalar} \mathcal{L}(\phi,\partial_\mu\phi) = -\frac12 \partial_\mu\phi\partial^\mu\phi - V(\phi) ~. \end{equation}\] The first term is called the kinetic term, while \(V(\phi)\) is a function of one variable called the potential of the scalar field. The corresponding action of the real scalar field is \[\begin{equation} \label{eq:actionscalar} S[\phi] = \int d^4x \, \Big(-\frac12 \partial_\mu\phi\partial^\mu\phi - V(\phi)\Big) ~. \end{equation}\]

Expanding out the Lagrangian density \(\eqref{eq:lagscalar}\) we find \[\mathcal{L}= -\frac12\eta^{\mu\nu}\partial_\mu\phi\partial_\nu\phi - V(\phi) = \frac12 (\partial_t \phi)^2 - \frac12 \vec{\nabla}\phi\cdot\vec{\nabla}\phi - V(\phi) ~.\] This is invariant under Lorentz transformations since we have contracted all the indices using the Minkowski metric and its inverse. Note that this has introduced a relative minus sign between the time and space derivatives. In analogy with classical mechanics, we can think of the first term \(\frac12 (\partial_t \phi)^2\) as a kinetic energy and the second two terms as a potential energy.

The simplest choice for \(V(\phi)\) is \[\begin{equation} \label{eq:freepotential} V(\phi) = \frac12 m^2\phi^2 ~. \end{equation}\] With this choice of potential the classical scalar field theory with Lagrangian density \(\eqref{eq:lagscalar}\) describes a system of non-interacting particles with mass \(m\), i.e., a free massive real scalar field. To see this let us work out the Euler-Lagrange equation for this system. Computing the first term in eq. \(\eqref{eq:el}\) we have \[\frac{\partial\mathcal{L}}{\partial\phi} = - \frac{\partial V}{\partial\phi} = -m^2 \phi ~,\] since the kinetic term does not depend explicitly on \(\phi\). The second term in eq. \(\eqref{eq:el}\) is \[\partial_\mu \Big(\frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi)}\Big) = \partial_\mu \Big(\frac{\partial}{\partial(\partial_\mu\phi)}\Big(-\frac12\eta^{\rho\sigma}\partial_\rho\phi\partial_\sigma\phi\Big)\Big) ~,\] where we have renamed dummy indices to ensure we are not using the same index twice. To compute this expression we use the identity \[\frac{\partial}{\partial(\partial_\sigma\phi)}(\partial_\rho\phi) = \delta_\rho^\sigma ~.\] This follows since the left-hand side vanishes if \(\sigma \neq \rho\) and equals 1 if \(\sigma = \rho\), which is the definition of the Kronecker delta. At first sight, it may be slightly surprising that the index \(\sigma\) is raised on the right-hand side. However, this can be shown to be consistent with Lorentz transformations. Using this identity gives \[\begin{split} \partial_\mu \Big(\frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi)}\Big) & = \partial_\mu \Big(\frac{\partial}{\partial(\partial_\mu\phi)}\Big(-\frac12\eta^{\rho\sigma}\partial_\rho\phi\partial_\sigma\phi\Big)\Big) \\ & = -\frac12\eta^{\rho\sigma} (\delta_\rho^\mu \partial_\sigma \phi + \partial_\rho\phi \delta_\sigma^\mu) = - \eta^{\mu\sigma} \partial_\sigma \phi = - \partial^\mu \phi ~. \end{split}\] Therefore, we find \[\frac{\partial\mathcal{L}}{\partial\phi} - \partial_\mu \Big(\frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi)}\Big) = -m^2\phi - \partial_\mu (-\partial^\mu\phi) ~,\] hence, the Euler-Lagrange equation is \[\partial_\mu\partial^\mu \phi - m^2\phi = 0 ~.\] This is the Klein-Gordon equation. It is a linear PDE that describes a relativistically-invariant free wave equation. The differential operator \(\partial_\mu\partial^\mu = \eta^{\mu\nu}\partial_\mu\partial_\nu = -\partial_t^2 + \vec{\nabla}\cdot\vec{\nabla}\) takes the expected form for a wave equation, with the speed of wave propagation equal to 1 since we have set the speed of light \(c=1\).

3.2.2 Action of a complex scalar field

A complex scalar field \(\Phi(x)\) is a map from \(\mathbb{R}^{1,3}\) to \(\mathbb{C}\) \[\Phi:\mathbb{R}^{1,3}\to\mathbb{C}~.\] The action of a free massive complex scalar field is \[\begin{equation} \label{eq:actioncomplexscalar} S[\Phi,\Phi^*] = \int d^4x \, \big(-\partial_\mu \Phi^*\partial^\mu \Phi - m^2 \Phi^* \Phi\big) ~, \end{equation}\] The complex scalar field has real and imaginary parts. If we write \(\Phi = \frac{1}{\sqrt{2}} (\phi + i \psi)\) we find the action of two free massive real scalar fields \[\begin{split} S[\Phi= \frac{1}{\sqrt{2}} (\phi + i \psi),\Phi^*= \frac{1}{\sqrt{2}} (\phi - i \psi)] = \int d^4x \, \Big(-\frac12\partial_\mu\phi\partial^\mu\phi - \frac12\partial_\mu\psi\partial^\mu\psi - \frac{m^2}{2} \phi^2 - \frac{m^2}{2}\psi^2\Big) ~. \end{split}\] When these two real fields have equal mass we typically encode them in a single complex scalar field \(\Phi\) and its conjugate \(\Phi^*\). To compute the Euler-Lagrange equations we formally treat \(\Phi\) and \(\Phi^*\) as independent and vary with respect to each of them. The action is linear in both \(\Phi\) and \(\Phi^*\), hence the Euler-Lagrange equations are simply \[\begin{equation} \label{eq:complexscalarequations} \partial_\mu\partial^\mu \Phi - m^2\Phi = 0 ~, \qquad \partial_\mu\partial^\mu \Phi^* - m^2\Phi^* = 0 ~. \end{equation}\] Note that these two equations are the conjugate of each other.

3.3 Noether’s theorem

Noether’s theorem states that a continuous symmetry leads to a conserved current and is one of the most important results in mathematical and theoretical physics.

3.3.1 Continuous symmetries

To understand what is meant by a continuous symmetry, let us consider the example of a free massive complex scalar field. The action \(\eqref{eq:actioncomplexscalar}\) is manifestly invariant under the following phase rotation of \(\Phi\) \[\begin{equation} \label{eq:u1sym} \Phi \to e^{i\alpha}\Phi ~, \qquad \Phi^* \to e^{-i\alpha}\Phi^* ~, \end{equation}\] where \(\alpha\) is a constant. Explicitly, we have \[\begin{split} S[e^{i\alpha}\Phi, e^{-i\alpha}\Phi^*] & = \int d^4x \, \big(-e^{-i\alpha} \partial_\mu \Phi^* e^{i\alpha} \partial^\mu \Phi - m^2 e^{-i\alpha} \Phi^* e^{i\alpha}\Phi \big) \\ & = \int d^4x \, \big(-\partial_\mu \Phi^*\partial^\mu \Phi - m^2 \Phi^* \Phi\big) = S[\Phi,\Phi^*] ~. \end{split}\] This is a global continuous \(\mathrm{U}(1)\) symmetry. It is global because \(\alpha\) is a constant, hence it acts in the same way at all points in space-time, and it is continuous because it is a symmetry for all values of \(\alpha \in [0,2\pi)\). There are also symmetries that are not global, e.g., gauge symmetries, and symmetries that are not continuous, i.e., discrete symmetries.

The fact that the transformation \(\eqref{eq:u1sym}\) describes a continuous symmetry allows us to consider its infinitesimal version \[\Phi \to (1+i\alpha) \Phi + \mathcal{O}(\alpha^2) ~, \qquad \Phi^* \to (1-i\alpha) \Phi + \mathcal{O}(\alpha^2) ~,\] which will be important for proving Noether’s theorem. This is in contrast to discrete symmetries, such as \[\begin{equation} \label{eq:chargecong} \Phi \to \Phi^* ~, \qquad \Phi^* \to \Phi ~, \end{equation}\] which do not have an infinitesimal version and there it is not a straightforward generalisation of Noether’s theorem. Note that the discrete symmetry \(\eqref{eq:chargecong}\) is known as charge conjugation, or \(C\) for short. Together with time-reversal, \(T\), and parity, \(P\), charge conjugation plays an important role in quantum field theory.

3.3.2 Proof of Noether’s theorem

To prove Noether’s theorem for classical scalar field theory, we consider a field theory for a set of scalar fields \(\phi_a\), where \(a = 1,\dots,N\), with action \(S[\phi_a]\). We now suppose that the theory is invariant under a continuous symmetry. This means that there is an infinitesimal variation of the fields \[\begin{equation} \label{eq:globalsym} \phi_a(x) \to \phi_a(x) + \epsilon \Delta\phi_a(x) + \mathcal{O}(\epsilon^2) ~, \end{equation}\] where \(\epsilon\) is an infinitesimal parameter, under which the action is invariant, i.e., \[S[\phi_a(x) + \epsilon \Delta\phi_a(x)] - S[\phi_a(x)] = \mathcal{O}(\epsilon^2) \qquad \Rightarrow \qquad \delta_\epsilon S[\phi_a] = 0 ~.\]

Now let us consider a variation where we allow \(\epsilon\) to be an arbitrary function of space-time, i.e. \[\begin{equation} \label{eq:locsym} \phi_a(x) \to \phi_a(x) + \epsilon(x) \Delta\phi_a(x) + \mathcal{O}(\epsilon^2) ~. \end{equation}\] The action will no longer be invariant and this may seem a strange thing to do. However, the variation of the action must vanish when \(\epsilon(x)\) is constant, hence it should take the form \[\begin{equation} \label{eq:consform} \delta_{\epsilon(x)} S[\phi_a] = \int d^4 x \, j^\mu \partial_\mu \epsilon(x) ~, \end{equation}\] assuming the field theory is local. Here \(j^\mu(\phi_a, \partial_\nu \phi_a, \dots)\) is a function of the fields and their derivatives that depends on the theory that we are studying. If we now integrate by parts, and assume that all the fields and their derivatives fall off sufficiently fast at infinity, we find \[\delta_{\epsilon(x)} S[\phi_a] = - \int d^4 x \, \partial_\mu j^\mu \epsilon(x) ~.\] This holds for any choice of field configuration. Restricting to field configurations \(\phi_a\) that satisfy the Euler-Lagrange equations, i.e., they are on-shell, the variation of the action vanishes for any variation of the fields, including \(\eqref{eq:locsym}\). Therefore, for on-shell field configurations, we have \[0 = \delta_{\epsilon(x)} S[\phi_a] = - \int d^4 x \, \partial_\mu j^\mu \epsilon(x) ~.\] This holds for any choice of \(\epsilon(x)\), hence, whenever the fields satisfy the Euler-Lagrange equations, it follows that \[\begin{equation} \label{eq:divergenceless} \partial_\mu j^\mu = 0 ~. \end{equation}\] Therefore, there exists a divergenceless vector field \(j^\mu\), which is usually called a conserved current. This proves Noether’s theorem. Moreover, it gives us an algorithm to construct the conserved current.

The existence of a divergenceless current is important since it means that there exists a conserved charge. Expanding out eq. \(\eqref{eq:divergenceless}\), it reads \[\partial_t j^0 (t,x^i) + \partial_i j^i (t,x^i) = 0 ~.\] Integrating this equation over space gives \[\partial_t \int d^3x \, j^0 + \int d^3x \, \partial_i j^i = 0 ~.\] The second term on the right-hand side is a total derivative, which means that we can neglect it if all fields and their derivatives fall off sufficiently fast at infinity. In this case we find the following expression \[\frac{d}{dt} Q(t) = 0 ~, \qquad Q = \int d^3 x \, j^0(t,x^i) ~.\] In other words, there exists a conserved charge \(Q\), which is independent of time.

As an example, let us consider a free massive complex scalar field with action \(\eqref{eq:actioncomplexscalar}\) and construct the conserved current and charge for the \(\mathrm{U}(1)\) symmetry \(\eqref{eq:u1sym}\). The infinitesimal variations of the field \(\Phi\) and its conjugate \(\Phi^*\) are \[\delta_\alpha \Phi(x) = i \alpha \Phi(x) ~, \qquad \delta_\alpha \Phi^*(x) = - i \alpha \Phi^*(x) ~.\] We now let the symmetry parameter \(\alpha\) depend on space-time and consider the variations \[\delta_{\alpha(x)} \Phi(x) = i \alpha(x) \Phi(x) ~, \qquad \delta_{\alpha(x)} \Phi^*(x) = - i \alpha(x) \Phi^*(x) ~.\] The variation of the action \(\eqref{eq:actioncomplexscalar}\) gives \[\begin{split} \delta_{\alpha(x)} S[\Phi,\Phi^*] & = \int d^4x \, \big(-\partial_\mu(\delta_{\alpha(x)} \Phi^*) \partial^\mu \Phi -\partial_\mu\Phi^*\partial^\mu(\delta_{\alpha(x)} \Phi) -m^2 \delta_{\alpha(x)} \Phi^* \Phi -m^2 \Phi^* \delta_{\alpha(x)} \Phi\big) \\ & = \int d^4x \, \big(-\partial_\mu(-i\alpha\Phi^*) \partial^\mu \Phi -\partial_\mu\Phi^*\partial^\mu(i\alpha\Phi) -m^2 (-i\alpha \Phi^*) \Phi -m^2 \Phi^* (i\alpha\Phi)\big) \\ & = \int d^4x \, i\big(\Phi^* \partial^\mu \Phi - \Phi \partial^\mu \Phi^*\big) \partial_\mu\alpha ~. \end{split}\] This takes the form \(\eqref{eq:consform}\), which allows us to read off the expression for the conserved current \[\begin{equation} \label{eq:u1current} j^\mu = i(\Phi^* \partial^\mu \Phi - \Phi \partial^\mu \Phi^*) ~. \end{equation}\] The corresponding conserved charge \(Q\) is \[Q = \int d^3x\, j^0 = \int d^3x\, i\big(\Phi \partial_t\Phi^* - \Phi^* \partial_t \Phi \big) ~.\] It is possible to explicitly check that the current \(\eqref{eq:u1current}\) is conserved using the Euler-Lagrange equations \(\eqref{eq:complexscalarequations}\) that follow from the action \(\eqref{eq:actioncomplexscalar}\).

3.3.3 Alternative derivation of the current

The action is written as the integral over space-time of the Lagrangian density \(\mathcal{L}(\phi_a,\partial_\mu\phi_a)\). In many situations, not only is the action invariant under a symmetry, but the Lagrangian density is as well. Let us consider the variation of the Lagrangian density under the symmetry variation \(\eqref{eq:globalsym}\) \[\delta_\epsilon\mathcal{L}= \sum_{a=1}^N \Big(\frac{\partial\mathcal{L}}{\partial\phi_a} \epsilon \Delta\phi_a + \frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi_a)} \partial_\mu(\epsilon \Delta \phi_a)\Big) ~.\] If the Lagrangian density is invariant then this variation vanishes for constant \(\epsilon\) \[\sum_{a=1}^N \Big(\frac{\partial\mathcal{L}}{\partial\phi_a} \Delta\phi_a + \frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi_a)} \partial_\mu(\Delta \phi_a)\Big) = 0 ~.\] Together with the Euler-Lagrange equations \[\frac{\partial\mathcal{L}}{\partial\phi_a} - \partial_\mu\Big(\frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi_a)}\Big) = 0 ~,\] this implies that on-shell \[\sum_{a=1}^N \Big( \partial_\mu\Big(\frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi_a)}\Big) \Delta\phi_a + \frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi_a)} \partial_\mu(\Delta \phi_a)\Big) = \sum_{a=1}^N \partial_\mu\Big(\frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi_a)} \Delta\phi_a \Big) = 0 ~.\] Therefore, we have identified a divergenceless current \[\begin{equation} \label{eq:divcurrent} j^\mu = \sum_{a=1}^N \frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi_a)} \Delta\phi_a ~. \end{equation}\] It is possible to check that this gives the same \(\mathrm{U}(1)\) conserved current \(\eqref{eq:u1current}\) for the free massive complex scalar field with action \(\eqref{eq:actioncomplexscalar}\).

When the Lagrangian density is invariant under a symmetry this method is often a quick way to find the conserved current. However, it is important to emphasise that eq. \(\eqref{eq:divcurrent}\) only holds when Lagrangian density is invariant and that this is a stronger condition than the invariance of the integrated action. It is possible to improve eq. \(\eqref{eq:divcurrent}\) to work in the general case. In particular, if \(\delta_\epsilon\mathcal{L}= \partial_\mu F^\mu\) then, assuming that the fields and their derivatives fall off sufficiently fast at infinity, the action is invariant and the conserved current is \[j^\mu = \sum_{a=1}^N \frac{\partial\mathcal{L}}{\partial(\partial_\mu\phi_a)} \Delta\phi_a - F^\mu~.\] However, it is typically more straightforward to use the algorithm described in subsection 3.3.2.

3.4 Hamiltonian formalism

To conclude our discussion of classical field theory let us introduce the Hamiltonian formalism. To do so we recall how to construct the Hamiltonian in classical mechanics. Given a system with \(N\) degrees of freedom \(q_a\), where \(a=1,\dots,N\), the canonical conjugate momentum is defined as \[p_a = \frac{\partial L}{\partial \dot{q}_a} ~.\] The Hamiltonian is then defined as \[H(q_a,p_a) = \sum_{a=1}^N p_a \dot{q}_a - L(q_a,\dot{q}_a) ~,\] where we substitute in for \(\dot{q}_a\) in terms of \(p_a\).

Proceeding by analogy for field theory, we construct the canonical conjugate momentum for a field \(\phi_a(t,\vec{x})\) from the Lagrangian density \[\pi_a(t,\vec{x}) = \frac{\partial\mathcal{L}}{\partial(\partial_t\phi_a(t,\vec{x}))} ~.\] The Hamiltonian density is then defined as \[\mathcal{H}(\phi_a,\pi_a,\vec{\nabla}\phi_a) = \sum_{a=1}^N \pi_a(t,\vec{x}) \partial_t \phi_a(t,\vec{x}) - \mathcal{L}(\phi_a,\partial_\mu\phi_a) ~,\] where we substitute in for \(\partial_t \phi_a\) in terms of \(\pi_a\). The Hamiltonian is then given by the integral of the Hamiltonian density over space \[H = \int d^3x \, \mathcal{H}~.\] The Hamiltonian can be interpreted as the energy of the system. For field theories that are invariant under time translations, which will be the case for all those we consider, this energy is conserved. Moreover, it can be found as the Noether charge associated with time translation symmetry.

As an example let us consider a free massive real scalar field with Lagrangian density \(\eqref{eq:lagscalar}\) and potential \(\eqref{eq:freepotential}\). Expanding out the Lagrangian density we find \[\mathcal{L}= -\frac12\eta^{\mu\nu}\partial_\mu\phi\partial_\nu\phi - \frac{1}{2}m^2\phi^2 = \frac12 (\partial_t \phi)^2 - \frac12 \vec{\nabla}\phi\cdot\vec{\nabla}\phi - \frac{1}{2}m^2\phi^2 ~.\] Therefore, \[\pi(t,\vec{x}) = \frac{\partial\mathcal{L}}{\partial(\partial_t\phi)} = \partial_t \phi ~.\] The Hamiltonian density is then given by \[\begin{split} \mathcal{H}& = \pi\partial_t \phi - \Big(\frac12 (\partial_t \phi)^2 - \frac12 \vec{\nabla}\phi\cdot\vec{\nabla}\phi - \frac{1}{2}m^2\phi^2\Big) \\ & = \pi^2 - \Big(\frac12 \pi^2 - \frac12 \vec{\nabla}\phi\cdot\vec{\nabla}\phi - \frac{1}{2}m^2\phi^2\Big) \\ & = \frac12 \pi^2 + \frac12 \vec{\nabla}\phi\cdot\vec{\nabla}\phi + \frac{1}{2}m^2\phi^2 ~, \end{split}\] and the Hamiltonian is the integral of this expression over space \[H = \int d^3x \, \Big(\frac12 \pi^2 + \frac12 \vec{\nabla}\phi\cdot\vec{\nabla}\phi + \frac{1}{2}m^2\phi^2\Big) ~.\] This expression is a manifestly positive-definite quantity, consistent with its interpretation as the energy of the system.

Note that even though we started with a Lorentz invariant field theory, Lorentz symmetry is no longer manifest in the Hamiltonian formalism. This is because we have picked out the time direction and used it to define the conjugate momentum, which is not a covariant procedure. This is always an issue with the Hamiltonian formalism, but even so, physical quantities will always be Lorentz invariant or covariant.

Finally, let us consider a free massive complex scalar field with action \(\eqref{eq:actioncomplexscalar}\). Expanding out the Lagrangian density we find \[\mathcal{L}= \partial_t\Phi^*\partial_t\Phi - \vec{\nabla}\Phi^* \cdot \vec{\nabla}\Phi - m^2 \Phi^* \Phi ~.\] Recalling that we formally treat \(\Phi\) and \(\Phi^*\) as independent, the conjugate momenta are given by \[\begin{equation} \label{eq:complexscalarconjugatemomenta} \Pi_{\Phi} = \frac{\partial \mathcal{L}}{\partial (\partial_t \Phi)} = \partial_t\Phi^* ~, \qquad \Pi_{\Phi^*} = \frac{\partial \mathcal{L}}{\partial(\partial_t \Phi^*)} = \partial_t\Phi = \Pi_{\Phi}^* ~. \end{equation}\] From now on we will drop the subscript \(\Phi\), hence the Hamiltonian density is \[\mathcal{H}= \Pi \partial_t \Phi + \Pi^* \partial_t \Phi^* - \mathcal{L}= \Pi\Pi^* + \vec{\nabla}\Phi^* \cdot \vec{\nabla}\Phi + m^2 \Phi^* \Phi ~.\] Again the corresponding Hamiltonian \[H = \int d^3x \,\big( \Pi\Pi^* + \vec{\nabla}\Phi^* \cdot \vec{\nabla}\Phi + m^2 \Phi^* \Phi \big)~,\] is a manifestly positive-definite quantity, consistent with its interpretation as the energy of the system.

4 Quantum Field Theory in Canonical Formalism

Let us now turn to quantum field theory. We will explain how to quantize classical field theories in the canonical formalism using the Hamiltonian. There is an alternative approach based on the Lagrangian known as the path integral formalism.

We start by recalling how we quantize classical mechanics using the canonical formalism. Consider a classical system with degrees of freedom \(q_a\) and conjugate momenta \(p_a\). We also have a Hamiltonian \(H(q_a,p_a)\), which can be constructed from a Lagrangian \(L(q_a,\dot q_a)\). These degrees of freedom have Poisson brackets given by \[\{q_a,q_b\} = 0 ~, \qquad \{p_a,p_b\} = 0 ~, \qquad \{q_a,p_b\} = \delta_{ab} ~.\] The standard algorithm (or quantization) takes these classical variables and promotes them to hermitian operators \(\hat q_a\) and \(\hat{p}_a\), which now act on a suitable Hilbert space, while the Poisson brackets are replaced by commutators. In the Schrödinger picture, where states \(|\psi\rangle\) evolve in time and operators are time-independent, the algebra of operators is \[\phantom{}[\hat q_a,\hat q_b] = 0 ~, \qquad [\hat{p}_a,\hat{p}_b] ~, \qquad [\hat q_a,\hat p_b] = i\delta_{ab} ~,\] where we recall that we have set \(\hbar = 1\). Reinstating \(\hbar\), it would appear on the right-hand side of these commutation relations. Alternatively, in the Heisenberg picture the operators evolve in time and these commutators should be understood as holding at equal time, i.e., \[\begin{equation} \label{eq:heisenbergcommutators} \phantom{}[\hat q_a(t),\hat q_b(t)] = 0 ~, \qquad [\hat{p}_a(t),\hat{p}_b(t)] ~, \qquad [\hat q_a(t),\hat p_b(t)] = i\delta_{ab} ~. \end{equation}\] The commutation relations for operators at different times take a more complicated form depending explicitly on the time delay.

We now would like to follow the analogous algorithm for field theory, i.e., our aim is the quantize the classical theory for a real scalar field. The classical degrees of freedom are encoded in the scalar field itself \(\phi(t,\vec{x})\) and its conjugate momentum \(\pi(t,\vec{x})\). We would like to promote these to operators \(\hat\phi(t,\vec{x})\) and \(\hat \pi(t,\vec{x})\) that satisfy the following canonical equal-time commutation relations \[\phantom{}[\hat \phi(t,\vec{x}),\hat \phi(t,\vec{y})] = 0 ~, \qquad [\hat\pi(t,\vec{x}),\hat\pi(t,\vec{y})] = 0 ~, \qquad [\hat \phi(t,\vec{x}),\hat \pi(t,\vec{y})] = i\delta^{(3)}(\vec{x}-\vec{y}) ~.\] The right-hand side of the last commutator contains a delta function in space, which is the continuum analogue of the Kronecker delta in \(\eqref{eq:heisenbergcommutators}\). Our aim now amounts to building the quantum theory that satisfies these relations, understanding the quantum version of the Hamiltonian, constructing the space of states, determining the eigenvalues of the Hamiltonian, and more.

This is a difficult task and we will start by doing it for the free massive theory, i.e., when the potential \(V(\phi)\) is quadratic in \(\phi\). In this case the problem is tractable. The Euler-Lagrange equation for a free massive real scalar field is the Klein-Gordon equation \[(\partial_\mu\partial^\mu -m^2) \phi(t,\vec{x}) = (-\partial_t^2 + \vec{\nabla}^2 - m^2) \phi(t,\vec{x}) = 0 ~.\] This is a linear PDE in a system with translational invariance. Whenever we have translational invariance it makes sense to Fourier transform to momentum space \[\begin{equation} \label{eq:phimomentum} \phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, e^{i\vec{k}\cdot \vec{x}} \tilde\phi(t,\vec{k}) ~. \end{equation}\] The inverse of this relation is given by \[\tilde\phi(t,\vec{k}) = \int d^3x \, e^{-i\vec{k}\cdot \vec{x}} \phi(t,\vec{x}) ~.\] If we now act on the scalar field \(\eqref{eq:phimomentum}\) with the Klein-Gordon operator \[-\partial_t^2 + \vec{\nabla}^2 - m^2 ~,\] we find \[\begin{equation} \label{eq:kleingordonmomentum} (\partial_t^2 + \vec{k}^2 + m^2)\tilde\phi(t,\vec{k}) = 0 ~, \end{equation}\] since each instance of \(\vec{\nabla}\) brings down a factor of \(i\vec{k}\) from the exponential. Note that the operation of multiplying by \(e^{i\vec{k}\cdot \vec{x}}\) and integrating over momentum space is invertible. Eq. \(\eqref{eq:kleingordonmomentum}\) is the equation for a simple harmonic oscillator with frequency \(\omega_{\vec{k}} = \sqrt{\vec{k}^2 + m^2}\) \[(\partial_t^2 + \omega_{\vec{k}}^2)\tilde\phi(t,\vec{k}) = 0 ~.\] Therefore, classically, each Fourier mode of a free massive real scalar field is a simple harmonic oscillator. Note that if the potential had not been quadratic in \(\phi\) the above analysis would not have worked so simply, hence why solving and quantizing interacting classical field theories is in general a difficult task.

4.1 Quantizing a simple harmonic oscillator

To gain insight into the quantization of a free massive real scalar field, let us recall the quantization of a simple harmonic oscillator. Classically, a simple harmonic oscillator with frequency \(\omega\) has a single degree of freedom \(q(t)\) with the Lagrangian \[L(q,\dot q) = \frac12 \dot q^2 - \frac12 \omega^2 q^2 ~.\] The Euler-Lagrange equation is \[\ddot q(t) = - \omega^2 q(t) ~.\] The general solution to this equation is given by \[q(t) = \frac{1}{\sqrt{2\omega}} \big(a e^{-i\omega t} + b e^{+ i \omega t}\big) ~,\] where \(a\) and \(b\) are integration constants and the normalisation of \(\frac{1}{\sqrt{2\omega}}\) is for convenience. Since \(q(t)\) is a real degree of freedom we require that \(q(t) = q^*(t)\), hence \(b = a^*\) and \[\begin{equation} \label{eq:qoscillator} q(t) = \frac{1}{\sqrt{2\omega}} \big(a e^{-i\omega t} + a^* e^{+ i \omega t}\big) ~. \end{equation}\] In the Hamiltonian formalism we have \[p = \frac{\partial L}{\partial \dot q} = \dot q ~, \qquad H = p \dot q - L = \frac12 (p^2 + \omega^2 q^2) ~.\]

To quantize we promote the degrees of freedom and their conjugate momenta to operators acting on a Hilbert space. To represent this we send \(q \to \hat q\) and \(p \to \hat p\), such that the Hamiltonian becomes \[\hat H = \frac12 (\hat p^2 + \omega^2 \hat q^2) ~.\] Working in the Heisenberg picture where the operators depend on time, the analogue of the expansion \(\eqref{eq:qoscillator}\) is \[\begin{equation} \label{eq:hatq} \hat q(t) = \frac{1}{\sqrt{2\omega}} \big(\hat a e^{-i\omega t} + \hat a^\dagger e^{+ i \omega t}\big) ~, \end{equation}\] where \(\hat a\) and \(\hat a ^\dagger\) are quantum operators that we take to be independent of time. In the context of a simple harmonic oscillator they are called ladder operators. Computing \(\hat p\) we find \[\begin{equation} \label{eq:hatp} \hat p(t) = \dot{\hat{q}}(t) = -i \sqrt{\frac{\omega}{2}} \big(\hat a e^{-i\omega t} - \hat a^\dagger e^{+ i \omega t}\big) ~. \end{equation}\]

The canonical equal-time commutation relations for \(\hat q(t)\) and \(\hat p(t)\) are \[\begin{equation} \label{eq:commutationrelations} \phantom{}[\hat q(t),\hat p(t)] = i ~, \qquad [\hat q(t),\hat q(t)] = [\hat p(t),\hat p(t)] = 0 ~. \end{equation}\] Solving the expressions for \(\hat q(t)\) \(\eqref{eq:hatq}\) and \(\hat p(t)\) \(\eqref{eq:hatp}\) for \(\hat a\) and \(\hat a^\dagger\) gives \[\hat a = \Big( \sqrt{\frac{\omega}{2}} \hat q(t) + \frac{i}{\sqrt{2\omega}} \hat p(t) \Big) e^{+i\omega t} ~, \qquad \hat a^\dagger = \Big( \sqrt{\frac{\omega}{2}} \hat q(t) - \frac{i}{\sqrt{2\omega}} \hat p(t) \Big) e^{-i\omega t} ~.\] We can now use the commutators \(\eqref{eq:commutationrelations}\) to work out the commutation relations for \(a\) and \(a^\dagger\). Doing so we find \[\phantom{}[a,a^\dagger ] = 1 ~, \qquad [a,a] = [a^\dagger,a^\dagger] = 0 ~.\]

The Hamiltonian expressed in terms of \(\hat a\) and \(\hat a^\dagger\) reads \[\hat H = \frac{\omega}{2} (\hat a^\dagger \hat a + \hat a \hat a^\dagger) = \omega \Big(\hat a^\dagger \hat a + \frac12\Big) ~,\] where we have used the commutator of \(\hat a\) and \(\hat a^\dagger\) to uniformise their ordering. Importantly, the ladder operators diagonalise the Hamiltonian. To see this we compute \[\phantom{}[\hat H,\hat a^\dagger] = \Big[\omega \Big(\hat a^\dagger \hat a + \frac12\Big),\hat a^\dagger\Big] = \omega \hat a^\dagger[\hat a,\hat a^\dagger] = \omega \hat a^\dagger ~,\] and similarly \[\phantom{}[\hat H,\hat a] = \Big[\omega \Big(\hat a^\dagger \hat a + \frac12\Big),\hat a\Big] = \omega [\hat a^\dagger,\hat a] \hat a = - \omega \hat a ~.\] This means that the ladder operators \(a^\dagger\) and \(a\) raise and lower the energy respectively. To see this explicitly, let us consider an energy eigenstate \(|E\rangle\) with energy \(E\), i.e., \[\hat H |E\rangle = E |E\rangle ~.\] Here \(\hat H\) is an operator, while \(E\) is a number, hence this is an eigenvalue equation. Now we consider the state \(\hat a^\dagger |E\rangle\) and act on it with the Hamiltonian. Doing so we find \[\hat H \hat a^\dagger |E\rangle = \big(\hat a^\dagger \hat H + [\hat H,\hat a^\dagger]\big) |E\rangle = \big(\hat a^\dagger \hat H + \omega \hat a^\dagger\big) |E\rangle = (E+\omega) \hat a^\dagger |E\rangle ~.\] Therefore, \(\hat a^\dagger |E\rangle\) is also an eigenstate of the Hamiltonian with energy \(E+\omega\), i.e., \(\hat a^\dagger\) has raised the energy by \(\omega\). Similarly, we can consider the state \(\hat a |E\rangle\). Acting with the Hamiltonian gives \[\hat H \hat a |E\rangle = \big(\hat a \hat H + [\hat H,\hat a]\big) |E\rangle = \big(\hat a \hat H - \omega \hat a\big) |E\rangle = (E-\omega) \hat a |E\rangle ~,\] and we see that \(\hat a |E\rangle\) is an eigenstate with energy \(E-\omega\), i.e., \(\hat a\) has lowered the energy by \(\omega\).

The ladder operators allow us to construct the full set of energy eigenstates. We define a state \(|0\rangle\) called the ground state by requiring it to satisfy \(\hat a|0\rangle = 0\), i.e., it is annihilated by the annihilation, or lowering, operator \(\hat a\). The energy of the ground state is \[\hat H |0\rangle = \omega \Big(\hat a^\dagger \hat a + \frac12\Big) |0\rangle = \frac{\omega}{2} |0\rangle ~.\] A tower of excited states can then be constructed by repeatedly acting with the creation, or raising, operator \(a^\dagger\) \[|n\rangle = (\hat a^\dagger)^n |0\rangle ~, \qquad \hat H |n\rangle = \omega \Big(n + \frac12\Big) |n\rangle ~.\]

4.2 Quantizing a free complex scalar field

The canonical quantization of a free massive real and complex scalar fields are very similar. The main difference is that for the real case we impose an additional reality condition on the field. Therefore, we will focus on the complex case. Recall the action of a free massive complex scalar field \[\begin{equation} \label{eq:actioncomplexscalar2} S[\Phi,\Phi^*] = \int d^4x \, \big(-\partial_\mu \Phi^*\partial^\mu \Phi - m^2 \Phi^* \Phi\big) ~. \end{equation}\] The Euler-Lagrange equations in position space are \[\begin{equation} \label{eq:complexscalarequations2} \partial_\mu\partial^\mu \Phi - m^2\Phi = 0 ~, \qquad \partial_\mu\partial^\mu \Phi^* - m^2\Phi^* = 0 ~, \qquad \partial_\mu\partial^\mu = - \partial_t^2 + \vec{\nabla}^2 ~. \end{equation}\] Since these equations are the conjugate of each other, we will just focus on the equation for \(\Phi\), which in momentum space becomes \[(\partial_t^2 + \omega_{\vec{k}}^2) \tilde\Phi(t,\vec{k}) ~, \qquad \omega_{\vec{k}} = \sqrt{\vec{k}^2 + m^2} ~.\] As for a free massive real scalar field, the Euler-Lagrange equation in momentum space for fixed \(\vec{k}\) is just that of a simple harmonic oscillator. Therefore, the general solution is given by \[\tilde\Phi(t,\vec{k}) = \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(a_{\vec{k}} e^{-i\omega_{\vec{k}}t} + b^*_{-\vec{k}} e^{+i\omega_{\vec{k}}t} \big) ~,\] where \(a_{\vec{k}}\) and \(b_{\vec{k}}\) are momentum-dependent integration constants. We can now return to position space using the Fourier transform to give \[\Phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(a_{\vec{k}} e^{-i\omega_{\vec{k}}t} + b^*_{-\vec{k}} e^{+i\omega_{\vec{k}}t} \big) e^{+i \vec{k}\cdot \vec{x}} ~.\] It is conventional to make the change of variables \(\vec{k} \to - \vec{k}\) in the second term to give \[\Phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(a_{\vec{k}} e^{-i\omega_{\vec{k}}t + i \vec{k}\cdot \vec{x}} + b^*_{\vec{k}} e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}} \big) ~.\] Introducing \(k^\mu = (\omega_{\vec{k}},\vec{k})^\mu\) and recalling \(x^\mu = (t,\vec{x})^\mu\), we see that the exponents now can be written in the manifestly Lorentz invariant form \[\Phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(a_{\vec{k}} e^{+ik_\mu x^\mu} + b^*_{\vec{k}} e^{-i k_\mu x^\mu} \big) ~.\] If we were considering a free massive real scalar field, then imposing reality would tell us that \(a_{\vec{k}} = b_{\vec{k}}\). For a free massive complex scalar field they are independent.

4.2.1 Operator-valued Fourier expansions

To quantize a free massive complex scalar field we now let the field \(\Phi(t,\vec{x})\) and its conjugate momentum \(\Pi(t,\vec{x})\) become operators, which we represent by sending \[\Phi(t,\vec{x}) \to \hat \Phi(t,\vec{x}) ~, \qquad \Pi(t,\vec{x}) \to \hat \Pi(t,\vec{x}) ~,\] and similarly for their conjugates \[\Phi^*(t,\vec{x}) \to \hat \Phi^\dagger(t,\vec{x}) ~, \qquad \Pi^*(t,\vec{x}) \to \hat \Pi^\dagger(t,\vec{x}) ~,\] where \(\dagger\) represents the hermitian conjugate of an operator, as oppose to \(*\), which denotes the complex conjugate of a number.

In the Heisenberg picture, we then have the following expansion of the operator \(\hat \Phi(t,\vec{x})\) \[\begin{equation} \label{eq:complexphi} \hat \Phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(\hat a_{\vec{k}} e^{-i\omega_{\vec{k}}t + i \vec{k}\cdot \vec{x}} + \hat b^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}} \big) ~, \end{equation}\] where \(\hat a_{\vec{k}}\) and \(\hat b_{\vec{k}}\) are now operators. Similarly, we have its hermitian conjugate \[\begin{equation} \label{eq:complexphihat} \hat \Phi^\dagger(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(\hat b_{\vec{k}} e^{-i\omega_{\vec{k}}t + i \vec{k}\cdot \vec{x}} + \hat a^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}} \big) ~. \end{equation}\]

The expression for the conjugate momentum of \(\Phi\) can be found using its classical definition \(\eqref{eq:complexscalarconjugatemomenta}\) \[\begin{equation} \label{eq:complexpi} \hat \Pi(t,\vec{x}) = \partial_t \hat \Phi^\dagger(t,\vec{x}) = - i \int \frac{d^3k}{(2\pi)^3} \, \sqrt{\frac{\omega_{\vec{k}}}{2}} \big(\hat b_{\vec{k}} e^{-i\omega_{\vec{k}}t +i \vec{k}\cdot \vec{x}} - \hat a^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t - i \vec{k}\cdot \vec{x}} \big) ~, \end{equation}\] while for its hermitian conjugate we have \[\begin{equation} \label{eq:complexpihat} \hat \Pi^\dagger(t,\vec{x}) = \partial_t \hat \Phi(t,\vec{x}) = - i \int \frac{d^3k}{(2\pi)^3} \, \sqrt{\frac{\omega_{\vec{k}}}{2}} \big(\hat a_{\vec{k}} e^{-i\omega_{\vec{k}}t +i \vec{k}\cdot \vec{x}} - \hat b^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t - i \vec{k}\cdot \vec{x}} \big) ~. \end{equation}\]

Therefore, the fields \(\Phi(t,\vec{x})\) and \(\Pi(t,\vec{x})\) and their hermitian conjugates are determined in terms of the operators \(\hat a_{\vec{k}}\) and \(\hat b_{\vec{k}}\) and their hermitian conjugates. It will transpire that these operators can be used to diagonalise the Hamiltonian of the quantum field theory for a free massive complex scalar. Moreover, they will have the physical interpretation of creation and annihilation operators of particles.

Let us recall the canonical equal-time commutation relations between the scalar field and its conjugate momentum \[\phantom{}[\hat\Phi(t,\vec{x}),\hat\Pi(t,\vec{y})] = [\hat\Phi^\dagger(t,\vec{x}),\hat\Pi^\dagger(t,\vec{y})] = i \delta^{(3)}(\vec{x} - \vec{y}) ~,\] with the remaining equal-time commutators vanishing. As we will show, these relations imply the following commutation relations for the creation and annihilation operators \[\phantom{}[\hat a_{\vec{k}},\hat a_{\vec{k}'}^\dagger] = (2\pi)^3 \delta^{(3)}(\vec{k} - \vec{k}') ~, \qquad \phantom{}[\hat b_{\vec{k}},\hat b_{\vec{k}'}^\dagger] = (2\pi)^3 \delta^{(3)}(\vec{k} - \vec{k}') ~,\] with the remaining commutators equal to zero. Therefore, we can think of each Fourier mode of a free massive complex scalar field as being two independent simple harmonic oscillators with creation operators \(\hat a_{\vec{k}}^\dagger\) and \(\hat b_{\vec{k}}^\dagger\). The delta functions mean that, as far as these commutation relations are concerned, the Fourier modes do not talk to each other.

4.2.2 Algebra of creation and annihilation operators

Let us now derive the algebra of creation and annihilation operators. To do this, the following identity will be useful \[\begin{equation} \label{eq:identity} \int d^3x \, e^{i\vec{p}\cdot\vec{x}} = (2\pi)^3 \delta^{(3)} (\vec{p}) ~. \end{equation}\] We would like to find expressions for \(\hat a_{\vec{k}}\), \(\hat b_{\vec{k}}\) and their hermitian conjugates in terms \(\hat\Phi(t,\vec{x})\), \(\hat\Pi(t,\vec{x})\) and their hermitian conjugates. Let us take the expression \(\eqref{eq:complexphi}\) for \(\hat\Phi(t,\vec{x})\), multiply by \(e^{i\vec{k}'\cdot\vec{x}}\) and integrate over \(\vec{x}\) \[\begin{split} \int d^3x \, e^{i\vec{k}'\cdot\vec{x}} \hat\Phi(t,\vec{x}) & = \int d^3x \, e^{i\vec{k}'\cdot\vec{x}} \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(\hat a_{\vec{k}} e^{-i\omega_{\vec{k}}t + i \vec{k}\cdot \vec{x}} + \hat b^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}} \big) \\ & = \int d^3x \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(\hat a_{\vec{k}} e^{-i\omega_{\vec{k}}t + i (\vec{k} + \vec{k}')\cdot \vec{x}} + \hat b^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t -i (\vec{k} -\vec{k}')\cdot \vec{x}} \big) \\ & = \int d^3k \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(\hat a_{\vec{k}} e^{-i\omega_{\vec{k}}t}\delta^{(3)}(\vec{k} + \vec{k}') +\hat b^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t}\delta^{(3)}(\vec{k} - \vec{k}')\big) \\ & =\frac{1}{\sqrt{2\omega_{\vec{k}'}}} \big(\hat a_{-\vec{k}'} e^{-i\omega_{\vec{k}'}t}+\hat b^\dagger_{\vec{k}'} e^{+i\omega_{\vec{k}'}t}\big)~, \end{split}\] where we recall that \(\omega_{-\vec{k}} = \omega_{\vec{k}}\). Similarly, starting from the expression \(\eqref{eq:complexpihat}\) for \(\hat \Pi^\dagger(t,\vec{x})\), we find \[\int d^3x\,e^{i\vec{k}'\cdot\vec{x}} \hat\Pi^\dagger(t,\vec{x}) =-i\sqrt{\frac{\omega_{\vec{k}'}}{2}} \big(\hat a_{-\vec{k}'} e^{-i\omega_{\vec{k}'}t}-\hat b^\dagger_{\vec{k}'} e^{+i\omega_{\vec{k}'}t}\big)~.\] Taking linear combinations of these two expressions we find \[\begin{split} \hat a_{\vec{k}} &= \int d^3x\, e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}} \Big( \sqrt{\frac{\omega_{\vec{k}}}{2}}\hat\Phi(t,\vec{x}) + \frac{i}{\sqrt{2\omega_{\vec{k}}}}\hat\Pi^\dagger(t,\vec{x})\Big) ~, \\ \hat b^\dagger_{\vec{k}} &= \int d^3x\, e^{-i\omega_{\vec{k}}t +i \vec{k}\cdot \vec{x}} \Big( \sqrt{\frac{\omega_{\vec{k}}}{2}}\hat\Phi(t,\vec{x}) - \frac{i}{\sqrt{2\omega_{\vec{k}}}}\hat\Pi^\dagger(t,\vec{x})\Big) ~, \end{split}\] where we have renamed \(\vec{k}' \to -\vec{k}\) in the expression for \(\hat a_{\vec{k}}\) and \(\vec{k}' \to \vec{k}\) in the expression for \(\hat b^\dagger_{\vec{k}}\).

As expected, we can obtain the corresponding expressions for \(\hat a_{\vec{k}}^\dagger\) and \(\hat b_{\vec{k}}\) by taking the hermitian conjugate \[\begin{split} \hat a^\dagger_{\vec{k}} &= \int d^3x\, e^{-i\omega_{\vec{k}}t +i \vec{k}\cdot \vec{x}} \Big( \sqrt{\frac{\omega_{\vec{k}}}{2}}\hat\Phi^\dagger(t,\vec{x}) - \frac{i}{\sqrt{2\omega_{\vec{k}}}}\hat\Pi(t,\vec{x})\Big) ~, \\ \hat b_{\vec{k}} &= \int d^3x\, e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}} \Big( \sqrt{\frac{\omega_{\vec{k}}}{2}}\hat\Phi^\dagger(t,\vec{x}) + \frac{i}{\sqrt{2\omega_{\vec{k}}}}\hat\Pi(t,\vec{x})\Big) ~. \end{split}\]

We can now directly compute the commutator \[\begin{split} \phantom{}[\hat a_{\vec{k}},\hat a_{\vec{k}'}^\dagger] = \frac12 \int d^3x \int d^3x' \, e^{it(\omega_{\vec{k}}-\omega_{\vec{k}'}) - i \vec{k}\cdot\vec{x} + i \vec{k}'\cdot\vec{x}'} \Big( & \sqrt{\omega_{\vec{k}}\omega_{\vec{k}'}}[\hat\Phi(t,\vec{x}),\hat\Phi^\dagger(t,\vec{x}')] -i\sqrt{\frac{\omega_{\vec{k}}}{\omega_{\vec{k}'}}}[\hat\Phi(t,\vec{x}),\hat\Pi(t,\vec{x}')] \\ &\!\!\!\!\!\!\!\!\!\!\!\! + i\sqrt{\frac{\omega_{\vec{k}'}}{\omega_{\vec{k}}}}[\hat\Pi^\dagger(t,\vec{x}),\hat\Phi^\dagger(t,\vec{x}')] + \frac{1}{\sqrt{\omega_{\vec{k}}\omega_{\vec{k}'}}}[\hat\Pi^\dagger(t,\vec{x}),\hat\Pi(t,\vec{x}')]\Big) ~. \end{split}\] The commutators of \(\hat\Phi(t,\vec{x})\) with \(\hat\Phi^\dagger(t,\vec{x}')\) and \(\hat\Pi(t,\vec{x}')\) with \(\hat\Pi^\dagger(t,\vec{x})\) vanish, while the other two commutators give delta functions and we find \[\phantom{}[\hat a_{\vec{k}},\hat a_{\vec{k}'}^\dagger] = \frac12 \int d^3x \int d^3x' \, e^{it(\omega_{\vec{k}}-\omega_{\vec{k}'}) - i \vec{k}\cdot\vec{x} + i \vec{k}'\cdot\vec{x}'}\Big( \sqrt{\frac{\omega_{\vec{k}}}{\omega_{\vec{k}'}}} + \sqrt{\frac{\omega_{\vec{k}'}}{\omega_{\vec{k}}}}\Big)\delta^{(3)}(\vec{x} - \vec{x}') ~.\] Now we can use the delta function to integrate over \(\vec{x}'\), which amounts to setting \(\vec{x}'\) to \(\vec{x}\), yielding \[\phantom{}[\hat a_{\vec{k}},\hat a_{\vec{k}'}^\dagger] = \frac12 \int d^3x \, e^{it(\omega_{\vec{k}}-\omega_{\vec{k}'}) - i (\vec{k}-\vec{k'})\cdot\vec{x} }\Big( \sqrt{\frac{\omega_{\vec{k}}}{\omega_{\vec{k}'}}} + \sqrt{\frac{\omega_{\vec{k}'}}{\omega_{\vec{k}}}}\Big)~.\] Using the identity \(\eqref{eq:identity}\), the remaining integral over \(\vec{x}\) gives a delta function \[\phantom{}[\hat a_{\vec{k}},\hat a_{\vec{k}'}^\dagger] = \frac12 (2\pi)^3 \delta^{(3)}(\vec{k}-\vec{k}')e^{it(\omega_{\vec{k}}-\omega_{\vec{k}'})}\Big( \sqrt{\frac{\omega_{\vec{k}}}{\omega_{\vec{k}'}}} + \sqrt{\frac{\omega_{\vec{k}'}}{\omega_{\vec{k}}}}\Big)~.\] Finally, since \(\delta^{(3)}(\vec{k}-\vec{k}')\) only has support when \(\vec{k} = \vec{k}'\), i.e., it is zero if \(\vec{k} \neq \vec{k}'\), we can freely set \(\vec{k} = \vec{k}'\). Doing so the commutator simplifies to \[\phantom{}[\hat a_{\vec{k}},\hat a_{\vec{k}'}^\dagger] = (2\pi)^3 \delta^{(3)}(\vec{k}-\vec{k}') ~.\] Similarly, the other non-vanishing commutator is \[\phantom{}[\hat b_{\vec{k}},\hat b_{\vec{k}'}^\dagger] = (2\pi)^3 \delta^{(3)}(\vec{k}-\vec{k}') ~,\] with all the other commutators equal to zero.

4.2.3 The Hamiltonian and the energy of the vacuum

Let us recall the classical Hamiltonian for a free massive complex scalar field \[H = \int d^3x \,\big( \Pi\Pi^* + \vec{\nabla}\Phi^* \cdot \vec{\nabla}\Phi + m^2 \Phi^* \Phi \big)~.\] The quantum Hamiltonian is then \[\hat H = \int d^3x \,\big( \hat\Pi\hat\Pi^\dagger + \vec{\nabla}\hat\Phi^\dagger \cdot \vec{\nabla}\hat\Phi + m^2 \hat\Phi^\dagger \hat\Phi \big)~,\] which we construct by letting the field \(\Phi\) and its conjugate momentum \(\Pi\) become operators. Substituting in the Fourier expansions \(\eqref{eq:complexphi}\)\(\eqref{eq:complexphihat}\)\(\eqref{eq:complexpi}\) and \(\eqref{eq:complexpihat}\), we find \[\begin{equation} \label{eq:quantumhamiltonian} \hat H = \int \frac{d^3k}{(2\pi)^3} \, \omega_{\vec{k}} \big(\hat a_{\vec{k}}^\dagger \hat a_{\vec{k}} + \hat b_{\vec{k}} \hat b^\dagger_{\vec{k}} \big) ~. \end{equation}\] Using the commutator \([\hat b_{\vec{k}},\hat b_{\vec{k}'}^\dagger] = (2\pi)^3 \delta^{(3)}(\vec{k}-\vec{k}')\) this can be rewritten \[\hat H = \int \frac{d^3k}{(2\pi)^3} \, \omega_{\vec{k}} \big(\hat a_{\vec{k}}^\dagger \hat a_{\vec{k}} + \hat b_{\vec{k}}^\dagger \hat b_{\vec{k}} +(2\pi)^3 \delta^{(3)}(0) \big) ~.\]

To construct the Hilbert space, we begin by defining the vacuum state \(|0\rangle\), which is the state that is annihilated by the annihilation operators \[\hat a_{\vec{k}} |0\rangle = \hat b_{\vec{k}} |0\rangle = 0 \quad \text{for all $\vec{k}$}~.\] This is the state that has no particles or antiparticles and is meant to represent empty space-time. Let us compute the energy of empty space-time, i.e. \(\hat H |0\rangle\). We have \[\hat H |0\rangle = \int \frac{d^3k}{(2\pi)^3} \, \omega_{\vec{k}} \big(\hat a_{\vec{k}}^\dagger \hat a_{\vec{k}} + \hat b_{\vec{k}}^\dagger \hat b_{\vec{k}} +(2\pi)^3 \delta^{(3)}(0) \big) |0\rangle = \int \frac{d^3k}{(2\pi)^3} \, \omega_{\vec{k}} \, (2\pi)^3 \delta^{(3)}(0) |0\rangle ~,\] hence the vacuum state \(|0\rangle\) is an energy eigenstate, but with infinite energy \[E_0 = \int \frac{d^3k}{(2\pi)^3} \, \omega_{\vec{k}} \, (2\pi)^3 \delta^{(3)}(0) ~.\] This energy is infinite in two ways. First, we have a delta function evaluated at 0. Recalling \[(2\pi)^3\delta^{(3)}(\vec{p}) = \lim_{V\to\infty} \int_V d^3x\, e^{i\vec{p}\cdot\vec{x}} ~,\] hence \[(2\pi)^3\delta^{(3)}(0) = \lim_{V\to\infty} \int_V d^3x\, 1 = \lim_{V\to\infty} V ~,\] we see that this infinity is related to the infinite volume of space. We can thus compute the vacuum energy density \[\epsilon_0 = \frac{E_0}{V} = \int \frac{d^3k}{(2\pi)^3} \, \omega_{\vec{k}} ~.\] This equation tells us that each Fourier mode with creation operators \(\hat a_{\vec{k}}^\dagger\) and \(\hat b_{\vec{k}}^\dagger\) contributes \(\omega_{\vec{k}}\) to the vacuum energy density, which is the contribution of two independent simple harmonic oscillators. This second infinity is more complicated to deal with since it is related to large momenta and high energies. For now we will simply argue that only differences in energies are measurable, hence we ignore it.

To do this we introduce the idea of normal ordering. Given an operator \(\mathcal{O}\) that is written in terms of creation and annihilation operators, we define its normal ordered version \(\,:\!\mathcal{O}\!:\,\) to be the same expression except with all the creation operators moved to the left and all the annihilation operators moved to the right in each term. Therefore, the normal ordered version of the Hamiltonian \(\eqref{eq:quantumhamiltonian}\) is \[\begin{equation} \label{eq:quantumh} \,:\!\hat H\!:\, = \int \frac{d^3k}{(2\pi)^3} \, \omega_{\vec{k}} \big(\hat a_{\vec{k}}^\dagger \hat a_{\vec{k}} + \hat b_{\vec{k}}^\dagger \hat b_{\vec{k}} \big) ~, \end{equation}\] which annihilates the vacuum state \(|0\rangle\) \[\,:\!\hat H\!:\, |0\rangle = 0 ~,\] and the vacuum energy is now zero, i.e. \(E_0 = 0\).

This may seem somewhat ad hoc. However, it is important to note that the notion of quantization is ambiguous. We expect that classical physics is an approximation to quantum physics, not the other way round. Indeed, we could have written the classical Hamiltonian as \[H = \int d^3x \,\big( \Pi^*\Pi + \vec{\nabla}\Phi \cdot \vec{\nabla}\Phi^* + m^2 \Phi \Phi^* \big)~,\] or even \[\begin{split} H = \int \frac{d^3k}{(2\pi)^3} \, \frac12 \Big( & \big(\tilde \Pi(t,-\vec{k}) + i \omega_{\vec{k}} \tilde\Phi^*(t,-\vec{k}) \big) \big(\tilde \Pi^*(t,\vec{k}) - i \omega_{\vec{k}} \tilde\Phi(t,\vec{k}) \big) \\& \quad + \big(\tilde \Pi^*(t,-\vec{k}) + i \omega_{\vec{k}} \tilde\Phi(t,-\vec{k}) \big) \big(\tilde \Pi(t,\vec{k}) - i \omega_{\vec{k}} \tilde\Phi^*(t,\vec{k}) \big) \Big) ~. \end{split}\] When we canonically quantize, each expression can give a different quantum operator with a different vacuum energy. However, they will all have the same normal ordered form. This gives us a justification for ignoring the infinite vacuum energy. However, we should also be careful since there are physical setups where differences in the vacuum energies can be measured. An example of this is the Casimir force, which is a force acting on the boundaries of space arising from quantum fluctuations. Beyond free theories, removing infinities in quantum field theory is subtle and needs to be done consistently. The formalism of removing infinities is known as regularisation and renormalisation.

4.2.4 Single particle states

Having discussed the vacuum state, let us now consider single particle states. For this it is useful to compute the commutator \[\begin{split} \phantom{}[\,:\!\hat H\!:\,,\hat a_{\vec{k}}^\dagger] & = \Big[ \int \frac{d^3k'}{(2\pi)^3} \, \omega_{\vec{k}'} \big(\hat a_{\vec{k}'}^\dagger \hat a_{\vec{k}'} + \hat b_{\vec{k}'}^\dagger \hat b_{\vec{k}'} \big) , \hat a_{\vec{k}}^\dagger \Big] \\ & = \int \frac{d^3k'}{(2\pi)^3} \, \omega_{\vec{k}'} \hat a_{\vec{k}'}^\dagger [\hat a_{\vec{k}'},\hat a_{\vec{k}}^\dagger] \\ & = \int \frac{d^3k'}{(2\pi)^3} \, \omega_{\vec{k}'} \hat a_{\vec{k}'}^\dagger (2\pi)^3\delta^{(3)}(\vec{k}-\vec{k}') \\ & = \omega_{\vec{k}} \hat a_{\vec{k}}^\dagger ~. \end{split}\] Similarly, we have \[\phantom{}[\,:\!\hat H\!:\,,\hat a_{\vec{k}}] = - \omega_{\vec{k}} \hat a_{\vec{k}} ~,\] and the same commutation relations for \(\hat b_{\vec{k}}^\dagger\) and \(\hat b_{\vec{k}}\) \[\phantom{}[\,:\!\hat H\!:\,,\hat b_{\vec{k}}^\dagger] = \omega_{\vec{k}} \hat b_{\vec{k}}^\dagger ~, \qquad \phantom{}[\,:\!\hat H\!:\,,\hat b_{\vec{k}}] = - \omega_{\vec{k}} \hat b_{\vec{k}} ~.\] At this point it is worth noting that these relations imply that the time dependence of the operators \(\hat\Phi(t,\vec{x})\), \(\hat\Phi^\dagger(t,\vec{x})\), \(\hat\Pi(t,\vec{x})\) and \(\hat\Pi^\dagger(t,\vec{x})\) in eqs. \(\eqref{eq:complexphi}\)\(\eqref{eq:complexphihat}\) \(\eqref{eq:complexpi}\) and \(\eqref{eq:complexpihat}\) agrees with the familiar time evolution of operators in the Heisenberg picture \[\mathcal{O}(t,\vec{x}) = e^{i :H: t} \mathcal{O}(0,\vec{x}) e^{-i:H:t} ~.\]

Now let us consider an excited state \(\hat a_{\vec{k}}^\dagger|0\rangle\) and compute its energy \[\,:\!\hat H\!:\, \hat a_{\vec{k}}^\dagger|0\rangle = [\,:\!\hat H\!:\, ,\hat a_{\vec{k}}^\dagger]|0\rangle - \hat a_{\vec{k}}^\dagger\,:\!\hat H\!:\,|0\rangle = \omega_{\vec{k}} \hat a_{\vec{k}}^\dagger |0\rangle ~.\] The energy of this state is \[\omega_{\vec{k}} = \sqrt{\vec{k}^2 + m^2} ~,\] which is precisely the energy of a single relativistic particle with momentum \(\vec{k}\). Therefore, we interpret \(\hat a_{\vec{k}}^\dagger|0\rangle\) as a single particle state. Similarly, an identical calculation shows that the energy of the state \(\hat b_{\vec{k}}^\dagger|0\rangle\) is also \(\omega_{\vec{k}}\).

To better understand the difference between the state \(\hat a_{\vec{k}}^\dagger|0\rangle\) and the state \(\hat b_{\vec{k}}^\dagger|0\rangle\) we recall that the action of a free massive complex scalar field has a \(\mathrm{U}(1)\) symmetry acting as \(\Phi \to e^{i\alpha}\Phi\). The associated classical conserved charge is \[Q = -i \int d^3x \, \big(\partial_t\Phi^* \Phi - \Phi^* \partial_t \Phi \big) = -i \int d^3x \, \big(\Pi \Phi - \Phi^* \Pi^* \big) ~,\] where by convention we have redefined \(Q \to - Q\) compared to the definition of the conserved charge in subsection 3.3.2. Promoting this to an operator in the quantum theory we have \[\hat Q = -i \int d^3x \,\Big(\hat\Pi\hat\Phi - \hat\Phi^\dagger\hat\Pi^\dagger\big) ~.\] Substituting in the Fourier expansions \(\eqref{eq:complexphi}\)\(\eqref{eq:complexphihat}\)\(\eqref{eq:complexpi}\) and \(\eqref{eq:complexpihat}\), we find the normal ordered form of the \(\mathrm{U}(1)\) conserved charge \[\begin{equation} \label{eq:quantumq} \,:\!\hat Q\!:\, = \int \frac{d^3k}{(2\pi)^3} \, \big(\hat a_{\vec{k}}^\dagger \hat a_{\vec{k}} - \hat b_{\vec{k}}^\dagger \hat b_{\vec{k}} \big) ~. \end{equation}\] Classically this conserved charge is independent of time. We can check that it commutes with the Hamiltonian \[\phantom{}[\,:\!\hat H\!:\,,\,:\!\hat Q\!:\,] = 0 ~,\] as expected for an operator that is independent of time in the quantum theory. Since the charge and Hamiltonian commute, they can be simultaneously diagonalised. Acting with \(\,:\!\hat{Q}\!:\,\) on the vacuum and the single particle states we find \[\,:\!\hat Q\!:\, |0\rangle = 0 ~, \qquad \,:\!\hat Q\!:\, \hat a_{\vec{k}}^\dagger |0\rangle = + \hat a_{\vec{k}}^\dagger |0\rangle ~, \qquad \,:\!\hat Q\!:\, \hat b_{\vec{k}}^\dagger |0\rangle = - \hat b_{\vec{k}}^\dagger |0\rangle ~.\] Therefore, these states are also charge eigenstates. The vacuum state has charge \(0\), while the single particle states \(\hat a_{\vec{k}}^\dagger |0\rangle\) have charge \(+1\) and the single particle states \(\hat b_{\vec{k}}^\dagger |0\rangle\) have charge \(-1\). We say that \(\hat a_{\vec{k}}^\dagger\) creates particles while \(\hat b_{\vec{k}}^\dagger\) creates antiparticles.

We can also quantize the classical momentum \[P^i = - \int d^3 x\, \big(\partial_t \Phi^* \partial^i \Phi + \partial^i \Phi^* \partial_t \Phi \big) = - \int d^3 x\, \big(\Pi \partial^i \Phi + \partial^i \Phi^* \Pi^* \big) ~,\] which is the Noether charge associated to space translation symmetry. Doing so we find \[\,:\!\hat P^i\!:\, = \int \frac{d^3k}{(2\pi)^3} \, k^i \big(\hat a_{\vec{k}}^\dagger \hat a_{\vec{k}} + \hat b_{\vec{k}}^\dagger \hat b_{\vec{k}} \big) ~,\] and \[\,:\!\hat P^i\!:\, \hat a_{\vec{k}}^\dagger |0\rangle = k^{i} \hat a_{\vec{k}}^\dagger |0\rangle ~, \qquad \,:\!\hat P^i\!:\, \hat b_{\vec{k}}^\dagger |0\rangle = k^{i} \hat b_{\vec{k}}^\dagger |0\rangle ~.\] Since our single particle states are momentum eigenstates this means that they are delocalised in space.

To summarise:

  • The vacuum state \(|0\rangle\) is empty. Provided we measure energy and charge using normal ordering, it has zero energy and momentum, and zero charge.

  • The state \(\hat a_{\vec{k}}^\dagger |0\rangle\) is single particle state interpreted as a particle, with energy \(\omega_{\vec{k}}\), mass \(m\), momentum \(\vec{k}\) and charge \(+1\).

  • The state \(\hat b_{\vec{k}}^\dagger |0\rangle\) is single particle state interpreted as an antiparticle, with energy \(\omega_{\vec{k}}\), mass \(m\), momentum \(\vec{k}\) and charge \(-1\).

4.2.5 Multiparticle states and the Fock space

We can now construct the full space of multiparticle states by acting with the creation operators \(\hat a_{\vec{k}}^\dagger\) and \(\hat b_{\vec{k}}^\dagger\) multiple times. Consider the following state with \(N_a\) particles and \(N_b\) antiparticles \[\begin{equation} \label{eq:multiparticlestate} \hat a_{\vec{p}_1}^\dagger \hat a_{\vec{p}_2}^\dagger \dots \hat a_{\vec{p}_{N_a}}^\dagger \hat b_{\vec{q}_1}^\dagger \hat b_{\vec{q}_2}^\dagger \dots \hat b_{\vec{q}_{N_b}}^\dagger |0\rangle =|\{\vec{p}_1,\vec{p}_2,\dots,\vec{p}_{N_a}\};\{\vec{q}_1,\vec{q}_2,\dots,\vec{q}_{N_b}\}\rangle ~. \end{equation}\] The operators \[\hat{\mathcal{N}}_A = \int \frac{d^3k}{(2\pi)^3} \, \hat a_{\vec{k}}^\dagger \hat a_{\vec{k}} ~, \qquad \hat{\mathcal{N}}_B = \int \frac{d^3k}{(2\pi)^3} \, \hat b_{\vec{k}}^\dagger \hat b_{\vec{k}} ~,\] simply count the number of particles and the number of antiparticles respectively. Since \[\phantom{}[\,:\!\hat H\!:\,, \hat{\mathcal{N}}_A ] = [\,:\!\hat H\!:\,, \hat{\mathcal{N}}_A ] = 0 ~,\] the number of particles and antiparticles are independently conserved in a free theory. However, this is no longer the case when the theory becomes interacting.

The charge of the multiparticle state \(\eqref{eq:multiparticlestate}\) is then given by the difference of the number of particles and antiparticles. That is the state \(\eqref{eq:multiparticlestate}\) is an eigenstate of the conserved charge \(\,:\!\hat Q\!:\,\) \(\eqref{eq:quantumq}\) with eigenvalue \[Q = N_a - N_b ~.\] Note that this charge will still be conserved in the interacting theory assuming the action remains invariant under the continuous \(\mathrm{U}(1)\) symmetry \(\eqref{eq:u1sym}\).

Similarly, the multiparticle state \(\eqref{eq:multiparticlestate}\) is an eigenstate of the Hamiltonian \(\,:\!\hat H\!:\,\) \(\eqref{eq:quantumh}\) and momentum operator \(\,:\!\hat P^i\!:\,\) with eigenvalues \[E = \sum_{i=1}^{N_a} \omega_{\vec{p}_i} + \sum_{i=1}^{N_b} \omega_{\vec{q}_i} ~, \qquad P^i = \sum_{i=1}^{N_a} p^i + \sum_{i=1}^{N_b} q^i ~,\] i.e., the sums of the energies and momenta of the constituent particles and antiparticles. We have constructed a space of states that allows the particle number to change. The state space is constructed by acting with creation operators in all possible ways on the vacuum state and is called the Fock space.

It is interesting to look at what happens if we interchange two particles. Consider the two-particle state \[|\{\vec{p}_1,\vec{p}_2\};\{\}\rangle = \hat a_{\vec{p}_1}^\dagger \hat a_{\vec{p}_2}^\dagger |0\rangle ~.\] Since creation operators commute we have \[|\{\vec{p}_2,\vec{p}_1\};\{\}\rangle = \hat a_{\vec{p}_2}^\dagger \hat a_{\vec{p}_1}^\dagger |0\rangle = \hat a_{\vec{p}_1}^\dagger \hat a_{\vec{p}_2}^\dagger |0\rangle = |\{\vec{p}_1,\vec{p}_2\};\{\}\rangle ~.\] Therefore, because the creation operators commute the state is unchanged if we interchange the two particles. This means that there is no meaning to the ordering of momenta in \(\eqref{eq:multiparticlestate}\) and there is no way to distinguish different particles.

Finally, the states we have constructed have definite momentum. We can also create states that have a definite position by taking suitable superpositions of momentum eigenstates. This is done by considering the original field \(\hat \Phi(t,\vec{x})\) \(\eqref{eq:complexphi}\) \[\hat \Phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(\hat a_{\vec{k}} e^{-i\omega_{\vec{k}}t + i \vec{k}\cdot \vec{x}} + \hat b^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}} \big) ~.\] Acting with this operator on the vacuum state \(|0\rangle\) we find \[\hat \Phi(t,\vec{x}) |0\rangle = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}}\hat b^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}}|0\rangle ~.\] The annihilation operator \(\hat a_{\vec{k}}\) annihilates the vacuum while the integral of \(\hat b^\dagger_{\vec{k}}|0\rangle\) creates a superposition of antiparticles. Therefore, \(\hat \Phi(t,\vec{x})\) creates an antiparticle at position \(\vec{x}\). Similarly, if we act with \(\hat \Phi^\dagger(t,\vec{x})\) \(\eqref{eq:complexphihat}\) on the vacuum state \(|0\rangle\) we find \[\hat \Phi^\dagger(t,\vec{x}) |0\rangle = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}}\hat a^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}}|0\rangle ~,\] hence \(\hat \Phi^\dagger(t,\vec{x})\) creates a particle at position \(\vec{x}\).

4.3 Propagators and causality

We have now quantized a free massive complex scalar field and understood the Fock space of states and their energies and charges. Therefore, we are now ready to discuss how particles propagate from one point to another.

From this point on in the notes we will drop that hats on operators. It should be clear from context if an object is an operator or number.

4.3.1 Lorentz invariant integration measures

Before we discuss propagators in detail, let us return to the Lorentz covariance of our quantum operators. This will be important for ensuring that the expected properties of a relativistic theory, such as nothing propagating faster than the speed of light, are satisfied. We have seen that expressions such as \[\Phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big( a_{\vec{k}} e^{-i\omega_{\vec{k}}t + i \vec{k}\cdot \vec{x}} + b^\dagger_{\vec{k}} e^{+i\omega_{\vec{k}}t -i \vec{k}\cdot \vec{x}} \big) ~,\] can be written in the form \[\Phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big( a_{\vec{k}} e^{+i k_\mu x^\mu } + b^\dagger_{\vec{k}} e^{-i k_\mu x^\mu} \big) ~,\] where \(k^\mu = (\omega_{\vec{k}},\vec{k})^\mu\), casting the exponents in a manifestly Lorentz invariant form. However, the measure \(d^3k\) is not manifestly Lorentz invariant. Since we only integrate over the space components of the momentum it will transform by factors of \(\gamma\) under boosts.

To construct an invariant measure we consider the manifestly Lorentz invariant integral \[\int d^4 k \,\delta(-k^2 - m^2) \theta(k^0) ~,\] where \(k^2 = k_\mu k^\mu\) and \(\theta(x)\) is the Heaviside step function \[\theta(x) = \begin{cases} 1 ~, \qquad x \geq 0 ~, \\ 0 ~, \qquad x < 0 ~. \end{cases} ~.\] Here we restrict to the connected subgroup of the Lorentz group so that \(k^0\) does not change sign under Lorentz transformations. This is the integral over four-momenta \(k^\mu\) constrained to lie on the Lorentz invariant surface \(k^2 + m^2 = 0\) with positive \(k^0\). This surface is a 3-dimensional hyperboloid in momentum space and is often referred to as the mass shell. We can rewrite the integral as \[\int d^3k \int dk^0 \,\delta((k^0)^2 - \omega_{\vec{k}}^2 ) \theta(k^0) ~.\] The argument of the delta function vanishes if \[k^0 = \pm \sqrt{\vec{k}^2 + m^2} = \pm \omega_{\vec{k}} ~.\] However, only the positive branch contributes due to the factor of \(\theta(k^0)\) in the integrand. Therefore, we have \[\int d^3k \int dk^0 \,\delta((k^0)^2 - \omega_{\vec{k}}^2 ) \theta(k^0) = \int d^3k \int dk^0 \,\delta(k^0 - \omega_{\vec{k}} ) \Big| \frac{d(k^0)^2}{dk^0}\Big|^{-1} = \int \frac{d^3 k}{2\omega_{\vec{k}}} ~.\] It follows that the measure \[\begin{equation} \label{eq:lorentzinvariantmeasure} \frac{d^3 k}{2\omega_{\vec{k}}} ~, \end{equation}\] is a Lorentz invariant measure that we can use to integrate over the 3-dimensional hyperboloid discussed above. Moreover, this implies that the Lorentz invariant delta function on the mass shell is \[2\omega_{\vec{k}} \delta^{(3)} (\vec{k} - \vec{k}') ~.\] We can then use the commutation relations \[\phantom{}[a_{\vec{k}},a_{\vec{k}'}^\dagger] = (2\pi)^3 \delta^{(3)}(\vec{k}-\vec{k}') ~, \qquad \phantom{}[b_{\vec{k}},b_{\vec{k}'}^\dagger] = (2\pi)^3 \delta^{(3)}(\vec{k}-\vec{k}') ~.\] to infer that the relativistically normalised creation and annihilation operators are given by \[\sqrt{2\omega_{\vec{k}}} a_{\vec{k}} ~, \qquad \sqrt{2\omega_{\vec{k}}} a_{\vec{k}}^\dagger ~, \qquad \sqrt{2\omega_{\vec{k}}} b_{\vec{k}} ~, \qquad \sqrt{2\omega_{\vec{k}}} b_{\vec{k}}^\dagger ~.\]

4.3.2 Commutators and the light-cone

In the process of quantization we have made use of the Hamiltonian formalism and a particular choice of time, or equivalently a choice of rest frame. In particular, our starting point was the canonical equal-time commutation relation \[\phantom{}[\Phi(t,\vec{x}),\Pi(t,\vec{y})] = i \delta^{(3)}(\vec{x}-\vec{y}) ~,\] with both operators evaluated at the same time. Therefore, it is not clear that the resulting quantum theory is Lorentz covariant and that our quantization procedure did not break this symmetry.

One way to test this is to consider operators that are timelike or spacelike separated. We expect that if an event happens at space-time point \(x\), then an event at spacetime point \(y\) can affect if they are timelike separated, i.e. \((x-y)^2 < 0\), and cannot affect it if they are spacelike separated, i.e. \((x-y)^2 > 0\). The surface separating these two regions is a cone in \(\mathbb{R}^{1,3}\) known as the light-cone. Since Lorentz invariance requires that no information propagates faster than the speed of light, it follows that any two measurements in the quantum theory that are spacelike separated should not affect each other, that is they commute \[\phantom{}[\mathcal{O}_1(x),\mathcal{O}_2(y)] = 0 ~, \qquad (x-y)^2 > 0 ~.\] Not that here the two operators are not necessarily evaluated at the same time. Note that this is satisfied by the canonical equal-time commutation relations since the right-hand side is only ever non-zero if \(x = y\), which is on the light-cone.

Let us now consider a non-trivial example. From the Fourier expansions \(\eqref{eq:complexphi}\) and \(\eqref{eq:complexphihat}\) we find that \[\begin{equation} \label{eq:commutatortest} \phantom{}[\Phi(x),\Phi^\dagger(y)] = \int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, \big(e^{+ i k\cdot (x-y)} - e^{-i k \cdot (x-y)} \big) ~, \end{equation}\] where we recall that \(k^\mu = (\omega_{\vec{k}},\vec{k})^\mu\). The first term comes from the commutator of \(a_{\vec{k}}\) and \(a_{\vec{k}}^\dagger\), while the second comes from the commutator of \(b_{\vec{k}}\) and \(b_{\vec{k}}^\dagger\). The right-hand side of \(\eqref{eq:commutatortest}\) is not immediately zero. It is proportional to the identity operator, which is a consequence of considering a free theory and it will typically be more complicated in an interacting theory. To simplify the computation of the integral, we note that it is Lorentz covariant since it is built from the Lorentz invariant measure \(\eqref{eq:lorentzinvariantmeasure}\) and the scalar product \(k \cdot (x-y) = k_\mu (x-y)^\mu\). Therefore, we can use a Lorentz transformation to pick \(x-y\) to lie in a particular direction and use Lorentz covariance to reconstruct the full result.

For timelike separated \(x\) and \(y\) we can choose \(x-y\) to point in the time direction, i.e., there exists a Lorentz transformation such that \[\Lambda^\mu{}_\nu (x-y)^\nu = (x'{}^0-y'{}^0 ,0,0,0)^\mu ~.\] The integral on the right-hand side of the commutator \(\eqref{eq:commutatortest}\) becomes \[\int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, \big(e^{-i \omega_{\vec{k}} (x'{}^0-y'{}^0)} - e^{+ i \omega_{\vec{k}} (x'{}^0-y'{}^0)} \big) ~,\] which is a non-vanishing oscillatory function of \((x'{}^0-y'{}^0)\).

For spacelike separated \(x\) and \(y\) we can choose \(x-y\) to be zero in the time direction, i.e., there exists a Lorentz transformation such that \[\Lambda^\mu{}_\nu (x-y)^\nu = (0,\vec{x}'-\vec{y}')^\mu ~.\] The integral on the right-hand side of the commutator \(\eqref{eq:commutatortest}\) becomes \[\int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, \big(e^{+i \vec{k} \cdot (\vec{x}'-\vec{y}')} - e^{- i \vec{k}\cdot(\vec{x}'-\vec{y}')} \big) ~.\] Changing variables \(\vec{k} \to - \vec{k}\) in the second term gives \[\int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, \big(e^{+i \vec{k} \cdot (\vec{x}'-\vec{y}')} - e^{+ i \vec{k}\cdot(\vec{x}'-\vec{y}')} \big) = 0 ~,\] where we recall that \(\omega_{-\vec{k}} = \omega_{\vec{k}}\). By Lorentz covariance we find that \[\phantom{}[\Phi(x),\Phi^\dagger(y)] = 0 ~, \qquad (x-y)^2 > 0 ~,\] as required. The same turns out to be true for any commutator of spacelike separated local operators. Moreover it holds in both free and interacting theories. In quantum field theory information cannot propagate outside of the light-cone and we say that the theory is causal.

4.3.3 Particle propagation

An alternative way to probe the causal structure of quantum field theory is to consider a particle at space-time point \(x\) and compute the amplitude for finding it a different space-time point \(y\). One answer to this question is given by the overlap \[D(y-x) = \langle 0| \Phi(y) \Phi^\dagger(x) |0\rangle ~,\] where \(\langle 0| = |0\rangle^\dagger\) is the conjugate transpose of the vacuum state, which by definition satisfies \[\langle 0|\hat a^\dagger_{\vec{k}} =\langle 0| \hat b^\dagger_{\vec{k}} = 0 \quad \text{for all $\vec{k}$}~.\] Therefore, we have \[\begin{split} \Phi^\dagger(t,\vec{x}) |0\rangle & = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} a_{\vec{k}}^\dagger e^{+i\omega_{\vec{k}}t - i\vec{k}\cdot\vec{x}} |0\rangle ~, \\ \langle 0|\Phi(t,\vec{x}) & = (\Phi^\dagger(t,\vec{x}) |0\rangle)^\dagger = \langle 0|\int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} a_{\vec{k}} e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot\vec{x}} ~, \end{split}\] hence \[D(y-x) = \int \frac{d^3k}{(2\pi)^3} \, \frac{d^3k'}{(2\pi)^3} \frac{1}{\sqrt{2\omega_{\vec{k}}}\sqrt{2\omega_{\vec{k}'}}} e^{+ i k' \cdot y -i k\cdot x } \langle 0| a_{\vec{k}'} a_{\vec{k}}^\dagger |0\rangle ~.\] Commuting \(a_{\vec{k}'}\) past \(a_{\vec{k}}^\dagger\) we find \[D(y-x) = \int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, e^{+i k\cdot(y-x)} ~.\] This function is known as the propagator or correlator and again it is Lorentz covariant. Note that the propagator for antiparticles is the same as for particles \[\langle 0| \Phi^\dagger(y) \Phi(x) |0\rangle = \langle 0| \Phi(y) \Phi^\dagger(x) |0\rangle = D(y-x) ~,\] since \(b_{\vec{k}}\) and \(b_{\vec{k}}^\dagger\) obey the same commutation relation as \(a_{\vec{k}'}\) and \(a_{\vec{k}}^\dagger\).

It is possible to evaluate \(D(y-x)\) explicitly. Unlike the commutator \(\eqref{eq:commutatortest}\), it does not vanish when \(x\) and \(y\) are spacelike separated. Choosing \(y^0 - x^0 = 0\), for large spacelike separations, \(|\vec{y}-\vec{x}| \to \infty\), it behaves as \[D(y-x) = \exp(-m|\vec{y}-\vec{x}|) ~.\] The fact that \(D(y-x)\) is non-vanishing tells us that there is entanglement in the vacuum of a quantum field theory. In particular, there is quantum entanglement between different points in space.

However, as we have already seen, this does not mean that it is possible to send information faster than the speed of light. In particular, we have \[\begin{split} 0 = \langle 0| [\Phi(x),\Phi^\dagger(y)] |0\rangle & = \langle 0|\Phi(x)\Phi^\dagger(y)|0\rangle - \langle 0|\Phi^\dagger(y) \Phi(x)|0\rangle \\ & = D(x-y) - D(y-x) ~, \qquad (x-y)^2 > 0 ~. \end{split}\] This tells us that the amplitude for a particle to propagate from \(y\) to \(x\) is the same as the amplitude for a particle to propagate from \(x\) to \(y\) if \(x\) and \(y\) are spacelike separated. It follows that, in any measurement the amplitude for these two events cancel. The same holds for real scalar fields except now the particle is its own antiparticle.

4.4 Feynman propagator

It turns out that the most useful propagator in quantum field theory is not the propagator \(D(y-x)\), but the Feynman propagator. To define the Feynman propagator we introduce the notion of time ordering, which orders operators by time with operators evaluated at later times placed to the left and operators evaluated at earlier times placed to the right \[\overset{\leftarrow}{\mathrm{T}}(\mathcal{O}_1(y)\mathcal{O}_2(x)) = \begin{cases} \mathcal{O}_1(y)\mathcal{O}_2(x) ~, \qquad y^0 > x^0 ~, \\ \mathcal{O}_2(x)\mathcal{O}_1(y) ~, \qquad y^0 < x^0 ~. \end{cases}\] We can also write this as \[\overset{\leftarrow}{\mathrm{T}}(\mathcal{O}_1(y)\mathcal{O}_2(x)) = \theta(y^0 - x^0) \mathcal{O}_1(y)\mathcal{O}_2(x) + \theta(x^0 - y^0) \mathcal{O}_2(x)\mathcal{O}_1(y) ~,\] where \(\theta(x)\) is the Heaviside step function \[\theta(x) = \begin{cases} 1 ~, \qquad x \geq 0 ~, \\ 0 ~, \qquad x < 0 ~. \end{cases}\]

The Feynman propagator is defined to be \[G(y-x) = \langle 0|\overset{\leftarrow}{\mathrm{T}}(\Phi(y)\Phi^\dagger(x))|0\rangle ~,\] i.e., the expectation value of the time ordered product of \(\Phi\) and \(\Phi^\dagger\). We will return to the question of why this is a useful definition when we discuss interacting quantum field theories and scattering. The Feynman propagator is therefore given by \[G(y-x) = \theta(y^0 - x^0) D(y-x) + \theta(x^0 - y^0) D(x-y) ~,\] i.e., if \(y^0\) is in the future of \(x^0\) it is the amplitude for a particle to propagate from \(x\) to \(y\), and if \(x^0\) is in the future of \(y^0\) it is the amplitude for a particle to propagate from \(y\) to \(x\). Substituting in the Fourier expansions \(\eqref{eq:complexphi}\) and \(\eqref{eq:complexphihat}\) we find \[\begin{equation} \begin{split}\label{eq:feynmanprop3} G(y-x) & = \theta(y^0 - x^0) \int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, e^{+i k\cdot(y-x)} + \theta(x^0 - y^0) \int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, e^{+i k\cdot(x-y)} \\ & = \theta(y^0 - x^0) \int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}}\, e^{-i\omega_{\vec{k}}(y^0-x^0) + i \vec{k}\cdot(\vec{y}-\vec{x})} \\ & \quad + \theta(x^0 - y^0) \int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, e^{-i\omega_{\vec{k}}(x^0-y^0) + i \vec{k}\cdot(\vec{x}-\vec{y})} ~. \end{split} \end{equation}\] Therefore, if \(y^0 > x^0\) then the time dependence of the Feynman propagator is \(e^{-i\omega_{\vec{k}}(y^0-x^0)}\), while if \(x^0 > y^0\) then the time dependence is \(e^{-i\omega_{\vec{k}}(x^0-y^0)}\). This means that the Feynman propagator oscillates with the positive-frequency wave propagating into the future. We cannot use the space dependence to distinguish the behaviour in the two regions since we can change the sign in the exponent by the change of variables \(\vec{k} \to -\vec{k}\). Note that the Feynman propagator satisfies \(G(x-y) = G(y-x)\) by definition, and again it is the same for antiparticles and particles \[\langle 0|\overset{\leftarrow}{\mathrm{T}}(\Phi(y)\Phi^\dagger(x))|0\rangle = \langle 0|\overset{\leftarrow}{\mathrm{T}}(\Phi^\dagger(y)\Phi(x))|0\rangle = G(y-x) ~.\]

It turns out that we can write the Feynman propagator as an integral over the four-momenta \(k^\mu\), with the expression \(\eqref{eq:feynmanprop3}\) recovered by integrating over \(k^0\). Explicitly, we have \[\begin{equation} \label{eq:feynmanprop4} G(y-x) = -i \int \frac{d^4k}{(2\pi)^4}\, \frac{e^{+i k\cdot(y-x)}}{k^2 + m^2} ~, \end{equation}\] with a suitable prescription to ensure the integral over \(k^0\) is well-defined.

To integrate over \(k^0\) we extend to the complex plane and use contour integrals. Recall that, if we have an analytic function \(f(z)\) of a complex variable \(z\), then the integral over an anticlockwise closed contour \(\Gamma\) in the complex plane is \[\begin{equation} \label{eq:residueintegral} \oint_\Gamma dz\, f(z) = 2\pi i \sum_{i} \mathop{\mathrm{res}}_{z_i} f(z) ~, \end{equation}\] where \(i\) runs over the poles of \(f(z)\) inside \(\Gamma\), which are located at \(z=z_i\), and \(\mathop{\mathrm{res}}_{z_i} f(z)\) denotes the residue of \(f(z)\) at \(z=z_i\). In particular, we have \[\oint_{|z| = 1} dz \, \frac{1}{z} = 2\pi i ~,\] since the residue of \(z^{-1}\) at \(z=0\) is equal to \(1\). The integral over a clockwise closed contour is minus the integral over the anticlockwise closed contour.

We can write the integral \(\eqref{eq:feynmanprop4}\) as \[G(y-x) = -i \int \frac{d^3k}{(2\pi)^3} \int_{-\infty}^{\infty} \frac{dk^0}{2\pi} \, \frac{e^{- i k^0 (y^0-x^0)+ i \vec{k}\cdot(\vec{y}-\vec{x})}}{-(k^0)^2 + \vec{k}^2 + m^2} ~.\] Let us consider the integral over \(k^0\) \[I(\vec{k}) = -i \int_{-\infty}^{\infty} \frac{dk^0}{2\pi} \, \frac{e^{- i k^0 (y^0-x^0)+ i \vec{k}\cdot(\vec{y}-\vec{x})}}{-(k^0)^2 + \vec{k}^2 + m^2} = i \int_{-\infty}^{\infty} \frac{dk^0}{2\pi} \, \frac{e^{- i k^0 (y^0-x^0)+ i \vec{k}\cdot(\vec{y}-\vec{x})}}{(k^0 + \omega_{\vec{k}})(k^0 - \omega_{\vec{k}})} ~.\] We see that the integrand has singularities at \(k^0 = \pm \omega_{\vec{k}}\). These singularities mean that the integral is not well-defined as an ordinary real integral. To define it, we let \(k^0\) be a complex variable and rewrite \(I(\vec{k})\) as a contour integral in the complex plane. We can now interpret the singularities as poles of a meromorphic function and give a prescription for how to go around the poles.

The Feynman propagator is defined by using the following contour

That is, we go below the pole at \(-\omega_{\vec{k}}\) and above the pole at \(+\omega_{\vec{k}}\). To use eq. \(\eqref{eq:residueintegral}\) we need a closed contour. We can either close the contour in the upper half plane or the lower half plane, i.e., \(k^0 \to +i\infty\) or \(k^0 \to - i\infty\) respectively. We make this choice by requiring that the contribution from this part of the contour vanishes.

If \(y^0 > x^0\) then we close the contour in the lower half plane since \(e^{-i k^0 (y^0 - x^0)}\) decays exponentially as \(k^0 \to -i\infty\). This means that we have a clockwise closed contour going around the pole at \(k^0 = +\omega_{\vec{k}}\). Therefore, using eq. \(\eqref{eq:residueintegral}\) with an overall minus sign due to the clockwise contour, we find \[I(\vec{k}) = + \mathop{\mathrm{res}}_{k^0 = + \omega_{\vec{k}}} \frac{e^{- i k^0 (y^0-x^0)+ i \vec{k}\cdot(\vec{y}-\vec{x})}}{(k^0 + \omega_{\vec{k}})(k^0 - \omega_{\vec{k}})} = \frac{e^{- i \omega_{\vec{k}} (y^0-x^0)+ i \vec{k}\cdot(\vec{y}-\vec{x})}}{2\omega_{\vec{k}}} ~, \qquad y^0 > x^0 ~.\]

On the other hand, if \(x^0 > y^0\) then we close the contour in the upper half plane since \(e^{-i k^0 (y^0 - x^0)}\) decays exponentially as \(k^0 \to +i\infty\). This means that we have an anticlockwise closed contour going around the pole at \(k^0 = -\omega_{\vec{k}}\). Therefore, using eq. \(\eqref{eq:residueintegral}\), we find \[I(\vec{k}) = - \mathop{\mathrm{res}}_{k^0 = - \omega_{\vec{k}}} \frac{e^{- i k^0 (y^0-x^0)+ i \vec{k}\cdot(\vec{y}-\vec{x})}}{(k^0 + \omega_{\vec{k}})(k^0 - \omega_{\vec{k}})} = \frac{e^{- i \omega_{\vec{k}} (x^0-y^0)- i \vec{k}\cdot(\vec{x}-\vec{y})}}{2\omega_{\vec{k}}} ~, \qquad x^0 > y^0 ~.\]

Together, this gives \[\begin{split} G(y-x) & = -i \int \frac{d^4k}{(2\pi)^4} \, \frac{e^{-i k\cdot(y-x)}}{k^2 + m^2} = \int \frac{d^3k}{(2\pi)^3} \, I(\vec{k}) \\ & = \theta(y^0 - x^0) \int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, e^{-i\omega_{\vec{k}}(y^0-x^0) + i \vec{k}\cdot(\vec{y}-\vec{x})} \\ & \quad + \theta(x^0 - y^0) \int \frac{d^3k}{(2\pi)^32\omega_{\vec{k}}} \, e^{-i\omega_{\vec{k}}(x^0-y^0) + i \vec{k}\cdot(\vec{x}-\vec{y})} ~, \end{split}\] where we have made the change of variables \(\vec{k} \to - \vec{k}\) in the second term. As claimed, we have recovered the form \(\eqref{eq:feynmanprop3}\) of the Feynman propagator, i.e., written as an integral over \(\vec{k}\).

The integral \(\eqref{eq:feynmanprop4}\), together with a suitable integration contour, is therefore a manifestly Lorentz covariant form of the Feynman propagator. To indicate that we are using the contour that gives the Feynman propagator we write \[\begin{equation} \label{eq:iepsilon} G(y-x) = -i \int \frac{d^4k}{(2\pi)^4} \, \frac{e^{+i k\cdot(y-x)}}{k^2 + m^2 - i\epsilon} ~, \end{equation}\] where \(\epsilon\) is a positive infinitesimal quantity. The presence of \(i\epsilon\) gives the location of the poles a small imaginary part. If we now integrate over \(k^0 \in \mathbb{R}\) this has the same effect as going around the poles as we did above. Solving for the locations of the poles we find \[(k^0)^2 = \omega_{\vec{k}}^2 - i\epsilon \quad \Rightarrow \quad k^0 = \pm \omega_{\vec{k}} \sqrt{1-\frac{i\epsilon}{\omega_{\vec{k}}^2}} = \pm \omega_{\vec{k}} \mp \frac{i\epsilon}{2\omega_{\vec{k}}} + \mathcal{O}(\epsilon^2) ~.\] Therefore, the pole at \(k^0 = +\omega_{\vec{k}}\) picks a small negative imaginary part and the pole at \(k^0 = - \omega_{\vec{k}}\) picks up a small positive imaginary part

This prescription is often called the Feynman or \(i\epsilon\) prescription.

The Feynman propagator is also a Green’s function for the Klein-Gordon operator \[(\partial^2 - m^2)\phi(x) = 0 ~,\] where \(\partial^2 = \partial_\mu \partial^\mu\). Explicitly, we have \[\begin{split} (\partial^2 - m^2)G(y-x) & = - i (\partial^2- m^2) \int \frac{d^4k}{(2\pi)^4} \, \frac{e^{+i k\cdot(y-x)}}{k^2 + m^2 - i\epsilon} \\ & = -i \int \frac{d^4k}{(2\pi)^4} \, e^{+i k\cdot(y-x)}\frac{-k^2 - m^2}{k^2 + m^2 - i\epsilon} \\ & = i \int \frac{d^4k}{(2\pi)^4} \, e^{+i k\cdot(y-x)} = i \delta^{(4)}(y-x) ~. \end{split}\] Note that we did not make use of the integration contour in this derivation. Choosing different integration contours we can define different propagators or Green’s functions. The retarded propagator corresponds to going above both poles, or equivalently giving both poles a small negative imaginary part, i.e., using the following contour

The advanced propagator corresponds to going below both poles, or equivalently giving both poles a small positive imaginary part, i.e., using the following contour

The different Green’s function can be used to find classical solutions of the Klein-Gordon PDE in different physical settings. The retarded propagator is given by \[G_R(y-x) = \theta(y^0 - x^0) (D(y-x) - D(x-y)) ~,\] while the advanced propagator is \[G_A(y-x) = \theta(x^0 - y^0) (D(x-y) - D(y-x)) ~,\] hence \(G_A(y-x) = G_R(x-y)\).

5 Interacting Quantum Field Theories

Now that we have quantized a free massive complex scalar field, we turn to quantum field theories with interactions. We will mostly focus on interacting theories of real scalar fields. The action of a free massive real scalar field is \[S[\phi] = \int d^4x\,\Big(-\frac12 \partial_\mu\phi\partial^\mu\phi-\frac{m^2}{2} \phi^2 \Big) ~.\] Quantizing, we find the following Fourier expansion \[\begin{equation} \label{eq:realscalar} \phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \frac{1}{\sqrt{\omega_{\vec{k}}}} \big(a_{\vec{k}} e^{-i \omega_{\vec{k}} t + i \vec{k}\cdot \vec{x}} + a^\dagger_{\vec{k}} e^{+i \omega_{\vec{k}} t - i \vec{k}\cdot \vec{x}} \big) ~. \end{equation}\] For the real scalar field the particle is its own antiparticle, the creation and annihilation operators satisfy the usual commutation relation \[\phantom{}[a_{\vec{k}},a^\dagger_{\vec{k}'}] = (2\pi)^3\delta^{(3)}(\vec{k}-\vec{k}') ~,\] and the normal ordered Hamiltonian takes the form \[\,:\!H_0\!:\, = \int \frac{d^3k}{(2\pi)^3} \, \omega_{\vec{k}} a^\dagger_{\vec{k}} a_{\vec{k}} ~.\] From now on we will denote the normal ordered Hamiltonian by \(H_0\). Introducing the vacuum state annihilated by \(a_{\vec{k}}\), and single particle and multiparticle states by acting with \(a_{\vec{k}}^\dagger\), we construct the state space, or Fock space. These states are eigenstates of the Hamiltonian and we can compute its eigenvalues using \[\phantom{}[H_0,a_{\vec{k}}] = -\omega_{\vec{k}} a_{\vec{k}} ~, \qquad [H_0,a^\dagger_{\vec{k}}] = \omega_{\vec{k}} a^\dagger_{\vec{k}} ~.\] Therefore, we know the spectrum of the Hamiltonian and the time dependence of the quantum field, given by eq. \(\eqref{eq:realscalar}\), and its conjugate momentum in the Heisenberg picture.

The simplest interacting quantum field theory is called \(\phi^4\) theory, which has the action \[S[\phi] = \int d^4x\,\Big(-\frac12 \partial_\mu\phi\partial^\mu\phi-\frac{m^2}{2} \phi^2 - \frac{\lambda}{4!} \phi^4 \Big) ~.\] The action is no longer quadratic in the field \(\phi\) and the quartic term allows the particles to interact with each other. The parameter \(\lambda\) is known as a coupling constant and controls the strength of the interaction. The larger \(\lambda\) is, the stronger the interaction. It is not known how to solve general interacting quantum field theories exactly in the coupling constant. Therefore, we assume that \(\lambda\) is small and series expand in powers of \(\lambda\). At leading order we have a free massive real scalar field, which we know how to solve. We can then try to solve the interacting theory order by order in \(\lambda\). This is known as perturbation theory.

More generally, we could consider the following interacting quantum field theory for a massive real scalar field \[\begin{equation} \label{eq:polynomial_potential} S[\phi] = \int d^4x\,\Big(-\frac12 \partial_\mu\phi\partial^\mu\phi-\frac{m^2}{2} \phi^2 - \sum_{n\geq 3} \frac{\lambda_n}{n!} \phi^n \Big) ~, \end{equation}\] with coupling constants \(\lambda_n\). Since we have set \(\hbar = c = 1\), we can measure times, lengths and masses in terms of a single mass scale. We denote the mass dimension of a quantity by square brackets. In particular, we have \([\text{mass}] = - [\text{time}] = - [\text{length}] = 1\) so that \([\hbar]=[c]=0\). The action should have the same units as \(\hbar\), which means it should have vanishing mass dimension, i.e. \([S] = 0\). Since \(x\) is a point in space-time we have \([x] = -1\), which means that \([d^4x] = -4\) and \([\partial_\mu] = 1\). It follows that the mass dimensions of the field \(\phi\), the mass \(m\) and the coupling constants \(\lambda_n\) are \[\phantom{}[\phi] = 1 ~, \qquad \phantom{}[m] = 1 ~, \qquad \phantom{}[\lambda_n] = 4-n ~.\] Since the couplings carry mass dimension, to make sense of perturbation theory we need the coupling constants to be small with respect to something with same mass dimension. If we are considering a process at some energy scale \(E\), then the dimensionless quantity of interest is \[\begin{equation} \label{eq:combination} \lambda_n E^{n-4} ~. \end{equation}\] We see that for \(n < 4\), this combination is small at high energies and large at low energies. We call such couplings, i.e. \(\lambda_3\), relevant since they are relevant at low energies. Since in a relativistic theory we have \(E \geq m\), we can make this combination small by taking \(\lambda_3 \ll m\). For \(n = 4\), the combination \(\eqref{eq:combination}\) does not depend on the energy scale. We say that the coupling is marginal and we can demand \(\lambda_4 \ll 1\). For \(n > 4\), the combination \(\eqref{eq:combination}\) is large at high energies and small at low energies. We call such couplings, i.e. \(\lambda_{5,6,\dots}\), irrelevant since they are not relevant at low energies. Typically, we cannot avoid high-energy processes in quantum field theory and irrelevant operators lead to non-renormalisable theories, which means the theory is incomplete and new physics is needed at some energy scale. It is important to note that this classical analysis may receive quantum corrections, e.g., marginal operators may become relevant or irrelevant in the quantum theory.

5.1 The interaction picture

Our starting point is to construct the Hamiltonian of the interacting \(\phi^4\) theory \[H = \int d^3x \, \Big( \frac12 \pi^2 + \frac12 \vec{\nabla}\phi\cdot\vec{\nabla}\phi + \frac12 m^2 \phi^2 + \frac\lambda{4!} \phi^4\Big) ~.\] We split this Hamiltonian into two parts, a free part \[H_0 = \int d^3x \, \Big( \frac12 \pi^2 + \frac12 \vec{\nabla}\phi\cdot\vec{\nabla}\phi + \frac12 m^2 \phi^2 \Big) ~,\] and an interacting part \[H_{\text{int}} = \frac\lambda{4!} \int d^3 x\, \phi^4 ~,\] such that \[H = H_0 + H_{\text{int}} ~.\]

If we know an operator at some reference time \(t_0\), then the time evolution of operators in the Heisenberg picture tells us that the value of the operator at some other time \(t\) is \[\mathcal{O}_H(t) = e^{i H (t-t_0)} \mathcal{O}_H(t_0) e^{-i H(t-t_0)} = e^{i H (t-t_0)} \mathcal{O}_S e^{-i H(t-t_0)} ~,\] where we take the Schrödinger picture operator to be the Heisenberg picture operator evaluated at the reference time. Here, the subscripts \(H\) and \(S\) denote the Heisenberg and Schrödinger pictures respectively. Note that this implies that the Hamiltonian is equal in the two pictures, \(H_H = H_S\), and we drop the subscripts. Similarly, for states we have that \[|\Psi_S(t)\rangle = e^{-iH(t-t_0)}|\Psi_S(t_0)\rangle = e^{-iH(t-t_0)} |\Psi_H\rangle ~,\] i.e., the Schrödinger picture state at time \(t\) is given by the time evolution of the state at time \(t_0\), which we take to be the Heisenberg picture state.

In general, exponentiating the full Hamiltonian is difficult due to presence of interactions. However, we do know how to exponentiate the free Hamiltonian \(H_0\). Indeed, we have derived the expression for the quantum field at general time \(t\) in the free theory, which is given in eq. \(\eqref{eq:realscalar}\). Therefore, we define \(\mathcal{O}_I(t)\) to be the operator evolved using only the Schrödinger picture free Hamiltonian \(H_{0S}\) \[\begin{equation} \label{eq:intoperators} \mathcal{O}_I(t) = e^{i H_{0S} (t-t_0)} \mathcal{O}_S e^{-i H_{0S}(t-t_0)} ~. \end{equation}\] This defines the interaction picture, denoted by the subscript \(I\). The Heisenberg picture operator is now related to the interaction picture operator as \[\mathcal{O}_I(t) = e^{i H_{0S} (t-t_0)} e^{-i H (t-t_0)} \mathcal{O}_H(t) e^{i H(t-t_0)} e^{-i H_{0S} (t-t_0)} ~.\] Correspondingly, we define states in the interaction picture as \[\begin{equation} \label{eq:intpicstates} |\Psi_I(t)\rangle = e^{iH_{0S}(t-t_0)}|\Psi_S(t)\rangle = e^{iH_{0S}(t-t_0)} e^{-iH(t-t_0)} |\Psi_H\rangle ~. \end{equation}\] Schematically, the interaction picture is a hybrid of the Heisenberg and Schrödinger pictures, where operators evolve with \(H_0\) and states evolve with \(H_{\text{int}}\).

Let us now construct the interaction picture field, which is an operator in the interacting quantum field theory. Parametrising the Schrödinger picture field and its conjugate momenta in the interacting theory as \[\begin{split} \phi_S(\vec{x}) & = \int \frac{d^3k}{(2\pi)^3} \frac{1}{\sqrt{\omega_{\vec{k}}}} \big(a_{\vec{k}} e^{-i \omega_{\vec{k}} t_0 + i \vec{k}\cdot \vec{x}} + a^\dagger_{\vec{k}} e^{+i \omega_{\vec{k}} t_0 - i \vec{k}\cdot \vec{x}} \big) ~, \\ \pi_S(\vec{x}) & = -i \int \frac{d^3k}{(2\pi)^3} \sqrt{\frac{\omega_{\vec{k}}}{2}} \big(a_{\vec{k}} e^{-i \omega_{\vec{k}} t_0 - i \vec{k}\cdot \vec{x}} + a^\dagger_{\vec{k}} e^{+i \omega_{\vec{k}} t_0 - i \vec{k}\cdot \vec{x}} \big) ~, \end{split}\] we find that the free Hamiltonian in the Schrödinger picture is \[H_{0S} = \int \frac{d^3 k}{(2\pi)^3} \omega_{\vec{k}} a_{\vec{k}}^\dagger a_{\vec{k}} ~.\] While the time dependence of the field and free Hamiltonian will take a complicated form in the Heisenberg picture, in the interaction picture it is simple. Starting from the canonical commutation relations \[\phantom{}[\phi(\vec{x}),\phi(\vec{y})] = 0 ~, \qquad [\pi(\vec{x}),\pi(\vec{y})] = 0 ~, \qquad [\phi(\vec{x}),\pi(\vec{y})] = i\delta^{(3)}(\vec{x}-\vec{y}) ~,\] we find that \[= (2\pi)^3 \delta^{(3)}(\vec{k} - \vec{k}') ~, \qquad [H_{0S},a_{\vec{k}}^\dagger] = \omega_{\vec{k}} a_{\vec{k}}^\dagger ~, \qquad [H_{0S},a_{\vec{k}}] = - \omega_{\vec{k}} \hat a_{\vec{k}} ~,\] with the remaining commutators vanishing. Therefore, the interaction picture field in the interacting theory is given by \[\begin{equation} \label{eq:interactionpicturefield} \phi_I(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \frac{1}{\sqrt{\omega_{\vec{k}}}} \big(a_{\vec{k}} e^{-i \omega_{\vec{k}} t + i \vec{k}\cdot \vec{x}} + a^\dagger_{\vec{k}} e^{+i \omega_{\vec{k}} t - i \vec{k}\cdot \vec{x}} \big) ~, \end{equation}\] i.e., it takes the same form as the Heisenberg picture field in the free theory.

The relation between the Heisenberg and interaction pictures suggests that it will be useful to define the following time evolution operator \[\begin{equation} \label{eq:timeevolution} U(t,t_0) = e^{i H_{0S} (t-t_0)} e^{-i H(t-t_0)} ~, \end{equation}\] such that \[\begin{equation} \label{eq:heisenberginteraction} \mathcal{O}_I(t) = U(t,t_0) \mathcal{O}_H(t) U^\dagger(t,t_0) ~, \qquad |\Psi_I(t)\rangle = U(t,t_0)|\Psi_H\rangle ~. \end{equation}\] In this definition of \(U(t,t_0)\) the final time \(t\) is allowed to vary, but the reference time \(t_0\) is always kept fixed. Schematically, \(U(t,t_0)\) corresponds to first evolving with the full Hamiltonian and then undoing the part of the evolution associated with the free Hamiltonian. Note that we cannot combine the two exponentials in the definition of the time evolution operator \(\eqref{eq:timeevolution}\) since they do not commute.

To understand the time evolution operator \(\eqref{eq:timeevolution}\) let us consider its time derivative \[\begin{split} i \frac{d}{dt} U(t,t_0) & = i \Big(\frac{d}{dt} e^{i H_{0S}(t-t_0)} \Big) e^{-i H(t-t_0)} + i e^{i H_{0S}(t-t_0)} \Big(\frac{d}{dt} e^{-i H(t-t_0)} \Big) \\ & = i \Big(e^{i H_{0S}(t-t_0)} (iH_{0S}) e^{-i H(t-t_0)} + e^{i H_{0S}(t-t_0)} (-iH) e^{-i H(t-t_0)}\Big) \\ & = e^{i H_{0S}(t-t_0)} (H - H_{0S}) e^{-i H(t-t_0)} \\ & = e^{i H_{0S}(t-t_0)} (H_{\text{int}})_{S} e^{-i H(t-t_0)} \\ & = e^{i H_{0S}(t-t_0)} (H_{\text{int}})_{S} e^{-i H_{0 S}(t-t_0)} e^{i H_{0S} (t-t_0)} e^{-i H(t-t_0)} \\ & = H_I(t) U(t,t_0) ~, \end{split}\] where we have defined \[H_I(t) = (H_{\text{int}})_I(t) = e^{i H_{0S}(t-t_0)} (H_{\text{int}})_S e^{-i H_{0S}(t-t_0)} ~,\] i.e., the interacting part of the Hamiltonian in the interaction picture. In the interacting \(\phi^4\) theory we have \[H_I(t) = \frac\lambda{4!} \int d^3 x\, e^{i H_{0S}(t-t_0)} \phi(t_0,\vec{x})^4 e^{-i H_{0S}(t-t_0)} = \frac\lambda{4!} \int d^3 x\,\phi_I(t,\vec{x})^4 ~,\] and we see that the \(H_I\) takes a simple form when written in terms of the interaction picture field \(\phi_I\).

The differential equation for the time evolution operator \(U(t,t_0)\) with \(t > t_0\) \[\begin{equation} \label{eq:diffeq} i \frac{d}{dt} U(t,t_0) = H_I(t) U(t,t_0) ~, \end{equation}\] subject to the initial condition \(U(t_0,t_0)=1\), which follows from its definition \(\eqref{eq:timeevolution}\), is solved by Dyson’s formula \[\begin{equation} \label{eq:utt0} U(t,t_0)=\mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\big(-i\int_{t_0}^t dt'\, H_I(t')\big) ~, \qquad t > t_0 ~. \end{equation}\] Here \(\mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\) denotes the time ordered exponential, which time orders all the operators in its argument. As an example of how this works consider the series expansion of the time ordered exponential in eq. \(\eqref{eq:utt0}\) \[\mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\big(-i\int_{t_0}^t dt'\, H_I(t')\big) = 1 -i \overset{\leftarrow}{\mathrm{T}}\Big(\int_{t_0}^t dt'\, H_I(t')\Big) - \frac12 \overset{\leftarrow}{\mathrm{T}}\Big(\int_{t_0}^t dt'\, H_I(t')\int_{t_0}^t dt''\, H_I(t'')\Big) + \dots ~.\] The second term on the right-hand side contains a single operator and there is nothing to order, hence we can drop the time ordering symbol. However, in the third term, and subsequent terms, we have more than one operator and we need to split the integrals in order to implement the time ordering. Explicitly, we can write \[\begin{split} \overset{\leftarrow}{\mathrm{T}}\Big(\int_{t_0}^t dt'\, H_I(t')\int_{t_0}^t dt''\, H_I(t'')\Big) & = \overset{\leftarrow}{\mathrm{T}}\Big(\int_{t_0}^t dt' \int_{t_0}^{t'} dt'' \, H_I(t') H_I(t'')\Big) + \overset{\leftarrow}{\mathrm{T}}\Big(\int_{t_0}^t dt' \int_{t'}^t dt'' \, H_I(t') H_I(t'')\Big) \\ & = \int_{t_0}^t dt' \int_{t_0}^{t'} dt'' \, H_I(t') H_I(t'') + \int_{t_0}^t dt' \int_{t'}^t dt'' \, H_I(t'') H_I(t') ~, \end{split}\] since in the first term we have \(t_0 \leq t'' \leq t'\), hence \(t' \geq t''\), while in the second term we have \(t' \leq t'' \leq t\), hence \(t''\geq t'\).

We can explicitly check that eq. \(\eqref{eq:utt0}\) solves the differential equation \(\eqref{eq:diffeq}\) as follows \[\begin{split} i \frac{d}{dt} U(t,t_0) & = i \frac{d}{dt}\mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\big(-i\int_{t_0}^t dt'\, H_I(t')\big) \\ & = \overset{\leftarrow}{\mathrm{T}}\Big(H_I(t) \exp\big(-i\int_{t_0}^t dt'\, H_I(t')\big)\Big) \\ & = H_I(t)\mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\big(-i\int_{t_0}^t dt'\, H_I(t')\big) = H_I(t) U(t,t_0) ~, \end{split}\] where to go from the first to second line we do not need to account for the fact the operators \(H_I(t)\) do not necessarily commute at different times since the ordering is taken care of by the time ordering. To go from the second to third line we note that we can take \(H_I(t)\) out to the left of the time ordering since \(t>t_0\) implies that the integral is only over times that are less than \(t\). Finally, eq. \(\eqref{eq:utt0}\) satisfies the boundary condition \(U(t_0,t_0)=1\) since the integral from \(t_0\) to \(t_0\) trivially vanishes. Therefore, by the uniqueness of solutions to differential equations, we have shown that the time evolution operator \(\eqref{eq:timeevolution}\) is equal to the time ordered exponential in eq. \(\eqref{eq:utt0}\) for \(t>t_0\).

At this point is useful to generalise the time evolution operator in the interaction picture such that both the initial and final times can vary. To do this we take the time ordered exponential in eq. \(\eqref{eq:utt0}\) and allow both limits of the integral to vary \[\begin{equation} \label{eq:ut2t1} U(t_2,t_1)=\begin{cases} \mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\big(-i\displaystyle{\int_{t_1}^{t_2}} dt'\, H_I(t')\big) ~, \qquad & t_2 > t_1 ~, \\ \mathrm{T}\hspace{-1pt}\overset{\longrightarrow}{\exp}\big(+i\displaystyle{\int_{t_2}^{t_1}} dt'\, H_I(t')\big) ~, \qquad & t_2 < t_1 ~. \end{cases} \end{equation}\] The generalisation of the original definition \(\eqref{eq:timeevolution}\) is given by \[\begin{equation} \label{eq:ut2t1alt} U(t_2,t_1) = e^{iH_{0S} (t_2-t_0)}e^{-i H (t_2-t_1)} e^{-i H_{0S}(t_1 -t_0)} ~, \end{equation}\] which follows if we use \(|\Psi_I(t_2)\rangle = U(t_2,t_0)|\Psi_H\rangle\) and \(|\Psi_I(t_1)\rangle = U(t_1,t_0)|\Psi_H\rangle\) to determine \[|\Psi_I(t_2)\rangle = U(t_2,t_1)|\Psi_I(t_1)\rangle ~.\] Note that the \(t_0\) dependence in the definition \(\eqref{eq:ut2t1}\) is hidden in \(H_I(t')\). We can interpret eq. \(\eqref{eq:ut2t1alt}\) as relating the time evolution operator between states in the Schrödinger and interaction pictures, generalising \(\eqref{eq:intoperators}\).

The operator \(U(t_2,t_1)\) has a number of useful properties. First, we have the composition property \[\begin{equation} \label{eq:identity1} U(t_3,t_2)U(t_2,t_1) = U(t_3,t_1) ~, \end{equation}\] i.e., evolving in time from \(t_1\) to \(t_2\) and then from \(t_2\) to \(t_3\) is equivalent to evolving in time from \(t_1\) to \(t_3\). Second, \(U(t_2,t_1)\) is unitary \[U^\dagger(t_2,t_1) U(t_2,t_1) = 1 ~,\] and the hermitian conjugate of the time evolution operator corresponds to evolving back from \(t_2\) to \(t_1\) \[|\Psi_I(t_1)\rangle = U(t_2,t_1)^\dagger|\Psi_I(t_2)\rangle = U(t_1,t_2) |\Psi_I(t_2)\rangle ~.\] This implies that \[\begin{equation} \label{eq:identity2} U(t_3,t_1)U^\dagger(t_2,t_1) = U(t_3,t_2) ~, \end{equation}\] i.e., evolving back from \(t_2\) to \(t_1\) and then from \(t_1\) to \(t_3\) is equivalent to evolving in time from \(t_2\) to \(t_3\).

Let us now consider the vacuum in the interacting theory. We have already introduced the Heisenberg picture vacuum in the free theory \(|0_H\rangle\), which satisfies \(H_0|0_H\rangle = 0\), where \(H_0\) is the free Hamiltonian, and we normalise such that \(\langle 0_H|0_H\rangle = 1\). We similarly define the physical Heisenberg picture vacuum in the interacting theory \(|\Omega_H\rangle\), which satisfies \(H |\Omega_H\rangle = E_0 |\Omega_H\rangle\), where \(H\) is the full Hamiltonian, and we again normalise such that \(\langle\Omega_H|\Omega_H\rangle = 1\). From now on we drop the subscript \(H\) on the physical Heisenberg picture vacuum.

In the interacting theory the state corresponding to the free vacuum will evolve non-trivially in time. Therefore, we define the following interaction picture states in the far past and far future \[|0\rangle = \lim_{t\to-\infty} |0_I(t)\rangle ~, \qquad \langle 0| = \lim_{t\to+\infty} \langle 0_I(t)| ~.\] We then claim that \[\begin{equation} \label{eq:asymptoticstatelemma} \lim_{s\to-\infty}\langle\Psi_I(t)|U(t,s)|0\rangle = \langle\Psi_H|\Omega\rangle\langle\Omega|0_H\rangle ~, \qquad \lim_{t\to+\infty}\langle 0|U(t,s)|\Psi_I(s)\rangle = \langle 0_H|\Omega\rangle\langle\Omega|\Psi_H\rangle ~, \end{equation}\] for arbitrary interaction picture states \(\langle\Psi_I(t)|\) and \(|\Psi_I(s)\rangle\). To show this we write the left-hand side of the first identity in the Schrödinger picture using eq. \(\eqref{eq:intpicstates}\) and eq. \(\eqref{eq:ut2t1alt}\) \[\lim_{s\to-\infty}\langle\Psi_S(t)|e^{-iH(t-s)} |0_S(s)\rangle ~,\] and insert a complete set of energy eigenstates \[1 = |\Omega_S(s)\rangle_S\langle\Omega_S(s)| + \sum_n |n_S(s)\rangle\langle n_S(s)| ~,\] where the sum over \(n\) represents the sum and integration over all states other than the physical vacuum. We also assume that there is a mass gap, i.e., the energies of all other states are strictly greater than \(E_0\), the energy of the physical vacuum. Therefore, we have \[\begin{split} & \lim_{s\to-\infty}\langle\Psi_S(t)|e^{-iH(t-s)} \Big(|\Omega_S(s)\rangle_S\langle\Omega_S(s)| + \sum_n |n_S(s)\rangle\langle n_S(s)|\Big)|0_S(s)\rangle \\ & = \lim_{s\to-\infty}e^{-iE_0(t-s)} \langle\Psi_S(t)|\Omega_S(s)\rangle\langle\Omega_S(s)|0_S(s)\rangle + \sum_n \lim_{s\to-\infty}e^{-iE_n(t-s)} \langle\Psi_S(t)|n_S(s)\rangle\langle n_S(s)|0_S(s)\rangle ~. \end{split}\] At this point we can either use the Riemann-Lebesgue lemma or replace \(s\to s(1-i\epsilon)\) with \(\epsilon > 0\) before taking the limit. This, together with the assumption that \(E_n > E_0\), implies that \[\lim_{s\to-\infty} \frac{e^{-iE_n(t-s)}}{e^{-iE_0(t-s)}} \to \lim_{s\to-\infty} \frac{e^{-iE_n(t-s)}}{e^{-iE_0(t-s)}}e^{\epsilon s(E_n-E_0)} = 0 ~,\] since \(E_n > E_0\). Note that this prescription is equivalent to the one we used to define the Feynman propagator. Therefore, the contributions from all other states become vanishingly small compared to the contribution from the physical vacuum in the limit and we can drop the sum over \(n\) to leave \[\lim_{s\to-\infty}e^{-iE_0(t-s)} \langle\Psi_S(t)|\Omega_S(s)\rangle\langle\Omega_S(s)|0_S(s)\rangle = \lim_{s\to-\infty}\langle\Psi_S(t)|e^{-iH (t-s)} |\Omega_S(s)\rangle\langle\Omega_S(s)|0_S(s)\rangle ~,\] which is the right-hand side of the first identity in eq. \(\eqref{eq:asymptoticstatelemma}\) in the Schrödinger picture. A similar argument holds for the second identity in eq. \(\eqref{eq:asymptoticstatelemma}\).

At this point we have introduced the formalism needed to write down an expression for the time ordered correlation function \(\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle\) in the interacting theory where \(\phi\) denotes the Heisenberg picture field. Recalling the relations \(\eqref{eq:heisenberginteraction}\) between operators and states in the Heisenberg and interaction pictures and using the identities \(\eqref{eq:asymptoticstatelemma}\) we find \[\begin{equation} \label{eq:expr1} \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle= \frac{\langle 0| \overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y) \phi_I(x) U(+\infty,-\infty)\big)|0\rangle}{\langle 0_H|\Omega\rangle\langle\Omega|0_H\rangle} ~. \end{equation}\] To show this we start by letting \(\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle = |\Psi_H(y,x)\rangle\), i.e., a Heisenberg picture state that depends on \(y\) and \(x\). We can then use the second identity in eq. \(\eqref{eq:asymptoticstatelemma}\) to give \[\langle\Omega|\Psi_H(y,x)\rangle = \frac{\langle 0|U(+\infty,s)|\Psi_I(s;y,x)\rangle}{\langle 0_H|\Omega\rangle} = \frac{\langle 0|U(+\infty,s)U(s,t_0)|\Psi_H(y,x)\rangle}{\langle 0_H|\Omega\rangle} = \frac{\langle 0|U(+\infty,t_0)|\Psi_H(y,x)\rangle}{\langle 0_H|\Omega\rangle}~,\] where we have used the relation between states in the Heisenberg and interaction pictures and the property \(\eqref{eq:identity1}\) of the time evolution operator. Also using the first identity in eq. \(\eqref{eq:asymptoticstatelemma}\) to substitute in for \(|\Omega\rangle\) we are left with \[\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle =\frac{\langle 0| U(+\infty,t_0) \overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big) U(t_0,-\infty)|0\rangle}{\langle 0_H|\Omega\rangle\langle\Omega|0_H\rangle} ~.\] Now assuming \(y^0 > x^0\) and using the relation between operators in the Heisenberg and interaction pictures we find \[\begin{split} \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle &= \frac{\langle 0| U(+\infty,t_0) U^\dagger(y^0,t_0)\phi_I(y) U(y^0,t_0) U^\dagger(x^0,t_0) \phi_I(x) U(x^0,t_0) U(t_0,-\infty)|0\rangle}{\langle 0_H|\Omega\rangle\langle\Omega|0_H\rangle} \\ & = \frac{\langle 0| U(+\infty,y_0) \phi_I(y) U(y^0,x^0) \phi_I(x) U(x^0,-\infty)|0\rangle}{\langle 0_H|\Omega\rangle\langle\Omega|0_H\rangle} \\ & = \frac{\langle 0| \overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y) \phi_I(x) U(+\infty,-\infty)\big)|0\rangle}{\langle 0_H|\Omega\rangle\langle\Omega|0_H\rangle} ~. \end{split}\] where to go from the first line to the second line we use the properties \(\eqref{eq:identity1}\) and \(\eqref{eq:identity2}\) of the time evolution operator. To go from the second to the third line we make use of the assumption that \(+\infty > y^0 > x^0 > -\infty\) and observe that if we order using time ordering then this automatically puts each operator in the correct place. We find the same result for \(x^0 > y^0\).

We can similarly show that \[\begin{equation} \label{eq:expr2} 1 = \langle\Omega|\Omega\rangle = \frac{\langle 0| U(+\infty,-\infty)|0\rangle}{\langle 0_H|\Omega\rangle\langle\Omega|0_H\rangle} ~, \end{equation}\] hence, combining eq. \(\eqref{eq:expr1}\) and eq \(\eqref{eq:expr2}\), we have \[\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle = \frac{\langle 0| \overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y) \phi_I(x) U(+\infty,-\infty)\big)|0\rangle}{\langle 0| U(+\infty,-\infty)|0\rangle} ~.\] Substituting in for \(U(t_2,t_1)\) using eq. \(\eqref{eq:ut2t1}\) gives \[\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle= \frac{\langle 0| \overset{\leftarrow}{\mathrm{T}}\Big(\phi_I(y) \phi_I(x) \exp\big(-i\displaystyle{\int_{-\infty}^{+\infty}} dt'\, H_I(t')\big) \Big)|0\rangle}{\langle 0| \mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\big(-i\displaystyle{\int_{-\infty}^{+\infty}} dt'\, H_I(t')\big)|0\rangle} ~.\] Since \(H_I\) is a polynomial in the field \(\phi_I\), we can expand the exponentials to any order in \(\lambda\). The computation of the time ordered correlation function \(\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle\) then reduces to computing expectation values of time ordered products of the field \(\phi_I\) in the free vacuum.

5.2 Wick’s theorem

We have introduced two different notions of ordering. Time ordering, \(\overset{\leftarrow}{\mathrm{T}}(\dots\mathcal{O}_{n+1}(x_{n+1})\mathcal{O}_{n}(x_{n})\dots)\), which defines the correlation functions of interest, and normal ordering, \(\,:\!\mathcal{O}_{n+1}(x_{n+1})\mathcal{O}_{n}(x_{n})\dots\!:\,\), which is useful when we work with creation and annihilation operators. Therefore, to compute time ordered correlation functions we would like to relate the two.

Let us start with the product of two fields. We can write the interaction picture field \(\eqref{eq:interactionpicturefield}\) as \[\phi_I (x) = \phi_I^+ (x) + \phi_I^- (x) ~,\] where \[\phi_I^+ (x) = \int \frac{d^3k}{(2\pi)^3} \frac{1}{\sqrt{2\omega_{\vec{k}}}} a_{\vec{k}} e^{+ik\cdot x} ~, \qquad \phi_I^- (x) = \int \frac{d^3k}{(2\pi)^3} \frac{1}{\sqrt{2\omega_{\vec{k}}}} a^\dagger_{\vec{k}} e^{-ik\cdot x} ~.\] Recall that the time dependence of the interaction picture field is the same as that of the free field since we construct it by evolving the Schrödinger picture field with the free Hamiltonian in the Schrödinger picture as in eq. \(\eqref{eq:intoperators}\). This implies that \[\phi_I^+ (x) |0\rangle = 0 ~, \qquad \langle 0|\phi_I^-(x) = 0 ~.\] Now let us consider the time ordered product for \(y^0 > x^0\) \[\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y)\phi_I(x)\big) = (\phi_I^+ (y) + \phi_I^- (y))(\phi_I^+ (x) + \phi_I^- (x)) ~.\] To normal order the right-hand side we move \(\phi_I^- (x)\) to the left and \(\phi_I^+ (y)\) to the right. Since these have a non-vanishing commutator, we find \[\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y)\phi_I(x)\big) = \,:\!\phi_I(y)\phi_I(x)\!:\, + [\phi_I^+ (y),\phi_I^- (x)] ~, \qquad y^0 > x^0 ~.\] Similarly, for \(x^0 > y^0\) we have \[\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y)\phi_I(x)\big) = \,:\!\phi_I(y)\phi_I(x)\!:\, + [\phi_I^+ (x),\phi_I^- (y)] ~, \qquad x^0 > y^0 ~.\]

At this point it is useful to introduce the following notation for the contraction of two fields

\[\require{mathtools} \overbracket{\phi_I(y) \phi_I(x)} = \begin{cases} [\phi_I^+ (y),\phi_I^- (x)] ~, \qquad y^0 > x^0 ~, \\ [\phi_I^+ (x),\phi_I^- (y)] ~, \qquad y^0 < x^0 ~. \end{cases}\]

Therefore, we have

\[\begin{equation} \label{eq:timenormal} \overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y)\phi_I(x)\big) = \,:\!\phi_I(y)\phi_I(x)\!:\, + \overbracket{\phi_I(y) \phi_I(x)} ~. \end{equation}\]

Taking the expectation value of this relation in the free vacuum, the contribution from the normal ordered product vanishes and we are left with the Feynman propagator on the left-hand side, hence

\[\langle 0|\overbracket{\phi_I(y) \phi_I(x)}|0\rangle = G(y-x) ~.\]

Moreover, the contraction is equal to the Feynman propagator times the identity operator and we will often write

\[\overbracket{ \phi_I(y) \phi_I(x)} = G(y-x) ~,\]

leaving the identity operator implicit. We can understand the relation \(\eqref{eq:timenormal}\) as allowing us to separate time ordered products of fields into parts that vanish and parts that do not when we take the expectation value in the free vacuum.

Wick’s theorem tells us how to do this for an arbitrary number of fields. It says that \[\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(x_n)\dots\phi_I(x_2)\phi_I(x_1)\big) =\sum\begin{array}{l} \text{all possible contractions with} \\ \text{uncontracted fields normal ordered} \end{array} ~.\] That is, we list all possible ways of contracting the fields in our time ordered product, sum over these and normal order the fields that are not contracted. For three fields we find

\[\begin{split} \overset{\leftarrow}{\mathrm{T}}\big(\phi_I(x_3)\phi_I(x_2)\phi_I(x_1)\big) &= \,:\!\phi_I(x_3)\phi_I(x_2)\phi_I(x_1)\!:\, + \, \phi_I(x_3) \overbracket{\phi_I(x_2)\phi_I(x_1)} \\ & \quad + \phi_I(x_2) \overbracket{\phi_I(x_3)\phi_I(x_1)} + \phi_I(x_1) \overbracket{\phi_I(x_3)\phi_I(x_2)} ~, \end{split}\]

while for four fields we have

\[\begin{split} \overset{\leftarrow}{\mathrm{T}}\big(\phi_I(x_4) \phi_I(x_3)\phi_I(x_2)\phi_I(x_1)\big) & = \,:\!\phi_I(x_4) \phi_I(x_3)\phi_I(x_2)\phi_I(x_1)\!:\, \\ & \quad + \,:\!\phi_I(x_4) \phi_I(x_3)\!:\, \overbracket{\phi_I(x_2)\phi_I(x_1)} + \,:\!\phi_I(x_4) \phi_I(x_2)\!:\, \overbracket{\phi_I(x_3)\phi_I(x_1)} \\ & \quad + \,:\!\phi_I(x_4) \phi_I(x_1)\!:\, \overbracket{\phi_I(x_3)\phi_I(x_2)} + \,:\!\phi_I(x_3) \phi_I(x_2)\!:\, \overbracket{\phi_I(x_4)\phi_I(x_1)} \\ & \quad + \,:\!\phi_I(x_3) \phi_I(x_1)\!:\, \overbracket{\phi_I(x_4)\phi_I(x_2)} + \,:\!\phi_I(x_2) \phi_I(x_1)\!:\, \overbracket{\phi_I(x_4)\phi_I(x_3)} \\ & \quad + \overbracket{\phi_I(x_4) \phi_I(x_3)} \overbracket{\phi_I(x_2)\phi_I(x_1)} + \overbracket{\phi_I(x_4) \phi_I(x_2)} \overbracket{\phi_I(x_3)\phi_I(x_1)} \\ & \quad + \overbracket{\phi_I(x_4) \phi_I(x_1)} \overbracket{\phi_I(x_3)\phi_I(x_2)} ~. \end{split}\]

The first line on the right-hand side has zero contractions, each term in the next three lines has one contraction, and the terms in the final two lines are fully contracted. Each time we see a contraction we replace the contracted fields with the Feynman propagator. For example, we write

\[\begin{split} \,:\!\phi_I(x_4) \phi_I(x_3)\!:\, \overbracket{\phi_I(x_2)\phi_I(x_1)} & = G(x_2-x_1)\,:\!\phi_I(x_4) \phi_I(x_3)\!:\, ~, \\ \overbracket{\phi_I(x_4) \phi_I(x_3)}\overbracket{\phi_I(x_2)\phi_I(x_1)} & = G(x_4-x_3)G(x_2-x_1) ~. \end{split}\]

The proof of Wick’s theorem then follows by induction.

Now that we have understood how time ordering and normal ordering are related, we can use this to simplify the computation of expectation values of time ordered products of fields. In particular, when we take the expectation value, any term with an uncontracted field will vanish due to the normal ordering. Therefore, as a corollary of Wick’s theorem, we have \[\langle 0|\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(x_n)\dots\phi_I(x_2)\phi_I(x_1)\big)|0\rangle = \sum \text{terms with all fields contracted} ~.\] For example, taking the expectation value of the time ordered product of two fields, we find

\[\begin{equation} \label{eq:twopoint} \langle 0|\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y)\phi_I(x)\big)|0\rangle = \langle 0|\big(\!\!\,:\!\phi_I(y)\phi_I(x)\!:\, + \overbracket{\phi_I(y) \phi_I(x)}\big)|0\rangle = G(y-x) \langle 0|0\rangle = G(y-x) ~, \end{equation}\]

while for four fields \[\begin{equation} \label{eq:fourpoint} \langle 0|\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(x_4)\phi_I(x_3)\phi_I(x_2)\phi_I(x_1)\big)|0\rangle = G(x_4-x_3)G(x_2-x_1) + G(x_4-x_2)G(x_3-x_1) + G(x_4-x_1)G(x_3-x_2) ~. \end{equation}\] We can represent relations such as this using diagrams. In particular, we represent the Feynman propagator by a line between two points. The expectation value \(\langle 0|\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(x_n)\dots\phi_I(x_2)\phi_I(x_1)\big)|0\rangle\) is then the sum of all possible diagrams with each point connected to exactly one other point by a line. The diagrams representing the expectation value of the time ordered product of two and four fields, given in eqs. \(\eqref{eq:twopoint}\) and \(\eqref{eq:fourpoint}\) respectively, are

5.3 Feynman diagrams and Feynman rules

These diagrams become more useful when we look at correlation functions in the interacting theory such as \[\begin{equation} \label{eq:twopointfunction} \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle= \frac{\langle 0| \overset{\leftarrow}{\mathrm{T}}\Big(\phi_I(y) \phi_I(x) \exp\big(-i\displaystyle{\int_{-\infty}^{+\infty}} dt'\, H_I(t')\big) \Big)|0\rangle}{\langle 0| \mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\big(-i\displaystyle{\int_{-\infty}^{+\infty}} dt'\, H_I(t')\big)|0\rangle} ~. \end{equation}\] Let us denote the numerator and denominator of the right-hand side as \[\begin{equation} \begin{split} N & = \langle 0| \overset{\leftarrow}{\mathrm{T}}\Big(\phi_I(y) \phi_I(x) \exp\big(-i\int_{-\infty}^{+\infty} dt'\, H_I(t')\big) \Big)|0\rangle ~, \\ D & = \langle 0| \mathrm{T}\hspace{-1pt}\overset{\longleftarrow}{\exp}\big(-i\int_{-\infty}^{+\infty} dt'\, H_I(t')\big)|0\rangle ~. \end{split}\label{eq:numden} \end{equation}\] Expanding out the exponential in powers of \(\lambda\), we have \[\exp\big(-i\int_{-\infty}^{+\infty} dt'\, H_I(t')\big) = 1 - i \frac{\lambda}{4!} \int d^4 x \, \phi_I(x)^4 + \mathcal{O}(\lambda^2) ~,\] for \(\phi^4\) theory. We can similarly expand out the \(N\) and \(D\) in eq. \(\eqref{eq:numden}\) in powers of \(\lambda\) \[N = N_0 + \lambda N_1 + \lambda^2 N_2 + \dots ~, \qquad D = D_0 + \lambda D_1 + \lambda^2 D_2 + \dots ~.\]

First computing \(N\) we have \[N = \langle 0| \overset{\leftarrow}{\mathrm{T}}\Big(\phi_I(y) \phi_I(x) \Big(1 - i \frac{\lambda}{4!} \int d^4 z \phi_I(z)^4 + \mathcal{O}(\lambda^2) \Big) \Big)|0\rangle ~,\] and we can read off \[\begin{split} N_0 & = \langle 0| \overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y) \phi_I(x) \big) |0\rangle ~, \\ N_1 & = -i \frac{1}{4!} \int d^4z \, \langle 0|\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(y) \phi_I(x) \phi_I(z) \phi_I(z) \phi_I(z) \phi_I(z)\big)|0\rangle ~. \end{split}\] From Wick’s theorem, we know that to determine this expectation value we should sum over all possible contractions of the six fields. Since each field needs to be contracted with another field we have \(5\times 3 \times 1 = 15\) ways of contracting the fields. To see this we pick any field, which we can contract with any of the five remaining fields. We then pick any of the four fields that are left, which we can contract with any of the three remaining fields. Finally, we are left with two fields, which we have no choice but to contract with each other.

There are two distinct classes of contractions. The first class is when the field \(\phi_I(y)\) is contracted with the field \(\phi_I(x)\)

\[\langle 0|\overbracket{\phi_I(y) \phi_I(x)}\overbracket{ \phi_I(z) \phi_I(z) }\overbracket{\phi_I(z) \phi_I(z)}|0\rangle = G(y-x)G(0)^2 ~.\]

Three of the possible contractions are in this class since once we have contracted \(\phi_I(y)\) and \(\phi_I(x)\) we are left with three ways of contracting the four copies of \(\phi_I(z)\). The second class is when the fields \(\phi_I(y)\) and \(\phi_I(x)\) are contracted with different copies of \(\phi_I(z)\)

\[\langle 0|\overbracket{\phi_I(y) \phi_I(z)}\overbracket{ \phi_I(x) \phi_I(z) }\overbracket{\phi_I(z) \phi_I(z)}|0\rangle = G(y-z)G(x-z)G(0) ~.\]

Twelve of the possible contractions are in this class since we can contract \(\phi_I(y)\) with any of the four copies of \(\phi_I(z)\), and then we can contract \(\phi_I(x)\) with any of the remaining three copies of \(\phi_I(z)\). Note that since \(3+12 = 15\) we have found all the possible contractions.

We can represent these contractions using diagrams as follows

These diagrams can be given a physical interpretation. In the first contraction the particle moves between \(x\) and \(y\) while a particle antiparticle pair is created from the vacuum at point \(z\), propagates and then comes back together at the same point and annihilates itself. We call such contributions disconnected since the diagram has two components that are not connected to each other. We denote the disconnected contribution by \(N^{\text{disc}}_1\). In the second contraction as the particle moves between \(x\) and \(y\) it emits another particle at \(z\), which propagates and then comes back to the same point where it is reabsorbed. We call such contributions connected since they are depicted by a single connected diagram. We denote the connected contribution by \(N^{\text{conn}}_1\). Together, we find that the full contribution at \(\mathcal{O}(\lambda)\) is \[\begin{equation} \label{eq:n1} N_1 = -i \frac{1}{4!} \int d^4z \, \big(3 G(y-x)G(0)^2 + 12G(y-z)G(x-z)G(0)\big) ~. \end{equation}\]

The diagrams we are drawing are known as Feynman diagrams and are an efficient way to represent contractions. Instead of using Wick’s theorem directly and working out all possible contractions, we can instead formulate a set of rules, known as Feynman rules, that allow us to write down an analytic expression for each Feynman diagram.

In \(\phi^4\) theory, the Feynman diagrams contributing to the numerator \(N\) \(\eqref{eq:numden}\) at \(\mathcal{O}(\lambda^k)\) are constructed by drawing two external points each attached to a leg and \(k\) vertices each with four legs. The different diagrams are then given by connecting the \(2+4k\) legs in all possible ways. For the correlation function of \(n\) fields, the Feynman diagrams contributing to the corresponding numerator are constructed in the same way, except we now start with \(n\) external points. To determine the contribution to the numerator at a given order we first draw all inequivalent Feynman diagrams with labelled external points, but without labelling the vertices. The position space Feynman rules for determining the contribution from a given Feynman diagram are:

  1. For each line connecting two points \(x\) and \(y\) we write down a factor of the propagator \(G(y,x)\).

  2. For each vertex at point \(z\) we write down \(-i\lambda \displaystyle{\int} d^4z\).

  3. Finally, we divide by the symmetry factor.

Let us discuss the symmetry factor in more detail. We have seen that different contractions can give the same Feynman diagram. Any permutation of the four legs attached to a vertex will give an equivalent expression. We have already accounted for this factor of \(4!\) by associating the integral \(-i\lambda \displaystyle{\int} d^4z\) to each vertex, rather than \(-i\displaystyle{\frac{\lambda}{4!} \int} d^4z\). We have also not included the factor of \(\displaystyle{\frac{1}{k!}}\) that comes from expanding the exponential \[\exp\big(-i\frac{\lambda}{4!}\displaystyle{\int} d^4x \phi_I(x)^4 \big) ~,\] since any permutation of the vertices will also give an equivalent expression by permuting the associated integration variables. However, not all of these permutations are necessarily independent and we may now have overcounted the number of contractions. This happens when the Feynman diagram has some symmetry and the degree of the symmetry is called the symmetry factor.

Consider the following two diagrams

In the first diagram permuting the two legs connected to each other does not give an independent contraction, hence the symmetry factor is 2. Similarly, the second diagram, which we call the double bubble diagram, has a symmetry factor of 8. These symmetry factors result in the same combinatoric factors we found above when computing \(N_1\) using Wick’s theorem. For the first diagram, we found a combinatoric factor of 12 in eq. \(\eqref{eq:n1}\) and \(\displaystyle{\frac{12}{4!} = \frac{1}{2}}\), i.e. the inverse of the symmetry factor. The second diagram appeared together with the two external points connected by a line. The symmetry factor of the latter is 1, hence the total symmetry factor of the disconnected diagram is still 8. For this disconnected diagram we found a combinatoric factor of 3 in eq. \(\eqref{eq:n1}\) and \(\displaystyle{\frac{3}{4!} = \frac{1}{8}}\), again the inverse of the symmetry factor.

The rules for writing down the symmetry factor in \(\phi^4\) theory are:

  1. Let \(g\) be the number of permutations of fully internal vertices, i.e., not connected to an external point, that give an identical diagram. If only the identity permutation gives an identical diagram then \(g=1\). When computing \(g\) it is useful to explicitly label the internal vertices to check that the permutation gives the same set of propagators between labelled vertices.

  2. Let \(d\) be the number of double bubble diagrams.

  3. Let \(\beta\) be the number of lines connecting a vertex to itself.

  4. Let \(\alpha_n\), \(n=1,2,3,4\), be the number of pairs of vertices connected by \(n\) lines.

  5. The symmetry factor is then given by \[S = g \, 2^d \, 2^\beta \, \prod_{n=1}^4 (n!)^{\alpha_n} ~.\]

The generalisation to the interacting theory of a real scalar field with a polynomial potential \(\eqref{eq:polynomial_potential}\) is:

  1. Let \(\operatorname{deg}(V)\) be the degree of the polynomial.

  2. Let \(g\) be the number of permutations of fully internal vertices, i.e., not connected to an external point, that give an identical diagram. If only the identity permutation gives an identical diagram then \(g=1\). When computing \(g\) it is useful to explicitly label the internal vertices to check that the permutation gives the same set of propagators between labelled vertices.

  3. Let \(d_k\), \(k=1,\dots,\lfloor\frac{\operatorname{deg}(V)}{2}\rfloor\) be the number of \(k\)-bubbles. A \(k\)-bubble is defined as \(k\) lines connecting a vertex to itself, i.e., double bubbles for \(k=2\), triple bubbles for \(k=3\), and so on.

  4. Let \(\beta\) be the number of lines connecting a vertex to itself.

  5. Let \(\alpha_n\), \(n=1,\dots,\operatorname{deg}(V)\), be the number of pairs of vertices connected by \(n\) lines.

  6. The symmetry factor is then given by \[S = g \, \Big(\prod_{k=1}^{\lfloor\frac{\operatorname{deg}(V)}{2}\rfloor}(k!)^{d_k}\Big) \, 2^\beta \, \Big(\prod_{n=1}^{\operatorname{deg}(V)} (n!)^{\alpha_n}\Big) ~.\]

We have seen that using Feynman diagrams and the Feynman rules we arrive at the same expression \(\eqref{eq:n1}\) that we found by directly using Wick’s theorem. We could now proceed to higher orders in \(\lambda\), which would involve diagrams with more than one vertex, and the correlation function of \(n\) fields, which would involve diagrams with \(n\) external points.

Now let us compute \(D\) \[D = \langle 0| \overset{\leftarrow}{\mathrm{T}}\Big( 1 - i \frac{\lambda}{4!} \int d^4 z \phi_I(z)^4 + \mathcal{O}(\lambda^2) \Big)|0\rangle ~,\] and we can read off \[\begin{split} D_0 & = \langle 0|0\rangle = 1 ~, \\ D_1 & = -i \frac{1}{4!} \int d^4z \, \langle 0|\overset{\leftarrow}{\mathrm{T}}\big( \phi_I(z) \phi_I(z) \phi_I(z) \phi_I(z)\big)|0\rangle ~. \end{split}\] To compute \(D_1\) we can use Feynman diagrams and Feynman rules. There are no external fields, hence the diagrams we consider have no external points. Since we are interested in the contribution at \(\mathcal{O}(\lambda)\) we have a single vertex. Connecting the four legs we find the double bubble diagram

whose contribution we have already computed to be \[D_1 = -i\frac18 \int d^4z G(0)^2 ~.\]

Returning to the full two-point correlation function \(\eqref{eq:twopointfunction}\), we have \[\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle= \frac{N_0 + \lambda N_1 + \dots}{D_0 + \lambda D_1 + \dots} = \frac{N_0}{D_0} + \lambda \Big(\frac{N_1}{D_0} - \frac{N_0 D_1}{D_0}\Big) + \mathcal{O}(\lambda^2) ~.\] Summarising, we have found \[\begin{aligned} N_0& = G(y-x) ~, \qquad &N_1 & = N_1^{\text{disc}} + N_1^{\text{conn}} = -i \frac18 \int d^4z \, G(y-x)G(0)^2 - i \frac12 \int d^4z G(y-z)G(x-z)G(0) ~, \\ D_0& = 1 ~, \qquad &D_1 &= -i \frac18 \int d^4z \, G(0)^2 ~. \end{aligned}\] Noting that \[\begin{equation} \label{eq:discconnidentity} N_1^{\text{disc}} = N_0 D_1 ~, \end{equation}\] and substituting in we find \[\begin{equation} \label{eq:twopointfunctionresult} \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle= G(y-x)- i \frac\lambda2 \int d^4z G(y-z)G(x-z)G(0) + \mathcal{O}(\lambda^3) ~. \end{equation}\]

The relation \(\eqref{eq:discconnidentity}\) means that the contribution from the denominator exactly cancels the contribution from the disconnected diagram in the numerator. The disconnected diagram contains a double bubble, which is an example of a vacuum bubble diagram, a diagram with no external legs. The denominator, which consists of only vacuum bubble diagrams, cancels the contribution to the numerator from those disconnected diagrams that contain vacuum bubbles. This holds to all orders in perturbation theory. Therefore, we have \[\begin{split} \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(y)\phi(x)\big)|\Omega\rangle & =\frac{\Big(\sum\text{all Feynman diagrams with two external points}\Big)} {\Big(\sum \text{all vacuum bubble diagrams}\Big)} ~. \\ & =\frac{\Big(\sum\begin{array}{l} \text{all Feynman diagrams with two} \\ \text{external points and no vacuum bubbles} \end{array}\Big) \Big(\sum \text{all vacuum bubble diagrams}\Big)} {\Big(\sum \text{all vacuum bubble diagrams}\Big)} \\ & =\Big(\sum\begin{array}{l} \text{all Feynman diagrams with two} \\ \text{external points and no vacuum bubbles} \end{array}\Big) ~. \end{split}\] We can interpret the vacuum bubble diagrams as capturing the relation between the free and interacting vacuums, while those diagrams with out vacuum bubbles describe the propagation of the particle. Generalising to correlation functions of \(n\) fields we have \[\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(x_n)\dots\phi(x_2)\phi(x_1)\big)|\Omega\rangle =\Big(\sum\begin{array}{l} \text{all Feynman diagrams with $n$} \\ \text{external points and no vacuum bubbles} \end{array}\Big) ~.\]

Feynman calculus gives a simple algorithm to determine the perturbative expansion of \(n\)-point time ordered correlation functions. We draw the Feynman diagrams and apply the Feynman rules. It also gives a physical interpretation to each term in the expansion in terms of physical and virtual particles. At this point we could try to evaluate the integral in \(\eqref{eq:twopointfunctionresult}\). This is possible, but needs to be done carefully since almost all integrals involving loops of virtual particles are infinite. Here, the infinity arises from \(G(0)\), which represents the amplitude for a particle to end up exactly where it started. This is sensitive to the short-distance structure of the theory, or equivalently high-energy processes. Therefore, this requires the formalism of regularisation and renormalisation.

6 Scattering

Having understood how to compute time ordered correlation functions of fields in an interacting theory using Feynman diagrams and Feynman rules, we can relate them to quantities we can measure through the S-matrix.

6.1 LSZ reduction formula and the S-matrix

To model a particle physics experiment we can consider preparing a state with two particles and colliding them with each other. We can then ask what is the probability that the final state has two, three, four, and so on, particles as a function of the energy and momentum of the incoming particles.

To compute this we would like to define an inner product of the form \[\langle\text{$n$ particles out}|\text{2 particles in}\rangle ~.\] We will show that such quantities can be related to time ordered correlation functions in the vacuum, and thus to Feynman diagrams and Feynman rules.

We start by understanding our initial and final, or asymptotic, states. We take an “in” state to mean a state in the far past, i.e. \(t\to-\infty\), and an “out state” to mean a state in the distant future \(t\to+\infty\). We can construct such states explicitly in the free theory where they are created with the creation operators \(a_{\vec{k}}^\dagger\). Recall the expression for the Heisenberg picture field \[\phi(t,\vec{x}) = \int \frac{d^3k}{(2\pi)^3} \, \frac{1}{\sqrt{2\omega_{\vec{k}}}} \big(a_{\vec{k}} e^{+ik\cdot x} + a_{\vec{k}}^\dagger e^{-ik\cdot x}\big) ~, \qquad k^\mu = (\omega_{\vec{k}},\vec{k})^\mu ~,\] which we can invert to obtain an expression for the creation operator in terms of the field \[a_{\vec{k}}^\dagger = \int d^3x\, e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot \vec{x}} \Big(\sqrt{\frac{\omega_{\vec{k}}}{2}} \phi(t,\vec{x}) - \frac{i}{\sqrt{2\omega_{\vec{k}}}} \partial_t \phi(t,\vec{x})\Big) ~.\] We can use \(a_{\vec{k}}^\dagger\) to create a single particle state with momentum \(\vec{k}\) in the free theory by acting on the free vacuum, \(a_{\vec{k}}^\dagger|0\rangle\).

We assume that the same expression holds in the interaction picture of the interacting theory. The creation operator will now be time dependent \[\begin{equation} \label{eq:creation} a_{\vec{k}}^\dagger(t) = \int d^3x\, e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot \vec{x}} \Big(\sqrt{\frac{\omega_{\vec{k}}}{2}} \phi_I(t,\vec{x}) - \frac{i}{\sqrt{2\omega_{\vec{k}}}} \partial_t \phi_I(t,\vec{x})\Big) ~, \end{equation}\] where \(\phi_I(t,\vec{x})\) is the interaction picture field, and can be used to create initial and final states. In other words, we take \[a_{\vec{k}}^{\text{in}\dagger} = \lim_{t\to-\infty} a_{\vec{k}}^\dagger(t) ~, \qquad a_{\vec{k}}^{\text{out}\dagger} = \lim_{t\to+\infty} a_{\vec{k}}^\dagger(t) ~.\] The motivation for this is that we are modelling a scattering experiment where all the physics depending on the interaction takes place in a localised region of space-time. This means that we can neglect the interaction in the far past and distant future and the theory essentially becomes free. Let us note that this does not always work, e.g., sometimes the fields in the action are not associated to asymptotic states. However, it typically works whenever the theory is weakly coupled, which is the case for all the theories that we consider.

With this assumption, we can now formulate the question we would like to answer more precisely. Let us consider the scattering of an initial state \(|\text{in}\rangle\) of two incoming \(\phi\) particles with momenta \(\vec{k}_1\) and \(\vec{k}_2\) into a final state \(|\text{out}\rangle\) of two outgoing \(\phi\) particles with momenta \(\vec{k}_1'\) and \(\vec{k}_2'\). The amplitude for this to happen is given by \[\langle\text{out}|\text{in}\rangle = \frac{\langle 0| a_{\vec{k}_1'}^{\text{out}}a_{\vec{k}_2'}^{\text{out}}U(+\infty,-\infty)a_{\vec{k}_1}^{\text{in}\dagger}a_{\vec{k}_2}^{\text{in}\dagger}|0\rangle}{\langle 0|U(+\infty,-\infty)|0\rangle} ~.\] Inner products of this type, measuring the amplitude for a state in the far past evolving into a different state in the distant future, are known as an S-matrix elements

Since the out state operators are defined at future infinity and the in state operators are defined at past infinity we can insert a time ordering without loss of generality, i.e. \[\begin{equation} \label{eq:smatrixelement} \langle\text{out}|\text{in}\rangle = \frac{\langle 0| \overset{\leftarrow}{\mathrm{T}}\big(a_{\vec{k}_1'}^{\text{out}}a_{\vec{k}_2'}^{\text{out}}U(+\infty,-\infty)a_{\vec{k}_1}^{\text{in}\dagger}a_{\vec{k}_2}^{\text{in}\dagger}\big)|0\rangle}{\langle 0|U(+\infty,-\infty)|0\rangle} ~. \end{equation}\] To relate this expression to the time ordered correlation functions of fields we observe that we can write the difference of \(a_{\vec{k}}^{\text{out}\dagger}\) and \(a_{\vec{k}}^{\text{in}\dagger}\) as \[a_{\vec{k}}^{\text{out}\dagger} - a_{\vec{k}}^{\text{in}\dagger} = \int_{-\infty}^\infty dt\, \partial_t a_{\vec{k}}^{\dagger}(t) ~.\] Substituting in for \(a_{\vec{k}}^\dagger(t)\) from eq. \(\eqref{eq:creation}\) and recalling that \(\omega_{\vec{k}}^2 = \vec{k}^2 + m^2\) we find \[\begin{equation} \begin{split} a_{\vec{k}}^{\text{out}\dagger} - a_{\vec{k}}^{\text{in}\dagger} & = \int_{-\infty}^\infty dt\, \partial_t \,\Big( \int d^3x\, e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot \vec{x}} \Big(\sqrt{\frac{\omega_{\vec{k}}}{2}} \phi_I(t,\vec{x}) - \frac{i}{\sqrt{2\omega_{\vec{k}}}} \partial_t \phi_I(t,\vec{x})\Big)\Big) \\ & = -\frac{i}{\sqrt{2\omega_{\vec{k}}}} \int d^4x \, e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot \vec{x}} \big(\partial_t^2 + \omega_{\vec{k}}^2\big)\phi_I(t,\vec{x}) \\ & = -\frac{i}{\sqrt{2\omega_{\vec{k}}}} \int d^4x \, e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot \vec{x}} \big(\partial_t^2 + \vec{k}^2 +m^2\big)\phi_I(t,\vec{x}) \\ & = -\frac{i}{\sqrt{2\omega_{\vec{k}}}} \int d^4x \, \big(e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot \vec{x}} \partial_t^2\phi_I(t,\vec{x}) + \phi_I(t,\vec{x}) (-\vec{\nabla}^2 +m^2) e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot \vec{x}}\big) \\ & = -\frac{i}{\sqrt{2\omega_{\vec{k}}}} \int d^4x \, e^{-i\omega_{\vec{k}}t + i\vec{k}\cdot \vec{x}} \big(\partial_t^2 - \vec{\nabla}^2 +m^2\big) \phi_I(t,\vec{x}) ~, \end{split}\label{eq:inoutalgebra} \end{equation}\] where two of the four terms cancel when going from the first to second line, and to go from the penultimate to final line we have integrated by parts and dropped a total derivative, assuming that all fields and their derivatives fall off sufficiently fast at spatial infinity. While it makes sense to neglect the interactions between particles in the far past and distant future, in a quantum field theory particles can have self interactions, which we cannot neglect. Such self interactions can modify the physical mass of the particles from \(m\) to \(M = m + \mathcal{O}(\lambda)\) where \(\lambda\) is the coupling constant. Making this replacement we can rewrite eq. \(\eqref{eq:inoutalgebra}\) in the Lorentz covariant form \[\begin{equation} \label{eq:creationrelation} a_{\vec{k}}^{\text{out}\dagger} - a_{\vec{k}}^{\text{in}\dagger} = -\frac{i}{\sqrt{2\omega_{\vec{k}}}} \int d^4x \, e^{+ik\cdot x}\big(-\partial_\mu\partial^\mu +M^2\big) \phi_I(x) ~. \end{equation}\] This equation tells us that the change in the operator that creates a particle over time is proportional to the Klein-Gordon operator acting on the corresponding field. In a free theory we have \((-\partial_\mu\partial^\mu +m^2) \phi(x) = 0\), hence the creation operator is time independent. In an interacting theory this is no longer the case and the creation operator is time dependent. Taking the hermitian conjugate we find the analogous expression for the annihilation operators \[\begin{equation} \label{eq:annihilationrelation} a_{\vec{k}}^{\text{out}} - a_{\vec{k}}^{\text{in}} = +\frac{i}{\sqrt{2\omega_{\vec{k}}}} \int d^4x \, e^{-ik\cdot x}\big(-\partial_\mu\partial^\mu +M^2\big) \phi_I(x) ~. \end{equation}\] Returning to the numerator of the S-matrix element \(\eqref{eq:smatrixelement}\) \[\langle 0| \overset{\leftarrow}{\mathrm{T}}\big(a_{\vec{k}_1'}^{\text{out}}a_{\vec{k}_2'}^{\text{out}}U(+\infty,-\infty)a_{\vec{k}_1}^{\text{in}\dagger}a_{\vec{k}_2}^{\text{in}\dagger}\big)|0\rangle ~,\] we can use the relations \(\eqref{eq:creationrelation}\) and \(\eqref{eq:annihilationrelation}\) to replace \(a^{\text{in}\dagger}\) by \(a^{\text{out}\dagger}\) and \(a^{\text{out}}\) by \(a^{\text{in}}\). Doing so one by one, and assuming that no incoming momentum is equal to an outgoing momentum, any term that has a creation or annihilation operator in will vanish. This is because the time ordering moves the creation operators \(a^{\text{out}\dagger}\) to the left annihilating the conjugate vacuum \(\langle 0|\) and the annihilation operators \(a^{\text{in}}\) to the right annihilating the vacuum \(|0\rangle\). The resulting expression is then given by \[\begin{split} & \langle 0| \overset{\leftarrow}{\mathrm{T}}\big(a_{\vec{k}_1'}^{\text{out}}a_{\vec{k}_2'}^{\text{out}}U(+\infty,-\infty)a_{\vec{k}_1}^{\text{in}\dagger}a_{\vec{k}_2}^{\text{in}\dagger}\big)|0\rangle \\ & = \frac{i^4}{\sqrt{2\omega_{\vec{k}_1}2\omega_{\vec{k}_2}2\omega_{\vec{k}_1'}2\omega_{\vec{k}_2'}}} \int d^4x_1 \int d^4 x_2\int d^4 x_1'\int d^4x_2'\, e^{+ik_1\cdot x_1 + i k_2\cdot x_2 - i k_1'\cdot x_1' - i k_2'\cdot x_2'} \\ & \hspace{5ex} \times (-\partial_1^2 +M^2)(-\partial_2^2 +M^2)(-\partial_{1'}^2 + M^2)(-\partial_{2'}^2 + M^2) \langle 0|\overset{\leftarrow}{\mathrm{T}}\big(\phi_I(x_1)\phi_I(x_2)U(+\infty,-\infty) \phi_I(x_1')\phi_I(x_2')\big)|0\rangle ~, \end{split}\] where \(\partial_i^2 = \displaystyle{\frac{\partial^2}{\partial x_i{}^\mu\partial x_i{}_\mu}}\) and \(\partial_{i'}^2 = \displaystyle{\frac{\partial^2}{\partial x_i'{}^\mu\partial x_i'{}_\mu}}\). It follows that \[\begin{split} \langle\text{out}|\text{in}\rangle & = \frac{i^4}{\sqrt{2\omega_{\vec{k}_1}2\omega_{\vec{k}_2}2\omega_{\vec{k}_1'}2\omega_{\vec{k}_2'}}} \int d^4x_1 \int d^4 x_2\int d^4 x_1'\int d^4x_2'\, e^{+ik_1\cdot x_1 + i k_2\cdot x_2 - i k_1'\cdot x_1' - i k_2'\cdot x_2'} \\ & \hspace{5ex} \times (-\partial_1^2 +M^2)(-\partial_2^2 +M^2)(-\partial_{1'}^2 + M^2)(-\partial_{2'}^2 + M^2) \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(x_1)\phi(x_2)\phi(x_1')\phi(x_2')\big)|\Omega\rangle ~. \end{split}\] This is the LSZ reduction formula for 2 to 2 scattering. The generalisation to \(n\) incoming particles and \(r\) outgoing particles is \[\begin{split} \langle\text{out}|\text{in}\rangle & = \frac{i^{n+r}}{\sqrt{2\omega_{\vec{k}_1}\dots 2\omega_{\vec{k}_n}2\omega_{\vec{k}_1'}\dots 2\omega_{\vec{k}_r'}}} \int d^4x_1 \hspace{1ex} \dots \int d^4 x_n\int d^4 x_1'\hspace{1ex}\dots\int d^4x_r'\, e^{+ik_1\cdot x_1 + \dots + i k_n\cdot x_n - i k_1'\cdot x_1' - \dots - i k_r'\cdot x_r'} \\ & \hspace{5ex} \times (-\partial_1^2 +M^2)\dots (-\partial_n^2 +M^2)(-\partial_{1'}^2 + M^2)\dots(-\partial_{r'}^2 + M^2) \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(x_1)\dots\phi(x_n)\phi(x_1')\dots\phi(x_r')\big)|\Omega\rangle ~. \end{split}\] The form of this expression suggests that it is more naturally written in momentum space. Defining the Fourier transformation of the time ordered correlation function \[\begin{split} &\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\tilde\phi(k_1)\dots\tilde\phi(k_n)\tilde\phi(k_1')\dots\tilde\phi(k_r')\big)|\Omega\rangle \\&= \int\prod_{i=1}^n d^4x_i e^{+ik_i\cdot x_i} \int\prod_{i=1}^{r} d^4x_i' e^{-ik_i'\cdot x_i'} \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(x_1)\dots\phi(x_n)\phi(x_1')\dots\phi(x_r')\big)|\Omega\rangle ~, \end{split}\] the LSZ formula becomes \[\begin{equation} \begin{split}\label{eq:scattering} \langle\text{out}|\text{in}\rangle= \frac{i^{n+r}}{\sqrt{2\omega_{\vec{k}_1}\dots 2\omega_{\vec{k}_n}2\omega_{\vec{k}_1'}\dots 2\omega_{\vec{k}_r'}}} (k_1^2+M^2)\dots&(k_n^2+M^2)(k_1'{}^2+M^2)\dots(k_r'+M^2) \\ & \times\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\tilde\phi(k_1)\dots\tilde\phi(k_n)\tilde\phi(k_1')\dots\tilde\phi(k_r')\big)|\Omega\rangle ~, \end{split} \end{equation}\] where we recall that we have assumed that \(k_i \neq k_j'\) for any \(i\) and \(j\). The LSZ formula tells us that, to determine the S-matrix element up to an overall normalisation related to the definition of asymptotic states, we compute the momentum space time ordered correlation function, multiply by a factor of \(k_i^2 +M^2\) for each incoming particle and \(k_i'{}^2 + M^2\) for each outgoing particle, and place the momenta on-shell.

We have seen in subsection 5.3 that vacuum bubble diagrams do not contribute to time ordered correlation functions. Moreover, the non-trivial contribution to an S-matrix element is captured by connected Feynman diagrams, i.e., diagrams where each external point and vertex is connected to every other external point or vertex by a propagator or series of propagators. The S-matrix element for \(n\) incoming particles and \(r\) outgoing particles can be constructed by first computing the connected contributions for \(n'\leq n\) incoming particles and \(r'\leq r\) outgoing particles. We then sum over all products of connected contributions subject to the condition that the total number of incoming and outgoing particles is \(n\) and \(r\) respectively. One of the terms in this sum will come from the connected Feynman diagram with \(n\) incoming particles and \(r\) outgoing particles, while the remaining terms will originate from products of disconnected Feynman diagrams. From now on, we will focus only on computing the contribution to \(\eqref{eq:scattering}\) from connected diagrams \[\begin{split} \langle\text{out}|\text{in}\rangle_{\text{conn}}=\frac{i^{n+r}}{\sqrt{2\omega_{\vec{k}_1}\dots 2\omega_{\vec{k}_n}2\omega_{\vec{k}_1'}\dots 2\omega_{\vec{k}_r'}}} (k_1^2+M^2)\dots&(k_n^2+M^2)(k_1'{}^2+M^2)\dots(k_r'+M^2) \\ & \times\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\tilde\phi(k_1)\dots\tilde\phi(k_n)\tilde\phi(k_1')\dots\tilde\phi(k_r')\big)|\Omega\rangle_{\text{conn}} ~. \end{split}\]

6.2 Scattering in \(\phi^4\)

Let us now consider an example and compute 2 to 2 scattering in \(\phi^4\) theory. We will scatter two incoming particles with momenta \(\vec{k}_1\) and \(\vec{k}_2\) into two outgoing particles with momenta \(\vec{k}_1'\) and \(\vec{k}_2'\). To implement the LSZ formula, we first work out the Fourier transform \[\begin{split} & \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\tilde\phi(k_1)\tilde\phi(k_2)\tilde\phi(k_1')\tilde\phi(k_2')\big)|\Omega\rangle_{\text{conn}} \\ & = \int d^4x_1 \int d^4 x_2 \int d^4 x_1' \int d^4 x_2' \, e^{+ik_1\cdot x_1 + i k_2\cdot x_2 - i k_1'\cdot x_1' - i k_2'\cdot x_2'} \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(x_1)\phi(x_2)\phi(x_1')\phi(x_2')\big)|\Omega\rangle_{\text{conn}} ~. \end{split}\] We expand in powers of \(\lambda\). At leading order we have four external points and no vertices. As we have seen in subsection 5.2, it is not possible to connect the external points to give a connected Feynman diagram. Therefore, at leading order there is no contribution to the connected time ordered correlation function.

Even though there is no contribution to the connected time ordered correlation function it is still useful to look at the contribution to the S-matrix element. At leading order we expect to find the free theory result and the S-matrix element is simply given by \[\langle\text{out}|\text{in}\rangle_{\text{free}} = \langle 0|a_{\vec{k}_1'}a_{\vec{k}_2'}a_{\vec{k}_1}^\dagger a_{\vec{k}_2}^\dagger|0\rangle = (2\pi)^3 \delta^{(3)} (\vec{k}_1'-\vec{k}_1) (2\pi)^3 \delta^{(3)}(\vec{k}_2' - \vec{k}_2) + (2\pi)^3 \delta^{(3)} (\vec{k}_1'-\vec{k}_2) (2\pi)^3 \delta^{(3)}(\vec{k}_2' - \vec{k}_1) ~.\] We immediately see that the incoming momenta are equal to the outgoing momenta and the LSZ formula is not directly applicable.

At \(\mathcal{O}(\lambda)\) we have a single connected Feynman diagram

and we can use the position space Feynman rules to write down \[\begin{equation} \label{eq:positionspace} \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\phi(x_1)\phi(x_2)\phi(x_1')\phi(x_2')\big)|\Omega\rangle_{\text{conn}} = -i\lambda \int d^4z\, G(x_1-z)G(x_2-z)G(x_1'-z)G(x_2'-z) + \mathcal{O}(\lambda^2) ~. \end{equation}\] To understand what happens when we Fourier transform let us consider a single propagator. Substituting in for the Feynman propagator using eq. \(\eqref{eq:iepsilon}\), we have \[\begin{split} \int d^4 x \, e^{\pm ik\cdot x} G(x-z) & = - i \int d^4x \, e^{\pm ik\cdot x} \int \frac{d^4p}{(2\pi)^4} \, \frac{e^{+i p\cdot (x-z)}}{p{}^2 + m^2 - i\epsilon} \\ & = -i \int \frac{d^4p}{(2\pi)^4} \, \frac{e^{-i p\cdot z}}{p{}^2 + m^2 - i\epsilon} (2\pi)^4\delta^{(4)}(p\pm k) \\ & = -i \frac{e^{\pm i k\cdot z}}{k{}^2 + m^2 - i\epsilon} ~, \end{split}\] where to go from the first to second line we integrate over \(x\) using the identity \(\eqref{eq:identity}\) and to go from the second to third line we integrate over \(p\) using the delta function. Fourier transforming all four propagators in eq. \(\eqref{eq:positionspace}\) we find \[\begin{split} &\langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\tilde\phi(k_1)\tilde\phi(k_2)\tilde\phi(k_1')\tilde\phi(k_2')\big)|\Omega\rangle_{\text{conn}} \\ & = -i\lambda \int d^4z\,\frac{(-i)^4 e^{+i(k_1+k_2-k_1'-k_2')\cdot z}}{(k_1^2 + m^2 - i\epsilon)(k_2^2 + m^2 - i\epsilon)(k_1'{}^2 + m^2 - i\epsilon)(k_2'{}^2 + m^2 - i\epsilon)} +\mathcal{O}(\lambda^2) ~. \end{split}\] Now integrating over \(z\) yields \[\begin{split} & \langle\Omega|\overset{\leftarrow}{\mathrm{T}}\big(\tilde\phi(k_1)\tilde\phi(k_2)\tilde\phi(k_1')\tilde\phi(k_2')\big)|\Omega\rangle_{\text{conn}} \\ & = -i\lambda (2\pi)^4\delta^{(4)}(k_1+k_2-k_1'-k_2') \frac{(-i)^4}{(k_1^2 + m^2 - i\epsilon)(k_2^2 + m^2 - i\epsilon)(k_1'{}^2 + m^2 - i\epsilon)(k_2'{}^2 + m^2 - i\epsilon)} + \mathcal{O}(\lambda^2) ~. \end{split}\] The integral over the position of the vertex implements conservation of momentum at that vertex, i.e., the momentum flowing in should equal to momentum flowing out. Applying the LSZ reduction formula \(\eqref{eq:scattering}\) and using that \(M = m + \mathcal{O}(\lambda)\), we find that the factors of \(i(k^2+M^2) = i(k^2+m^2) + \mathcal{O}(\lambda)\) cancel the factors of \(\displaystyle{\frac{-i}{k^2+m^2 -i\epsilon}}\) where we recall that \(\epsilon\) is infinitesimal. This cancellation continues to all orders in perturbation theory. To compute the contribution that is left after the cancellation we can restrict to a special class of connected Feynman diagrams known as one-particle irreducible diagrams. These are diagrams that you cannot split into two diagrams by cutting a single internal leg. Finally, we are left with \[\langle\text{out}|\text{in}\rangle_{\text{conn}} = (2\pi)^4\delta^{(4)}(k_1+k_2-k_1'-k_2') \frac{1}{\sqrt{2\omega_{\vec{k}_1}2\omega_{\vec{k}_2}2\omega_{\vec{k}_1'} 2\omega_{\vec{k}_2'}}} (-i\lambda) ~.\] This is the S-matrix element for 2 to 2 scattering in \(\phi^4\) theory at \(\mathcal{O}(\lambda)\).

Much of the structure of this S-matrix element follows from kinematics. The overall delta function that conserves momentum will always be present for any connected Feynman diagram, as will the factors of \(\displaystyle{\frac{1}{\sqrt{2\omega_{\vec{k}}}}}\). It is conventional to write \[\langle\text{out}|\text{in}\rangle_{\text{conn}} = (2\pi)^4\delta^{(4)}(k_1+k_2-k_1'-k_2') \frac{1}{\sqrt{2\omega_{\vec{k}_1}2\omega_{\vec{k}_2}2\omega_{\vec{k}_1'} 2\omega_{\vec{k}_2'}}} i\mathcal{M} ~,\] where the prefactor contains the kinematics and the invariant matrix element \(i\mathcal{M}\) captures the dynamics. We can calculate \(i\mathcal{M}\) directly using the momentum space Feynman rules for \(\phi^4\) theory:

  1. To compute the \(n\) to \(r\) S-matrix element at \(\mathcal{O}(\lambda^k)\) draw \(n+r\) external lines and \(k\) vertices.

  2. Label each external line with an incoming or outgoing on-shell momentum and construct all inequivalent one-particle irreducible Feynman diagrams.

  3. Assign an off-shell momentum \(p\) to each internal line and write a factor of \(\displaystyle{\frac{-i}{p^2+m^2-i\epsilon}}\).

  4. For each vertex write a factor of \(-i\lambda\) and impose momentum conservation.

  5. Integrate over any internal momenta not fixed by momentum conservation with measure \(\displaystyle{\int \frac{d^4p}{(2\pi)^4}}\).

  6. Divide by the symmetry factor.

These Feynman rules can be represented as follows

It is straightforward to see that these rules give the correct 2 to 2 S-matrix element at \(\mathcal{O}(\lambda)\) in \(\phi^4\) theory, i.e., \(i \mathcal{M} = -i\lambda\), which comes from the Feynman diagram

Note that at this order the 2 to 2 invariant matrix element does not depend on the momenta of the incoming and outgoing particles.

6.3 Scattering in \(\phi_1^2\phi_2\)

Let us now consider a theory of two interacting scalar fields \[S[\phi,\sigma] = \int d^4x\, \Big(-\frac12 \partial_\mu\phi\partial^\mu\phi - \frac12\partial_\mu\sigma \partial^\mu\sigma -\frac12 m_\phi^2\phi^2 - \frac12 m_\sigma^2 \sigma^2 - \frac{g}{2}\phi^2 \sigma\Big) ~.\] This theory has two types of particles associated with the fields \(\phi\) and \(\sigma\). These particles have mass \(m_\phi\) and \(m_\sigma\) respectively. As there are two types of particles we have two propagators \[\tilde G_\phi(p) = \frac{-i}{p^2+m_\phi^2-i\epsilon} ~, \qquad \tilde G_\sigma(p) = \frac{-i}{p^2+m_\sigma^2-i\epsilon} ~.\] We represent \(\phi\) particles by solid lines and \(\sigma\) particles by dashed lines. The interaction is cubic indicating that we have a vertex with three legs in perturbation theory . Two of these legs correspond to \(\phi\) particles, hence are solid lines, and the third to a \(\sigma\) particle, hence is a dashed line. Each cubic vertex is associated with a factor of \(-i g\) since the factor of \(2\) that comes from interchanging the two \(\phi\) fields is compensated by the factor of \(\displaystyle{\frac12}\) in the interaction term. The Feynman rules are given by

We again compute 2 to 2 scattering in this theory with two incoming \(\phi\) particles with momenta \(\vec{k}_1\) and \(\vec{k}_2\) and two outgoing \(\phi\) particles with momenta \(\vec{k}_1'\) and \(\vec{k}_2'\). To compute the 2 to 2 invariant matrix element \(i\mathcal{M}\) we first draw the connected Feynman diagrams

and then use the Feynman rules to write down \[\begin{equation} \label{eq:invmatphisigma} i\mathcal{M} = (-ig)^2 \Big(\frac{-i}{(k_1+k_2)^2+m_{\sigma}^2 - i\epsilon} + \frac{-i}{(k_1-k_1')^2+m_{\sigma}^2 - i\epsilon} + \frac{-i}{(k_1-k_2')^2+m_{\sigma}^2 - i\epsilon}\Big) ~. \end{equation}\] Therefore, the 2 to 2 invariant matrix element at \(\mathcal{O}(g^2)\) depends on the momenta of the incoming and outgoing particles.

The invariant matrix element in a scalar field theory is a Lorentz invariant quantity. Since the particles we scatter are on-shell, i.e., \(k_1^2 = k_2^2 = k_1'{}^2 = k_2'{}^2 = -m_\phi^2\), the only undetermined Lorentz invariant combinations of momenta are the Mandelstam variables \[s = (k_1 + k_2)^2 ~, \qquad t = (k_1 - k_1')^2 ~, \qquad u = (k_1 - k_2')^2 ~.\] For any 2 to 2 scattering process the invariant matrix element should be a function of these variables. This is the case for the invariant matrix element \(\eqref{eq:invmatphisigma}\), which can be written in terms of the Mandelstam variables as \[i\mathcal{M} = (-ig)^2 \Big(\frac{-i}{s+m_{\sigma}^2} + \frac{-i}{t+m_{\sigma}^2} + \frac{-i}{u^2+m_{\sigma}^2}\Big) ~,\] where we have dropped the infinitesimal \(i\epsilon\). The Feynman diagrams corresponding to the three terms are referred to as the s-channel, t-channel and u-channel diagrams.

7 Concluding Comments

We have now finished our introduction to quantum field theory. The course continues by developing the framework we have introduced, exploring the path integral formalism, computing Feynman integrals and analysing the resulting physics. Let us briefly summarised some of the key points we have discussed:

  1. There are different yet indistinguishable copies of elementary particles. We have seen this explicitly for scalar fields with particles created by a creation operator \(a_{\vec{p}}^\dagger\) that commutes with itself. It is also the case for spinor fields, such as the electron field, and other types of field.

  2. There is a relationship between the statistics of particles (their behaviour under exchange) and their spin (their behaviour under rotation). We have seen that scalar fields, which have spin 0, lead to commuting particles. More generally, fields with integer spin, such as the photon field or Higgs field, lead to commuting particles and fields with half integer spin, such as the electron field, lead to anticommuting particles.

  3. Particles can be created and destroyed, and antiparticles exist. We have seen that the Hilbert space of quantum field theory does not have a fixed particle number, which is important for relativistic invariance. Moreover, the complex scalar field has two creation operators \(b_{\vec{p}}^\dagger\) and \(a_{\vec{p}}^\dagger\) with opposite charge that create particles and antiparticles respectively. For a real scalar field the particle is its own antiparticle.

  4. Forces can be interpreted as the exchange of particles. The scattering process discussed in subsection 6.3 suggests that the \(\phi\) particles are feeling a force mediated by the \(\sigma\) particle, leading to a non-trivial 2 to 2 scattering. To make this more concrete, we would like to extract a potential \(V(x)\) for the force from the scattering process. Potentials are a concept in non-relativistic physics, hence we should take the non-relativistic limit of the invariant matrix element. The s-channel diagram is intrinsically relativistic since two \(\phi\) particles annihilate each other and emit a \(\sigma\) particle. Moreover, in non-relativistic physics particles are distinguishable, hence only one of the t-channel and u-channel diagrams can have a non-trivial non-relativistic limit. Let us pick the t-channel diagram and interpret it as describing an incoming particle particle with momentum \(k_1\) scattering off a potential to an outgoing particle with momentum \(k_1'\). The potential is induced by the exchange of a \(\sigma\) particle with a second background \(\phi\) particle, which has incoming momentum \(k_2\) and outgoing momentum \(k_2'\).

    In non-relativistic quantum mechanics the matrix element for scattering a momentum eigenstate with momentum \(\vec{k}\) to an eigenstate with momentum \(\vec{k}'\) off a potential \(V(x)\) is \[\langle\vec{k}'|iT|\vec{k}\rangle \propto -i\tilde V(\vec{k}'-\vec{k})(2\pi) \delta(E_{\text{non-rel}}(\vec{k}) - E_{\text{non-rel}}(\vec{k}')) ~.\] This is the Born approximation and \(\tilde V(\vec{k}'-\vec{k})\) is the Fourier transform of the interaction potential evaluated at the momentum transfer. Comparing this to the non-relativistic limit of the contribution from the t-channel diagram in eq. \(\eqref{eq:invmatphisigma}\) \[i\mathcal{M}_{\text{non-rel}} \propto i g^2 \Big(\frac{1}{(\vec{k}_1'{} - \vec{k}_1)^2 + m_\sigma^2}\Big) ~.\] we find the Fourier transformation of the interaction potential is \[\tilde V(\vec{q}) \propto - \frac{g^2}{\vec{q}^2 + m_\sigma^2} ~.\] Fourier transforming back to position space we find \[V(\vec{x}) \propto -\int \frac{d^3q}{(2\pi)^3} e^{+i\vec{q}\cdot\vec{x}} \frac{g^2}{\vec{q}^2 + m_\sigma^2} = -\frac{g^2}{4\pi}\frac{1}{|\vec{x}|} e^{-m_\sigma |\vec{x}|} ~.\] Therefore, in the non-relativistic limit the propagating \(\phi\) particle feels an attractive force due to the exchange of a \(\sigma\) particle with the background \(\phi\) particle. The potential is \(\displaystyle{\frac{1}{r} e^{-m_\sigma r}}\), which depends on the mass of the exchanged particle. The non-zero mass of the \(\sigma\) particle means that the potential decays exponentially in space. Heuristically, \(\sigma\) particles are hard to create from the vacuum and they do not travel far when they are. If we instead exchanged a massless particle then the potential would simply be \(\displaystyle{\frac1r}\), recovering the potential of the Coulomb interaction in electrodynamics. In quantum electrodynamics this force originates from the exchange of massless photons.

Before we finish, let us return to \(\phi^4\) theory and try to compute 2 to 2 scattering at \(\mathcal{O}(\lambda^2)\). The Feynman diagrams are given by

These diagrams have loops, which means that the Feynman rules require us to integrate over an unfixed momentum. For example, the contribution from the first diagram is \[i\mathcal{M}^{(1)} = (-i\lambda)^2 \int \frac{d^4p}{(2\pi)^4} \frac{-i}{(p+k_1+k_2)^2 + m^2-i\epsilon}\frac{-i}{p^2+m^2 -i\epsilon} ~.\] The large \(p^2\) behaviour of this integral is captured by \[i\mathcal{M}^{(1)} \sim \lambda^2 \int^\Lambda \frac{d^4p}{(2\pi)^4} \frac{1}{(p^2)^2} \sim \log \Lambda ~,\] where \(\Lambda\) is a large momentum cutoff. We see that as \(\Lambda \to \infty\) this integral diverges.

This is telling us that information at large momentum, or equivalently short distances, has an effect on scattering amplitudes, even if we scatter particles with small momenta. Organising this flow of information can be achieved using the formalism of regularisation and renormalisation, and effective field theory. Our understanding of physics should depend on the energy, or distance, scale that we are probing.