1 Introduction

For almost a century, the theory of general relativity (GR) has been known to describe the force of gravity with impeccable agreement with observations. Despite all the successes of GR the search for alternatives has been an ongoing challenge since its formulation. Far from a purely academic exercise, the existence of consistent alternatives to describe the theory of gravitation is actually essential to test the theory of GR. Furthermore, the open questions that remain behind the puzzles at the interface between gravity/cosmology and particle physics such as the hierarchy problem, the old cosmological constant problem and the origin of the late-time acceleration of the Universe have pushed the search for alternatives to GR.

While it was not formulated in this language at the time, from a more modern particle physics perspective GR can be thought of as the unique theory of a massless spin-2 particle [287, 483, 175, 225, 76], and so in order to find alternatives to GR one should break one of the underlying assumptions behind this uniqueness theorem. Breaking Lorentz invariance and the notion of spin along with it is probably the most straightforward since non-Lorentz invariant theories include a great amount of additional freedom. This possibility has been explored at length in the literature; see for instance [398] for a review. Nevertheless, Lorentz invariance is observationally well constrained by both particle and astrophysics. Another possibility is to maintain Lorentz invariance and the notion of spin that goes with it but to consider gravity as being the representation of a higher spin. This idea has also been explored; see for instance [466, 52] for further details. In this review, we shall explore yet another alternative: Maintaining the notion that gravity is propagated by a spin-2 particle but considering this particle to be massive. From the particle physics perspective, this extension seems most natural since we know that the particles carrier of the electroweak forces have to acquire a mass through the Higgs mechanism.

Giving a mass to a spin-2 (and spin-1) field is an old idea and in this review we shall summarize the approach of Fierz and Pauli, which dates back to 1939 [226]. While the theory of a massive spin-2 field is in principle simple to derive, complications arise when we include interactions between this spin-2 particle and other particles as should be the case if the spin-2 field is to describe the graviton.

At the linear level, the theory of a massless spin-2 field enjoys a linearized diffeomorphism (diff) symmetry, just as a photon enjoys a U (1) gauge symmetry. But unlike for a photon, coupling the spin-2 field with external matter forces this symmetry to be realized in a different way non-linearly. As a result, GR is a fully non-linear theory, which enjoys non-linear diffeomorphism invariance (also known as general covariance or coordinate invariance). Even though this symmetry is broken when dealing with a massive spin-2 field, the non-linearities are inherited by the field. So, unlike a single isolated massive spin-2 field, a theory of massive gravity is always fully non-linear (and as a consequence non-renormalizable) just as for GR. The fully non-linear equivalent to GR for massive gravity has been a much more challenging theory to obtain. In this review we will summarize a few different approaches to deriving consistent theories of massive gravity and will focus on recent progress. See Ref. [309] for an earlier review on massive gravity, as well as Refs. [134] and [336] for other reviews relating Galileons and massive gravity.

When dealing with a theory of massive gravity two elements have been known to be problematic since the seventies. First, a massive spin-2 field propagates five degrees of freedom no matter how small its mass. At first this seems to suggest that even in the massless limit, a theory of massive gravity could never resemble GR, i.e., a theory of a massless spin-2 field with only two propagating degrees of freedom. This subtlety is at the origin of the vDVZ discontinuity (van Dam-Veltman-Zakharov [465, 497]). The resolution behind that puzzle was provided by Vainshtein two years later and lies in the fact that the extra degree of freedom responsible for the vDVZ discontinuity gets screened by its own interactions, which dominate over the linear terms in the massless limit. This process is now relatively well understood [463] (see also Ref. [35] for a recent review). The Vainshtein mechanism also comes hand in hand with its own set of peculiarities like strong coupling and superluminalities, which we will discuss in this review.

A second element of concern in dealing with a theory of massive gravity is the realization that most non-linear extensions of Fierz-Pauli massive gravity are plagued with a ghost, now known as the Boulware-Deser (BD) ghost [75]. The past decade has seen a revival of interest in massive gravity with the realization that this BD ghost could be avoided either in a model of soft massive gravity (not a single massive pole for the graviton but rather a resonance) as in the DGP (Dvali-Gabadadze-Porrati) model or its extensions [208, 209, 207], or in a three-dimensional model of massive gravity as in ‘new massive gravity’ (NMG) [66] or more recently in a specific ghost-free realization of massive gravity (also known as dRGT in the literature) [144].

With these developments several new possibilities have become a reality:

  • First, one can now more rigorously test massive gravity as an alternative to GR. We will summarize the different phenomenologies of these models and their theoretical as well as observational bounds through this review. Except in specific cases, the graviton mass is typically bounded to be a few times the Hubble parameter today, that is m ≲ 10−30 − 10−33 eV depending on the exact models. In all of these models, if the graviton had a mass much smaller than 10−33 eV, its effect would be unseen in the observable Universe and such a mass would thus be irrelevant. Fortunately there is still to date an open window of opportunity for the graviton mass to be within an interesting range and providing potentially new observational signatures.

  • Second, these developments have opened up the door for theories of interacting metrics, a success long awaited. Massive gravity was first shown to be expressible on an arbitrary reference metric in [296]. It was then shown that the reference metric could have its own dynamics leading to the first consistent formulation of bi-gravity [293]. In bi-gravity two metrics are interacting and the mass spectrum is that of a massless spin-2 field interacting with a massive spin-2 field. It can, therefore, be seen as the theory of general relativity interacting (fully non-linearly) with a massive spin-2 field. This is a remarkable new development in both field theory and gravity.

  • The formulation of massive gravity and bi-gravity in the vielbein language were shown to be both analytic and much more natural and allowed for a general formulation of multi-gravity [314] where an arbitrary number of spin-2 fields may interact together.

  • Finally, still within the theoretical progress front, all of these successes provided full and definite proof for the absence of Boulware-Deser ghosts in these types of theories; see [295], which has then been translated into a multitude of other languages. This also opens the door for new types of theories that can propagate fewer degrees than naively thought.

Independent of this, developments in massive gravity, bi-gravity and multi-gravity have also opened up new theoretical avenues, which we will summarize, and these remain very much an active area of progress. On the phenomenological front, a genuine task force has been devoted to finding both exact and approximate solutions in these types of gravitational theories, including the ones relevant for black holes and for cosmology. We shall summarize these in the review.

This review is organized as follows: We start by setting the formalism for massive and massless spin-1 and -2 fields in Section 2 and emphasize the Stückelberg language both for the Proca and the Fierz-Pauli fields. In Part I we then derive consistent theories using a higher-dimensional framework, either using a braneworld scenario à la DGP in Section 4, or via a discretizationFootnote 1 (or Kaluza-Klein reduction) of the extra dimension in Section 5. This second approach leads to the theory of ghost-free massive gravity (also known as dRGT) which we review in more depth in Part II. Its formulation is summarized in Section 6, before tackling other interesting aspects such as the fate of the BD ghost in Section 7, deriving its decoupling limit in Section 8, and various extensions in Section sec:Extensions. The Vainshtein mechanism and other related aspects are discussed in Section 10. The phenomenology of ghost-free massive gravity is then reviewed in Part III including a discussion of solar-system tests, gravitational waves, weak lensing, pulsars, black holes and cosmology. We then conclude with other related theories of massive gravity in Part IV, including new massive gravity, Lorentz breaking theories of massive gravity and non-local versions.

Notations and conventions: Throughout this review, we work in units where the reduced Planck constant and the speed of light c are set to unity. The gravitational Newton constant is related to the Planck scale by \(8\pi {G_N} = M_{{\rm{P1}}}^{- 2}\). Unless specified otherwise, d represents the number of spacetime dimensions. We use the mainly + convention (−+ ⋯ +) and space indices are denoted by i, j, ⋯ = 1, ⋯, d − 1 while 0 represents the time-like direction, x0 = t.

We also use the symmetric convention: \((a,b) = {1 \over 2}(ab + ba)\) and \([a,b] = {1 \over 2}(ab - ba)\). Throughout this review, square brackets of a tensor indicates the trace of tensor, for instance \([{\mathbb X}] = {\mathbb X}_\mu ^\mu, [{{\mathbb X}^2}] = {\mathbb X}_v^\mu {\mathbb X}_\mu ^v\), etc. … We also use the notation Πμν = dμdν and \({\mathcal I} = \delta _{v \cdot}^\mu \,{\varepsilon _{\mu v\alpha \beta}}\) and εabcde represent the Levi-Cevita symbol in respectively four and five dimensions, ε0123 = ε01234 = 1 = ε0123.

2 Massive and Interacting Fields

2.1 Proca field

2.1.1 Maxwell kinetic term

Before jumping into the subtleties of massive spin-2 field and gravity in general, we start this review with massless and massive spin-1 fields as a warm up. Consider a Lorentz vector field living on a four-dimensional Minkowski manifold. We focus this discussion to four dimensions and the extension to d dimensions is straightforward. Restricting ourselves to Lorentz invariant and local actions for now, the kinetic term can be decomposed into three possible contributions:

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - 1} = {a_1}{{\mathcal L}_1} + {a_2}{{\mathcal L}_2} + {a_3}{{\mathcal L}_3}\,,$$
(2.1)

where a1,2,3 are so far arbitrary dimensionless coefficients and the possible kinetic terms are given by

$${{\mathcal L}_1} = {\partial _\mu}{A^\nu}{\partial ^\mu}{A_\nu}$$
(2.2)
$${{\mathcal L}_2} = {\partial _\mu}{A^\mu}{\partial _\nu}{A^\nu}$$
(2.3)
$${{\mathcal L}_3} = {\partial _\mu}{A^\nu}{\partial _\nu}{A^\mu}\,,$$
(2.4)

where in this section, indices are raised and lowered with respect to the flat Minkowski metric. The first and third contributions are equivalent up to a boundary term, so we set a3 = 0 without loss of generality.

We now proceed to establish the behavior of the different degrees of freedom (dofs) present in this theory. A priori, a Lorentz vector field Aμ in four dimensions could have up to four dofs, which we can split as a transverse contribution \(A_\mu ^ \bot\) satisfying \({\partial ^\mu}A_\mu ^ \bot = 0\) bearing a priori three dofs and a longitudinal mode χ with \({\mathcal X}\)

2.1.1.1 Helicity-0 mode

Focusing on the longitudinal (or helicity-0) mode χ, the kinetic term takes the form

$${\mathcal L}_{\mathrm{kin}}^{\chi}=(a_1+a_2) \partial_\mu\partial_\nu \chi \partial^\mu\partial^\nu\chi= (a_1+a_2)(\square \chi)^2\,,$$
(2.5)

where □ = ημνμν represents the d’Alembertian in flat Minkowski space and the second equality holds after integrations by parts. We directly see that unless a1 = −a2, the kinetic term for the field χ bears higher time (and space) derivatives. As a well known consequence of Ostrogradsky’s theorem [421], two dofs are actually hidden in χ with an opposite sign kinetic term. This can be seen by expressing the propagator □−2 as the sum of two propagators with opposite signs:

$$\frac{1}{\square^2}=\lim\limits_{m\rightarrow 0} \frac{1}{2m^2}\left(\frac{1}{\square-m^2}-\frac{1}{\square+m^2}\right)\,,$$
(2.6)

signaling that one of the modes always couples the wrong way to external sources. The mass m of this mode is arbitrarily low which implies that the theory (2.1) with a3 = 0 and a1 +a2 = 0 is always sick. Alternatively, one can see the appearance of the Ostrogradsky instability by introducing a Lagrange multiplier \(\tilde {\mathcal X}(x)\), so that the kinetic action (2.5) for χ is equivalent to

$$\mathcal L_{\mathrm{kin}}^{\chi}=(a_1+a_2)\left(\tilde\chi \square \chi -\frac 14 \tilde\chi^2\right)\,,$$
(2.7)

after integrating out the Lagrange multiplierFootnote 2 \(\tilde {\mathcal X} \equiv 2\square{\mathcal X}\). We can now perform the change of variables χ = ϕ1 + ϕ2 and \(\tilde {\mathcal X} = {\phi _1} - {\phi _2}\) giving the resulting Lagrangian for the two scalar fields ϕ1,2

$$\mathcal L_{\mathrm{kin}}^{\chi}=(a_1+a_2)\left(\phi_1 \square \phi_1-\phi_2 \square \phi_2-\frac 14 (\phi_1-\phi_2)^2\right)\,.$$
(2.8)

As a result, the two scalar fields ϕ1,2 always enter with opposite kinetic terms, signaling that one of them is always a ghost.Footnote 3 The only way to prevent this generic pathology is to make the specific choice a1 + a2 = 0, which corresponds to the well-known Maxwell kinetic term.

2.1.1.2 Helicity-1 mode and gauge symmetry

Now that the form of the local and covariant kinetic term has been uniquely established by the requirement that no ghost rides on top of the helicity-0 mode, we focus on the remaining transverse mode \(A_\mu ^ \bot\),

$${\mathcal L}_{{\rm{kin}}}^{{\rm{helicity}} - {\rm{1}}} = {a_1}\, {\left({{\partial _\mu}A_\nu ^ \bot} \right)^2}\, ,$$
(2.9)

which has the correct normalization if a1 = −1/2. As a result, the only possible local kinetic term for a spin-1 field is the Maxwell one:

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - {\rm{1}}} = - {1 \over 4}F_{\mu v}^2$$
(2.10)

with Fμν = μAνμ. Restricting ourselves to a massless spin-1 field (with no potential and other interactions), the resulting Maxwell theory satisfies the following U (1) gauge symmetry:

$${A_\mu} \rightarrow {A_\mu} + {\partial _\mu}\xi .$$
(2.11)

This gauge symmetry projects out two of the naive four degrees of freedom. This can be seen at the level of the Lagrangian directly, where the gauge symmetry (2.11) allows us to fix the gauge of our choice. For convenience, we perform a (3 + 1)-split and choose Coulomb gauge iAi = 0, so that only two dofs are present in Ai, i.e., Ai contains no longitudinal mode, \({A_i} = A_i^t + {\partial _i}{A^l}\), with \({\partial ^i}A_i^t = 0\) and the Coulomb gauge sets the longitudinal mode Al = 0. The time-component A0 does not exhibit a kinetic term,

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - {\rm{1}}} = {1 \over 2}{({\partial _t}{A_i})^2} - {1 \over 2}{({\partial _i}{A^0})^2} - {1 \over 4}{({\partial _i}{A_j})^2}\, ,$$
(2.12)

and appears instead as a Lagrange multiplier imposing the constraint

$${\partial _i}{\partial ^i}{A_0} \equiv 0\, .$$
(2.13)

The Maxwell action has therefore only two propagating dofs in \(A_i^t\),

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - {\rm{1}}} = - {1 \over 2}{({\partial _\mu}A_i^t)^2}\, .$$
(2.14)

To summarize, the Maxwell kinetic term for a vector field and the fact that a massless vector field in four dimensions only propagates 2 dofs is not a choice but has been imposed upon us by the requirement that no ghost rides along with the helicity-0 mode. The resulting theory is enriched by a U (1) gauge symmetry which in turn freezes the helicity-0 mode when no mass term is present. We now ‘promote’ the theory to a massive vector field.

2.1.2 Proca mass term

Starting with the Maxwell action, we consider a covariant mass term AμAμ corresponding to the Proca action

$${{\mathcal L}_{{\rm{Proca}}}} = - {1 \over 4}F_{\mu v}^2 - {1 \over 2}{m^2}{A_\mu}{A^\mu}\, ,$$
(2.15)

and emphasize that the presence of a mass term does not change the fact that the kinetic has been uniquely fixed by the requirement of the absence of ghost. An immediate consequence of the Proca mass term is the breaking of the U (1) gauge symmetry (2.11), so that the Coulomb gauge can no longer be chosen and the longitudinal mode is now dynamical. To see this, let us use the previous decomposition \({A_\mu} = A_\mu ^ \bot + {\partial _\mu}\hat {\mathcal X}\) and notice that the mass term now introduces a kinetic term for the helicity-0 mode \({\mathcal X} = m\hat {\mathcal X}\)

$${{\mathcal L}_{{\rm{Proca}}}} = - {1 \over 2}{({\partial _\mu}A_\nu ^ \bot)^2} - {1 \over 2}{m^2}{(A_\mu ^ \bot)^2} - {1 \over 2}{({\partial _\mu}\chi)^2}\, .$$
(2.16)

A massive vector field thus propagates three dofs, namely two in the transverse modes \(A_\mu ^ \bot\) and one in the longitudinal mode χ. Physically, this can be understood by the fact that a massive vector field does not propagate along the light-cone, and the fluctuations along the line of propagation correspond to an additional physical dof.

Before moving to the Abelian Higgs mechanism, which provides a dynamical way to give a mass to bosons, we first comment on the discontinuity in number of dofs between the massive and massless case. When considering the Proca action (2.16) with the properly normalized fields \(A_\mu ^ \bot\) and χ, one does not recover the massless Maxwell action (2.9) or (2.10) when sending the boson mass m → 0. A priori, this seems to signal the presence of a discontinuity which would allow us to distinguish between for instance a massless photon and a massive one no matter how tiny the mass. In practice, however, the difference is physically indistinguishable so long as the photon couples to external sources in a way which respects the U (1) symmetry. Note however that quantum anomalies remain sensitive to the mass of the field so the discontinuity is still present at this level, see Refs. [197, 204].

To physically tell the difference between a massless vector field and a massive one with tiny mass, one has to probe the system, or in other words include interactions with external sources

$${{\mathcal L}_{{\rm{sources}}}} = - {A_\mu}{J^\mu}\, .$$
(2.17)

The U (1) symmetry present in the massless case is preserved only if the external sources are conserved, ∂μ Jμ = 0. Such a source produces a vector field which satisfies

$$\square A_\mu^\bot=J_\mu$$
(2.18)

in the massless case. The exchange amplitude between two conserved sources Jμ and Jμ mediated by a massless vector field is given by

$${\mathcal A}_{JJ \prime}^{{\rm{massless}}} = \int {\;{{\rm{d}}^4}A_\mu ^ \bot {{J \prime}^\mu}} = \int {{{\rm{d}}^4}x{{J \prime}^\mu}{1 \over {\square}}{J_\mu}\,.}$$
(2.19)

On the other hand, if the vector field is massive, its response to the source Jμ is instead

$$(\square-m^2) A_\mu^\bot=J_\mu \quad \text{and}\quad \square \chi=0\,.$$
(2.20)

In that case, one needs to consider both the transverse and the longitudinal modes of the vector field in the exchange amplitude between the two sources Jμ and Jμ. Fortunately, a conserved source does not excite the longitudinal mode and the exchange amplitude is uniquely given by the transverse mode,

$$A_{JJ\prime}^{{\rm{massless}}} = \int {{{\rm{d}}^4}x(A_\mu ^ \bot + {\partial _\mu}\chi)J{\prime ^\mu}} = \int {{{\rm{d}}^4}xJ{\prime ^\mu}{1 \over {\square \, - \,{m^2}}}{J_\mu}\, .}$$
(2.21)

As a result, the exchange amplitude between two conserved sources is the same in the limit m → 0 no matter whether the vector field is intrinsically massive and propagates 3 dofs or if it is massless and only propagates 2 modes. It is, therefore, impossible to probe the difference between an exactly massive vector field and a massive one with arbitrarily small mass.

Notice that in the massive case no U (1) symmetry is present and the source needs not be conserved. However, the previous argument remains unchanged so long as ∂μJμ goes to zero in the massless limit at least as quickly as the mass itself. If this condition is violated, then the helicity-0 mode ought to be included in the exchange amplitude (2.21). In parallel, in the massless case the non-conserved source provides a new kinetic term for the longitudinal mode which then becomes dynamical.

2.1.3 Abelian Higgs mechanism for electromagnetism

Associated with the absence of an intrinsic discontinuity in the massless limit is the existence of a Higgs mechanism for the vector field whereby the vector field acquires a mass dynamically. As we shall see later, the situation is different for gravity where no equivalent dynamical Higgs mechanism has been discovered to date. Nevertheless, the tools used to describe the Abelian Higgs mechanism and in particular the introduction of a Stückelberg field will prove useful in the gravitational case as well.

To describe the Abelian Higgs mechanism, we start with a vector field with associated Maxwell tensor Fμν and a complex scalar field ϕ with quartic potential

$${{\mathcal L}_{{\rm{AH}}}} = - {1 \over 4}F_{\mu v}^2 - {1 \over 2}\left({{{\mathcal D}_\mu}\phi} \right){\left({{{\mathcal D}^\mu}\phi} \right)^{\ast}} - \lambda {\left({\phi {\phi ^{\ast}} - \Phi _0^2} \right)^2} .$$
(2.22)

The covariant derivative, \({{\mathcal D}_\mu} = {\partial _\mu} - iq{A_\mu}\) ensures the existence of the U (1) symmetry, which in addition to (2.11) shifts the scalar field as

$$\phi \rightarrow \phi {e^{iq\xi}}\,.$$
(2.23)

Splitting the complex scalar field ϕ into its norm and phase ϕ = φe, we see that the covariant derivative plays the role of the mass term for the vector field, when scalar field acquires a non-vanishing vacuum expectation value (vev),

$${{\mathcal L}_{{\rm{AH}}}} = - {1 \over 4}F_{\mu v}^2 - {1 \over 2}{\varphi ^2}{\left({q{A_\mu} - {\partial _\mu}\chi} \right)^2} - {1 \over 2}{({\partial _\mu}\varphi)^2} - \lambda {\left({{\varphi ^2} - \Phi _0^2} \right)^2}\, .$$
(2.24)

The Higgs field φ can be made arbitrarily massive by setting λ ≫ 1 in such a way that its dynamics may be neglected and the field can be treated as frozen at φ ≡ Φ0 = const. The resulting theory is that of a massive vector field,

$${{\mathcal L}_{{\rm{AH}}}} = - {1 \over 4}F_{\mu v}^2 - {1 \over 2}\Phi _0^2{\left({q{A_\mu} - {\partial _\mu}\chi} \right)^2}\, ,$$
(2.25)

where the phase χ of the complex scalar field plays the role of a Stückelberg which restores the U (1) gauge symmetry in the massive case,

$${A_\mu} \rightarrow {A_\mu} + \;{\partial _\mu}\xi (x)$$
(2.26)
$$\chi \rightarrow \chi + q \xi (x)\, .$$
(2.27)

In this formalism, the U (1) gauge symmetry is restored at the price of introducing explicitly a Stückelberg field which transforms in such a way so as to make the mass term invariant. The symmetry ensures that the vector field Aμ propagates only 2 dofs, while the Stückelberg χ propagates the third dof. While no equivalent to the Higgs mechanism exists for gravity, the same Stückelberg trick to restore the symmetry can be used in that case. Since the in that context the symmetry broken is coordinate transformation invariance, (full diffeomorphism invariance or covariance), four Stückelberg fields should in principle be included in the context of massive gravity, as we shall see below.

2.1.4 Interacting spin-1 fields

Now that we have introduced the notion of a massless and a massive spin-1 field, let us look at N interacting spin-1 fields. We start with N free and massless gauge fields, \(A_\mu ^{(a)}\), with a = 1, ⋯, N, and respective Maxwell tensors \(F_{\mu v}^{(a)} = {\partial _\mu}{A^{(a)}} - {\partial _v}A_\mu ^{(a)}\),

$${\mathcal L}_{{\rm{kin}}}^{{\rm{N}}\, {\rm{spin}} - {\rm{1}}} = - {1 \over 4}\sum\limits_{a = 1}^N {{{\left({F_{\mu v}^{(a)}} \right)}^2}} .$$
(2.28)

The theory is then manifestly Abelian and invariant under N copies of U (1), (i.e., the symmetry group is U (1)N which is Abelian as opposed to U (N) which would correspond to a Yang-Mills theory and would not be Abelian).

However, in addition to these N gauge invariances, the kinetic term is invariant under global rotations in field space,

$$A_\mu ^{(a)} \rightarrow \tilde A_\mu ^{(a)} = O_{\;b}^aA_\mu ^{(b)},$$
(2.29)

where \(O_b^a\) is a (global) rotation matrix. Now let us consider some interactions between these different fields. At the linear level (quadratic level in the action), the most general set of interactions is

$${{\mathcal L}_{{\rm{int}}}} = - {1 \over 2}\sum\limits_{a,b} {{{\mathcal I}_{ab}}A_\mu ^{(a)}A_\nu ^{(b)}{\eta ^{\mu \nu}}} ,$$
(2.30)

where \({{\mathcal I}_{ab}}\) is an arbitrary symmetric matrix with constant coefficients. For an arbitrary rank-N matrix, all N copies of U (1) are broken, and the theory then propagates N additional helicity-0 modes, for a total of 3N independent polarizations in four spacetime dimensions. However, if the rank r of \({\mathcal I}\) is r < N, i.e., if some of the eigenvalues of \({\mathcal I}\) vanish, then there are Nr special directions in field space which receive no interactions, and the theory thus keeps N − r independent copies of U (1). The theory then propagates r massive spin-1 fields and N − r massless spin-2 fields, for a total of 3N − r independent polarizations in four dimensions.

We can see this statement more explicitly in the case of N spin-1 fields by diagonalizing the mass matrix \({\mathcal I}\). A mentioned previously, the kinetic term is invariant under field space rotations, (2.29), so one can use this freedom to work in a field representation where the mass matrix I is diagonal,

$${{\mathcal I}_{ab}} = {\rm{diag}}\left({m_1^2, \cdots ,m_N^2} \right).$$
(2.31)

In this representation the gauge fields are the mass eigenstates and the mass spectrum is simply given by the eigenvalues of \({{\mathcal I}_{ab}}\).

2.2 Spin-2 field

As we have seen in the case of a vector field, as long as it is local and Lorentz-invariant, the kinetic term is uniquely fixed by the requirement that no ghost be present. Moving now to a spin-2 field, the same argument applies exactly and the Einstein-Hilbert term appears naturally as the unique kinetic term free of any ghost-like instability. This is possible thanks to a symmetry which projects out all unwanted dofs, namely diffeomorphism invariance (linear diffs at the linearized level, and non-linear diffs/general covariance at the non-linear level).

2.2.1 Einstein-Hilbert kinetic term

We consider a symmetric Lorentz tensor field hμν. The kinetic term can be decomposed into four possible local contributions (assuming Lorentz invariance and ignoring terms which are equivalent upon integration by parts):

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin - 2}}} = {1 \over 2}\;{\partial ^\alpha}{h^{\mu \nu}}\left({{b_1}{\partial _\alpha}{h_{\mu \nu}} + 2{b_2}{\partial _{(\mu}}{h_{\nu)\alpha}} + {b_3}{\partial _\alpha}h{\eta _{\mu \nu}} + 2{b_4}{\partial _{(\mu}}h{\eta _{\nu)\alpha}}} \right),$$
(2.32)

where b1,2,3,4 are dimensionless coefficients which are to be determined in the same way as for the vector field. We split the 10 components of the symmetric tensor field hμν into a transverse tensor \(h_{\mu v}^T\) (which carries 6 components) and a vector field χμ (which carries 4 components),

$${h_{\mu \nu}} = h_{\mu \nu}^T + 2{\partial _{(\mu}}{\chi _{\nu)}}.$$
(2.33)

Just as in the case of the spin-1 field, an arbitrary kinetic term of the form (2.32) with untuned coefficients bi would contain higher derivatives for χμ which in turn would imply a ghost. As we shall see below, avoiding a ghost within the kinetic term automatically leads to gauge-invariance. After substitution of hμν in terms of \(h_{\mu v}^T\) and χμ, the potentially dangerous parts are

$$\begin{array}{*{20}c} {{\mathcal L}_{{\rm{kin}}}^{{\rm{spin - 2}}} \supset ({b_1} + {b_2}){\chi ^\mu}{\square ^2}{\chi _\mu} + ({b_1} + 3{b_2} + 2{b_3} + 4{b_4}){\chi ^\mu}\square {\partial _\mu}{\partial _\nu}{\chi ^\nu}} \\ {- 2{h^{T\mu \nu}}\left({({b_2} + {b_4}){\partial _\mu}{\partial _\nu}{\partial _\alpha}{\chi ^\alpha} + ({b_1} + {b_2}){\partial _\mu}\square {\chi _\mu}} \right.} \\ {\left. {+ ({b_3} + {b_4})\square {\partial _\alpha}{\chi ^\alpha}{\eta _{\mu v}}} \right).} \end{array}$$
(2.34)

Preventing these higher derivative terms from arising sets

$${b_4} = - {b_3} = - {b_2} = {b_1},$$
(2.35)

or in other words, the unique (local and Lorentz-invariant) kinetic term one can write for a spin-2 field is the Einstein-Hilbert term

$${\mathcal L}_{{\rm{kin}}}^{{\rm{spin}} - {\rm{2}}} = - {1 \over 4}{h^{\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} = - {1 \over 4}{h^{T\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}h_{\alpha \beta}^T,$$
(2.36)

where \(\hat \varepsilon\) is the Lichnerowicz operator

$$\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} = - {1 \over 2}\left({\square {h_{\mu \nu}} - 2{\partial _{(\mu}}{\partial _\alpha}h_{\nu)}^\alpha + {\partial _\mu}{\partial _\nu}h - {\eta _{\mu \nu}}(\square h - {\partial _\alpha}{\partial _\beta}{h^{\alpha \beta}})} \right),$$
(2.37)

and we have set b1 = −1/4 to follow standard conventions. As a result, the kinetic term for the tensor field is invariant under the following gauge transformation,

$${h_{\mu \nu}} \rightarrow {h_{\mu \nu}} + {\partial _{(\mu}}{\xi _{\nu)}}.$$
(2.38)

We emphasize that the form of the kinetic term and its gauge invariance is independent on whether or not the tensor field has a mass, (as long as we restrict ourselves to a local and Lorentz-invariant kinetic term). However, just as in the case of a massive vector field, this gauge invariance cannot be maintained by a mass term or any other self-interacting potential. So only in the massless case, does this symmetry remain exact. Out of the 10 components of a tensor field, the gauge symmetry removes 2 × 4 = 8 of them, leaving a massless tensor field with only two propagating dofs as is well known from the propagation of gravitational waves in four dimensions.

In d ≥ 3 spacetime dimensions, gravitational waves have d (d +1)/2−2d = d (d −3)/2 independent polarizations. This means that in three dimensions there are no gravitational waves and in five dimensions they have five independent polarizations.

2.2.2 Fierz-Pauli mass term

As seen in seen in Section 2.2.1, for a local and Lorentz-invariant theory, the linearized kinetic term is uniquely fixed by the requirement that longitudinal modes propagate no ghost, which in turn prevents that operator from exciting these modes altogether. Just as in the case of a massive spin-1 field, we shall see in what follows that the longitudinal modes can nevertheless be excited when including a mass term. In what follows we restrict ourselves to linear considerations and spare any non-linearity discussions for Parts I and II. See also [327] for an analysis of the linearized Fierz-Pauli theory using Bardeen variables.

In the case of a spin-2 field hμν, we are a priori free to choose between two possible mass terms \(h_{\mu v}^2\) and h2, so that the generic mass term can be written as a combination of both,

$${{\mathcal L}_{{\rm{mass}}}} = - {1 \over 8}{m^2}\left({h_{\mu v}^2 - A{h^2}} \right),$$
(2.39)

where A is a dimensionless parameter. Just as in the case of the kinetic term, the stability of the theory constrains very strongly the phase space and we shall see that only for α = 1 is the theory stable at that order. The presence of this mass term breaks diffeomorphism invariance. Restoring it requires the introduction of four Stückelberg fields χμ which transform under linear diffeomorphisms in such a way as to make the mass term invariant, just as in the Abelian-Higgs mechanism for electromagnetism [174]. Including the four linearized Stückelberg fields, the resulting mass term

$${{\mathcal L}_{{\rm{mass}}}} = - {1 \over 8}{m^2}\left({{{({h_{\mu v}} + 2{\partial _{(\mu}}{\chi _{\nu)}})}^2} - A{{(h + 2{\partial _\alpha}{\chi ^\alpha})}^2}} \right),$$
(2.40)

is invariant under the simultaneous transformations:

$${h_{\mu v}} \rightarrow {h_{\mu v}} + {\partial _{(\mu}}{\xi _{v)}}\,,$$
(2.41)
$${\chi _\mu} \rightarrow {\chi _\mu} - {1 \over 2}\xi \mu .$$
(2.42)

This mass term then provides a kinetic term for the Stückelberg fields

$${\mathcal L}_{{\rm{kin}}}^\chi = - {1 \over 2}{m^2}\left({{{({\partial _\mu}{\chi _\nu})}^2} - A{{({\partial _\alpha}{\chi ^\alpha})}^2}} \right),$$
(2.43)

which is precisely of the same form as the kinetic term considered for a spin-1 field (2.1) in Section 2.1.1 with a3 = 0 and a2 = Aa1. Now the same logic as in Section 2.1.1 applies and singling out the longitudinal component of these Stückelberg fields it follows that the only combination which does not involve higher derivatives is a2 = a1 or in other words A = 1. As a result, the only possible mass term one can consider which is free from an Ostrogradsky instability is the Fierz-Pauli mass term

$${{\mathcal L}_{{\rm{FP\,mass}}}} = - {1 \over 8}{m^2}\left({{{({h_{\mu v}} + 2{\partial _{(\mu}}{\chi _{\nu)}})}^2} - {{(h + 2{\partial _\alpha}{\chi ^\alpha})}^2}} \right).$$
(2.44)

In unitary gauge, i.e., in the gauge where the Stückelberg fields χa are set to zero, the Fierz-Pauli mass term simply reduces to

$${{\mathcal L}_{{\rm{FP\,mass}}}} = - {1 \over 8}{m^2}\left({h_{\mu v}^2 - {h^2}} \right),$$
(2.45)

where once again the indices are raised and lowered with respect to the Minkowski metric.

2.2.2.1 Propagating degrees of freedom

To identify the propagating degrees of freedom we may split further into a transverse and a longitudinal mode,

$${\chi ^a} = {1 \over m}{A^a} + {1 \over {{m^2}}}{\eta ^{ab}}{\partial _b}\pi ,$$
(2.46)

(where the normalization with negative factors of m has been introduced for further convenience).

In terms of hμν and the Stückelberg fields and π the linearized Fierz-Pauli action is

$$\begin{array}{*{20}c} {{\mathcal L}_{{\rm{FP}}}} = - {1 \over 4}{h^{\mu v}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {1 \over 2}{h^{\mu v}}({\Pi _{\mu v}} - [\Pi ]{\eta _{\mu v}}) - {1 \over 8}F_{\mu v}^2 \\- {1 \over 8}{m^2}(h_{\mu v}^2 - {h^2}) - {1 \over 2}m({h^{\mu v}} - h{\eta ^{\mu v}}){\partial _{(\mu}}{A_{v)}}, \end{array}$$
(2.47)

with Fμν = μAννAμ and Πμν = μνπ and all the indices are raised and lowered with respect to the Minkowski metric.

Terms on the first line represent the kinetic terms for the different fields while the second line represent the mass terms and mixing.

We see that the kinetic term for the field π is hidden in the mixing with hμν. To make the field content explicit, we may diagonalize this mixing by shifting \({h_{\mu v}} = {\tilde h_{\mu v}} + \pi {\eta _{\mu v}}\) and the linearized Fierz-Pauli action is

$$\begin{array}{*{20}c} {{\mathcal L_{{\rm{FP}}}} = - {1 \over 4}{{\tilde h}^{\mu v}}\hat E_{\mu \nu}^{\alpha \beta}{{\tilde h}_{\alpha \beta}} - {3 \over 4}{{(\partial \pi)}^2} - {1 \over 8}F_{\mu v}^2\quad \quad \quad \quad} \\{- {1 \over 8}{m^2}(\tilde h_{\mu v}^2 - {{\tilde h}^2}) + {3 \over 2}{m^2}{\pi ^2} + {3 \over 2}{m^2}\pi \tilde h} \\{\quad - {1 \over 2}{m^2}({{\tilde h}^{\mu v}} - \tilde h{\eta ^{\mu v}}){\partial _{(\mu}}{A_{v)}} + 3m\pi {\partial _\alpha}{A^\alpha}.} \\ \end{array}$$
(2.48)

This decomposition allows us to identify the different degrees of freedom present in massive gravity (at least at the linear level): hμν represents the helicity-2 mode as already present in GR and propagates 2 dofs, Aμ represents the helicity-1 mode and propagates 2 dofs, and finally π represents the helicity-0 mode and propagates 1 dof, leading to a total of five dofs as is to be expected for a massive spin-2 field in four dimensions.

The degrees of freedom have not yet been split into their mass eigenstates but on doing so one can easily check that all the degrees of freedom have the same positive mass square m2.

Most of the phenomenology and theoretical consistency of massive gravity is related to the dynamics of the helicity-0 mode. The coupling to matter occurs via the coupling \({h_{\mu v}}{T^{\mu u}} = {\tilde h_{\mu v}}{T^{\mu v}} + \pi T\), where T is the trace of the external stress-energy tensor. We see that the helicity-0 mode couples directly to conserved sources (unlike in the case of the Proca field) but the helicity-1 mode does not. In most of what follows we will thus be able to ignore the helicity-1 mode.

2.2.2.2 Higgs mechanism for gravity

As we shall see in Section 9.1, the graviton mass can also be promoted to a scalar function of one or many other fields (for instance of a different scalar field), m = m (ψ). We can thus wonder whether a dynamical Higgs mechanism for gravity can be considered where the field(s) ψ start in a phase for which the graviton mass vanishes, m (ψ) = 0 and dynamically evolves to acquire a non-vanishing vev for which m (ψ) ≠ 0. Following the same logic as the Abelian Higgs for electromagnetism, this strategy can only work if the number of dofs in the massless phase m = 0 is the same as that in the massive case m ≠ 0. Simply promoting the mass to a function of an external field is thus not sufficient since the graviton helicity-0 and -1 modes would otherwise be infinitely strongly coupled as m → 0.

To date no candidate has been proposed for which the graviton mass could dynamically evolve from a vanishing value to a finite one without falling into such strong coupling issues. This does not imply that Higgs mechanism for gravity does not exist, but as yet has not been found. For instance on AdS, there could be a Higgs mechanism as proposed in [431], where the mass term comes from integrating out some conformal fields with slightly unusual (but not unphysical) ‘transparent’ boundary conditions. This mechanism is specific to AdS and to the existence of time-like boundary and would not apply on Minkowski or dS.

2.2.3 Van Dam-Veltman-Zakharov discontinuity

As in the case of spin-1, the massive spin-2 field propagates more dofs than the massless one. Nevertheless, these new excitations bear no observational signatures for the spin-1 field when considering an arbitrarily small mass, as seen in Section 2.1.2. The main reason for that is that the helicity-0 polarization of the photon couple only to the divergence of external sources which vanishes for conserved sources. As a result no external sources directly excite the helicity-0 mode of a massive spin-1 field. For the spin-2 field, on the other hand, the situation is different as the helicity-0 mode can now couple to the trace of the stress-energy tensor and so generic sources will excite not only the 2 helicity-2 polarization of the graviton but also a third helicity-0 polarization, which could in principle have dramatic consequences. To see this more explicitly, let us compute the gravitational exchange amplitude between two sources Tμν and T′μν in both the massive and massless gravitational cases.

In the massless case, the theory is diffeomorphism invariant. When considering coupling to external sources, of the form hμνTμν, we thus need to ensure that the symmetry be preserved, which implies that the stress-energy tensor Tμν should be conserved μTμν = 0. When computing the gravitational exchange amplitude between two sources we thus restrict ourselves to conserved ones. In the massive case, there is a priori no reasons to restrict ourselves to conserved sources, so long as their divergences cancel in the massless limit m → 0.

2.2.3.1 Massive spin-2 field

Let us start with the massive case, and consider the response to a conserved external source Tμν,

$${\mathcal L} = - {1 \over 4}{h^{\mu {\rm{v}}}}\tilde {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {{{m^2}} \over 8}(h_{\mu v}^2 - {h^2}) + {1 \over {2{M_{{\rm{Pl}}}}}}{h_{\mu \nu}}{T^{\mu \nu}}.$$
(2.49)

The linearized Einstein equation is then

$$\tilde {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {1 \over 2}{m^2}({h_{\mu v}} - h{\eta _{\mu v}}) = {1 \over {{M_{{\rm{Pl}}}}}}{T_{\mu v}}\,.$$
(2.50)

To solve this modified linearized Einstein equation for hμν we consider the trace and the divergence separately,

$$h = - {1 \over {3{m^2}{M_{{\rm{Pl}}}}}}\left({T + {2 \over {{m^2}}}{\partial _\alpha}{\partial _\beta}{T^{\alpha \beta}}} \right)$$
(2.51)
$${\partial _\mu}h_v^\mu = {1 \over {{m^2}{M_{{\rm{pl}}}}}}\left({{\partial _\mu}T_v^\mu + {1 \over 3}{\partial _\nu}T + {2 \over {3{m^2}}}{\partial _\nu}{\partial _\alpha}{\partial _\beta}{T^{\alpha \beta}}} \right).$$
(2.52)

As is already apparent at this level, the massless limit m → 0 is not smooth which is at the origin of the vDVZ discontinuity (for instance we see immediately that for a conserved source the linearized Ricci scalar vanishes dμdνhμν − □h = 0 see Refs. [465, 497]. This linearized vDVZ discontinuity was recently repointed out in [193].) As has been known for many decades, this discontinuity (or the fact that the Ricci scalar vanishes) is an artefact of the linearized theory and is resolved by the Vainshtein mechanism [463] as we shall see later.

Plugging these expressions back into the modified Einstein equation, we get

$$\begin{array}{*{20}c} {\left({\square - {m^2}} \right){h_{\mu {\rm{v}}}} = - {1 \over {{M_{{\rm{pl}}}}}}\left[ {{T_{\mu v}} - {1 \over 3}T{\eta _{\mu v}} - {2 \over {{m^2}}}{\partial _{(\mu}}{\partial _\alpha}T_{\nu)}^\alpha + {1 \over {3{m^2}}}{\partial _\mu}{\partial _\nu}T} \right.\quad \quad \quad \quad \quad \quad \quad \quad} \\{\left. {+ {1 \over {3{m^2}}}{\partial _\alpha}{\partial _\beta}{T^{\alpha \beta}}{\eta _{\mu v}} + {2 \over {3{m^4}}}{\partial _\mu}{\partial _\nu}{\partial _\alpha}{\partial _\beta}T} \right]} \end{array}$$
(2.53)
$$= {1 \over {{M_{{\rm{pl}}}}}}\left[ {{{\tilde \eta}_{\mu (\alpha}}{{\tilde \eta}_{\nu \beta)}} - {1 \over 3}{{\tilde \eta}_{\mu v}}\tilde \eta \alpha \beta} \right]{T^{\alpha \beta}},$$
(2.54)

with

$${{\tilde \eta}_{\mu v}} = {\eta _{\mu v}} - {1 \over {{m^2}}}{\partial _\mu}{\partial _\nu}.$$
(2.55)

The propagator for a massive spin-2 field is thus given by

$$G_{\mu \nu \alpha \beta}^{{\rm{massive}}}(x,x\prime) = {{f_{\mu \nu \alpha \beta}^{{\rm{massive}}}} \over {\square - {m^2}}},$$
(2.56)

where \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the polarization tensor,

$$f_{\mu \nu \alpha \beta}^{{\rm{massive}}} = {\tilde \eta _{\mu (\alpha}}{\tilde \eta _{\nu \beta)}} - {1 \over 3}{\tilde \eta _{\mu \nu}}{\tilde \eta _{\alpha \beta}}.$$
(2.57)

In Fourier space we have

$$\begin{array}{*{20}c} {f_{\mu \nu \alpha \beta}^{{\rm{massive}}}({p_\mu},m) = {2 \over {3{m^4}}}{p_\mu}{p_\nu}{p_\alpha}{p_\beta} + {\eta _{\mu (\alpha}}{\eta _{\nu \beta)}} - {1 \over 3}{\eta _{\mu v}}{\eta _{\alpha \beta}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {1 \over {{m^2}}}\left({{p_\alpha}{p_{(\mu}}{\eta _{\nu)\beta}} + {p_\beta}{p_{(\mu}}{\eta _{\nu)\alpha}} - {1 \over 3}{p_\mu}{p_\nu}{\eta _{\alpha \beta}} - {1 \over 3}{p_\alpha}{p_\beta}{\eta _{\mu v}}} \right).} \end{array}$$
(2.58)

The amplitude exchanged between two sources Tμν and Tμν via a massive spin-2 field is thus given by

$${\mathcal A}_{T{T{\prime}}}^{{\rm{massive}}} = \int {{d^4}x\;{h_{\mu \nu}}{T^{{\prime}\mu v}} =} \int {{d^4}x\;{T^{{\prime}\mu v}}{{f_{\mu \nu \alpha \beta}^{{\rm{massive}}}} \over {\square - {m^2}}}\;{T^{\alpha \beta}}} .$$
(2.59)

As mentioned previously, to compare this result with the massless case, the sources ought to be conserved in the massless limit, \({\partial _\mu}T_v^\mu, {\partial _\mu}T_v^{{\mu \prime}} \to 0\) as m → 0. The gravitational exchange amplitude in the massless limit is thus given by

$${\mathcal A}_{TT\prime}^{m \rightarrow 0}\int {{d^4}x\;{{T\prime}^{\mu v}}{1 \over \square}\;\left({{T_{\mu v}} - {1 \over 3}T{\eta _{\mu v}}} \right)} .$$
(2.60)

We now compare this result with the amplitude exchanged by a purely massless graviton.

2.2.3.2 Massless spin-2 field

In the massless case, the equation of motion (2.50) reduces to the linearized Einstein equation

$$\tilde {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} = {1 \over {{M_{{\rm{Pl}}}}}}{T_{\mu v}},$$
(2.61)

where diffeomorphism invariance requires the stress-energy to be conserved, \({\partial _\mu}T_v^\mu = 0\). In this case the transverse part of this equation is trivially satisfied (as a consequence of the Bianchi identity which follows from symmetry). Since the theory is invariant under diffeomorphism transformations (2.38), one can choose a gauge of our choice, for instance de Donder (or harmonic) gauge

$${\partial _\mu}h_v^\mu = {1 \over 2}{p_\nu}.$$
(2.62)

In de Donder gauge, the Einstein equation then reduces to

$$(\square - {m^2}){h_{\mu v}} = - {2 \over {{M_{{\rm{Pl}}}}}}\left({{T_{\mu v}} - {1 \over 2}T{\eta _{\mu v}}} \right).$$
(2.63)

The propagator for a massless spin-2 field is thus given by

$$G_{\mu \nu \alpha \beta}^{{\rm{massless}}} = {{f_{\mu \nu \alpha \beta}^{{\rm{massless}}}} \over \square},$$
(2.64)

where \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the polarization tensor,

$$f_{\mu \nu \alpha \beta}^{{\rm{massless}}} = {\eta _{\mu (\alpha}}{\eta _{\nu \beta)}} - {1 \over 2}{\eta _{\mu \nu}}{\eta _{\alpha \beta}}.$$
(2.65)

The amplitude exchanged between two sources Tμν and T′μν, via a genuinely massless spin-2 field is thus given by

$${\mathcal A}_{TT\prime}^{{\rm{massless}}} = - {2 \over {{M_{{\rm{Pl}}}}}}\int {{{\rm{d}}^4}x\;{{T\prime}^{\mu v}}{1 \over \square}\;\left({{T_{\mu v}} - {1 \over 2}T{\eta _{\mu v}}} \right)} ,$$
(2.66)

and differs from the result (2.60) in the small mass limit. This difference between the massless limit of the massive propagator and the massless propagator (and gravitational exchange amplitude) is a well-known fact and was first pointed out by van Dam, Veltman and Zakharov in 1970 [465, 497]. The resolution to this ‘problem’ lies within the Vainshtein mechanism [463]. In 1972, Vainshtein showed that a theory of massive gravity becomes strongly coupled a low energy scale when the graviton mass is small. As a result, the linear theory is no longer appropriate to describe the theory in the limit of small mass and one should keep track of the non-linear interactions (very much as what we do when approaching the Schwarzschild radius in GR.) We shall see in Section 10.1 how a special set of interactions dominate in the massless limit and are responsible for the screening of the extra degrees of freedom present in massive gravity.

Another ‘non-GR’ effect was also recently pointed out in Ref. [280] where a linear analysis showed that massive gravity predicts different spin-orientations for spinning objects.

2.3 From linearized diffeomorphism to full diffeomorphism invariance

When considering the massless and non-interactive spin-2 field in Section 2.2.1, the linear gauge invariance (2.38) is exact. However, if this field is to be probed and communicates with the rest of the world, the gauge symmetry is forced to include non-linear terms which in turn forces the kinetic term to become fully non-linear. The result is the well-known fully covariant Einstein-Hilbert term \(M_{{\rm{Pl}}}^2\sqrt {- gR}\), where R is the scalar curvature associated with the metric gμν, = ημν + hμν/Mpl.

To see this explicitly, let us start with the linearized theory and couple it to an external source \(T_0^{\mu v}\), via the coupling

$${\mathcal L}_{{\rm{matter}}}^{{\rm{linear}}} = {1 \over {2{M_{{\rm{pl}}}}}}{h_{\mu \nu}}T_0^{\mu \nu}.$$
(2.67)

This coupling preserves diffeomorphism invariance if the source is conserved, \({\partial _\mu}T_0^{\mu v} = 0\). To be more explicit, let us consider a massless scalar field φ which satisfies the Klein-Gordon equation □φ = 0. A natural choice for the stress-energy tensor Tμν is then

$$T_0^{\mu v} = {\partial ^\mu}\varphi {\partial ^\nu}\varphi - {1 \over 2}{(\partial \varphi)^2}{\eta ^{\mu v}},$$
(2.68)

so that the Klein-Gordon equation automatically guarantees the conservation of the stress-energy tensor on-shell at the linear level and linearized diffeomorphism invariance. However, the very coupling between the scalar field and the spin-2 field affects the Klein-Gordon equation in such a way that beyond the linear order, the stress-energy tensor given in (2.68) fails to be conserved. When considering the coupling (2.67), the Klein-Gordon equation receives corrections of the order of hμν/Mpl

$$\square \varphi = {1 \over {{M_{{\rm{Pl}}}}}}\left({{\partial ^\alpha}({h_{\alpha \beta}}{\partial ^\beta}\varphi) - {1 \over 2}{\partial _\alpha}(h_\beta ^\beta {\partial ^\alpha}\varphi)} \right),$$
(2.69)

implying a failure of conservation of \(T_0^{\mu v}\) at the same order,

$${\partial _\mu}T_0^{\mu v} = {{{\partial ^\nu}\varphi} \over {{M_{{\rm{Pl}}}}}}\left({{\partial ^\alpha}({h_{\alpha \beta}}{\partial ^\beta}\varphi) - {1 \over 2}{\partial _\alpha}(h_\beta ^\beta {\partial ^\alpha}\varphi)} \right).$$
(2.70)

The resolution is of course to include non-linear corrections in h/MPl in the coupling with external matter,

$${{\mathcal L}_{{\rm{matter}}}} = {1 \over {2{M_{{\rm{Pl}}}}}}{h_{\mu \nu}}T_0^{\mu \nu} + {1 \over {2M_{{\rm{Pl}}}^2}}{h_{\mu \nu}}{h_{\alpha \beta}}T_1^{\mu \nu \alpha \beta} + \cdots ,$$
(2.71)

and promote diffeomorphism invariance to a non-linearly realized gauge symmetry, symbolically,

$$h \rightarrow h + \partial \xi + {1 \over {{M_{{\rm{Pl}}}}}}\partial (h\xi) + \cdots ,$$
(2.72)

so this gauge invariance is automatically satisfied on-shell order by order in h/Mpl, i.e., the scalar field (or general matter field) equations of motion automatically imply the appropriate relation for the stress-energy tensor to all orders in h/MPl. The resulting symmetry is the well-known fully non-linear coordinate transformation invariance (or full diffeomorphism invariance or covarianceFootnote 4), which requires the stress-energy tensor to be covariantly conserved. To satisfy this symmetry, the kinetic term (2.36) should then be promoted to a fully non-linear contribution,

$${\mathcal L}_{{\rm{kin}}\,{\rm{linear}}}^{{\rm{spin}} - {\rm{2}}} = - {1 \over 4}{h^{\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}}\quad \rightarrow \quad {\mathcal L}_{{\rm{kin}}\;{\rm{covariant}}}^{{\rm{spin}} - {\rm{2}}} = {{M_{{\rm{Pl}}}^2} \over 2}\sqrt {- g} R[g].$$
(2.73)

Just as the linearized version \({h^{\mu v}}\hat \varepsilon _{\mu v}^{\alpha \beta}{h_{\alpha \beta}}\) was unique, the non-linear realization \(\sqrt {- g} R\) is also unique.Footnote 5 As a result, any theory of an interacting spin-2 field is necessarily fully non-linear and leads to the theory of gravity where non-linear diffeomorphism invariance (or covariance) plays the role of the local gauge symmetry that projects out four out of the potential six degrees of freedom of the graviton and prevents the excitation of any ghost by the kinetic term.

The situation is very different from that of a spin-1 field as seen earlier, where coupling with other fields can be implemented at the linear order without affecting the U (1) gauge symmetry. The difference is that in the case of a U (1) symmetry, there is a unique nonlinear completion of that symmetry, i.e., the unique nonlinear completion of a U (1) is nothing else but a U (1). Thus any nonlinear Lagrangian which preserves the full U (1) symmetry will be a consistent interacting theory. On the other hand, for spin-2 fields, there are two, and only two ways to nonlinearly complete linear diffs, one as linear diffs in the full theory and the other as full non-linear diffs. While it is possible to write self-interactions which preserve linear diffs, there are no interactions between matter and hμν. which preserve linear diffs. Thus any theory of gravity must exhibit full nonlinear diffs and is in this sense what leads us to GR.

2.4 Non-linear Stückelberg decomposition

2.4.1 On the need for a reference metric

We have introduced the spin-2 field hμν as the perturbation about flat spacetime. When considering the theory of a field of given spin it is only natural to work with Minkowski as our spacetime metric, since the notion of spin follows from that of Poincaré invariance. Now when extending the theory non-linearly, we may also extend the theory about different reference metric. When dealing with a reference metric different than Minkowski, one loses the interpretation of the field as massive spin-2, but one can still get a consistent theory. One could also wonder whether it is possible to write a theory of massive gravity without the use of a reference metric at all. This interesting question was investigated in [75], where it shown that the only consistent alternative is to consider a function of the metric determinant. However, as shown in [75], the consistent function of the determinant is the cosmological constant and does not provide a mass for the graviton.

2.4.2 Non-linear Stückelberg

Full diffeomorphism invariance (or covariance) indicates that the theory should be built out of scalar objects constructed out of the metric gμν and other tensors. However, as explained previously a theory of massive gravity requires the notion of a reference metricFootnote 6 fμν (which may be Minkowski fμν = ημν) and at the linearized level, the mass for gravity was not built out of the full metric gμν, but rather out of the fluctuation hμν about this reference metric which does not transform as a tensor under general coordinate transformations. As a result the mass term breaks covariance.

This result is already transparent at the linear level where the mass term (2.39) breaks linearized diffeomorphism invariance. Nevertheless, that gauge symmetry can always be ‘formally’ restored using the Stückelberg trick which amounts to replacing the reference metric (so far we have been working with the flat Minkowski metric as the reference), to

$${\eta _{\mu \nu}} \rightarrow ({\eta _{\mu \nu}} - {2 \over {{M_{{\rm{Pl}}}}}}{\partial _{(\mu}}{\chi _{\nu)}}),$$
(2.74)

and transforming χμ under linearized diffeomorphism in such a way that the combination hμν − 2∂(μχν) remains invariant. Now that the symmetry is non-linearly realized and replaced by general covariance, this Stückelberg trick should also be promoted to a fully covariant realization.

Following the same Stückelberg trick non-linearly, one can ‘formally restore’ covariance by including four Stückelberg fields ϕa (a = 0, 1, 2, 3) and promoting the reference metric fμν, which may of may not be Minkowski, to a tensor [446, 27],

$${f_{\mu \nu}} \rightarrow {\tilde f_{\mu \nu}} = {\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^b}{f_{ab}}$$
(2.75)

As we can see from this last expression, \({{\tilde f}_{\mu v}}\), transforms as a tensor under coordinate transformations as long as each of the four fields ϕa transform as scalars. We may now construct the theory of massive gravity as a scalar Lagrangian of the tensors \({{\tilde f}_{\mu v}}\) and gμν. In unitary gauge, where the Stückelberg fields are ϕa = xa, we simply recover \({{\tilde f}_{\mu v}} = {f_{\mu v}}\).

This Stückelberg trick for massive gravity dates already from Green and Thorn [267] and from Siegel [446], introduced then within the context of open string theory. In the same way as the massless graviton naturally emerges in the closed string sector, open strings also have spin-2 excitations but whose lowest energy state is massive at tree level (they only become massless once quantum corrections are considered). Thus at the classical level, open strings contain a description of massive excitations of a spin-2 field, where gauge invariance is restored thanks to same Stückelberg fields as introduced in this section. In open string theory, these Stückelberg fields naturally arise from the ghost coordinates. When constructing the non-linear theory of massive gravity from extra dimension, we shall see that in that context the Stückelberg fields naturally arise at the shift from the extra dimension.

For later convenience, it will be useful to construct the following tensor quantity,

$${\mathbb X}_v^\mu = {g^{\mu \alpha}}{\tilde f_{\alpha \nu}} = {\partial ^\mu}{\phi ^a}{\partial _\nu}{\phi ^b}{f_{ab}},$$
(2.76)

in unitary gauge, \({\mathbb X} = {g^{- 1}}f\).

2.4.3 Alternative Stückelberg trick

An alternative way to Stückelberize the reference metric is to express it as

$${g^{ac}}{f_{cb}} \rightarrow {\mathbb Y}_b^a = {g^{\mu \nu}}{\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^c}{f_{cb}}.$$
(2.77)

As nicely explained in Ref. [14], both matrices \(X_v^\mu\) and \(Y_b^a\) have the same eigenvalues, so one can choose either one of them in the definition of the massive gravity Lagrangian without any distinction. The formulation in terms of Y rather than X was originally used in Ref. [94], although unsuccessfully as the potential proposed there exhibits the BD ghost instability, (see for instance Ref. [60]).

2.4.4 Helicity decomposition

If we now focus on the flat reference metric, fμν = ημν, we may further split the Stückelberg fields as \({\phi ^a} = {x^a} - {1 \over {{M_{{\rm{Pl}}}}}}{{\mathcal X}^a}\) and identify the index a with a Lorentz index,Footnote 7 we obtain the non-linear generalization of the Stückelberg trick used in Section 2.2.2

$${\eta _{\mu v}} \rightarrow {\tilde f_{\mu v}} = {\eta _{\mu v}} - {2 \over {{M_{{\rm{Pl}}}}}}{\partial _{(\mu}}{\chi _{\nu)}} + {1 \over {M_{{\rm{Pl}}}^2}}{\partial _\mu}{\chi ^a}{\partial _\nu}{\chi ^b}{\eta _{ab}}$$
(2.78)
$$\begin{array}{*{20}c} = {\eta _{\mu v}} - {2 \over {{M_{{\rm{Pl}}}}m}}{\partial _{(\mu}}{A_{\nu)}} - {2 \over {{M_{{\rm{Pl}}}}{m^2}}}{\Pi _{\mu v}}{\quad \quad \quad \quad \quad \quad \quad} \\{{\rm{+}}{1 \over {M_{{\rm{Pl}}}^2{m^2}}}{\partial _\mu}{A^\alpha}{\partial _\nu}{\mathcal A_\alpha} + {2 \over {M_{{\rm{Pl}}}^2{m^3}}}{\partial _\mu}{A^\alpha}{\Pi _{\nu \alpha}} + {1 \over {M_{{\rm{Pl}}}^2{m^4}}}\Pi _{\mu v}^2.} \end{array}$$
(2.79)

where in the second equality we have used the split performed in (2.46) of in terms of the helicity-0 and -1 modes and all indices are raised and lowered with respect to ημν.

In other words, the fluctuations about flat spacetime are promoted to the tensor Hμν

$${h_{\mu v}} = {M_{{\rm{Pl}}}}\left({{g_{\mu v}} - {\eta _{\mu v}}} \right)\quad \rightarrow \quad {H_{\mu \nu}} = {M_{{\rm{Pl}}}}\left({{g_{\mu \nu}} - {{\tilde f}_{\mu \nu}}} \right)$$
(2.80)

with

$${H_{\mu v}} = {h_{\mu v}} + 2{\partial _{(\mu}}{\chi _{\nu)}} - {1 \over {{M_{{\rm{Pl}}}}}}{\eta _{ab}}{\partial _\mu}{\chi ^a}{\partial _\nu}{\chi ^b}$$
(2.81)
$$\begin{array}{*{20}c} {= {h_{\mu v}} + {2 \over m}{\partial _{(\mu}}{A_{\nu)}} + {2 \over {{m^2}}}{\Pi _{\mu v}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\{- {1 \over {{M_{{\rm{Pl}}}}{m^2}}}{\partial _\mu}{A^\alpha}{\partial _\nu}{\mathcal A_\alpha} - {2 \over {{M_{{\rm{Pl}}}}{m^3}}}{\partial _\mu}{A^\alpha}{\Pi _{\nu \alpha}} - {1 \over {{M_{{\rm{Pl}}}}{m^4}}}\Pi _{\mu v}^2.} \end{array}$$
(2.82)

The field are introduced to restore the gauge invariance (full diffeomorphism invariance). We can now always set a gauge where hμν is transverse and traceless at the linearized level and Aμ is transverse. In this gauge the quantities hμν, Aμ; and π represent the helicity decomposition of the metric. hμν is the helicity-2 part of the graviton, Aμ the helicity-1 part and π the helicity-0 part. The fact that these quantities continue to correctly identify the physical degrees of freedom non-linearly in the limit MPl → ∞ is non-trivial and has been derived in [143].

2.4.5 Non-linear Fierz-Pauli

The most straightforward non-linear extension of the Fierz-Pauli mass term is as follows

$$\mathcal L^{\mathrm{(nl1)}}_{\mathrm{FP}}=-m^2 {M_{{\rm{Pl}}}^2}\sqrt{-g} \left([(\mathbb{I}-\mathbb{X})^2]-[\mathbb{I}-\mathbb{X}]^2\right),$$
(2.83)

this mass term is then invariant under non-linear coordinate transformations. This non-linear formulation was used for instance in [27]. Alternatively, one may also generalize the Fierz-Pauli mass non-linearly as follows [75]

$$\mathcal L^{\mathrm{(nl2)}}_{\mathrm{FP}}=-m^2{M_{{\rm{Pl}}}^2}\sqrt{-g}\sqrt{\det \mathbb X} \left([(\mathbb{I}-\mathbb{X}^{-1})^2]-[\mathbb{I}-\mathbb{X}^{-1}]^2\right).$$
(2.84)

A priori, the linear Fierz-Pauli action for massive gravity can be extended non-linearly in an arbitrary number of ways. However, as we shall see below, most of these generalizations generate a ghost non-linearly, known as the Boulware-Deser (BD) ghost. In Part II, we shall see that the extension of the Fierz-Pauli to a non-linear theory free of the BD ghost is unique (up to two constant parameters).

2.5 Boulware-Deser ghost

The easiest way to see the appearance of a ghost at the non-linear level is to follow the Stückelberg trick non-linearly and observe the appearance of an Ostrogradsky instability [111, 173], although the original formulation was performed in unitary gauge in [75] in the ADM language (Arnowitt, Deser and Misner, see Ref. [29]). In this section we shall focus on the flat reference metric, ƒμν = ημν

Focusing solely on the helicity-0 mode π to start with, the tensor \({\mathbb X}_v^\mu\) defined in (2.76) is expressed as

$$\mathbb X_{\;v}^\mu = \delta _{\;v}^\mu - {2 \over {{M_{{\rm{Pl}}}}{m^2}}}\Pi _{\;v}^\mu + {1 \over {M_{{\rm{Pl}}}^2{m^4}}}\Pi _\alpha ^\mu \Pi _\nu ^\alpha ,$$
(2.85)

where at this level all indices are raised and lowered with respect to the flat reference metric ημν. then the Fierz-Pauli mass term (2.83) reads

$$\mathcal L_{{\rm{FP}},\pi}^{({m{nl}}1)} = - {4 \over {{m^2}}}\left({[{\Pi ^2}] - {{[\Pi ]}^2}} \right) + {4 \over {{M_{{\rm{Pl}}}}{m^4}}}\left({[{\Pi ^3}] - [\Pi ][{\Pi ^2}]} \right) + {1 \over {M_{{\rm{Pl}}}^2{m^6}}}\left({[{\Pi ^4}] - {{[{\Pi ^2}]}^2}} \right).$$
(2.86)

Upon integration by parts, we notice that the quadratic term in (2.86) is a total derivative, which is another way to see the special structure of the Fierz-Pauli mass term. Unfortunately this special fact does not propagate to higher order and the cubic and quartic interactions are genuine higher order operators which lead to equations of motion with quartic and cubic derivatives. In other words these higher order operators ([Π3] − [Π][Π2]) and ([Π4] − [Π2]2) propagate an additional degree of freedom which by Ostrogradsky’s theorem, always enters as a ghost. While at the linear level, these operators might be irrelevant, their existence implies that one can always find an appropriate background configuration π = π0 + δπ, such that the ghost is manifest

$${\mathcal L}_{{\rm{FP}},\pi}^{({m{nl}}1)} = {4 \over {{M_{{\rm{Pl}}}}{m^4}}}{Z^{\mu \nu \alpha \beta}}{\partial _\mu}{\partial _\nu}\delta \pi {\partial _\alpha}{\partial _\beta}\delta \pi ,$$
(2.87)

with Zμναβ = 3μαπ0ηνβ − □π0ημαηνβ − 2μνπ0ηαβ + ⋯. This implies that non-linearly (or around a non-trivial background), the Fierz-Pauli mass term propagates an additional degree of freedom which is a ghost, namely the BD ghost. The mass of this ghost depends on the background configuration π0,

$$m_{{\rm{ghost}}}^2 \sim {{{M_{{\rm{Pl}}}}{m^4}} \over {{\partial ^2}{\pi _0}}}.$$
(2.88)

As we shall see below, the resolution of the vDVZ discontinuity lies in the Vainshtein mechanism for which the field takes a large vacuum expectation value, 2π0MPlm2, which in the present context would lead to a ghost with an extremely low mass, \(m_{{\rm{ghost}}}^2 \lesssim {m^2}\).

Choosing another non-linear extension for the Fierz-Pauli mass term as in (2.84) does not seem to help much,

$$\begin{array}{*{20}c} {\mathcal L_{{\rm{FP}}, \pi}^{({m{nl}}2)} = - {4 \over {{m^2}}}\left({[{\Pi ^2}] - {{[\Pi ]}^2}} \right) - {4 \over {{M_{{\rm{Pl}}}}{m^4}}}\left({{{[\Pi ]}^3} - 4[\Pi ][{\Pi ^2}] + 3[{\Pi ^3}]} \right) + \cdots} \\ {\rightarrow {4 \over {{M_{{\rm{Pl}}}}{m^4}}}\left({[\Pi ][{\Pi ^2}] - [{\Pi ^3}]} \right) + \cdots \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \end{array}$$
(2.89)

where we have integrated by parts on the second line, and we recover exactly the same type of higher derivatives already at the cubic level, so the BD ghost is also present in (2.84).

Alternatively the mass term was also generalized to include curvature invariants as in Ref. [69]. This theory was shown to be ghost-free at the linear level on FLRW but not yet non-linearly.

2.5.1 Function of the Fierz-Pauli mass term

As an extension of the Fierz-Pauli mass term, one could instead write a more general function of it, as considered in Ref. [75]

$${{\mathcal L}_{F({\rm{FP}})}} = - {m^2}\sqrt {- g} F\left({{g^{\mu v}}{g^{\alpha \beta}}({H_{\mu \alpha}}{H_{\nu \beta}} - {H_{\mu \nu}}{H_{\alpha \beta}})} \right),$$
(2.90)

however, one can easily see, if a mass term is actually present, i.e., F ′ ≠ 0, there is no analytic choice of the function F which would circumvent the non-linear propagation of the BD ghost. Expanding F into a Taylor expansion, we see for instance that the only choice to prevent the cubic higher-derivative interactions in π, [Π3] [Π ]−[Π2] is F ′(0) = 0, which removes the mass term at the same time. If F (0) ≠ 0 but F ′(0) = 0, the theory is massless about the specific reference metric, but infinitely strongly coupled about other backgrounds.

Instead to prevent the presence of the BD ghost fully non-linearly (or equivalently about any background), one should construct the mass term (or rather potential term) in such a way, that all the higher derivative operators involving the helicity-0 mode (2π)n are total derivatives. This is precisely what is achieved in the “ghost-free” model of massive gravity presented in Part II. In the next Part I we shall use higher dimensional GR to get some insight and intuition on how to construct a consistent theory of massive gravity.

3 Part I Massive Gravity from Extra Dimensions

4 Higher-Dimensional Scenarios

As seen in Section 2.5, the ‘most natural’ non-linear extension of the Fierz-Pauli mass term bears a ghost. Constructing consistent theories of massive gravity has actually been a challenging task for years, and higher-dimensional scenario can provide excellent frameworks for explicit realizations of massive gravity. The main motivation behind relying on higher dimensional gravity is twofold:

  • The five-dimensional theory is explicitly covariant.

  • A massless spin-2 field in five dimensions has five degrees of freedom which corresponds to the correct number of dofs for a massive spin-2 field in four dimensions without the pathological BD ghost.

While string theory and other higher dimensional theories give rise naturally to massive gravitons, they usually include a massless zero-mode. Furthermore, in the simplest models, as soon as the first massive mode is relevant so is an infinite tower of massive (Kaluza-Klein) modes and one is never in a regime where a single massive graviton dominates, or at least this was the situation until the Dvali-Gabadadze-Porrati model (DGP) [208, 209, 207], provided the first explicit model of (soft) massive gravity, based on a higher-dimensional braneworld model.

In the DGP model the graviton has a soft mass in the sense that its propagator does not have a simple pole at fixed value m, but rather admits a resonance. Considering the Kallen-Lehmann spectral representation [331, 374], the spectral density function ρ (μ2) in DGP is of the form

$${\rho _{{\rm{DGP}}}}({\mu ^2})\sim {{{m_0}} \over {\pi \mu}}{1 \over {{\mu ^2} + m_0^2}}\, ,$$
(3.1)

and so DGP corresponds to a theory of massive gravity with a resonance with width Δmm0 about m = 0.

In a Kaluza-Klein decomposition of a flat extra dimension we have, on the other hand, an infinite tower of massive modes with spectral density function

$${\rho _{{\rm{KK}}}}({\mu ^2})\sim \sum\limits_{n = 0}^\infty \delta ({\mu ^2} - {(n{m_0})^2})\, .$$
(3.2)

We shall see in the section on deconstruction (5) how one can truncate this infinite tower by performing a discretization in real space rather than in momentum space à la Kaluza-Klein, so as to obtain a theory of a single massive graviton

$${\rho _{{\rm{MG}}}}({\mu ^2})\sim \delta ({\mu ^2} - m_0^2)\, ,$$
(3.3)

or a theory of multi-gravity (with N-interacting gravitons),

$${\rho _{{\rm{multi - gravity}}}}({\mu ^2})\sim \sum\limits_{n = 0}^N \delta ({\mu ^2} - {(n{m_0})^2})\, .$$
(3.4)

In this language, bi-gravity is the special case of multi-gravity where N = 2. These different spectral representations, together with the cascading gravity extension of DGP are represented in Figure 1.

Figure 1
figure 1

Spectral representation of different models. (a) DGP (b) higher-dimensional cascading gravity and (c) multi-gravity. Bi-gravity is the special case of multi-gravity with one massless mode and one massive mode. Massive gravity is the special case where only one massive mode couples to the rest of the standard model and the other modes decouple. (a) and (b) are models of soft massive gravity where the graviton mass can be thought of as a resonance.

Recently, another higher dimensional embedding of bi-gravity was proposed in Ref. [495]. Rather than performing a discretization of the extra dimension, the idea behind this model is to consider a two-brane DGP model, where the radion or separation between these branes is stabilized via a Goldberger-Wise stabilization mechanism [255] where the brane and the bulk include a specific potential for the radion. At low energy the mass spectrum can be truncated to a massless mode and a massive mode, reproducing a bi-gravity theory. However, the stabilization mechanism involves a relatively low scale and the correspondence breaks down above it. Nevertheless, this provides a first proof of principle for how to embed such a model in a higher-dimensional picture without discretization and could be useful to tackle some of the open questions of massive gravity.

In what follows we review how five-dimensional gravity is a useful starting point in order to generate consistent four-dimensional theories of massive gravity, either for soft-massive gravity à la DGP and its extensions, or for hard massive gravity following a deconstruction framework.

The DGP model has played the role of a precursor for many developments in modified and massive gravity and it is beyond the scope of this review to summarize all of them. In this review we briefly summarize the DGP model and some key aspects of its phenomenology, and refer the reader to other reviews (see for instance [232, 390, 234]) for more details on the subject.

In this section, A, B, C ⋯ = 0, …, 4 represent five-dimensional spacetime indices and μ, ν, α ⋯ = 0, …, 3 label four-dimensional spacetime indices. y = x4 represents the fifth additional dimension, {xA} = {xμ, y}. The five-dimensional metric is given by (5)gab (x, y) while the four-dimensional metric is given by gμν (x). The five-dimensional scalar curvature is (5)R [G ] while R = R [g ] is the four-dimensional scalar-curvature. We use the same notation for the Einstein tensor where (5)Gab is the five-dimensional one and Gμν represents the four-dimensional one built out of gμν.

When working in the Einstein-Cartan formalism of gravity, \({\mathbb A},{\mathbb B},{\mathbb C}\) label five-dimensional Lorentz indices and a,b,c = ⋯ label the four-dimensional ones.

5 The Dvali-Gabadadze-Porrati Model

The idea behind the DGP model [209, 208, 207] is to start with a four-dimensional braneworld in an infinite size-extra dimension. A priori gravity would then be fully five-dimensional, with respective Planck scale M5, but the matter fields localized on the brane could lead to an induced curvature term on the brane with respective Planck scale MPl. See [22] for a potential embedding of this model within string theory.

At small distances the induced curvature dominates and gravity behaves as in four dimensions, while at large distances the leakage of gravity within the extra dimension weakens the force of gravity. The DGP model is thus a model of modified gravity in the infrared, and as we shall see, the graviton effectively acquires a soft mass, or resonance.

5.1 Gravity induced on a brane

We start with the five-dimensional action for the DGP model [209, 208, 207] with a brane localized at y = 0,

$$S = \int {{{\rm{d}}^4}x\,{\rm{d}}y\left({{{M_5^3} \over 4}\sqrt {{- ^{(5)}}g} {\;^{(5)}}R + \delta (y)\left[ {\sqrt {- g} {{M_{{\rm{Pl}}}^2} \over 2}R[g] + {{\mathcal L}_m}(g,\,{\psi _i})} \right]} \right)} \, ,$$
(4.1)

where ψi represent matter field species confined to the brane with stress-energy tensor Tμν. This brane is considered to be an orbifold brane enjoying a ℤ2-orbifold symmetry (so that the physics at y < 0 is the mirror copy of that at y < 0.) We choose the convention where we consider −∞ < y < ∞, reason why we have a factor or \(M_5^3/4\) rather than \(M_5^3/2\) if we had only consider one side of the brane, for instance y ≥ 0.

The five-dimensional Einstein equation of motion are then given by

$$M_5^3{\;^{(5)}}{G_{AB}} = 2\delta (y){\,^{(5)}}{T_{AB}}$$
(4.2)

with

$$^{(5)}{T_{AB}} = \left({- M_{{\rm{Pl}}}^2{G_{\mu \nu}} + {T_{\mu \nu}}} \right)\delta _A^\mu \delta _B^\nu \, .$$
(4.3)

The Israel matching condition on the brane [323] can be obtained by integrating this equation over \(\int\nolimits_{- \varepsilon}^\varepsilon {{\rm{d}}y}\) dy and taking the limit ε → 0, so that the jump in the extrinsic curvature across the brane is related to the Einstein tensor and stress-energy tensor of the matter field confined on the brane.

5.1.1 Perturbations about flat spacetime

In DGP the four-dimensional graviton is effectively massive. To see this explicitly, we look at perturbations about flat spacetime

$${\rm{d}}s_5^2 = \left({{\eta _{AB}} + {h_{AB}}\left({x,y} \right)} \right){\rm{d}}{x^A}\,{\rm{d}}{x^B}\, .$$
(4.4)

Since at this level we are dealing with five-dimensional GR, we are free to set the five-dimensional gauge of our choice and choose five-dimensional de Donder gauge (a discussion about the brane-bending mode will follow)

$${\partial _A}h_B^A = {1 \over 2}{\partial _B}h_A^A\, .$$
(4.5)

In this gauge the five-dimensional Einstein tensor is simply

$$^{{\rm{(5)}}}{G_{AB}} = - {1 \over 2}{\square_5}\left({{h_{AB}} - {1 \over 2}h_{C}^C{\eta _{AB}}} \right)\, ,$$
(4.6)

where \({\square_5} = \square + \partial _y^2\) is the five-dimensional d’Alembertian and □ is the four-dimensional one.

Since there is no source along the μy or yy directions ((5)Tμy = 0 = (5)Tyy), we can immediately infer that

$${\square_5}{h_{\mu y}} = 0\quad \Rightarrow \quad {h_{\mu y}} = 0$$
(4.7)
$${\square_5}\left({{h_{yy}} - h_{\mu}^\mu} \right) = 0\quad \Rightarrow \quad {h_{yy}} = h_{\mu}^\mu$$
(4.8)

up to an homogeneous mode which in this setup we set to zero. This does not properly account for the brane-bending mode but for the sake of this analysis it will give the correct expression for the metric fluctuation hμν. We will see in Section 4.2 how to keep track of the brane-bending mode which is partly encoded in hyy.

Using these relations in the five-dimensional de Donder gauge, we deduce the relation for the purely four-dimensional part of the metric perturbation,

$${\partial _\mu}h_{\nu}^\mu = {\partial _\nu}h_{\mu}^\mu \, .$$
(4.9)

Using these relations in the projected Einstein equation, we get

$${1 \over 2}M_5^3\left[{\square + \partial _y^2} \right]\left({{h_{\mu \nu}} - h{\eta _{\mu \nu}}} \right) = - \delta (y)\left({2{T_{\mu \nu}} + M_{{\rm{Pl}}}^2\left({{\square h_{\mu \nu}} - {\partial _\mu}{\partial _\nu}h} \right)} \right)\, ,$$
(4.10)

where \(h \equiv h_\alpha ^\alpha = {\eta ^{\mu v}}{h_{\mu v}}\) is the four-dimensional trace of the perturbations.

Solving this equation with the requirement that hμν → 0 as y → ±∞, we infer the following profile for the perturbations along the extra dimension

$${h_{\mu \nu}}(x,y) = {e^{- \vert y\vert \sqrt {-\square}}}{h_{\mu \nu}}(x)\, ,$$
(4.11)

where the □ should really be thought in Fourier space, and hμν (x) is set from the boundary conditions on the brane. Integrating the Einstein equation across the brane, from −ε to +ε, we get

$$\begin{array}{*{20}c}{{1 \over 2}\lim\limits_{\varepsilon \rightarrow 0} M_5^3\left[ {{\partial _y}{h_{\mu \nu}}(x,y) - h(x,y){\eta _{\mu \nu}}} \right]_{- \varepsilon}^\varepsilon + {\rm{M}}_{{\rm{Pl}}}^2\left(\square {{h_{\mu \nu}}(x,0) - {\partial _\mu}{\partial _\nu}h(x,0)} \right)\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\{= - 2{T_{\mu \nu}}(x)\, ,}\end{array}$$
(4.12)

yielding the modified linearized Einstein equation on the brane

$$M_{{\rm{Pl}}}^2\left[ {(\square{h_{\mu \nu}} - {\partial _\mu}{\partial _\nu}h) - {m_0}\sqrt {-\square} \left({{h_{\mu \nu}} - h{\eta _{\mu \nu}}} \right)} \right] = - 2T\mu \nu \, ,$$
(4.13)

where all the metric perturbations are the ones localized at y = 0 and the constant mass scale m0 is given by

$${m_0} = {{M_5^3} \over {M_{{\rm{Pl}}}^2}}\, .$$
(4.14)

Interestingly, we see the special Fierz-Pauli combination hμνμν appearing naturally from the five-dimensional nature of the theory. At this level, this corresponds to a linearized theory of massive gravity with a scale-dependent effective mass \({m^2}(\square) = {m_0}\sqrt {- \square}\), which can be thought in Fourier space, m2(k) = m0k. We could now follow the same procedure as derived in Section 2.2.3 and obtain the expression for the sourced metric fluctuation on the brane

$${h_{\mu \nu}} = - {2 \over {{\rm{M}}_{{\rm{Pl}}}^2}}{1 \over {\square - {m_0}\sqrt {-\square}}}\left({{T_{\mu \nu}} - {1 \over 3}T{\eta _{\mu \nu}} + {1 \over {3m\sqrt {-\square}}}{\partial _\mu}{\partial _\nu}T} \right)\, ,$$
(4.15)

where T = ημνTμν is the trace of the four-dimensional stress-energy tensor localized on the brane. This yields the following gravitational exchange amplitude between two conserved sources Tμν and

$${\mathcal A}_{TT\prime}^{{\rm{DGP}}} = \int {{{\rm{d}}^4}} x\;{h_{\mu \nu}}{T\prime^{\mu \nu}} = \int {{{\rm{d}}^4}} x\,{T\prime^{\mu \nu}}{{f_{\mu \nu \alpha \beta}^{{\rm{massive}}}} \over {\square - {m_0}\sqrt {-\square}}}\;{T^{\alpha \beta}}\, ,$$
(4.16)

where the polarization tensor \(f_{\mu v\alpha \beta}^{{\rm{massive}}}\) is the same as that given for Fierz-Pauli in (2.57) in terms of m0. In particular the polarization tensor includes the standard factor of −1/3μν as opposed to −1/2μν as would be the case in GR. This is again the manifestation of the vDVZ discontinuity which is cured by the Vainshtein mechanism as for Fierz-Pauli massive gravity. See [165] for the explicit realization of the Vainshtein mechanism in DGP which is where it was first shown to work explicitly.

5.1.2 Spectral representation

In Fourier space the propagator for the graviton in DGP is given by

$$\tilde G_{\mu \nu \alpha \beta}^{{\rm{massive}}}(k) = f_{\mu \nu \alpha \beta}^{{\rm{massive}}}(k,{m_0})\;\tilde {\mathcal G}(k)\, ,$$
(4.17)

with the massive polarization tensor fmassive defined in (2.58) and

$$\tilde {\mathcal G}(k) = {1 \over {{k^2} + {m_0}k}}\, ,$$
(4.18)

which can be written in the Kallen-Lehmann spectral representation as a sum of free propagators with mass μ,

$$\tilde {\mathcal G}(k) = \int\nolimits_0^\infty {{{\rho ({\mu ^2})} \over {{k^2} + {\mu ^2}}}} {\rm{d}}{\mu ^2}\, ,$$
(4.19)

with the spectral density ρ (μ2)

$$\rho ({\mu ^2}) = {1 \over \pi}{{{m_0}} \over \mu}{1 \over {{\mu ^2} + m_0^2}}\, ,$$
(4.20)

which is represented in Figure 1. As already emphasized, the graviton in DGP cannot be thought of a single massive mode, but rather as a resonance picked about μ = 0.

We see that the spectral density is positive for any μ2 > 0, confirming the fact that about the normal (flat) branch of DGP there is no ghost.

Notice as well that in the massless limit m0 → 0, we see appearing a representation of the Dirac delta function,

$$\lim\limits_{m \rightarrow 0} {1 \over \pi}{{{m_0}} \over \mu}{1 \over {{\mu ^2} + m_0^2}} = \delta ({\mu ^2})\, ,$$
(4.21)

and so the massless mode is singled out in the massless limit of DGP (with the different tensor structure given by \(f_{\mu v\alpha \beta}^{{\rm{massive}}} \ne f_{\mu v\alpha \beta}^{(0)}\) which is the origin of the vDVZ discontinuity see Section 2.2.3.)

5.2 Brane-bending mode

5.2.1 Five-dimensional gauge-fixing

In Section 4.1.1 we have remained vague about the gauge-fixing and the implications for the brane position. The brane-bending mode is actually important to keep track of in DGP and we shall do that properly in what follows by keeping all the modes.

We work in the five-dimensional ADM split with the lapse \(N = 1/\sqrt {{g^{yy}}} = 1 + {1 \over 2}{h_{yy}}\), the shift Nμ = gμy and the four-dimensional part of the metric, gμν (x,y) = ημν + (x,y). The five-dimensional Einstein-Hilbert term is then expressed as

$${\mathcal L}_{\rm{R}}^{(5)} = {{M_5^3} \over 4}\sqrt {- g} N\left({R[g] + {{[K]}^2} - [{K^2}]} \right)\, ,$$
(4.22)

where square brackets correspond to the trace of a tensor with respect to the four-dimensional metric gμν and Kμν is the extrinsic curvature

$${K_{\mu \nu}} = {1 \over {2N}}\left({{\partial _y}{g_{\mu \nu}} - {D_\mu}{N_\nu} - {D_\nu}{N_\mu}} \right)\, ,$$
(4.23)

and Dμ is the covariant derivative with respect to gμν.

First notice that the five-dimensional de Donder gauge choice (4.5) can be made using the five-dimensional gauge fixing term

$${\mathcal L}_{{\rm{Gauge - Fixing}}}^{(5)} = - {{M_5^3} \over 8}{\left({{\partial _A}h_{B}^A - {1 \over 2}{\partial _B}h_{A}^A} \right)^2}$$
(4.24)
$$\begin{array}{*{20}c} {= - {{M_5^3} \over 8}\left[ {{{\left({{\partial _\mu}{h^\mu}_\nu - {1 \over 2}{\partial _\nu}h + {\partial _y}{N_\nu} - {1 \over 2}{\partial _\nu}{h_{yy}}} \right)}^2}} \right.\quad} \\ {\left. {\quad \quad \quad \quad \quad + {{\left({{\partial _\mu}{N^\mu} + {1 \over 2}{\partial _y}{h_{yy}} - {1 \over 2}{\partial _y}h} \right)}^2}} \right],} \end{array}$$
(4.25)

where we keep the same notation as previously, h = ημνhμν is the four-dimensional trace.

After fixing the de Donder gauge (4.5), we can make the addition gauge transformation xAxA + ξA, and remain in de Donder gauge provided satisfies linearly □5ξA = 0. This residual gauge freedom can be used to further fix the gauge on the brane (see [389] for more details, we only summarize their derivation here).

5.2.2 Four-dimensional Gauge-fixing

Keeping the brane at the fixed position y = 0 imposes = 0 since we need ξA (y = 0) = 0 and should be bounded as y → ∞ (the situation is slightly different in the self-accelerating branch and this mode can lead to a ghost, see Section 4.4 as well as [361, 98]).

Using the bulk profile \({h_{AB}}(x,y) = {e^{- \sqrt {- \square} \vert y\vert}}{h_{AB}}(x)\) and integrating over the extra dimension, we obtain the contribution from the bulk on the brane (including the contribution from the gauge-fixing term) in terms of the gauge invariant quantity

$$\begin{array}{*{20}c} {\tilde h_{\mu \nu}} = {h_{\mu \nu}} + {2 \over {\sqrt {-}}}{\partial _{(\mu}}{N_{\nu)}} = - {2 \over {\sqrt {-}}}{K_{\mu \nu}}\\ S_{{\rm{bulk}}}^{{\rm{integrated}}} = {{M_5^3} \over 4}\int {{{\rm{d}}^4}} x\left[ {- {1 \over 2}{{\tilde h}^{\mu \nu}}\sqrt {-} \left({{{\tilde h}_{\mu \nu}} - {1 \over 2}\tilde h{\eta _{\mu \nu}}} \right) + {1 \over 2}{h_{yy}}\sqrt {-} \left({\tilde h - {1 \over 2}{h_{yy}}} \right)} \right]\, . \end{array}$$
(4.26)

Notice again a factor of 2 difference from [389] which arises from the fact that we integrate from y = −∞ to y = +∞ imposing a ℤ2-mirror symmetry at y = 0, rather than considering only one side of the brane as in [389]. Both conventions are perfectly reasonable.

The integrated bulk action (4.27) is invariant under the residual linearized gauge symmetry

$${h_{\mu \nu}} \rightarrow {h_{\mu \nu}} + 2{\partial _{(\mu}}{\xi _{\nu)}}$$
(4.27)
$${N_\mu} \rightarrow {N_\mu} - \sqrt {- \square \xi _\nu}$$
(4.28)
$${h_{yy}} \rightarrow {h_{yy}}$$
(4.29)

which keeps both \({\tilde h_{\mu v}}\) and hyy invariant. The residual gauge symmetry can be used to set the gauge on the brane, and at this level from (4.27) we can see that the most convenient gauge fixing term is [389]

$${\mathcal L}_{{\rm{Residual}}\;{\rm{Gauge - Fixing}}}^{(4)} = - {{M_{{\rm{Pl}}}^2} \over 4}{\left({{\partial _\mu}{h^\mu}_\nu - {1 \over 2}{\partial _\nu}h + {m_0}{N_\nu}} \right)^2}\, ,$$
(4.30)

with again \({m_0} = M_5^3/M_{{\rm{Pl}}}^2\), so that the induced Lagrangian on the brane (including the contribution from the residual gauge fixing term) is

$${S_{{\rm{boundary}}}} = {{M_{{\rm{Pl}}}^2} \over 4}\int {{\rm{d}^4}} x\left[ {{1 \over 2}{h^{\mu \nu}}\square({h_{\mu \nu}} - {1 \over 2}h{\eta _{\mu \nu}}) - 2{m_0}{N_\mu}\left({{\partial _\alpha}{h^{\alpha \mu}} - {1 \over 2}{\partial ^\mu}h} \right) - m_0^2{N_\mu}{N^\mu}} \right].$$
(4.31)

Combining the five-dimensional action from the bulk (4.27) with that on the boundary (4.31) we end up with the linearized action on the four-dimensional DGP brane [389]

$$\begin{array}{*{20}c} {S_{{\rm{DGP}}}^{({\rm{lin}})} = {{M_{{\rm{Pl}}}^2} \over 4}\int {{{\rm{d}}^4}x\left[ {{1 \over 2}{h^{\mu \nu}}\left[ {\square - {m_0}\sqrt - \square} \right]({h_{\mu \nu}} - {1 \over 2}h{\eta _{\mu \nu}}) - {m_0}{N^\mu}{\partial _\mu}{h_{yy}}\quad \quad \quad \quad \quad \quad} \right.}} \\ {\left. {- {m_0}{N^\mu}\left[ {\sqrt - \square + {m_0}} \right]{N_\mu} - {{{m_0}} \over 4}{h_{yy}}\sqrt - \square ({h_{yy}} - 2h)} \right].} \end{array}$$
(4.32)

As shown earlier we recover the theory of a massive graviton in four dimensions, with a soft mass \({m_2}(\square) = {m_0}\sqrt {- \square}\). This analysis has allowed us to keep track of the physical origin of all the modes including the brane-bending mode which is especially relevant when deriving the decoupling limit as we shall see below.

The kinetic mixing between these different modes can be diagonalized by performing the change of variables [389]

$${h_{\mu \nu}} = {1 \over {{M_{{\rm{Pl}}}}}}\left({{{h\prime}_{\mu \nu}} + \pi {\eta _{\mu \nu}}} \right)$$
(4.33)
$${N_\mu} = {1 \over {{M_{{\rm{Pl}}}}\sqrt {{m_0}}}}{N\prime_\mu} + {1 \over {{M_{{\rm{Pl}}}}{m_0}}}{\partial _\mu}\pi$$
(4.34)
$${h_{yy}} = - {{2\sqrt {- \square}} \over {{m_0}{M_{{\rm{Pl}}}}}}\pi \, ,$$
(4.35)

so we see that the mode π is directly related to hyy. In the case of Section 4.1.1, we had set hyy = 0 and the field π is then related to the brane bending mode. In either case we see that the extrinsic curvature Kμν carries part of this mode.

Omitting the mass terms and other relevant operators, the action is diagonalized in terms of the different graviton modes at the linearized level h′μν (which encodes the helicity-2 mode), Nμ (which is part of the helicity-1 mode) and π (helicity-0 mode),

$$S_{{\rm{DGP}}}^{({\rm{lin}})} = {1 \over 4}\int {{{\rm{d}}^4}} x\left[ {{1 \over 2}{{h\prime}^{\mu \nu}}\square ({{h\prime}_{\mu \nu}} - {1 \over 2}h\prime{\eta _{\mu \nu}}) - {{N\prime}^\mu}\sqrt {- \square} {{N\prime}_\mu} + 3\pi \square \pi} \right]\, .$$
(4.36)

5.2.3 Decoupling limit

We will be discussing the meaning of ‘decoupling limits’ in more depth in the context of multi-gravity and ghost-free massive gravity in Section 8. The main idea behind the decoupling limit is to separate the physics of the different modes. Here we are interested in following the interactions of the helicity-0 mode without the complications from the standard helicity-2 interactions that already arise in GR. For this purpose we can take the limit MPl → ∞ while simultaneously sending \({m_0} = M_5^3/M_{{\rm{Pl}}}^2 \to 0\) while keeping the scale \(\Lambda = {(m_0^2{M_{{\rm{Pl}}}})^{1/3}}\) fixed. This is the scale at which the first interactions arise in DGP.

In DGP the decoupling limit should be taken by considering the full five-dimensional theory, as was performed in [389]. The four-dimensional Einstein-Hilbert term does not give to any operators before the Planck scale, so in order to look for the irrelevant operator that come at the lowest possible scale, it is sufficient to focus on the boundary term from the five-dimensional action. It includes operators of the form

$${\mathcal L}_{{\rm{boundary}}}^{(5)} \supset {m_0}M_{{\rm{Pl}}}^2\partial {\left({{{{{h\prime}_{\mu \nu}}} \over {{M_{{\rm{Pl}}}}}}} \right)^n}{\left({{{{{N\prime}_\mu}} \over {\sqrt {{m_0}}{M_{{\rm{Pl}}}}}}} \right)^k}{\left({{{\partial \pi} \over {{m_0}{M_{{\rm{Pl}}}}}}} \right)^\ell}\, ,$$
(4.37)

with integer powers n, k, ≥ 0 and n + k + ≥ 3 since we are dealing with interactions. The scale at which such an operator arises is

$${\Lambda _{n,k,\ell}} = {\left({M_{{\rm{Pl}}}^{n + k + \ell - 2}m_0^{k/2 + \ell - 1}} \right)^{1/(n + 3k/2 + 2\ell - 3)}}$$
(4.38)

and it is easy to see that the lowest possible scale is \({\Lambda _3} = {({M_{{\rm{Pl}}}}m_0^2)^{1/3}}\) which arises for n = 0, k = 0 and = 3, it is thus a cubic interaction in the helicity-0 mode π which involves four derivatives. Since it is only a cubic interaction, we can scan all the possible ways enters at the cubic level in the five-dimensional Einstein-Hilbert action. The relevant piece are the ones from the extrinsic curvature in (4 22), and in particular the combination N ([K ]2 − [K2]), with

$$N = 1 + {1 \over 2}{e^{- \sqrt {- \square} y}}{h_{yy}}$$
(4.39)
$${K_{\mu \nu}} = - {1 \over 2}(1 - {1 \over 2}{e^{- \sqrt {- \square} y}}{h_{yy}})({\partial _\mu}{N_\nu} + {\partial _\nu}{N_\mu})\, .$$
(4.40)

Integrating \({m_0}M_{{\rm{Pl}}}^2N({[K]^2} - [{K^2}])\) along the extra dimension, we obtain the cubic contribution in π on the brane (using the relations (4.34) and (4.35))

$${\mathcal L}{\Lambda _3} = {1 \over {2\Lambda _3^3}}{(\partial \pi)^2}\square \pi \, .$$
(4.41)

So the decoupling limit of DGP arises at the scale Λ3 and reduces to a cubic Galileon for the helicity-0 mode with no interactions for the helicity-2 and -1 modes,

$$\begin{array}{*{20}c} {{\mathcal L_{{\rm{DL}}\;{\rm{DGP}}}} = {1 \over 8}{{h\prime}^{\mu \nu}}\square \left({{{h\prime}_{\mu \nu}} - {1 \over 2}h\prime{\eta _{\mu \nu}}} \right) - {1 \over 4}{{N\prime}^\mu}\sqrt {- \square} {{N\prime}_\mu}} \\ {+ {3 \over 2}\pi \square \pi + {1 \over {2\Lambda _3^3}}{{(\partial \pi)}^2}\square \pi .\quad \quad} \end{array}$$
(4.42)

5.3 Phenomenology of DGP

The phenomenology of DGP is extremely rich and has led to many developments. In what follows we review one of the most important implications of the DGP for cosmology which the existence of self-accelerating solutions. The cosmology and phenomenology of DGP was first derived in [159, 163] (see also [388, 385, 387, 386]).

5.3.1 Friedmann equation in de Sitter

To get some intuition on how cosmology gets modified in DGP, we first look at de Sitter-like solutions and then infer the full Friedmann equation in a FLRW-geometry. We thus start with five-dimensional Minkowski in de Sitter slicing (this can be easily generalized to FLRW-slicing),

$${\rm{d}}s_5^2 = {b^2}(y)\,\left({{\rm{d}}{y^2} + \gamma _{\mu \nu}^{({\rm{dS}})}{\rm{d}}{x^\mu}{\rm{d}}{x^\nu}} \right)\, ,$$
(4.43)

where \(\gamma _{\mu v}^{{\rm{(dS)}}}\) is the four-dimensional de Sitter metric with constant Hubble parameter H, \(\gamma _{\mu v}^{{\rm{(dS)}}}\,{\rm{d}}{x^\mu}\,{\rm{d}}{x^v} = - {\rm{d}}{t^2} + {a^2}(t)\,{\rm{d}}{x^2}\), and the scale factor is given by a (t) = exp(Ht). The metric (4.43) is indeed Minkowski in de Sitter slicing if the warp factor b (y) is given by

$$b(y) = {e^{\epsilon H\vert y\vert}},\quad {\rm{with}}\quad \epsilon = \pm 1\, ,$$
(4.44)

and the mod y has be imposed by the ℤ2-orbifold symmetry. As we shall see the branch ϵ = +1 corresponds to the self-accelerating branch of DGP and ϵ = −1 is the stable, normal branch of DGP.

We can now derive the Friedmann equation on the brane by integrating over the 00-component of the Einstein equation (4.2) with the source (4.3) and consider some energy density T00 = ρ. The four-dimensional Einstein tensor gives the standard contribution G00 = 3H2 on the brane and so we obtain the modified Friedmann equation

$${{M_5^3} \over 2}\left[ {\underset {\varepsilon \rightarrow 0} {\lim} \int\nolimits_{- \varepsilon}^\varepsilon {{}^{(5)}{G_{00}}} {\rm{d}}y} \right] + 3{\rm{M}}_{{\rm{Pl}}}^2{H^2} = \rho \, ,$$
(4.45)

with (5)G00 = 3(H2b ″(y)/b (y)), so

$$\underset {\varepsilon \rightarrow 0} {\lim} \int\nolimits_{- \varepsilon}^\varepsilon {{}^{(5)}{G_{00}}} {\rm{d}}y = - 6\epsilon H\, ,$$
(4.46)

leading to the modified Friedmann equation,

$${H^2} - \varepsilon {m_0}H = {1 \over {3M_{{\rm{Pl}}}^2}}\rho \, ,$$
(4.47)

where the five-dimensional nature of the theory is encoded in the new term −ϵm0H (this new contribution can be seen to arise from the helicity-0 mode of the graviton and could have been derived using the decoupling limit of DGP.) for reasons which will become clear in what follows, the choice ϵ = − 1 corresponds to the stable branch of DGP while the other choice ϵ = +1 corresponds to the self-accelerating branch of DGP. As is already clear from the higher-dimensional perspective, when ϵ = +1, the warp factor grows in the bulk (unless we think of the junction conditions the other way around), which is already signaling towards a pathology for that branch of solution.

5.3.2 General Friedmann equation

This modified Friedmann equation has been derived assuming a constant H, which is only consistent if the energy density is constant (i.e., a cosmological constant). We can now derive the generalization of this Friedmann equation for non-constant H. This amounts to account for and other derivative corrections which might have been omitted in deriving this equation by assuming that was constant. But the Friedmann equation corresponds to the Hamiltonian constraint equation and higher derivatives (e.g., ⊃ ä and higher derivatives of H) would imply that this equation is no longer a constraint and this loss of constraint would imply that the theory admits a new degree of freedom about generic backgrounds namely the BD ghost (see the discussion of Section 7).

However, in DGP we know that the BD ghost is absent (this is ensured by the five-dimensional nature of the theory, in five dimensions we start with five dofs, and there is thus no sixth BD mode). So the Friedmann equation cannot include any derivatives of H, and the Friedmann equation obtained assuming a constant H is actually exact in FLRW even if H is not constant. So the constraint (4.47) is the exact Friedmann equation in DGP for any energy density ρ on the brane.

The same trick can be used for massive gravity and bi-gravity and the Friedmann equations (12.51), (12.52) and (12.54) are indeed free of any derivatives of the Hubble parameter.

5.3.3 Observational viability of DGP

Independently of the ghost issue in the self-accelerating branch of the model, there has been a vast amount of investigation on the observational viability of both the self-accelerating branch and the normal (stable) branch of DGP. First because many of these observations can apply equally well to the stable branch of DGP (modulo a minus sign in some of the cases), and second and foremost because DGP represents an excellent archetype in which ideas of modified gravity can be tested.

Observational tests of DGP fall into the following two main categories:

  • Tests of the Friedmann equation. This test was performed mainly using Supernovae, but also using Baryonic Acoustic Oscillations and the CMB so as to fix the background history of the Universe [162, 217, 221, 286, 391, 23, 405, 481, 304, 382, 462]. Current observations seem to slightly disfavor the additional term in the Friedmann equation of DGP, even in the normal branch where the late-time acceleration of the Universe is due to a cosmological constant as in ΛCDM. These put bounds on the graviton mass in DGP to the order of m0 ≲ 10−1 H0, where H0 is the Hubble parameter today (see Ref. [492] for the latest bounds at the time of writing, including data from Planck). Effectively this means that in order for DGP to be consistent with observations, the graviton mass can have no effect on the late-time acceleration of the Universe.

  • Tests of an extra fifth force, either within the solar system, or during structure formation (see for instance [362, 260, 452, 451, 222, 482] Refs. [453, 337, 442] for N-body simulations as well as Ref. [17, 441] using weak lensing).

    Evading fifth force experiments will be discussed in more detail within the context of the Vainshtein mechanism in Section 10.1 and thereafter, and we save the discussion to that section. See Refs. [388, 385, 387, 386, 444] for a five-dimensional study dedicated to DGP. The study of cosmological perturbations within the context of DGP was also performed in depth for instance in [367, 92].

5.4 Self-acceleration branch

The cosmology of DGP has led to a major conceptual breakthrough, namely the realization that the Universe could be ‘self-accelerating’. This occurs when choosing the ϵ = +1 branch of DGP, the Friedmann equation in the vacuum reduces to [159, 163]

$${H^2} - {m_0}H = 0\, ,$$
(4.48)

which admits a non-trivial solution H = 0 in the absence of any cosmological constant nor vacuum energy. In itself this would not solve the old cosmological constant problem as the vacuum energy ought to be set to zero on its own, but it can lead to a model of ‘dark gravity’ where the amount of acceleration is governed by the scale m0 which is stable against quantum corrections.

This realization has opened a new field of study in its own right. It is beyond the scope of this review on massive gravity to summarize all the interesting developments that arose in the past decade and we simply focus on a few elements namely the presence of a ghost in this self-accelerating branch as well as a few cosmological observations.

5.4.1 Ghost

The existence of a ghost on the self-accelerating branch of DGP was first pointed out in the decoupling limit [389, 411], where the helicity-0 mode of the graviton is shown to enter with the wrong sign kinetic in this branch of solutions. We emphasize that the issue of the ghost in the self-accelerating branch of DGP is completely unrelated to the sixth BD ghost on some theories of massive gravity. In DGP there are five dofs one of which is a ghost. The analysis was then generalized in the fully fledged five-dimensional theory by K. Koyama in [360] (see also [263, 361] and [98]).

When perturbing about Minkowski, it was shown that the graviton has an effective mass \({m^2} = {m_0}\sqrt {- \square}\). When perturbing on top of the self-accelerating solution a similar analysis can be performed and one can show that in the vacuum the graviton has an effective mass at precisely the Higuchi-bound, \(m_{{\rm{eff}}}^2 = 2{H^2}\) (see Ref. [307]). When matter or a cosmological constant is included on the brane, the graviton mass shifts either inside the forbidden Higuchi-region \(0 < m_{{\rm{eff}}}^2 < 2{H^2}\), or outside \(m_{{\rm{eff}}}^2 > 2{H^2}\). We summarize the three case scenario following [360, 98]

  • In [307] it was shown that when the effective mass is within the forbidden Higuchi-region, the helicity-0 mode of graviton has the wrong sign kinetic term and is a ghost.

  • Outside this forbidden region, when \(m_{{\rm{eff}}}^2 > 2{H^2}\), the zero-mode of the graviton is healthy but there exists a new normalizable brane-bending mode in the self-accelerating branchFootnote 8 which is a genuine degree of freedom. For \(m_{{\rm{eff}}}^2 > 2{H^2}\) the brane-bending mode was shown to be a ghost.

  • Finally, at the critical mass \(m_{{\rm{eff}}}^2 > 2{H^2}\) (which happens when no matter nor cosmological constant is present on the brane), the brane-bending mode takes the role of the helicity-0 mode of the graviton, so that the theory graviton still has five degrees of freedom, and this mode was shown to be a ghost as well.

In summary, independently of the matter content of the brane, so long as the graviton is massive \(m_{{\rm{eff}}}^2 > 0\), the self-accelerating branch of DGP exhibits a ghost. See also [210] for an exact non-perturbative argument studying domain walls in DGP. In the self-accelerating branch of DGP domain walls bear a negative gravitational mass. This non-perturbative solution can also be used as an argument for the instability of that branch.

5.4.2 Evading the ghost?

Different ways to remove the ghosts were discussed for instance in [325] where a second brane was included. In this scenario it was then shown that the graviton could be made stable but at the cost of including a new spin-0 mode (that appears as the mode describing the distance between the branes).

Alternatively it was pointed out in [233] that if the sign of the extrinsic curvature was flipped, the self-accelerating solution on the brane would be stable.

Finally, a stable self-acceleration was also shown to occur in the massless case \(m_{{\rm{eff}}}^2 = 0\) by relying on Gauss-Bonnet terms in the bulk and a self-source AdS5 solution [156]. The five-dimensional theory is then similar as that of DGP (4.1) but with the addition of a five-dimensional Gauss-Bonnet term \({\mathcal R}_{{\rm{GB}}}^2\) in the bulk and the wrong sign five-dimensional Einstein-Hilbert term,

$$\begin{array}{*{20}c} {S = \int {{{\rm{d}}^5}} x\left[ {\sqrt {{- ^{({\rm{5}})}}g} \left({- {{M_5^3} \over 4}{\,{({\rm{5}})}}R{[^{({\rm{5}})}}g] - {{M_5^3{\ell ^2}} \over 4}{\,^{({\rm{5}})}}R_{{\rm{GB}}}^2{[^{({\rm{5}})}}g]} \right)\quad \quad \quad \quad \quad \quad \quad \quad} \right.} \\{\left. {+ \delta (y)\left[ {\sqrt {- g} {{M_{{\rm{Pl}}}^2} \over 2}R + {L_m}(g,{\psi _i})} \right]} \right]\, .} \end{array}$$
(4.49)

The idea is not so dissimilar as in new massive gravity (see Section 13), where here the wrong sign kinetic term in five-dimensions is balanced by the Gauss-Bonnet term in such a way that the graviton has the correct sign kinetic term on the self-sourced AdS5 solution. The length scale is related to this AdS length scale, and the self-accelerating branch admits a stable (ghost-free) de Sitter solution with H−1.

We do not discuss this model any further in what follows since the graviton admits a zero (massless) mode. It is feasible that this model can be understood as a bi-gravity theory where the massive mode is a resonance. It would also be interesting to see how this model fits in with the Galileon theories [412] which admit stable self-accelerating solution.

In what follows, we go back to the standard DGP model be it the self-accelerating branch (ϵ = 1) or the normal branch (ϵ = − 1).

5.5 Degravitation

One of the main motivations behind modifying gravity in the infrared is to tackle the old cosmological constant problem. The idea behind ‘degravitation’ [211, 212, 26, 216] is if gravity is modified in the IR, then a cosmological constant (or the vacuum energy) could have a smaller impact on the geometry. In these models, we would live with a large vacuum energy (be it at the TeV scale or at the Planck scale) but only observe a small amount of late-acceleration due to the modification of gravity. In order for a theory of modified gravity to potentially tackle the old cosmological constant problem via degravitation it needs to have the two following properties:

  1. 1.

    First, gravity must be weaker in the infrared and effectively massive [216] so that the effect of IR sources can be degravitated.

  2. 2.

    Second, there must exist some (nearly) static attractor solutions towards which the system can evolve at late-time for arbitrary value of the vacuum energy or cosmological constant.

5.5.1 Flat solution with a cosmological constant

The first requirement is present in DGP, but as was shown in [216] in DGP gravity is not ‘sufficiently weak’ in the IR to allow degravitation solutions. Nevertheless, it was shown in [164] that the normal branch of DGP satisfies the second requirement for any negative value of the cosmological constant. In these solutions the five-dimensional spacetime is not Lorentz invariant, but in a way which would not (at this background level) be observed when confined on the four-dimensional brane.

For positive values of the cosmological constant, DGP does not admit a (nearly) static solution. This can be understood at the level of the decoupling limit using the arguments of [216] and generalized for other mass operators.

Inspired by the form of the graviton in DGP, \({m^2}(\square) = {m_0}\sqrt {- \square}\), we can generalize the form of the graviton mass to

$${m^2}(\square) = m_0^2{\left({{{- \square} \over {m_0^2}}} \right)^\alpha}\, ,$$
(4.50)

with α a positive dimensionless constant. α = 1 corresponds to a modification of the kinetic term. As shown in [153], any such modification leads to ghosts, so we do not consider this case here. α > 1 corresponds to a UV modification of gravity, and so we focus on α < 1.

In the decoupling limit the helicity-2 decouples from the helicity-0 mode which behaves (symbolically) as follows [216]

$$3\square \pi - {1 \over {{M_{{\rm{Pl}}}}m_0^{4(1 - \alpha)}}}\square {\left({{\square ^{1 - \alpha}}\pi} \right)^2} + \cdots = - {1 \over {{M_{{\rm{Pl}}}}}}T\, ,$$
(4.51)

where T is the trace of the stress-energy tensor of external matter fields. At the linearized level, matter couples to the metric \({g_{\mu v}} = {\eta _{\mu v}} + {1 \over {{M_{{\rm{Pl}}}}}}(h_{\mu v}\prime + \pi {\eta _{\mu v}})\). We now check under which conditions we can still recover a nearly static metric in the presence of a cosmological constant Tμν = −ΛCCgμν. In the linearized limit of GR this leads to the profile for the helicity-2 mode (which in that case corresponds to a linearized de Sitter solution)

$${h\prime_{\mu \nu}} = - {{{\Lambda _{{\rm{CC}}}}} \over {6{M_{{\rm{Pl}}}}}}{\eta _{\rho \sigma}}{x^\rho}{x^\sigma}{\eta _{\mu \nu}}\, .$$
(4.52)

One way we can obtain a static solution in this extended theory of massive gravity at the linear level is by ensuring that the solution for cancels out that of hμν so that the metric gμν remains flat. \(\pi = + {{{\Lambda _{{\rm{CC}}}}} \over {6{M_{{\rm{Pl}}}}}}{\eta _{\mu v}}{x^\mu}{x^v}\) is actually the solution of (4.51) when only the term contributes and all the other operators vanish for πxμxμ. This is the case if α < 1/2 as shown in [216]. This explains why in the case of DGP which corresponds to border line scenario α = 1/2, one can never fully degravitate a cosmological constant.

5.5.2 Extensions

This realization has motivated the search for theories of massive gravity with 0 ≤ α < 1/2, and especially the extension of DGP to higher dimensions where the parameter can get as close to zero as required. This is the main motivation behind higher dimensional DGP [359, 240] and cascading gravity [135, 148, 132, 149] as we review in what follows. (In [433] it was also shown how a regularized version of higher dimensional DGP could be free of the strong coupling and ghost issues).

Note that α ≡ 0 corresponds to a hard mass gravity. Within the context of DGP, such a model with an ‘auxiliary’ extra dimension was proposed in [235, 133] where we consider a finite-size large extra dimension which breaks five-dimensional Lorentz invariance. The five-dimensional action is motivated by the five-dimensional gravity with scalar curvature in the ADM decomposition (5)R = R [g ] + [K ]2 − [K2], but discarding the contribution from the four-dimensional curvature R [g ]. Similarly as in DGP, the four-dimensional curvature still appears induced on the brane

$$S = {{{\rm{M}}_{{\rm{Pl}}}^2} \over 2}\int\nolimits_0^\ell {\rm{d}} y\int {{{\rm{d}}^4}} x\sqrt {- g} \,\left({{m_0}\,\left({{{[K]}^2} - [{K^2}]} \right) + \delta (y)R[g]} \right)\, ,$$
(4.53)

where is the size of the auxiliary extra dimension and gμν is a four-dimensional metric and we set the lapse to one (this shift can be kept and will contribute to the four-dimensional Stückelberg field which restores four-dimensional invariance, but at this level it is easier to work in the gauge where the shift is set to zero and reintroduce the Stückelberg fields directly in four dimensions). Imposing the Dirichlet conditions gμν (x, y = 0) = fμν, we are left with a theory of massive gravity at y = 0, with reference metric fμν and hard mass m0. Here again the special structure ([K ]2 − [K2]) inherited (or rather inspired) from five-dimensional gravity ensures the Fierz-Pauli structure and the absence of ghost at the linearized level. Up to cubic order in perturbations it was shown in [138] that the theory is free of ghost and its decoupling limit is that of a Galileon.

Furthermore, it was shown in [133] that it satisfies both requirements presented above to potentially help degravitating a cosmological constant. Unfortunately at higher orders this model is plagued with the BD ghost [291] unless the boundary conditions are chosen appropriately [59]. For this reason we will not review this model any further in what follows and focus instead on the ghost-free theory of massive gravity derived in [137, 144]

5.5.3 Cascading gravity

5.5.3.1 Deficit angle

It is well known that a tension on a cosmic string does not cause the cosmic strong to inflate but rather creates a deficit angle in the two spatial dimensions orthogonal to the string. Similarly, if we consider a four-dimensional brane embedded in six-dimensional gravity, then a tension on the brane leads to the following flat geometry

$${\rm{d}}s_6^2 = {\eta _{\mu \nu}}\,{\rm{d}}{x^\mu}\,{\rm{d}}{x^\nu}\, + \,{\rm{d}}{r^2}\, + \,{r^2}\,\left({1 - {{\Delta \theta} \over {2\pi}}} \right){\rm{d}}{\theta ^2}\, ,$$
(4.54)

where the two extra dimensions are expressed in polar coordinates {r, θ} and Δθ is a constant which parameterize the deficit angle in this canonical geometry. This deficit angle is related to the tension on the brane ΛCC and the six-dimensional Planck scale (assuming six-dimensional gravity)

$$\Delta \theta = 2\pi {{{\Lambda _{{\rm{CC}}}}} \over {M_6^4}}\, .$$
(4.55)

For a positive tension Λcc > 0, this creates a positive deficit angle and since Λθ cannot be larger than 2π, the maximal tension on the brane is \(M_6^4\). For a negative tension, on the other hand, there is no such bound as it creates a surplus of angle, see Figure 2.

Figure 2
figure 2

Codimension-2 brane with positive (resp. negative) tension brane leading to a positive (resp. negative) deficit angle in the two extra dimensions.

This interesting feature has lead to many potential ways to tackle the cosmological constant by considering our Universe to live in a 3 + 1-dimensional brane embedded in two or more large extra dimensions. (See Refs. [4, 3, 408, 414, 80, 470, 458, 459, 86, 82, 247, 333, 471, 81, 426, 409, 373, 85, 460, 155] for the supersymmetric large-extra-dimension scenario as an alternative way to tackle the cosmological constant problem). Extending the DGP to more than one extra dimension could thus provide a natural way to tackle the cosmological constant problem.

5.5.3.2 Spectral representation

Furthermore in n-extra dimensions the gravitational potential is diluted as V (r) ∼ r−1−n. If the propagator has a Källén-Lehmann spectral representation with spectral density π (μ2), the Newtonian potential has the following spectral representation

$$V(r) = \int\nolimits_0^\infty {{{\rho ({\mu ^2}){e^{- \mu r}}} \over r}} \,{\rm{d}}{\mu ^2}\, .$$
(4.56)

In a higher-dimensional DGP scenario, the gravitation potential behaves higher dimensional at large distance, V (r) ∼−(1+n) which implies π (μ2) ∼ μn−2 in the IR as depicted in Figure 1.

Working back in terms of the spectral representation of the propagator as given in (4.19), this means that the propagator goes to 1/k in the IR as μ → 0 when n = 1 (as we know from DGP), while it goes to a constant for n > 1. So for more than one extra dimension, the theory tends towards that of a hard mass graviton in the far IR, which corresponds to α → 0 in the parametrization of (4.50). Following the arguments of [216] such a theory should thus be a good candidate to tackle the cosmological constant problem.

5.5.3.3 A brane on a brane

Both the spectral representation and the fact that codimension-two (and higher) branes can accommodate for a cosmological constant while remaining flat has made the field of higher-codimension branes particularly interesting.

However, as shown in [240] and [135, 148, 132, 149], the straightforward extension of DGP to two large extra dimensions leads to ghost issues (sixth mode with the wrong sign kinetic term, see also [290, 70]) as well as divergences problems (see Refs. [256, 131, 130, 423, 422, 355, 83]).

To avoid these issues, one can consider simply applying the DGP procedure step by step and consider a 4 + 1-dimensional DGP brane embedded in six dimension. Our Universe would then be on a 3 + 1-dimensional DGP brane embedded in the 4 + 1 one, (note we only consider one side of the brane here which explains the factor of 2 difference compared with (4.1))

$$\begin{array}{*{20}c} {S = {{M_6^4} \over 2}\int {{{\rm{d}}^6}x{{\sqrt {- {g_6}}}^{(6)}}R + {{M_5^3} \over 2}} \int {{{\rm{d}}^5}x{{\sqrt {- {g_5}}}^{(5)}}R}} \\{+ {{M_{{\rm{Pl}}}^2} \over 2}\int {\,{{\rm{d}}^4}x{{\sqrt {- {g_4}}}^{(4)}}R +} \int {\,{{\rm{d}}^4}x{L_{{\rm{matter}}}}({g_4},\,\psi)} \, .} \end{array}$$
(4.57)

This model has two cross-over scales: \({m_5} = M_5^3/M_{{\rm{Pl}}}^2\) which characterizes the scale at which one crosses from the four-dimensional to the five-dimensional regime, and \({m_6} = M_6^4/M_5^3\) yielding the crossing from a five-dimensional to a six-dimensional behavior. Of course we could also have a simultaneous crossing if m5 = m6. In what follows we focus on the case where Mpl > M5 > M6.

Performing the same linearized analysis as in Section 4.1.1 we can see that the four-dimensional theory of gravity is effectively massive with the soft mass in Fourier space

$${m^2}(k) = {{\pi {m_5}} \over 4}{{\sqrt {m_6^2 - {k^2}}} \over {{\rm{arcth}}\sqrt {{{{m_6} - k} \over {{m_6} + k}}}}}\, .$$
(4.58)

We see that the 4 + 1-dimensional brane plays the role of a regulator (a divergence occurs in the limit m5 → 0).

In this six-dimensional model, there are effectively two new scalar degrees of freedom (arising from the extra dimensions). We can ensure that both of them have the correct sign kinetic term by

  • Either smoothing out the brane [240, 148] (this means that one should really consider a six-dimensional curvature on both the smoothed 4 + 1 and on the 3 + 1-dimensional branes, which is something one would naturally expectFootnote 9).

  • Or by including some tension on the 3 + 1 brane (which is also something natural since the setup is designed to degravitate a large cosmological constant on that brane). This was shown to be ghost free in the decoupling limit in [135] and in the full theory in [150].

As already mentioned in two large extra dimensional models there is to be a maximal value of the cosmological constant that can be considered which is related to the six-dimensional Planck scale. Since that scale is in turn related to the effective mass of the graviton and since observations set that scale to be relatively small, the model can only take care of a relatively small cosmological constant. Nevertheless, it still provides a proof of principle on how to evade Weinberg’s no-go theorem [484].

The extension of cascading gravity to more than two extra dimensions was considered in [149]. It was shown in that case how the 3 + 1 brane remains flat for arbitrary values of the cosmological constant on that brane (within the regime of validity of the weak-field approximation). See Figure 3 for a picture on how the scalar potential adapts itself along the extra dimensions to accommodate for a cosmological constant on the brane.

Figure 3
figure 3

Seven-dimensional cascading scenario and solution for one the metric potential F on the (5+1)-dimensional brane in a 7-dimensional cascading gravity scenario with tension on the (3 + 1)-dimensional brane located at y = z = 0, in the case where \(M_6^4/M_5^3 = M_7^5/M_6^4 = {m_7}\). y and z represent the two extra dimensions on the (5 + 1)-dimensional brane. Image reproduced with permission from [149], copyright APS.

6 Deconstruction

As for DGP and its extensions, to get some insight on how to construct a four-dimensional theory of single massive graviton, we can start with five-dimensional general relativity. This time, we consider the extra dimension to be compactified and of finite size R, with periodic boundary conditions. It is then natural to perform a Kaluza-Klein decomposition and to obtain a tower of Kaluza-Klein graviton mode in four dimensions. The zero mode is then massless and the higher modes are all massive with mass separation m = 1/R. Since the graviton mass is constant in this formalism we omit the subscript 0 in the rest of this review.

Rather than starting directly with a Kaluza-Klein decomposition (discretization in Fourier space), we perform instead a discretization in real space, known as “deconstruction” of five-dimensional gravity [24, 25, 170, 168, 28, 443, 340]. The deconstruction framework helps making the connection with massive gravity more explicit. However, we can also obtain multi-gravity out of it which is then completely equivalent to the Kaluza-Klein decomposition (after a non-linear field redefinition).

The idea behind deconstruction is simply to ‘replace’ the continuous fifth dimension y by a series of N sites yj separated by a distance = R/N. So that the five-dimensional metric is replaced by a set of interacting metrics depending only on x.

In what follows, we review the procedure derived in [152] to recover four-dimensional ghost-free massive gravity as well as bi- and multi-gravity out of five-dimensional GR. The procedure works in any dimensions and we only focus to deconstructing five-dimensional GR for sake of concreteness.

6.1 Formalism

6.1.1 Metric versus Einstein-Cartan formulation of GR

Before going further, let us first describe five-dimensional general relativity in its Einstein-Cartan formulation, where we introduce a set of vielbein \(e_A^{\rm{a}}\), so that the relation between the metric and the vielbein is simply,

$${g_{AB}}(x,y) = e_A^a(x,y)e_B^b(x,y){\eta _{ab}}\, ,$$
(5.1)

where, as mentioned previously, the capital Latin letters label five-dimensional spacetime indices, while letters to a,b,c,… label five-dimensional Lorentz indices.

Under the torsionless condition, de+ωe = 0, the antisymmetric spin connection ω, is uniquely determined in terms of the vielbeins

$$\omega _A^{ab} = {1 \over 2}e_A^c(O_{\, \, \, \, c}^{ab} - O_c^{\, \, ab} - O_{\, \, c}^{b\, \, a})\, ,$$
(5.2)

with \({O^{{\rm{ab}}}}_{\rm{c}} = 2{e^{{\rm{a}}A}}{e^{{\rm{b}}B}}{\partial _{{{[{A^e}B]}_{\rm{c}}}}}\). In the Einstein-Cartan formulation of GR, we introduce a 2-form Riemann curvature,

$${{\mathcal R}^{ab}} = {\rm{d}}{\omega ^{ab}} + \omega _{\, \, c}^a \wedge {\omega ^{cb}}\, ,$$
(5.3)

and up to boundary terms, the Einstein-Hilbert action is then given in the respective metric and the vielbein languages by (here in five dimensions for definiteness),

$$S_{{\rm{EH}}}^{(5)} = {{M_5^3} \over 2}\int \, {{\rm{d}}^4}x\, {\rm{d}}y\sqrt {- g} \, {R^{(5)}}[g]$$
(5.4)
$$={{M_5^3} \over {2\times 3!}}\int\varepsilon_{abcde} \, \mathcal{R}^{ab}\wedge e^{c} \wedge e^{d} \wedge e^{e}\,,$$
(5.5)

where R(5)[g ] is the scalar curvature built out of the five-dimensional metric gμν and M5 is the five-dimensional Planck scale.

The counting of the degrees of freedom in both languages is of course equivalent and goes as follows: In d-spacetime dimensions, the metric has d (d + 1)/2 independent components. Covariance removes 2d of them,Footnote 10 which leads to \({{\mathcal N}_d} = d(d - 3)/2\) independent degrees of freedom. In four-dimensions, we recover the usual \({{\mathcal N}_4} = 2\) independent polarizations for gravitational waves. In five-dimensions, this leads to \({{\mathcal N}_5} = 5\) degrees of freedom which is the same number of degrees of freedom as a massive spin-2 field in four dimensions. This is as expect from the Kaluza-Klein philosophy (massless bosons in d + 1 dimensions have the same number of degrees of freedom as massive bosons in d dimensions — this counting does not directly apply to fermions).

In the Einstein-Cartan formulation, the counting goes as follows: The vielbein has d2 independent components. Covariance removes 2d of them, and the additional global Lorentz invariance removes an additional d (d − 1)/2, leading once again to a total of \({{\mathcal N}_d} = d(d - 3)/2\) independent degrees of freedom.

In GR one usually considers the metric and the vielbein formulation as being fully equivalent. However, this perspective is true only in the bosonic sector. The limitations of the metric formulation becomes manifest when coupling gravity to fermions. For such couplings one requires the vielbein formulation of GR. For instance, in four spacetime dimensions, the covariant action for a Dirac fermion ψ at the quadratic order is given by (see Ref. [392]),

$$S_{\rm{Dirac}}=\int {{1} \over {3!}}\varepsilon_{abcd}\ e^a\wedge e^b \wedge e^c\ \left[ {{i} \over {2}}\bar \psi \gamma^d\, \overleftrightarrow{D}\, \psi-{{m} \over {4}}e^d \bar \psi \psi \right]\,,$$
(5.6)

where the γa’s are the Dirac matrices and represents the covariant derivative, \(D\psi = d\psi - {1 \over 8}{\omega ^{ab}}[{\gamma _a},{\gamma _b}]\psi\).

In the bosonic sector, one can convert the covariant action of bosonic fields (e.g., of scalar, vector fields, etc.…) between the vielbein and the metric language without much confusion, however this is not possible for the covariant Dirac action, or other half-spin fields. For these types of matter fields, the Einstein-Cartan Formulation of GR is more fundamental than its metric formulation. In doubt, one should always start with the vielbein formulation. This is especially important in the case of deconstruction when a discretization in the metric language is not equivalent to a discretization in the vielbein variables. The same holds for Kaluza-Klein decomposition, a point which might have been under-appreciated in the past.

6.1.2 Gauge-fixing

The discretization process breaks covariance and so before staring this procedure it is wise to fix the gauge (failure to do so leads to spurious degrees of freedom which then become ghost in the four-dimensional description). We thus start in five spacetime dimensions by setting the gauge

$${G_{AB}}(x,y)\, {\rm{d}}{x^A}\, {\rm{d}}{x^B} = {\rm{d}}{y^2} + {g_{\mu \nu}}(x,y)\, {\rm{d}}{x^\mu}\, {\rm{d}}{x^\nu}\, ,$$
(5.7)

meaning that the lapse is set to unity and the shift to zero. Notice that one could in principle only set the lapse to unity and keep the shift present throughout the discretization. From a four-dimensional point of view, the shift will then ‘morally’ play the role of the Stückelberg fields, however they do so only after a cumbersome field redefinition. So for sake of clarity and simplicity, in what follows we first gauge-fix the shift and then once the four-dimensional theory is obtained to restore gauge invariance by use of the Stückelberg trick presented previously.

In vielbein language, we fix the five-dimensional coordinate system and use four Lorentz transformations to set

$${e^a} = \left(\begin{array}{*{20}c} {e_\mu ^a\, {\rm{d}}{x^\mu}} \\ {{\rm{d}}y} \\ \end{array} \right)\, ,$$
(5.8)

and use the remaining six Lorentz transformations to set

$$\omega _y^{ab} = {e^{\mu [a}}{\partial _y}e_\mu ^{b]} = 0\, .$$
(5.9)

In this gauge, the five-dimensional Einstein-Hilbert term (5.4), (5.5) is given by

$$S_{{\rm{EH}}}^{(5)} = {{M_5^3} \over 2}\int \, {{\rm{d}}^4}x\, {\rm{d}}y\sqrt {- g} \, \left({R[g] + {{[K]}^2} - [{K^2}]} \right)$$
(5.10)
$$\begin{array}{*{20}c} {= {{M_5^3} \over 4}\int \left({{\varepsilon _{abcd}}\, {R^{ab}} \wedge {e^c} \wedge {e^d} - {K^a} \wedge {K^b} \wedge {e^c} \wedge {e^d}} \right.} \\{\left. {+ 2{K^a} \wedge {\partial _y}{e^b} \wedge {e^c} \wedge {e^d}} \right) \wedge {\rm{d}}y\, ,\quad \quad \quad \quad \quad} \end{array}$$
(5.11)

where R [g ], is the four-dimensional curvature built out of the four-dimensional metric gμν, Rab is the 2-form curvature built out of the four-dimensional vielbein \(e_\mu ^a\) and its associated connection \({\omega ^{ab}} = \omega _\mu ^{ab}d{x^\mu},\,{R^{ab}} = d{\omega ^{ab}} + {\omega ^a}_c\wedge{\omega ^{cb}}\), and \(K_{\,\,\,v}^\mu = {g^{\mu \alpha}}{K_{\alpha v}}\) is the extrinsic curvature,

$${K_{\mu \nu}} = {1 \over 2}{\partial _y}{g_{\mu \nu}} = e_{(\mu}^a{\partial _y}e_{\nu)}^b\, {\eta _{ab}}$$
(5.12)
$$K_\mu ^a = {e^{\nu a}}{K_{\mu \nu}}\, .$$
(5.13)

6.1.3 Discretization in the vielbein

One could in principle go ahead and perform the discretization directly at the level of the metric but first this would not lead to a consistent truncated theory of massive gravity.Footnote 11 As explained previously, the vielbein is more fundamental than the metric itself, and in what follows we discretize the theory keeping the vielbein as the fundamental object.

$$y \hookrightarrow y_j$$
(5.14)
$$e^a_\mu(x,y) \hookrightarrow {e_j}_\mu ^a(x)=e^a_\mu(x,y_j)$$
(5.15)
$$\partial_y e^a_\mu(x,y) \hookrightarrow m_N\left({e_{j+1}}_\mu ^a-{e_j}_\mu ^a\right).$$
(5.16)

The gauge choice (5.9) then implies

$$\omega^{ab}_y=e^{\mu [ a}\partial_y e^{b]}_\mu =0 \quad \hookrightarrow \quad {e_{j + 1}}^{\mu [a}{e_j}_\mu ^{b]} = 0\,,$$
(5.17)

where the arrow ↪ represents the deconstruction of five-dimensional gravity. We have also introduced the ‘truncation scale’, mN = Nm = −1 = NR−1, i.e., the scale of the highest mode in the discretized theory. After discretization, we see the Deser-van Nieuwenhuizen [187] condition appearing in Eq. (5.17), which corresponds to the symmetric vielbein condition. This is a sufficient condition to allow for a formulation back into the metric language [410, 314, 172]. Note, however, that as mentioned in [152], we have not assumed that this symmetric vielbein condition was true, we simply derived it from the discretization procedure in the five-dimensional gauge choice \(\omega _y^{ab} = 0\). In terms of the extrinsic curvature, this implies

$$K^a_\mu \hookrightarrow m_N\left(e_{j+1}{}^a_\mu-e_{j}{}^a_\mu\right)\,.$$
(5.18)

This can be written back in the metric language as follows

$$g_{\mu\nu}(x,y) \hookrightarrow g_{j, \mu\nu}(x)=g_{\mu\nu}(x,y_j)$$
(5.19)
$$K^{\mu}_{\nu} \hookrightarrow -m_N \mathcal{K}^{\mu}_{\nu}[g_{j},\, g_{j+1}]\equiv-m_N\left(\delta^{\mu}_{\nu} -\left(\sqrt{g_{j}^{-1}\, g_{j+1}}\right)^{\mu}_{\nu}\right)\,,$$
(5.20)

where the square root in the extrinsic curvature appears after converting back to the metric language. The square root exists as long as the metrics gj and gj+1 have the same signature and \(g_j^{- 1}{g_{j + 1}}\) has positive eigenvalues so if both metrics were diagonal the ‘time’ direction associated with each metric would be the same, which is a meaningful requirement.

From the metric language, we thus see that the discretization procedure amounts to converting the extrinsic curvature to an interaction between neighboring sites through the building block \({\mathcal K}_v^\mu [{g_j},{g_{j + 1}}]\)

6.2 Ghost-free massive gravity

6.2.1 Simplest discretization

In this subsection we focus on deriving a consistent theory of massive gravity from the discretization procedure (5.19, 5.20). For this, we consider a discretization with only two sites j = 1, 2 and will only be considered in the four-dimensional action induced on one site (say site 1), rather than the sum of both sites. This picture is analogous in spirit to a braneworld picture where we induce the action at one point along the extra dimension. This picture gives the theory of a unique dynamical metric, expressed in terms of a reference metric which corresponds to the fixed metric on the other site. We emphasize that this picture corresponds to a trick to build a consistent theory of massive gravity, and would otherwise be more artificial than its multi-gravity extension. However, as we shall see later, massive gravity can be seen as a perfectly consistent limit of multi (or bi-)gravity where the massless spin-2 field (and other fields in the multi-case) decouple and is thus perfectly acceptable.

To simplify the notation for this two-site case, we write the vielbein on both sites as e1 = e, e2 = f, and similarly for the metrics g1,μν, = g,μν and g2,μν = fμν. Out of the five-dimensional action for GR, we obtain the theory of massive gravity in four dimensions, (on site 1),

$$S^{(5)}_{\rm{EH}} \hookrightarrow S^{(4)}_{\rm{mGR}}\,,$$
(5.21)

with

$$S_{{\rm{mGR}}}^{(4)} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \, \left({R[g] + {m^2}\left({{{[{\mathcal K}]}^2} - [{{\mathcal K}^2}]} \right)} \right)$$
(5.22)
$$= {{M_{{\rm{Pl}}}^2} \over 4}\int \, {\varepsilon _{abcd}}\left({{R^{ab}} \wedge {e^c} \wedge {e^d} + {m^2}{{\mathcal A}^{abcd}}(e,f)} \right)\, ,$$
(5.23)

with the mass term in the vielbein language

$${{\mathcal A}^{abcd}}(e,f) = ({f^a} - {e^a}) \wedge ({f^a} - {e^a}) \wedge {e^c} \wedge {e^d}\, ,$$
(5.24)

or the mass term building block in the metric language,

$${\mathcal K}_\nu ^\mu = \delta _\nu ^\mu - \left({\sqrt {{g^{- 1}}f}} \right)_\nu ^\mu \, .$$
(5.25)

and we introduced the four-dimensional Planck scale, \(M_{{\rm{Pl}}}^2 = M_5^3\int {dy}\), where in this case we limit the integral about one site.

The theory of massive gravity (5.22), or equivalently (5.23) is one special example of a ghost-free theory of massive gravity (i.e., for which the BD ghost is absent). In terms of the ‘Stückelbergized’ tensor \({\mathbb X}\) introduced in Eq. (2.76), we see that

$${\mathcal K}_\nu ^\mu = \delta _\nu ^\mu - \left({\sqrt {\mathbb X}} \right)_\nu ^\mu \, ,$$
(5.26)

or in other words,

$${\mathbb X}_\nu ^\mu = \delta _\nu ^\mu - 2{\mathcal K}_\nu ^\mu + {\mathcal K}_\alpha ^\mu {\mathcal K}_\nu ^\alpha \, ,$$
(5.27)

and the mass term can be written as

$${{\mathcal L}_{{\rm{mass}}}} = - {{{m^2}M_{{\rm{Pl}}}^2} \over 2}\sqrt {- g} \left({[{{\mathcal K}^2}] - {{[{\mathcal K}]}^2}} \right)$$
(5.28)
$$=-{{m^{2}M_{\rm{Pl}}^2} \over {2}}\sqrt{-g} \left([(\mathbb{I}-\sqrt{\mathbb{X}})^2]-[\mathbb{I}-\sqrt{\mathbb{X}}]^2\right)\,.$$
(5.29)

This also a generalization of the Fierz-Pauli mass term, albeit more complicated on first sight than the ones considered in (2.83) or (2.84), but as we shall see, a generalization of the Fierz-Pauli mass term which remains free of the BD ghost as is proven in depth in Section 7. We emphasize that the idea of the approach is not to give a proof of the absence of ghost (which is provided later) but rather to provide an intuitive argument of why the mass term takes its very peculiar structure.

6.2.2 Generalized mass term

This mass term is not the unique acceptable generalization of Fierz-Pauli gravity and by considering more general discretization procedures we can generate the entire 2-parameter family of acceptable potentials for gravity which will also be shown to be free of ghost in Section 7.

Rather than considering the straight-forward discretization e (x, y) ↪ ej (x), we could consider the average value on one site, pondered with arbitrary weight r,

$$e(x,y)\hookrightarrow r e_j+(1-r)e_{j+1}\,.$$
(5.30)

The mass term at one site is then generalized to

$$K^a\wedge K^b \wedge e^c \wedge e^d \hookrightarrow m^2 \mathcal{A}^{abcd}_{r,s}(e_j,e_{j+1})\,,$$
(5.31)

and the most general action for massive gravity with reference vielbein is thusFootnote 12

$${S_{{\rm{mGR}}}} = {{M_{{\rm{Pl}}}^2} \over 4}\int \, {\varepsilon _{abcd}}\left({{R^{ab}} \wedge {e^c} \wedge {e^d} + {m^2}{\mathcal A}_{r,s}^{abcd}(e,\, f)} \right)\, ,$$
(5.32)

with

$${\mathcal A}_{r,s}^{abcd}(e,\, f) = ({f^a} - {e^a}) \wedge ({f^b} - {e^b}) \wedge ((1 - r){e^c} + r{f^c}) \wedge ((1 - s){e^d} + s{f^d})\, ,$$

for any r, s ∈ ℝ.

In particular for the two-site case, this generates the two-parameter family of mass terms

$$\begin{array}{*{20}c} {{\mathcal A}_{r,s}^{abcd}(e,f) = {c_0}\, {e^a} \wedge {e^b} \wedge {e^c} \wedge {e^d} + {c_1}\, {e^a} \wedge {e^b} \wedge {e^c} \wedge {f^d}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\{+ {c_2}\,{e^a} \wedge {e^b} \wedge {f^c} \wedge {f^d} + {c_3}\, {e^a} \wedge {f^b} \wedge {f^c} \wedge {f^d} + {c_4} {f^a} \wedge {f^b} \wedge {f^c} \wedge {f^d}} \end{array}$$
(5.33)
$$\equiv {{\mathcal A}_{1 - r,1 - s}}(f,e)\, ,$$
(5.34)

with c0 = (1 − s)(1 − r), c1 = (−2 + 3s + 3r − 4rs), c2 = (1 − 3s − 3r + 6rs), c3 = (r + s − 4rs) and c4 = rs. This corresponds to the most general potential which, by construction, includes no cosmological constant nor tadpole. One can also always include a cosmological constant for such models, which would naturally arise from a cosmological constant in the five-dimensional picture.

We see that in the vielbein language, the expression for the mass term is extremely natural and simple. In fact this form was guessed at already for special cases in Ref. [410] and even earlier in [502]. However, the crucial analysis on the absence of ghosts and the reason for these terms was incorrect in both of these presentations. Subsequently, after the development of the consistent metric formulation, the generic form of the mass terms was given in Refs. [95]Footnote 13 and [314].

In the metric Language, this corresponds to the following Lagrangian for dRGT massive gravity [144], or its generalization to arbitrary reference metric [296]

$${{\mathcal L}_{{\rm{mGR}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \left({R + {{{m^2}} \over 2}\left({{{\mathcal L}_2}[{\mathcal K}] + {\alpha _3}{{\mathcal L}_3}[{\mathcal K}] + {\alpha _4}{{\mathcal L}_4}[{\mathcal K}]} \right)} \right)\, ,$$
(5.35)

where the two parameters are related to the two discretization parameters r, s as

$${\alpha _3} = r + s,\quad {\rm{and}}\quad {\alpha _4} = rs\, ,$$
(5.36)

and for any tensor Q, we define the scalar n symbolically as

$${{\mathcal L}_n}[Q] = \varepsilon \varepsilon {Q^n}\, ,$$
(5.37)

for any n = 0, ⋯ d, where d is the number of spacetime dimensions. ε is the Levi-Cevita antisymmetric symbol, so for instance in four dimensions, \({{\mathcal L}_2}[Q] = {\varepsilon ^{\mu v\alpha \beta}}{\varepsilon _{\mu \prime v\prime \alpha \beta}}Q_\mu ^{\mu \prime}Q_v^{v\prime} = 2!({[Q]^2} - [{Q^2}])\), so we recover the mass term expressed in (5.28). Their explicit form is given in what follows in the relations (6.11)(6.13) or (6.16)(6.18).

This procedure is easily generalizable to any number of dimensions, and massive gravity in d dimensions has (d − 2)-free parameters which are related to the (d − 2) discretization parameters.

6.3 Multi-gravity

In Section 5.2, we showed how to obtain massive gravity from considering the five-dimensional Einstein-Hilbert action on one site.Footnote 14 Instead in this section, we integrate over the whole of the extra dimension, which corresponds to summing over all the sites after discretization. Following the procedure of [152], we consider N = 2M + 1 sites to start which leads to multi-gravity [314], and then focus on the two-site case leading to bi-gravity [293].

Starting with the five-dimensional action (5.12) and applying the discretization procedure (5.31) with \({\mathcal A}_{r,\,s}^{abcd}\) given in (5.33), we get

$$\begin{array}{*{20}c} {{S_{N\,{\rm{mGR}}}} = {{M_4^2} \over 4}\sum\limits_{j = 1}^N {\int}\, {\varepsilon _{abcd}}\left({{R^{ab}}[{e_j}] \wedge e_j^c \wedge e_j^d + m_N^2{\mathcal A}_{{r_j},{s_j}}^{abcd}({e_j},\,{e_{j + 1}})} \right)\quad \quad \quad} \\ {= {{M_4^2} \over 2}\sum\limits_{j = 1}^N {\int {{{\rm{d}}^4}}} x\sqrt {- {g_j}} \left({R[{g_j}] + {{m_N^2} \over 2}\sum\limits_{n = 0}^4 {\alpha _n^{(j)}} {{\mathcal L}_n}({{\mathcal K}_{j,j + 1}})} \right)\, ,} \end{array}$$
(5.38)

with \(M_4^2 = M_5^3R = M_5^3/m,\,\alpha _2^{(j)} = - 1/2\), and in this deconstruction framework we obtain no cosmological constant nor tadpole, \(\alpha _0^{(j)} = \alpha _1^{(j)} = 0\) at any site j, (but we keep them for generality). In the mass Lagrangian, we use the shorthand notation for the tensor \({\mathcal K}_{\,\,\,\,\,v}^\mu [{g_i},{g_{j + 1}}]\). This is a special case of multi-gravity presented in [314] (see also [417] for other ‘topologies’ in the way the multiple gravitons interact), where each metric only interacts with two other metrics, i.e., with its closest neighbors, leading to 2N-free parameters. For any fixed j, one has \(\alpha _3^{(j)} = ({r_j} + {s_j})\), and \(\alpha _4^{(j)} = {r_j} + {s_j}\).

To see the mass spectrum of this multi-gravity theory, we perform a Fourier decomposition, which is what one would obtain (after a field redefinition) by performing a KK decomposition rather than a real space discretization. KK decomposition and deconstruction are thus perfectly equivalent (after a non-linear — but benignFootnote 15 — field redefinition). We define the discrete Fourier transform of the vielbein variables,

$$\tilde e_{\mu ,n}^a = {1 \over {\sqrt N}}\sum\limits_{j = 1}^N {e_{\mu ,j}^a} {e^{i{{2\pi} \over N}j}}\, ,$$
(5.39)

with the inverse map,

$$e_{\mu ,j}^a = {1 \over {\sqrt N}}\sum\limits_{n = - M}^M {\tilde e_{\mu ,n}^a} {e^{- i{{2\pi} \over N}n}}\, .$$
(5.40)

In terms of the Fourier transform variables, the multi-gravity action then reads at the linear level

$${\mathcal L} = \sum\limits_{n = - M}^M {\left[ {(\partial {{\tilde h}_n})(\partial {{\tilde h}_{- n}}) + m_n^2{{\tilde h}_n}{{\tilde h}_{- n}}} \right]} + {{\mathcal L}_{{\rm{int}}}}$$
(5.41)

with \(M_{{\rm{Pl}}}^{- 1}{{\tilde h}_{\mu v,n}} = {{\tilde e}^a}_{\mu, n}\tilde e_{v,n}^b{\eta _{ab}} - {\eta _{\mu v}}\) and MPl represents the four-dimensional Planck scale, \({M_{{\rm{Pl}}}} = {M_4}\sqrt N\). The reality condition on the vielbein imposes n = ẽ*n and similarly for \({\tilde h_n}\). The mass spectrum is then

$${m_n} = {m_N}\sin \left({{n \over N}} \right) \approx nm\quad {\rm{for}}\quad n \ll N.$$
(5.42)

The counting of the degrees of freedom in multi-gravity goes as follows: the theory contains 2M massive spin-2 fields with five degrees of freedom each and one massless spin-2 field with two degrees of freedom, corresponding to a total of 10M + 2 degrees of freedom. In the continuum limit, we also need to account for the zero mode of the lapse and the shift which have been gauged fixed in five dimensions (see Ref. [443] for a nice discussion of this point). This leads to three additional degrees of freedom, summing up to a total of 5N degrees of freedom of the four coordinates x2.

6.4 Bi-gravity

Let us end this section with the special case of bi-gravity. Bi-gravity can also be derived from the deconstruction paradigm, just as massive gravity and multi-gravity, but the idea has been investigated for many years (see for instance [436, 324]). Like massive gravity, bi-gravity was for a long time thought to host a BD ghost parasite, but a ghost-free realization was recently proposed by Hassan and Rosen [293] and bi-gravity is thus experiencing a revived amount of interested. This extensions is nothing other than the ghost-free massive gravity Lagrangian for a dynamical reference metric with the addition of an Einstein-Hilbert term for the now dynamical reference metric.

6.4.1 Bi-gravity from deconstruction

Let us consider a two-site discretization with periodic boundary conditions, j = 1, 2, 3 with quantities at the site j = 3 being identified with that at the site j = 1. Similarly, as in Section 5.2 we denote by \({g_{\mu v}} = e_\mu ^ae_v^b{\eta _{ab}}\) and by \({f_{\mu v}} = f_\mu ^af_v^b{\eta _{ab}}\) the metrics and vielbeins at the respective locations y1 and y2.

Then applying the discretization procedure highlighted in Eqs. (5.14, 5.15, 5.18, 5.19 and 5.20) and summing over the extra dimension, we obtain the bi-gravity action

$$\begin{array}{*{20}c} S_{\rm{bi-gravity}}={{M_{\rm{Pl}}^2}\over {2}} \int {\rm{d}}^4x \sqrt{-g} R[g]+ {{M_f^2}\over {2}} \int {\rm{d}}^4x \sqrt{-f} R[f]\\ + {{M_{\rm{Pl}}^2} {m^{2}}\over{4}} \int {\rm{d}}^4 x \sqrt{-g} \sum\nolimits_{n=0}^4\alpha_n \mathcal{L}_n[\mathcal{K}[g,f]]\,, \end{array}$$
(5.43)

where \({\mathcal K}[g,f]\) is given in (5.25) and we use the notation Mg = MPl. We can equivalently well write the mass terms in terms of \({\mathcal K}[g,f]\) rather than \({\mathcal K}[g,f]\) as performed in (6.21).

Notice that the most naive discretization procedure would lead to Mg = MPl = Mf, but these can be generalized either ‘by hand’ by changing the weight of each site during the discretization, or by considering a non-trivial configuration along the extra dimension (for instance warping along the extra dimensionFootnote 16), or most simply by performing a conformal rescaling of the metric at each site.

Here, \({{\mathcal L}_0}[{\mathcal K}[g,f]]\) corresponds to a cosmological constant for the metric gμν and the special combination \(\sum\nolimits_{n = 0}^4 {{{(- 1)}^n}C_4^n{{\mathcal L}_n}[{\mathcal K}[g,f]]}\), where the \(C_n^m\) are the binomial coefficients is the cosmological constant for the metric fμν, so only 2,3,4 correspond to genuine interactions between the two metrics.

In the deconstruction framework, we naturally obtain α2 = 1 and no tadpole nor cosmological constant for either metrics.

6.4.2 Mass eigenstates

In this formulation of bi-gravity, both metrics g and f carry a superposition of the massless and the massive spin-2 field. As already emphasize the notion of mass (and of spin) only makes sense for a field living in Minkowski, and so to analyze the mass spectrum, we expand both metrics about flat spacetime,

$${g_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {{M_{{\rm{Pl}}}}}}\delta {g_{\mu \nu}}$$
(5.44)
$${f_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {{M_f}}}\delta {f_{\mu \nu}}\, .$$
(5.45)

The general mass spectrum about different backgrounds is richer and provided in [300]. Here we only focus on a background which preserves Lorentz invariance (in principle we could also include other maximally symmetric backgrounds which hae the same amount of symmetry as Minkowski).

Working about Minkowski, then to quadratic order in h, the action for bi-gravity reads (for

$$S_{{\rm{bi - gravity}}}^{(2)} = \int {{{\rm{d}}^4}} x\left[ {- {1 \over 4}\delta {g^{\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}\delta {g_{\alpha \beta}} - {1 \over 4}\delta {f^{\mu \nu}}\hat {\mathcal E}_{\mu \nu}^{\alpha \beta}\delta {f_{\alpha \beta}} - {1 \over 8}m_{{\rm{eff}}}^2\left({h_{\mu \nu}^2 - {h^2}} \right)} \right]\, ,$$
(5.46)

where all indices are raised and lowered with respect to the flat Minkowski metric and the Lichnerowicz operator \(\hat \varepsilon _{\mu v}^{\alpha \beta}\) was defined in (2.37). We see appearing the Fierz-Pauli mass term combination \(h_{\mu v}^2 - {h^2}\) introduced in (2.44) for the massive field with the effective mass Meff defined as [293]

$$M_{\rm{eff}}^2=\left(M_{\rm{Pl}}^{-2}+M_f^{-2}\right)^{-1}$$
(5.47)
$$m_{{\rm{eff}}}^2 = {m^2}{{M_{{\rm{Pl}}}^2} \over {M_{{\rm{eff}}}^2}}\, .$$
(5.48)

The massive field h is given by

$${h_{\mu \nu}} = {M_{{\rm{eff}}}}\left({{1 \over {{M_{{\rm{Pl}}}}}}\delta {g_{\mu \nu}} - {1 \over {{M_f}}}\delta {f_{\mu \nu}}} \right) = {M_{{\rm{eff}}}}\left({{g_{\mu \nu}} - {f_{\mu \nu}}} \right)\, ,$$
(5.49)

while the other combination represents the massless field μν,

$${\ell _{\mu \nu}} = {M_{{\rm{eff}}}}\left({{1 \over {{M_f}}}\delta {g_{\mu \nu}} + {1 \over {{M_{{\rm{Pl}}}}}}\delta {f_{\mu \nu}}} \right)\, ,$$
(5.50)

so that in terms of the light and heavy spin-2 fields (or more precisely in terms of the two mass eigenstates h and ), the quadratic action for bi-gravity reproduces that of a massless spin-2 field and a Fierz-Pauli massive spin-2 field h with mass meff,

$$\begin{array}{*{20}c} {S_{{\rm{bi - gravity}}}^{(2)} = \int {{{\rm{d}}^4}{\rlap {-}{x}}\left[ {{1 \over 4}{h^{\mu \nu}}\left[ {\hat {\mathcal E} _{\mu \nu}^{\alpha \beta} + {1 \over 2}m_{{\rm{eff}}}^2\left({\delta _\mu ^\alpha \delta _\nu ^\beta - {\eta ^{\alpha \beta}}{\eta _{\mu \nu}}} \right)} \right]{h_{\alpha \beta}}} \right.}} \\ {\left. {- {1 \over 4}{\ell ^{\mu \nu}}\hat {\mathcal E} _{\mu \nu}^{\alpha \beta}{\ell _{\alpha \beta}}} \right].\quad \quad \quad \quad \,\,\,}\\ \end{array}$$
(5.51)

As explained in [293], in the case where there is a large Hierarchy between the two Planck scales MPl and Mf, the massive particles is always the one that enters at the lower Planck mass and the massless one the one that has a large Planck scale. For instance if MfMPl, the massless particle is mainly given by δfμν and the massive one mainly by δgμν. This means that in the limit Mf → ∞ while keeping MPl fixed, we recover the theory of a massive gravity and a fully decoupled massless graviton as will be explained in Section 8.2.

6.5 Coupling to matter

So far we have only focus on an empty five-dimensional bulk with no matter. It is natural, though, to consider matter fields living in five dimensions, χ (x, y) with Lagrangian (in the gauge choice (5.7))

$${{\mathcal L}_{{\rm{matter}}}} = \sqrt {- g} \left({- {1 \over 2}{{({\partial _\mu}\chi)}^2} - {1 \over 2}{{({\partial _y}\chi)}^2} - V(\chi)} \right)\, ,$$
(5.52)

in addition to arbitrary potentials (we focus on the case of a scalar field for simplicity, but the same philosophy can be applied to higher-spin species be it bosons or fermions). Then applying the same discretization scheme used for gravity, every matter field then comes in N copies

$$\chi(x,y)\hookrightarrow \chi^{(j)}(x)=\chi(x,y_j)\,,$$
(5.53)

for j = 1, ⋯, N and each field χj is coupled to the associated vielbein e(j) or metric \(g_{\mu v}^{(i)} = e_\mu ^{(j)}{}^ae_v^{(j)}{}^b{\eta _{ab}}\) at the same site. In the discretization procedure, the gradient along the extra dimension yields a mixing (interaction) between fields located on neighboring sites,

$$\int {\rm{d}} y \, (\partial_y\chi)^2 \hookrightarrow R \sum\nolimits_{j=1}^N\, m^2 (\chi^{(j+1)}(x)- \chi^{(j)}(x))^2\,,$$
(5.54)

(assuming again periodic boundary conditions, χ(N +1) = χ(1)). The discretization procedure could be also performed using a more complicated definition of the derivative along y involving more than two sites, which leads to further interactions between the different fields.

In the two-sight derivative formulation, the action for matter is then

$$\begin{array}{*{20}c} {S_{{\rm{matter}}}} \hookrightarrow {1 \over m}\int {{{\rm{d}}^4}} x\sum\limits_j {\sqrt {- {g^{(j)}}}} \left({- {1 \over 2}{g^{(j)\, \mu \nu}}{\partial _\mu}{\chi ^{(j)}}{\partial _\nu}{\chi ^{(j)}}} \right. \\ \left. {\quad \quad \quad \quad \quad \quad \quad - {1 \over 2}{m^2}{{({\chi ^{(j + 1)}} - {\chi ^{(j)}})}^2} - V({\chi ^{(j)}})} \right). \\ \end{array}$$
(5.55)

The coupling to gauge fields or fermions can be derived in the same way, and the vielbein formalism makes it natural to extend the action (5.6) to five dimensions and applying the discretization procedure. Interestingly, in the case of fermions, the fields and would not directly couple to one another, but they would couple to both the vielbein e(j) at the same site and the one e(j −1) on the neighboring site.

Notice, however, that the current full proofs for the absence of the BD ghost do not include such couplings between matter fields living on different metrics (or vielbeins), nor matter fields coupling directly to more than one metric (vielbein).

6.6 No new kinetic interactions

In GR, diffeomorphism invariance uniquely fixes the kinetic term to be the Einstein-Hilbert one

$${{\mathcal L}_{EH}} = \sqrt {- g} R \,,$$
(5.56)

(see, for instance, Refs. [287, 483, 175, 225, 76] for the uniqueness of GR for the theory of a massless spin-2 field).

In more than four dimensions, the GR action can be supplemented by additional Lovelock invariants [383] which respect diffeomorphism invariance and are expressed in terms of higher powers of the Riemann curvature but lead to second order equations of motion. In four dimensions there is only one non-trivial additional Lovelock invariant corresponding the Gauss-Bonnet term but it is topological and thus does not affect the theory, unless other degrees of freedom such as a scalar field is included.

So, when dealing with the theory of a single massless spin-2 field in four dimensions the only allowed kinetic term is the well-known Einstein-Hilbert one. Now when it comes to the theory of a massive spin-2 field, diffeomorphism invariance is broken and so in addition to the allowed potential terms described in (6.9)(6.13), one could consider other kinetic terms which break diffeomorphism.

This possibility was explored in Refs. [231, 310, 230] where it was shown that in four dimensions, the following derivative interaction \({\mathcal L}_3^{{\rm{(der)}}}\) is ghost-free at leading order (i.e., there is no higher derivatives for the Stückelberg fields when introducing the Stückelberg fields associated with linear diffeomorphism),

$${\mathcal L}_3^{({\rm{der}})} = {\varepsilon ^{\mu \nu \, \rho \sigma}}{\varepsilon ^{\mu \prime \nu \prime \rho \prime \sigma \prime}}{h_{\sigma \sigma \prime}}{\partial _\rho}{h_{\mu \mu \prime}}{\partial _{\rho \prime}}{h_{\nu \nu \prime}}\, .$$
(5.57)

So this new derivative interaction would be allowed for a theory of a massive spin-2 field which does not couple to matter. Note that this interaction can only be considered if the spin-2 field is massive in the first place, so this interaction can only be present if the Fierz-Pauli mass term (2.44) is already present in the theory.

Now let us turn to a theory of gravity. In that case, we have seen that the coupling to matter forces linear diffeomorphisms to be extended to fully non-linear diffeomorphism. So to be viable in a theory of massive gravity, the derivative interaction (5.57) should enjoy a ghost-free non-linear completion (the absence of ghost non-linearly can be checked for instance by restoring non-linear diffeomorphism using the non-linear Stückelberg decomposition (2.80) in terms of the helicity-1 and -0 modes given in (2.46), or by performing an ADM analysis as will be performed for the mass term in Section 7.) It is easy to check that by itself \({\mathcal L}_3^{{\rm{(der)}}}\) has a ghost at quartic order and so other non-linear interactions should be included for this term to have any chance of being ghost-free.

Within the deconstruction paradigm, the non-linear completion of \({\mathcal L}_3^{{\rm{(der)}}}\) could have a natural interpretation as arising from the five-dimensional Gauss-Bonnet term after discretization. Exploring the avenue would indeed lead to a new kinetic interaction of the form \(\sqrt {- g} {{\mathcal K}_{\mu v}}{{\mathcal K}_{\alpha \beta}}^*{R^{\mu v\alpha \beta}}\), where *R is the dual Riemann tensor [339, 153]. However, a simple ADM analysis shows that such a term propagates more than five degrees of freedom and thus has an Ostrogradsky ghost (similarly as the BD ghost). As a result this new kinetic interaction (5.57) does not have a natural realization from a five-dimensional point of view (at least in its metric formulation, see Ref. [153] for more details.)

We can push the analysis even further and show that no matter what the higher order interactions are, as soon as \({\mathcal L}_3^{{\rm{(der)}}}\) is present it will always lead to a ghost and so such an interaction is never acceptable [153].

As a result, the Einstein-Hilbert kinetic term is the only allowed kinetic term in Lorentzinvariant (massive) gravity.

This result shows how special and unique the Einstein-Hilbert term is. Even without imposing diffeomorphism invariance, the stability of the theory fixes the kinetic term to be nothing else than the Einstein-Hilbert term and thus forces diffeomorphism invariance at the level of the kinetic term. Even without requiring coordinate transformation invariance, the Riemann curvature remains the building block of the kinetic structure of the theory, just as in GR.

Before summarizing the derivation of massive gravity from higher dimensional deconstruction/Kaluza-Klein decomposition, we briefly comment on other ‘apparent’ modifications of the kinetic structure like in f (R) — gravity (see for instance Refs. [89, 354, 46] for f (R) massive gravity and their implications to cosmology).

Such kinetic terms à la f (R) are also possible without a mass term for the graviton. In that case diffeomorphism invariance allows us to perform a change of frame. In the Einstein-frame f (R) gravity is seen to correspond to a theory of gravity with a scalar field, and the same result will hold in f (R) massive gravity (in that case the scalar field couples non-trivially to the Stückelberg fields). As a result f (R) is not a genuine modification of the kinetic term but rather a standard Einstein-Hilbert term and the addition of a new scalar degree of freedom which not a degree of freedom of the graviton but rather an independent scalar degree of freedom which couples non-minimally to matter (see Ref. [128] for a review on f (R)-gravity.)

7 Part II Ghost-free Massive Gravity

8 Massive, Bi- and Multi-Gravity Formulation: A Summary

The previous ‘deconstruction’ framework gave a intuitive argument for the emergence of a potential of the form (6.3) (or (6.1) in the vielbein language) and its bi- and multi-metric generalizations. In deconstruction or Kaluza-Klein decomposition a certain type of interaction arises naturally and we have seen that the whole spectrum of allowed potentials (or interactions) could be generated by extending the deconstruction procedure to a more general notion of derivative or by involving the mixing of more sites in the definition of the derivative along the extra dimensions. We here summarize the most general formulation for the theories of massive gravity about a generic reference metric, bi-gravity and multi-gravity and provide a dictionary between the different languages used in the literature.

The general action for ghost-free (or dRGT) massive gravity [144] in the vielbein language is [95, 314] (see however Footnote 13 with respect to Ref. [95], see also Refs. [502, 410] for earlier work)

$${S_{{\rm{mGR}}}} = {{M_{{\rm{Pl}}}^2} \over 4}\, \int {\left({{\varepsilon _{abcd}}{R^{ab}} \wedge {e^c} \wedge {e^d} + {m^2}{L^{({\rm{mass}})}}(e,\, f)} \right),}$$
(6.1)

with

$$\begin{array}{*{20}c} {{{\mathcal L}^{({\rm{mass}})}}(e,f) = {\varepsilon _{abcd}}\left[ {{c_0}\, {e^a} \wedge {e^b} \wedge {e^c} \wedge {e^d} + {c_1}\, {e^a} \wedge {e^b} \wedge {e^c} \wedge {f^d}} \right.} \\ {\quad \quad \quad \quad + {c_2}\, {e^a} \wedge {e^b} \wedge {f^c} \wedge {f^d} + {c_3}\, {e^a} \wedge {f^b} \wedge {f^c} \wedge {f^d}} \\ {\left. {+ {c_4}\, {f^a} \wedge {f^b} \wedge {f^c} \wedge {f^d}} \right],\,\,\,\quad \quad \quad} \\ \end{array}$$
(6.2)

or in the metric language [144],

$${S_{{\rm{mGR}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}x\sqrt {- g} \left({R + {{{m^2}} \over 2}\sum\limits_{n = 0}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,f]]} \right)\, .}$$
(6.3)

In what follows we will use the notation for the overall potential of massive gravity

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt {- g} \sum\limits_{n = 0}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,f]] = - {{\mathcal L}^{({\rm{mass}})}}(e,f)\, ,$$
(6.4)

so that

$${{\mathcal L}_{{\rm{mGR}}}} = M_{{\rm{Pl}}}^2{{\mathcal L}_{{\rm{GR}}}}[g] - {m^2}\;{\mathcal U}[g,f]\, ,$$
(6.5)

where GR[g ] is the standard GR Einstein-Hilbert Lagrangian for the dynamical metric gμν and fμν is the reference metric and for bi-gravity,

$${{\mathcal L}_{{\rm{bi - gravity}}}} = M_{{\rm{Pl}}}^2{{\mathcal L}_{{\rm{GR}}}}[g] + M_f^2{{\mathcal L}_{{\rm{GR}}}}[f] - {m^2}\;{\mathcal U}[g,f]\, ,$$
(6.6)

where both gμν and fμν are then dynamical metrics.

Both massive gravity and bi-gravity break one copy of diff invariance and so the Stückelberg fields can be introduced in exactly the same way in both cases \({\mathcal U}[g,f] \to {\mathcal U}[g,\tilde f]\) where the Stückelbergized metric \({\tilde f_{\mu v}}\) was introduced in (2.75) (or alternatively \({\mathcal U}[g,f] \to {\mathcal U}[\tilde g,f]\). Thus bi-gravity is by no means an alternative to introducing the Stückelberg fields as is sometimes stated.

In these formulations, 0 (or the term proportional to c0) correspond to a cosmological constant, 1 to a tadpole, 2 to the mass term and 3,4 to allowed higher order interactions. The presence of the tadpole 1 would imply a non-zero vev. The presence of the potentials 3,4 without 2 would lead to infinitely strongly coupled degrees of freedom and would thus be pathological. We recall that \({\mathcal K}[g,f]\) is given in terms of the metrics g and f as

$${\mathcal K}_{\,\,\,\nu}^\mu [g,f] = \delta _{\,\,\,\nu}^\mu - \left({\sqrt {{g^{- 1}}f}} \right)_\nu ^\mu \, ,$$
(6.7)

and the Lagrangians n are defined as follows in arbitrary dimensions d [144]

$${{\mathcal L}_n}[Q] = - (d - n)!\sum\limits_{m = 1}^n {{{(- 1)}^m}} {{(n - 1)!} \over {(n - m)!(d - n + m)!}}[{Q^m}]{\mathcal L}_n^{(n - m)}[Q]\, ,$$
(6.8)

with 0[Q ] = d ! and = (1[Q ] = (d − 1)![Q ] or equivalently in four dimensions [292]

$${{\mathcal L}_0}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \nu \alpha \beta}}$$
(6.9)
$${{\mathcal L}_1}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \prime \nu \alpha \beta}}\, Q_\mu ^{\mu \prime}$$
(6.10)
$${{\mathcal L}_2}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \prime \nu \prime \alpha \beta}}Q_\nu ^{\mu \prime}Q_\nu ^{\nu \prime}$$
(6.11)
$${{\mathcal L}_3}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \prime \nu \prime \alpha \prime \beta}}Q_\nu ^{\mu \prime}Q_\nu ^{\nu \prime}Q_\alpha ^{\alpha \prime}$$
(6.12)
$${{\mathcal L}_4}[Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu \prime \nu \prime \alpha \prime \beta \prime}}Q_\nu ^{\mu \prime}Q_\nu ^{\nu \prime}Q_\alpha ^{\alpha \prime}Q_\beta ^{\beta \prime}\, .$$
(6.13)

We have introduced the constant \({{\mathcal L}_0}\) (\({{\mathcal L}_0} = 4!\) and \(\sqrt {- g{{\mathcal L}_0}}\) is nothing other than the cosmological constant) and the tadpole 1 for completeness. Notice however that not all these five Lagrangians are independent and the tadpole can always be re-expressed in terms of a cosmological constant and the other potential terms.

Alternatively, we may express these scalars as follows [144]

$${{\mathcal L}_0}[Q] = 4!$$
(6.14)
$${{\mathcal L}_1}[Q] = 3!\, [Q]$$
(6.15)
$${{\mathcal L}_2}[Q] = 2!({[Q]^2} - [{Q^2}])$$
(6.16)
$${{\mathcal L}_3}[Q] = ({[Q]^3} - 3[Q][{Q^2}] + 2[{Q^3}])$$
(6.17)
$${{\mathcal L}_4}[Q] = ({[Q]^4} - 6{[Q]^2}[{Q^2}] + 3{[{Q^2}]^2} + 8[Q][{Q^3}] - 6[{Q^4}])\, .$$
(6.18)

These are easily generalizable to any number of dimensions, and in d dimensions we find d such independent scalars.

The multi-gravity action is a generalization to multiple interacting spin-2 fields with the same form for the interactions, and bi-gravity is the special case of two metrics (N = 2), [314]

$${S_N} = {{M_{{\rm{Pl}}}^2} \over 4}\sum\limits_{j = 1}^N {\int {\left({{\varepsilon _{abcd}}{R^{ab}}[{e_j}] \wedge e_j^c \wedge e_j^d + m_N^2{{\mathcal L}^{({\rm{mass}})}}({e_j},{e_{j + 1}})} \right)\,}} ,$$
(6.19)

or

$${S_N} = {{M_{{\rm{Pl}}}^2} \over 2}\sum\limits_{j = 1}^N {\int {{{\rm{d}}^4}x\sqrt {- {g_j}} \left({R[{g_j}] + {{m_N^2} \over 2}\sum\limits_{n = 0}^4 {\alpha _n^{(j)}} {{\mathcal L}_n}[{\mathcal K}[{g_j},{g_{j + 1}}]]} \right)\,}} .$$
(6.20)

8.1 Inverse argument

We could have written this set of interactions in terms of \({\mathcal K}[f,g]\) rather than \({\mathcal K}[g,f]\),

$$\begin{array}{*{20}c} {{\mathcal U} = {{M_{{\rm{Pl}}}^2{m^2}} \over 4}\int {{{\rm{d}}^4}} x\sqrt {- g} \sum\limits_{n = 0}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,f]]\quad} \\ {\, = {{M_{{\rm{Pl}}}^2{m^2}} \over 4}\int {{{\rm{d}}^4}} x\sqrt {- f} \sum\limits_{n = 0}^4 {{{\tilde \alpha}_n}} {{\mathcal L}_n}[{\mathcal K}[f,g]]\, ,} \\ \end{array}$$
(6.21)

with

$$\left(\begin{array}{*{20}c} {{{\tilde \alpha}_0}} \\ {{{\tilde \alpha}_1}} \\ {{{\tilde \alpha}_2}} \\ {{{\tilde \alpha}_3}} \\ {{{\tilde \alpha}_4}} \\ \end{array} \right) = \left(\begin{array}{*{20}c} 1 & 0 & 0 & 0 & 0 \\ {- 4} & {- 1} & 0 & 0 & 0 \\ 6 & 3 & 1 & 0 & 0 \\ {- 4} & {- 3} & {- 2} & {- 1} & 0 \\ 1 & 1 & 1 & 1 & 1 \\ \end{array} \right)\left(\begin{array}{*{20}c} {{\alpha _0}} \\ {{\alpha _1}} \\ {{\alpha _2}} \\ {{\alpha _3}} \\ {{\alpha _4}} \\ \end{array} \right)$$
(6.22)

Interestingly, the absence of tadpole and cosmological constant for say the metric implies α0 = α1 = 0 which in turn implies the absence of tadpole and cosmological constant for the other metric f, ã0 = ã1 = 0, and thus ã2 = α2 = 1.

8.2 Alternative variables

Alternatively, another fully equivalent convention has also been used in the literature [292] in terms of \({\mathbb X}_{\,\,\,v}^\mu = {g^{\mu \alpha}}{f_{\alpha v}}\) defined in (2.76),

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt {- g} \sum\limits_{n = 0}^4 {{{{\beta _n}} \over {n!}}} {{\mathcal L}_n}[\sqrt {\mathbb{X} }]\, ,$$
(6.23)

which is equivalent to (6.4) with 0 = 4! and

$$\left(\begin{array}{*{20}c} {{\beta _0}} \\ {{\beta _1}} \\ {{\beta _2}} \\ {{\beta _3}} \\ {{\beta _4}} \\ \end{array} \right) = \left(\begin{array}{*{20}c} 1 & 1 & 1 & 1 & 1 \\ 0 & {- 1} & {- 2} & {- 3} & {- 4} \\ 0 & 0 & 2 & 6 & {12} \\ 0 & 0 & 0 & {- 6} & {- 24} \\ 0 & 0 & 0 & 0 & {24} \\ \end{array} \right)\left(\begin{array}{*{20}c} {{\alpha _0}} \\ {{\alpha _1}} \\ {{\alpha _2}} \\ {{\alpha _3}} \\ {{\alpha _4}} \\ \end{array} \right)\, ,$$
(6.24)

or the inverse relation,

$$\left(\begin{array}{*{20}c} {{\alpha _0}} \\ {{\alpha _1}} \\ {{\alpha _2}} \\ {{\alpha _3}} \\ {{\alpha _4}} \\ \end{array} \right) = {1 \over {24}}\left(\begin{array}{*{20}c} {24} & {24} & {12} & 4 & 1 \\ 0 & {- 24} & {- 24} & {- 12} & {- 4} \\ 0 & 0 & {12} & {12} & 6 \\ 0 & 0 & 0 & {- 4} & {- 4} \\ 0 & 0 & 0 & 0 & 1 \\ \end{array} \right)\left(\begin{array}{*{20}c} {{\beta _0}} \\ {{\beta _1}} \\ {{\beta _2}} \\ {{\beta _3}} \\ {{\beta _4}} \\ \end{array} \right)\, ,$$
(6.25)

so that in order to avoid a tadpole and a cosmological constant we need to set for instance β4 = − (24β0 + 24β1 + 12β2 + 4β3) and β3 = −6(4β0 + 3β1 + β2).

8.3 Expansion about the reference metric

In the vielbein language the mass term is extremely simple, as can be seen in Eq. (6.1) with \({\mathcal A}\) defined in (2.60). Back to the metric language, this means that the mass term takes a remarkably simple form when writing the dynamical metric gμν in terms of the reference metric and a difference \({\tilde h_{\mu v}} = 2{h_{\mu v}} + h_{\mu v}^2\) as

$${g_{\mu \nu}} = {f_{\mu \nu}} + 2{h_{\mu \nu}} + {h_{\mu \alpha}}{h_{\nu \beta}}{f^{\alpha \beta}}\, ,$$
(6.26)

where fαβ = (f−1)αβ The mass terms is then expressed as

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt {- f} \sum\limits_{n = 0}^4 {{\kappa _n}} {{\mathcal L}_n}[{f^{\mu \alpha}}{h_{\alpha \nu}}]\, ,$$
(6.27)

where the n have the same expression as the n in (6.9)-(6.13) so \({\tilde {\mathcal L}_n}\) is genuinely nth order in hμν. The expression (6.27) is thus at most quartic order in hμν but is valid to all orders in hμν, (there is no assumption that h be small). In other words, the mass term (6.27) is not an expansion in hμν truncated to a finite (quartic) order, but rather a fully equivalent way to rewrite the mass Lagrangian in terms of the variable hμν rather than gμν. Of course the kinetic term is intrinsically non-linear and includes a infinite expansion in hμν. A generalization of such parameterizations are provided in [300].

The relation between the coefficients κn and αn is given by

$$\left(\begin{array}{*{20}c} {{\kappa _0}} \\ {{\kappa _1}} \\ {{\kappa _2}} \\ {{\kappa _3}} \\ {{\kappa _4}} \\ \end{array} \right) = \left(\begin{array}{*{20}c} 1 & 0 & 0 & 0 & 0 \\ 4 & 1 & 0 & 0 & 0 \\ 6 & 3 & 1 & 0 & 0 \\ 4 & 3 & 3 & 1 & 0 \\ 1 & 1 & 1 & 1 & 1 \\ \end{array} \right)\left(\begin{array}{*{20}c} {{\alpha _0}} \\ {{\alpha _1}} \\ {{\alpha _2}} \\ {{\alpha _3}} \\ {{\alpha _4}} \\ \end{array} \right)\, .$$
(6.28)

The quadratic expansion about a background different from the reference metric was derived in Ref. [278]. Notice however that even though the mass term may not appear as having an exact Fierz-Pauli structure as shown in [278], it still has the correct structure to avoid any BD ghost, about any background [295, 294, 300, 297].

9 Evading the BD Ghost in Massive Gravity

The deconstruction framework gave an intuitive approach on how to construct a theory of massive gravity or multiple interacting ‘gravitons’. This lead to the ghost-free dRGT theory of massive gravity and its bi- and multi-gravity extensions in a natural way. However, these developments were only possible a posteriori.

The deconstruction framework was proposed earlier (see Refs. [24, 25, 168, 28, 443, 168, 170]) directly in the metric language and despite starting from a perfectly healthy five-dimensional theory of GR, the discretization in the metric language leads to the standard BD issue (this also holds in a KK decomposition when truncating the KK tower at some finite energy scale). Knowing that massive gravity (or multi-gravity) can be naturally derived from a healthy five-dimensional theory of GR is thus not a sufficient argument for the absence of the BD ghost, and a great amount of effort was devoted to that proof, which is known by now a multitude of different forms and languages.

Within this review, one cannot make justice to all the independent proofs that have been formulated by now in the literature. We thus focus on a few of them — the Hamiltonian analysis in the ADM language — as well as the analysis in the Stückelberg language. One of the proofs in the vielbein formalism will be used in the multi-gravity case, and thus we do not emphasize that proof in the context of massive gravity, although it is perfectly applicable (and actually very elegant) in that case. Finally, after deriving the decoupling limit in Section 8.3, we also briefly review how it can be used to prove the absence of ghost more generically.

We note that even though the original argument on how the BD ghost could be circumvented in the full nonlinear theory was presented in [137] and [144], the absence of BD ghost in “ghost-free massive gravity” or dRGT has been the subject of many discussions [12, 13, 345, 342, 95, 341, 344, 96] (see also [350, 351, 349, 348, 352] for related discussions in bi-gravity). By now the confusion has been clarified, and see for instance [295, 294, 400, 346, 343, 297, 15, 259] for thorough proofs addressing all the issues raised in the previous literature. (See also [347] for the proof of the absence of ghosts in other closely related models).

9.1 ADM formulation

9.1.1 ADM formalism for GR

Before going onto the subtleties associated with massive gravity, let us briefly summarize how the counting of the number of degrees of freedom can be performed in the ADM language using the Hamiltonian for GR. Using an ADM decomposition (where this time, we single out the time, rather than the extra dimension as was performed in Part I),

$${\rm{d}}{s^2} = - {N^2}{\rm{d}}{t^2} + {\gamma _{ij}}({\rm{d}}{x^i} + {N^i}{\rm{d}}t)\;({\rm{d}}{x^j} + {N^j}{\rm{d}}t)\,,$$
(7.1)

with the lapse N, the shift and the 3-dimensional space metric γij. In this section indices are raised and lowered with respect to γij and dots represent derivatives with respect to t. In terms of these variables, the action density for GR is

$${{\mathcal L}_{{\rm{GR}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {\rm{d}} t\left({\sqrt {- g} R + {\partial _t}\left[ {\sqrt {- g} [k]} \right]} \right)$$
(7.2)
$$= {{M_{{\rm{Pl}}}^2} \over 2}\int {\rm{d}} tN\sqrt \gamma \left({{}^{(3)}R[\gamma ] + {{[k]}^2} - [{k^2}]} \right)\,,$$
(7.3)

where (3)R is the three-dimensional scalar curvature built out of γ (no time derivatives in (3)R) and i is the three-dimensional extrinsic curvature,

$${k_{ij}} = {1 \over {2N}}({\dot \gamma _{ij}} - {\nabla _{(i}}{N_{j)}})\,.$$
(7.4)

The GR action can thus be expressed in a way which has no double or higher time derivatives and only first time-derivatives squared of γij This means that neither the shift nor the lapse are truly dynamical and they do not have any associated conjugate momenta. The conjugate momentum associated with γ is,

$${p^{ij}} = {{\partial \sqrt {- g} R} \over {\partial {{\dot \gamma}_{ij}}}}\,.$$
(7.5)

We can now construct the Hamiltonian density for GR in terms of the 12 phase space variables (γij and pij carry 6 component each),

$${{\mathcal H}_{{\rm{GR}}}} = N{{\mathcal R}_0}(\gamma ,p) + {N^i}{{\mathcal R}_i}(\gamma ,p)\,.$$
(7.6)

So we see that in GR, both the shift and the lapse play the role of Lagrange multipliers. Thus they propagate a first-class constraint each which removes 2 phase space degrees of freedom per constraint. The counting of the number of degrees of freedom in phase space thus goes as follows:

$$(2 \times 6) - 2\;{\rm{lapse\;constraints}} - 2 \times 3\;{\rm{shift\;constraints}} = 4 = 2 \times 2\,,$$
(7.7)

corresponding to a total of 4 degrees of freedom in phase space, or 2 independent degrees of freedom in field space. This is the very well-known and established result that in four dimensions GR propagates 2 physical degrees of freedom, or gravitational waves have two polarizations.

This result is fully generalizable to any number of dimensions, and in spacetime dimensions, gravitational waves carry d (d − 3)/2 polarizations. We now move to the case of massive gravity.

9.1.2 ADM counting in massive gravity

We now amend the GR Lagrangian with a potential \({\mathcal U}\). As already explained, this can only be performed by breaking covariance (with the exception of a cosmological constant). This potential could be a priori an arbitrary function of the metric, but contains no derivatives and so does not affect the definition of the conjugate momenta pij This translates directly into a potential at the level of the Hamiltonian density,

$${\mathcal H} = N{{\mathcal R}_0}(\gamma ,p) + {N^i}{{\mathcal R}_i}(\gamma ,p) + {m^2}{\mathcal U}({\gamma _{ij}},{N^i},N)\,,$$
(7.8)

where the overall potential for ghost-free massive gravity is given in (6.4).

If \({\mathcal U}\) depends non-linearly on the shift or the lapse then these are no longer directly Lagrange multipliers (if they are non-linear, they still appear at the level of the equations of motion, and so they do not propagate a constraint for the metric but rather for themselves). As a result for an arbitrary potential one is left with (2 × 6) degrees of freedom in the three-dimensional metric and its momentum conjugate and no constraint is present to reduce the phase space. This leads to 6 degrees of freedom in field space: the two usual transverse polarizations for the graviton (as we have in GR), in addition to two ‘vector’ polarizations and two ‘scalar’ polarizations.

These 6 polarizations correspond to the five healthy massive spin-2 field degrees of freedom in addition to the sixth BD ghost, as explained in Section 2.5 (see also Section 7.2).

This counting is also generalizable to an arbitrary number of dimensions, in spacetime dimensions, a massive spin-2 field should propagate the same number of degrees of freedom as a massless spin-2 field in d + 1 dimensions, that is (d + 1)(d − 2)/2 polarizations. However, an arbitrary potential would allow for d (d − 1)/2 independent degrees of freedom, which is 1 too many excitations, always corresponding to one BD ghost degree of freedom in an arbitrary number of dimensions.

The only way this counting can be wrong is if the constraints for the shift and the lapse cannot be inverted for the shift and the lapse themselves, and thus at least one of the equations of motion from the shift or the lapse imposes a constraint on the three-dimensional metric γij. This loophole was first presented in [138] and an example was provided in [137]. It was then used in [144] to explain how the ‘no-go’ on the presence of a ghost in massive gravity could be circumvented. Finally, this argument was then carried through fully non-linearly in [295] (see also [342] for the analysis in 1 + 1 dimensions as presented in [144]).

9.1.3 Eliminating the BD ghost

9.1.3.1 Linear Fierz-Pauli massive gravity

Fierz-Pauli massive gravity is special in that at the linear level (quadratic in the Hamiltonian), the lapse remains linear, so it still acts as a Lagrange multiplier generating a primary second-class constraint. Defining the metric as hμν = MPl(gμνημν), (where for simplicity and definiteness we take Minkowski as the reference metric fμν = ημν, although most of what follows can be easily generalizable to an arbitrary reference metric fμν). Expanding the lapse as N = 1 + δN, we have h00 = δN + γijNiNj and h0i = γijNj. In the ADM decomposition, the Fierz-Pauli mass term is then (see Eq. (2.45))

$$\begin{array}{*{20}c} {{{\mathcal U}^{(2)}} = - {m^{- 2}}{{\mathcal L}_{{\rm{FP}}\,{\rm{mass}}}} = {1 \over 8}(h_{\mu \nu}^2 - {h^2})\quad \;\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {= {1 \over 8}(h_{ij}^2 - {{(h_i^i)}^2} - 2(N_i^2 - \delta Nh_i^i))\,,}\\ \end{array}$$
(7.9)

and is linear in the lapse. This is sufficient to deduce that it will keep imposing a constraint on the three-dimensional phase space variables {γij, pij} and remove at least half of the unwanted BD ghost. The shift, on the other hand, is non-linear already in the Fierz-Pauli theory, so their equations of motion impose a relation for themselves rather than a constraint for the three-dimensional metric. As a result the Fierz-Pauli theory (at that order) propagates three additional degrees of freedom than GR, which are the usual five degrees of freedom of a massive spin-2 field. Nonlinearly however the Fierz-Pauli mass term involve a non-linear term in the lapse in such a way that the constraint associated with it disappears and Fierz-Pauli massive gravity has a ghost at the non-linear level, as pointed out in [75]. This is in complete agreement with the discussion in Section 2.5, and is a complementary way to see the issue.

In Ref. [111], the most general potential was considered up to quartic order in the hμν, and it was shown that there is no choice of such potential (apart from a pure cosmological constant) which would prevent the lapse from entering non-linearly. While this result is definitely correct, it does not however imply the absence of a constraint generated by the set of shift and lapse Nμ = {N, Ni}. Indeed there is no reason to believe that the lapse should necessarily be the quantity to generates the constraint necessary to remove the BD ghost. Rather it can be any combination of the lapse and the shift.

9.1.3.2 Example on how to evade the BD ghost non-linearly

As an instructive example presented in [137], consider the following Hamiltonian,

$${\mathcal H} = N{\tilde{\mathcal C}_0}(\gamma ,p) + {N^i}{\tilde{\mathcal C}_i}(\gamma ,p) + {m^2}{\mathcal U},$$
(7.10)

with the following example for the potential

$${\mathcal U} = V(\gamma ,p){{{\gamma _{ij}}{N^i}{N^j}} \over {2N}}\,.$$
(7.11)

In this example neither the lapse nor the shift enter linearly, and one might worry on the loss of the constraint to project out the BD ghost. However, upon solving for the shift and substituting back into the Hamiltonian (this is possible since the lapse is not dynamical), we get

$${\mathcal H} = N\left({{{\tilde{\mathcal C}}_0}(\gamma ,p) - {{{\gamma ^{ij}}{{\tilde{\mathcal C}}_i}{{\tilde{\mathcal C}}_j}} \over {2{m^2}V(\gamma ,p)}}} \right)\,,$$
(7.12)

and the lapse now appears as a Lagrange multiplier generating a constraint, even though it was not linear in (7.10). This could have been seen more easily, without the need to explicitly integrating out the shift by computing the Hessian

$${L_{\mu \nu}} = {{{\partial ^2}{\mathcal H}} \over {\partial {N^\mu}\partial {N^\nu}}} = {m^2}{{{\partial ^2}{\mathcal U}} \over {\partial {N^\mu}\partial {N^\nu}}}\,.$$
(7.13)

In the example (7.10), one has

$${L_{\mu \nu}} = {{{m^2}V(\gamma ,p)} \over {{N^3}}}\left({\begin{array}{*{20}c} {N_i^2} & {- N\,{N_i}} \\ {- N\,{N_j}} & {{N^2}\,{\gamma _{ij}}} \\ \end{array}} \right)\qquad \Rightarrow \qquad \det \;({L_{\mu \nu}}) = 0\,.$$
(7.14)

The Hessian cannot be inverted, which means that the equations of motion cannot be solved for all the shift and the lapse. Instead, one of these ought to be solved for the three-dimensional phase space variables which corresponds to the primary second-class constraint. Note that this constraint is not associated with a symmetry in this case and while the Hamiltonian is then pure constraint in this toy example, it will not be in general.

Finally, one could also have deduce the existence of a constraint by performing the linear change of variable

$${N_i} \rightarrow {n_i} = {{{N_i}} \over N}\,,$$
(7.15)

in terms of which the Hamiltonian is then explicitly linear in the lapse,

$${\mathcal H} = N\left({{{\tilde{\mathcal C}}_0}(\gamma ,p) + {n^i}{{\tilde{\mathcal C}}_i}(\gamma ,p) + {m^2}V(\gamma ,p){{{\gamma _{ij}}{n^i}{n^j}} \over 2}} \right)\,,$$
(7.16)

and generates a constraint that can be read for {ni, γij, pij}.

9.1.3.3 Condition to evade the ghost

To summarize, the condition to eliminate (at least half of) the BD ghost is that the det of the Hessian (7.13) Lμν vanishes as explained in [144]. This was shown to be the case in the ghost-free theory of massive gravity (6.3) [(6.1)] exactly in some cases and up to quartic order, and then fully non-linearly in [295]. We summarize the derivation in the general case in what follows.

Ultimately, this means that in massive gravity we should be able to find a new shift ni related to the original one as follows Ni = f0(γ, n) + Nf1(γ, n), such that the Hamiltonian takes the following factorizable form

$${\mathcal H} = ({{\mathcal A}_1}(\gamma ,p) + N{{\mathcal C}_1}(\gamma ,p)){\mathcal F}(\gamma ,p,n) + ({{\mathcal A}_2}(\gamma ,p) + N{{\mathcal C}_2}(\gamma ,p))\,.$$
(7.17)

In this form, the equation of motion for the shift is manifestly independent of the lapse and integrating over the shift ni manifestly keeps the Hamiltonian linear in the lapse and has the constraint \({{\mathcal C}_1}(\gamma, p){\mathcal F}(\gamma, p,{n^i}(\gamma)) + {{\mathcal C}_2}(\gamma, p) = 0\). However, such a field redefinition has not (yet) been found. Instead, the new shift ni found below does the next best thing (which is entirely sufficient) of a. Keeping the Hamiltonian linear in the lapse and b. Keeping its own equation of motion independent of the lapse, which is sufficient to infer the presence of a primary constraint.

9.1.3.4 Primary constraint

We now proceed by deriving the primary first-class constraint present in ghost-free (dRGT) massive gravity. The proof works equally well for any reference at no extra cost, and so we consider a general reference metric in its own ADM decomposition, while keep the dynamical metric in its original ADM form (since we work in unitary gauge, we may not simplify the metric further),

$${g_{\mu \nu}}{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = - {N^2}\;{\rm{d}}{t^2} + {\gamma _{ij}}({\rm{d}}{x^i} + {N^i}{\rm{d}}t)\;({\rm{d}}{x^j} + {N^j}{\rm{d}}t)$$
(7.18)
$${f_{\mu \nu}}\;{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = - {\bar {\mathcal N}^2}\;{\rm{d}}{t^2} + {\bar f_{ij}}({\rm{d}}{x^i} + {\bar {\mathcal N}^i}\;{\rm{d}}t)\;({\rm{d}}{x^j} + {\bar {\mathcal N}^j}\;{\rm{d}}t)\,,$$
(7.19)

and denote again by pij the conjugate momentum associated with γij. \({\overset - f _{ij}}\) is not dynamical in massive gravity so there is no conjugate momenta associated with it. The bars on the reference metric are there to denote that these quantities are parameters of the theory and not dynamical variables, although the proof for a dynamical reference metric and multi-gravity works equally well, this is performed in Section 7.4.

Proceeding similarly as in the previous example, we perform a change of variables similar as in (7.15) (only more complicated, but which remains linear in the lapse when expressing in terms of ni) [295, 296]

$${N^i} \rightarrow {n^i}\quad {\rm{defined\ as}}\quad {N^i} - {\bar {\mathcal N} ^i} = (\bar {\mathcal N} \delta _j^i + N\;D_j^i)\;{n^j}\,,$$
(7.20)

where the matrix \(D_j^i\) satisfies the following relation

$$D_k^iD_j^k = ({P^{- 1}})_k^i{\gamma ^{k\ell}}{\bar f_{\ell j}}\,,$$
(7.21)

with

$$P_j^i = \delta _j^i + ({n^i}{\bar f_{j\ell}}{n^\ell} - {n^k}{\bar f_{k\ell}}{n^\ell}\delta _j^i)\,.$$
(7.22)

In what follows we use the definition

$$\tilde D_{\;j}^i = \kappa D_{\;j}^i\,,$$
(7.23)

with

$$\kappa = \sqrt {1 - {n^i}{n^j}{{\bar f}_{ij}}} \,.$$
(7.24)

The field redefinition naturally involves a square root through the expression of the matrix D in (7.21), which should come as no surprise from the square root structure of the potential term. For the potential to be writable in the metric language, the square root in the definition of the tensor \({\mathcal K}_{\,\,\,v}^\mu\), should exist, which in turns imply that the square root in the definition of \(D_j^i\) in (7.21) must also exist. While complicated, the important point to notice is that this field redefinition remains linear in the lapse (and so does not spoil the standard constraints of GR).

The Hamiltonian for massive gravity is then

$$\begin{array}{*{20}c} {{{\mathcal H}_{{\rm{mGR}}}} = {{\mathcal H}_{{\rm{GR}}}} + {m^2}{\mathcal U}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= N\,{{\mathcal R}_0}(\gamma ,p) + \left({{{\bar {\mathcal N}}^i} + \left({\bar {\mathcal N}\delta _j^i + N\;D_j^i} \right){n^j}} \right)\,{{\mathcal R}_i}(\gamma ,p)} \\ {+ {m^2}\,{\mathcal U}(\gamma ,{N^i}(n),N)\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(7.25)

where \({\mathcal U}\) includes the new contributions from the mass term. \({\mathcal U}(\gamma, {N^i},N)\) is neither linear in the lapse N, nor in the shift Ni. There is actually no choice of potential \({\mathcal U}\) which would keep it linear in the lapse beyond cubic order [111]. However, as we shall see, when expressed in terms of the redefined shift ni, the non-linearities in the shift absorb all the original non-linearities in the lapse and \({\mathcal U}(\gamma, {n^i}N)\). In itself this is not sufficient to prove the presence of a constraint, as the integration over the shift ni could in turn lead to higher order lapse in the Hamiltonian,

$${\mathcal U}(\gamma ,{N^i}({n^j}),N) = N\,{{\mathcal U}_0}(\gamma ,{n^j}) + \bar {\mathcal N} \,{{\mathcal U}_1}(\gamma ,{n^j})\,,$$
(7.26)

with

$${{\mathcal U}_0} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt \gamma \sum\limits_{n = 0}^3 {{{(4 - n){\beta _n}} \over {n!}}} {{\mathcal L}_n}[\tilde D_{\,j}^i]$$
(7.27)
$$\begin{array}{*{20}c} {{{\mathcal U}_1} = - {{M_{{\rm{Pl}}}^2} \over 4}\sqrt \gamma \left({3!{\beta _1}\kappa + 2{\beta _2}D_{\,j}^iP_{\,i}^j} \right.\quad \quad \quad \quad \quad \quad \quad} \\ {\left. {+ {\beta _3}\kappa \left[ {2D_{\;k}^{\left[ k \right.}{n^{\left. i \right]}}{{\bar f}_{ij}}D_{\,\ell}^j{n^\ell} + D_{\;i}^{\left[ i \right.}D_{\;j}^{\left. j \right]}} \right]} \right) - {{M_{{\rm{Pl}}}^2} \over 4}{\beta _4}\sqrt {\bar f} \,,} \\ \end{array}$$
(7.28)

where the β’s are expressed in terms of the α’s as in (6.28). For the purpose of this analysis it is easier to work with that notation.

The structure of the potential is so that the equations of motion with respect to the shift are independent of the lapse and impose the following relations in terms of \({\bar n_i} = {n^j}\,{\bar f_{ij}}\),

$${m^2}\sqrt \gamma \left[ {3!{\beta _1}{{\bar n}_i} + 4{\beta _2}\tilde D_{\,\left[ j \right.}^j{{\bar n}_{\left. i \right]}} + {\beta _3}\tilde D_{\;j}^{\left[ j \right.}\left({\tilde D_{\;k}^{\left. k \right]}{{\bar n}_i} - 2\tilde D_{\;i}^{\left. k \right]}{{\bar n}_k}} \right)} \right] = \kappa {{\mathcal R}_i}(\gamma ,p)\,,$$
(7.29)

which entirely fixes the three shifts ni in terms of γij and pij as well as the reference metric \({\overset - f _{ij}}\) (note that \({\overset - {\mathcal N} ^i}\) entirely disappears from these equations of motion).

The two requirements defined previously are thus satisfied: a. The Hamiltonian is linear in the lapse and b. the equations of motion with respect to the shift ni are independent of the lapse, which is sufficient to infer the presence of a primary constraint. This primary constraint is derived by varying with respect to the lapse and evaluating the shift on the constraint surface (7.29),

$${{\mathcal C}_0} = {{\mathcal R}_0}(\gamma ,p) + D_{\,j}^i{n^j}{{\mathcal R}_i}(\gamma ,p) + {m^2}{{\mathcal U}_0}(\gamma ,n(\gamma ,p)) \approx 0\,,$$
(7.30)

where the symbol “≈” means on the constraint surface. The existence of this primary constraint is sufficient to infer the absence of BD ghost. If we were dealing with a generic system (which could allow for some spontaneous parity violation), it could still be in principle that there are no secondary constraints associated with \({{\mathcal C}_0} = 0\) and the theory propagates 5.5 physical degrees of freedom (11 dofs in phase space). However, physically this never happens in the theory of gravity we are dealing with preserves parity and is Lorentz invariant. Indeed, to have 5.5 physical degrees of freedom, one of the variables should have an equation of motion which is linear in time derivatives. Lorentz invariance then implies that it must also be linear in space derivatives which would then violate parity. However, this is only an intuitive argument and the real proof is presented below. Indeed, it ghost-free massive gravity admits a secondary constraint which was explicitly found in [294].

9.1.3.5 Secondary constraint

Let us imagine we start with initial conditions that satisfy the constraints of the system, in particular the modified Hamiltonian constraint (7.30). As the system evolves the constraint (7.30) needs to remain satisfied. This means that the modified Hamiltonian constraint ought to be independent of time, or in other words it should commute with the Hamiltonian. This requirement generates a secondary constraint,

$${{\mathcal C}_2} \equiv {{\rm{d}} \over {{\rm{d}}t}}{{\mathcal C}_0} = \{{{\mathcal C}_0},{H_{{\rm{mGR}}}}\} \approx \{{{\mathcal C}_0},{H_1}\} \approx 0\,,$$
(7.31)

with \({H_{{\rm{mGR,1}}}} = \int {{{\rm{d}}^{\rm{3}}}x{{\mathcal H}_{{\rm{mGR,1}}}}}\) and

$${{\mathcal H}_1} = \left({{{\bar {\mathcal N}}^i} + \bar {\mathcal N} {n^i}(\gamma ,p)} \right){{\mathcal R}_i} + {m^2}\bar {\mathcal N} {{\mathcal U}_1}(\gamma ,n(\gamma ,p))\,.$$
(7.32)

Finding the precise form of this secondary constraint requires a very careful analysis of the Poisson bracket algebra of this system. This formidable task lead to some confusions at first (see Refs. [345]) but was then successfully derived in [294] (see also [258, 259] and [343]). Deriving the whole set of Poisson brackets is beyond the scope of this review and we simply give the expression for the secondary constraint,

$$\begin{array}{*{20}c} {{{\mathcal C}_2} \equiv {{\mathcal C}_0}{\nabla _i}({{\bar {\mathcal N}}^i} + \bar {\mathcal N} {n^i}) + {m^2}\bar {\mathcal N} ({\gamma _{ij}}p_\ell ^\ell - 2{p_{ij}}){\mathcal U}_1^{ij}\quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ {+ 2{m^2}\bar {\mathcal N} \sqrt \gamma {\nabla _i}{\mathcal U}_{1\,j}^iD_{\;k}^j{n^k} + ({{\mathcal R}_j}D_{\;k}^i{n^k} - \sqrt \gamma \bar {\mathcal B} _j^{\;i}){\nabla _i}({{\bar {\mathcal N}}^j} + \bar {\mathcal N} {n^j})} \\ {+ \left({{\nabla _i}{{\mathcal R}_0} + {\nabla _i}{{\mathcal R}_j}D_{\;k}^j{n^k}} \right)({{\bar {\mathcal N}}^i} + \bar {\mathcal N} {n^i})\;,\;\quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(7.33)

where unless specified otherwise, all indices are raised and lowered with respect to the dynamical metric γij, and the covariant derivatives are also taken with respect to the same metric. We also define

$${\mathcal U}_1^{ij} = {1 \over {\sqrt \gamma}}{{\partial {{\mathcal U}_1}} \over {\partial {\gamma _{ij}}}}$$
(7.34)
$$\begin{array}{*{20}c} {{{\bar {\mathcal B}}_{ij}} = - {{M_{{\rm{Pl}}}^2} \over 4}\left[ {({{\tilde D}^{- 1}})_{\;j}^k{{\bar f}_{ik}}\left({3{\beta _1}{{\mathcal L}_0}[\tilde D] + 2{\beta _2}{{\mathcal L}_1}[\tilde D] + {{{\beta _3}} \over 2}{{\mathcal L}_2}[\tilde D]} \right)} \right.} \\ {\left. {- {\beta _2}{{\bar f}_{ij}} + 2{\beta _3}{{\bar f}_{i\left[ k \right.}}\tilde D_{\left. {\,j} \right]}^k} \right]\;\,.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(7.35)

The important point to notice is that the secondary constraint (7.33) only depends on the phase space variables γij, pij and not on the lapse N. Thus it constraints the phase space variables rather than the lapse and provides a genuine secondary constraint in addition to the primary one (7.30) (indeed one can check that \({{\mathcal C}_2}{\vert_{{{\mathcal C}_{0 = 0}} \ne 0}}\).

Finally, we should also check that this secondary constraint is also maintained in time. This was performed [294], by inspecting the condition

$${{\rm{d}} \over {{\rm{d}}t}}{{\mathcal C}_2} = \{{{\mathcal C}_2},{H_{{\rm{mGR}}}}\} \approx 0\,.$$
(7.36)

This condition should be satisfied without further constraining the phase space variables, which would otherwise imply that fewer than five degrees of freedom are propagating. Since five fully fledged dofs are propagating at the linearized level, the same must happen non-linearly.Footnote 17 Rather than a constraint on {γij, pij}, (7.36) must be solved for the lapse. This is only possible if both the two following conditions are satisfied

$$\{{{\mathcal C}_2}(x),{{\mathcal H}_1}(y)\}\rlap{/}{\approx}0\quad {\rm{and}}\quad \{{{\mathcal C}_2}(x),{{\mathcal C}_0}(y)\}\rlap{/}{\approx}0\,.$$
(7.37)

As shown in [294], since these conditions do not vanish at the linear level (the constraints reduce to the Fierz-Pauli ones in that case), we can deduce that they cannot vanish non-linearly and thus the condition (7.36) fixes the expression for the lapse rather than constraining further the phase space dofs. Thus there is no tertiary constraint on the phase space.

To conclude, we have shown in this section that ghost-free (or dRGT) massive gravity is indeed free from the BD ghost and the theory propagates five physical dofs about generic backgrounds. We now present the proof in other languages, but stress that the proof developed in this section is sufficient to infer the absence of BD ghost.

9.1.3.6 Secondary constraints in bi- and multi-gravity

In bi- or multi-gravity where all the metrics are dynamical the Hamiltonian is pure constraint (every term is linear in the one of the lapses as can be seen explicitly already from (7.25) and (7.26)).

In this case, the evolution equation of the primary constraint can always be solved for their respective Lagrange multiplier (lapses) which can always be set to zero. Setting the lapses to zero would be unphysical in a theory of gravity and instead one should take a ‘bifurcation’ of the Dirac constraint analysis as explained in [48]. Rather than solving for the Lagrange multipliers we can choose to use the evolution equation of some of the primary constraints to provide additional secondary constraints instead of solving them for the lagrange multipliers.

Choosing this bifurcation leads to statements which are then continuous with the massive gravity case and one recovers the correct number of degrees of freedom. See Ref. [48] for an enlightening discussion.

9.2 Absence of ghost in the Stückelberg language

9.2.1 Physical degrees of freedom

Another way to see the absence of ghost in massive gravity is to work directly in the Stückelberg language for massive spin-2 fields introduced in Section 2.4. If the four scalar fields ϕa were dynamical, the theory would propagate six degrees of freedom (the two usual helicity-2 which dynamics is encoded in the standard Einstein-Hilbert term, and the four Stückelberg fields). To remove the sixth mode, corresponding to the BD ghost, one needs to check that not all four Stückelberg fields are dynamical but only three of them. See also [14] for a theory of two Stückelberg fields.

Stated more precisely, in the Stückelberg language beyond the DL, if a is the equation of motion with respect to the field the correct requirement for the absence of ghost is that the Hessian defined as

$${{\mathcal A}_{ab}} = - {{\delta {{\mathcal E}_a}} \over {\delta {{\ddot \phi}^b}}} = {{{\delta ^2}{\mathcal L}} \over {\delta {{\dot \phi}^a}\delta {{\dot \phi}^b}}}$$
(7.38)

be not invertible, so that the dynamics of not all four Stückelberg may be derived from it. This is the case if

$$\det \;({{\mathcal A}_{ab}}) = 0\,,$$
(7.39)

as first explained in Ref. [145]. This condition was successfully shown to arise in a number of situations for the ghost-free theory of massive gravity with potential given in (6.3) or equivalently in (6.1) in Ref. [145] and then more generically in Ref. [297].Footnote 18 For illustrative purposes, we start by showing how this constraint arises in simple two-dimensional realization of ghost-free massive gravity before deriving the more general proof.

9.2.2 Two-dimensional case

Consider massive gravity on a two-dimensional space-time, ds2 = − N2 dt2 + γ (dx + Nx dt)2, with the two Stückelberg fields ϕ0,1 [145]. In this case the graviton potential can only have one independent non-trivial term, (excluding the tadpole),

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}N\sqrt \gamma ({{\mathcal L}_2}({\mathcal K}) + 1)\,.$$
(7.40)

In light-cone coordinates,

$${\phi ^ \pm} = {\phi ^0} \pm {\phi ^1}$$
(7.41)
$${{\mathcal D}_ \pm} = {1 \over {\sqrt \gamma}}{\partial _x} \pm {1 \over N}\left[ {{\partial _t} - {N_x}{\partial _x}} \right]\,,$$
(7.42)

the potential is thus

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}N\sqrt \gamma \sqrt {({{\mathcal D}_ -}{\phi ^ -})({{\mathcal D}_ +}{\phi ^ +})} \,.$$
(7.43)

The Hessian of this Lagrangian with respect to the two Stückelberg fields ϕ± is then

$$\begin{array}{*{20}c} {{{\mathcal A}_{ab}} = {{{\delta ^2}{{\mathcal L}_{{\rm{mGR}}}}} \over {\delta {{\dot \phi}^a}\delta {{\dot \phi}^b}}} = - {m^2}{{{\delta ^2}{\mathcal U}} \over {\delta {{\dot \phi}^a}\delta {{\dot \phi}^b}}}\;\quad \quad \quad \quad \quad \quad \quad} \\ {\propto \left({\begin{array}{*{20}c} {{{({{\mathcal D}_ -}{\phi ^ -})}^2}} & {- ({{\mathcal D}_ -}{\phi ^ -})({{\mathcal D}_ +}{\phi ^ +})} \\ {- ({{\mathcal D}_ -}{\phi ^ -})({{\mathcal D}_ +}{\phi ^ +})} & {{{({{\mathcal D}_ +}{\phi ^ +})}^2}} \\ \end{array}} \right)\,\;,} \\ \end{array}$$
(7.44)

and is clearly non-invertible, which shows that not both Stückelberg fields are dynamical. In this special case, the Hamiltonian is actually pure constraint as shown in [145], and there are no propagating degrees of freedom. This is as expected for a massive spin-two field in two dimensions.

As shown in Refs. [144, 145] the square root can be traded for an auxiliary non-dynamical variable \(\lambda _{\,\,\,\,v}^\mu\). In this two-dimensional example, the mass term (7.43) can be rewritten with the help of an auxiliary non-dynamical variable λ as

$${\mathcal U} = - {{M_{{\rm{Pl}}}^2} \over 4}N\sqrt \gamma \left({\lambda + {1 \over {2\lambda}}({{\mathcal D}_ -}{\phi ^ -})({{\mathcal D}_ +}{\phi ^ +})} \right)\,.$$
(7.45)

A similar trick will be used in the full proof.

9.2.3 Full proof

The full proof in the minimal model (corresponding to α2 = 1 and α3 = −2/3 and α4 = 1/6 in (6.3) or β2 = β3 = 0 in the alternative formulation (6.23)), was derived in Ref. [297]. We briefly review the essence of the argument, although the full technical derivation is beyond the scope of this review and refer the reader to Refs. [297] and [15] for a fully-fledged derivation.

Using a set of auxiliary variables \(\lambda _b^a\) (with λab = λba, so these auxiliary variables contain ten elements in four dimensions) as explained previously, we can rewrite the potential term in the minimal model as [79, 342],

$${\mathcal U} = {{M_{{\rm{Pl}}}^2} \over 4}\sqrt {- g} ([\lambda ] + [{\lambda ^{- 1}}\cdot Y])\,,$$
(7.46)

where the matrix Y has been defined in (2.77) and is equivalent to X used previously. Upon integration over the auxiliary variable λ we recover the square-root structure as mentioned in Ref. [144]. We now perform an ADM decomposition as in (7.1) which implies the ADM decomposition on the matrix Y,

$$Y_{\,b}^a = {g^{\mu \nu}}{\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^c}{f_{cb}} = - {{\mathcal{D}}_t}{\phi ^a}{{\mathcal{D}}_t}{\phi ^c}{f_{cb}} + V_{\,b}^a\,,$$
(7.47)

with

$${{\mathcal D}_t} = {1 \over N}({\partial _t} - {N^i}{\partial _i})$$
(7.48)
$$V_{\;b}^a = {\gamma ^{ij}}{\partial _i}{\phi ^a}{\partial _j}{\phi ^c}{f_{cb}}\,.$$
(7.49)

Since the matrix uses a projection along the 3 spatial directions it is genuinely a rank-3 matrix rather than rank 4. This implies that det V = 0. Notice that we consider an arbitrary reference metric f, as the proof does not depend on it and can be done for any f at no extra cost [297]. The canonical momenta conjugate to ϕa is given by

$${p_a} = {1 \over 2}\tilde \alpha {({\lambda ^{- 1}})_{ab}}{{\mathcal{D}}_0}{\phi ^b}\,,$$
(7.50)

with

$$\tilde \alpha = 2M_{{\rm{Pl}}}^2{m^2}\sqrt \gamma \,.$$
(7.51)

In terms of these conjugate momenta, the equations of motion with respect to λab then imposes the relation (after multiplying with the matrixFootnote 19 α λ on both side),

$${\lambda ^{ac}}{C_{ab}}{\lambda ^{bd}} = {V^{ab}}\,,$$
(7.52)

with the matrix Cab defined as

$${C_{ab}} = {\tilde \alpha ^2}{f_{ab}} + {p_a}{p_b}\,.$$
(7.53)

Since det V = 0, as mentioned previously, the equation of motion (7.52) is only consistent if we also have det C = 0. This is the first constraint found in [297] which is already sufficient to remove (half) the BD ghost,

$${{\mathcal C}_1} \equiv {{\det C} \over {\det f}} = {\tilde \alpha ^2} + {({f^{- 1}})^{ab}}{p_a}{p_b} = 0\,,$$
(7.54)

which is the primary constraint on a subset of physical phase space variables {γij, pa}, (by construction det f ≠ 0). The secondary constraint is then derived by commuting \({{\mathcal C}_1}\) with the Hamiltonian. Following the derivation of [297], we get on the constraint surface

$${{\mathcal C}_2} = {1 \over {{{\tilde \alpha}^2}N}}{{{\rm{d}}{{\mathcal C}_1}} \over {{\rm{d}}t}} = {1 \over {{{\tilde \alpha}^2}N}}\int {\rm{d}} y\,\{{{\mathcal C}_1}(y),\;H(x)\}$$
(7.55)
$$\begin{array}{*{20}c} {\propto - {\gamma ^{- 1/2}}{\gamma _{ij}}{\pi ^{ij}} - 2{{\tilde \alpha} \over \gamma}{{({\lambda ^{- 1}})}_{ab}}{\partial _i}{\phi ^a}{\gamma ^{ij}}\nabla _j^{(f)}{p^b}} \\ {\equiv 0\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(7.56)

where πij is the momentum conjugate associated with γij, and Δ(f) is the covariant derivative associated with f.

9.2.4 Stückelberg method on arbitrary backgrounds

When working about different non-Minkowski backgrounds, one can instead generalize the definition of the helicity-0 mode as was performed in [400]. The essence of the argument is to perform a rotation in field space so that the fluctuations of the Stückelberg fields about a curved background form a vector field in the new basis, and one can then employ the standard treatment for a vector field. See also [10] for another study of the Stückelberg fields in an FLRW background.

Recently, a covariant Stückelberg analysis valid about any background was performed in Ref. [369] using the BRST formalism. Interestingly, this method also allows to derive the decoupling limit of massive gravity about any background.

In what follows, we review the approach derived in [400] which provides yet another independent argument for the absence of ghost in all generalities. The proofs presented in Sections 7.1 and 7.2 work to all orders about a trivial background while in [400], the proof is performed about a generic (curved) background, and the analysis can thus stop at quadratic order in the fluctuations. Both types of analysis are equivalent so long as the fields are analytic, which is the case if one wishes to remain within the regime of validity of the theory.

Consider a generic background metric, which in unitary gauge (i.e., in the coordinate system {x} where the Stückelberg background fields are given by \({\phi ^a}(x) = {x^\mu}\delta _\mu ^a\), the background metric is given by \(g_{\mu v}^{{\rm{bg}}} = e_\mu ^a(x)e_v^b{(x)_{{\eta _{ab}}}}\), and the background Stückelberg fields are given by \(\phi _{{\rm{bg}}}^a(x) = {x^a} - A_{{\rm{bg}}}^a(x)\).

We now add fluctuations about that background,

$${\phi ^a} = \phi _{{\rm{bg}}}^a - {a^a} = {x^a} - {A^a}$$
(7.57)
$${g_{\mu \nu}} = g_{\mu \nu}^{{\rm{bg}}} + {h_{\mu \nu}},$$
(7.58)

with \({A^a} = A_{{\rm{bg}}}^a + {a^a}\).

9.2.4.1 Flat background metric

First, note that if we consider a flat background metric to start with, then at zeroth order in h, the ghost-free potential is of the form [400], (this can also be seen from [238, 419])

$${{\mathcal L}_A} = - {1 \over 4}FF(1 +\partial A + \cdots)\,,$$
(7.59)

with Fab = aAbbAa. This means that for a symmetric Stückelberg background configuration, i.e., if the matrix \({\partial _\mu}\phi _{{\rm{bg}}}^a\) is symmetric, then \(F_{ab}^{{\rm{bg}}} = 0\), and at quadratic order in the fluctuation a, the action has a U (1)-symmetry. This symmetry is lost non-linearly, but is still relevant when looking at quadratic fluctuations about arbitrary backgrounds. Now using the split about the background, \({A^a} + A_{{\rm{bg}}}^a + {a^a}\), this means that up to quadratic order in the fluctuations aa, the action at zeroth order in the metric fluctuation is of the form [400]

$${\mathcal L}_a^{(2)} = {\bar B^{\mu \alpha \nu \beta}}{f_{\mu \nu}}{f_{\alpha \beta}},$$
(7.60)

with fμν = μaν, − νaμ and \({\overset - B ^{\mu \alpha v\beta}}\) is a set of constant coefficients which depends on \(A_{{\rm{bg}}}^a\). This quadratic action has an accidental U (1)-symmetry which is responsible for projecting out one of the four dofs naively present in the four Stückelberg fluctuations aa. Had we considered any other potential term, the U (1) symmetry would have been generically lost and all four Stückelberg fields would have been dynamical.

9.2.4.2 Non-symmetric background Stückelberg

If the background configuration is not symmetric, then at every point one needs to perform first an internal Lorentz transformation Λ(x) in the Stückelberg field space, so as to align them with the coordinate basis and recover a symmetric configuration for the background Stückelberg fields. In this new Lorentz frame, the Stückelberg fluctuation is \({\tilde a^\mu} = \Lambda _v^\mu (x){a_v}\). As a result, to quadratic order in the Stückelberg fluctuation the part of the ghost-free potential which is independent of the metric fluctuation and its curvature goes symbolically as (7.60) with f replaced by \(f \to \tilde f + (\partial \Lambda){\Lambda ^{- 1}}\tilde a\), (with = \({\tilde f_{\mu v}} = {\partial _\mu}{\tilde a_v} - {\partial _v}{\tilde a_\mu}\)). Interestingly, the Lorentz boost ( Λ)Λ−1 now plays the role of a mass term for what looks like a gauge field ã. This mass term breaks the U (1) symmetry, but there is still no kinetic term for ã0, very much as in a Proca theory. This part of the potential is thus manifestly ghost-free (in the sense that it provides a dynamics for only three of the four Stückelberg fields, independently of the background).

Next, we consider the mixing with metric fluctuation h while still assuming zero curvature. At linear order in h, the ghost-free potential, (6.3) goes as follows

$${\mathcal L}_{Ah}^{(2)} = {h^{\mu \nu}}\sum\limits_{n = 1}^3 {{c_n}} X_{\mu \nu}^{(n)} + hF(\partial A + \cdots)\,,$$
(7.61)

where the tensors \(X_{\mu v}^{(n)}\) are similar to the ones found in the decoupling limit, but now expressed in terms of the symmetric full four Stückelberg fields rather than just π, i.e., replacing by μAν + νAμ in the respective expressions (8.29), (8.30) and (8.31) for \(X_{\mu v}^{(1,2,3)}\). Starting with the symmetric configuration for the Stückelberg fields, then since we are working at the quadratic level in perturbations, one of the Aμ in the \(X_{\mu v}^{(n)}\) is taken to be the fluctuation aμ, while the others are taken to be the background field \(A_\mu ^{{\rm{bg}}}\). As a result in the first terms in hX in (7.61)0a0 cannot come at the same time as h00 or h0i, and we can thus integrate by parts the time derivative acting on any a0, leading to a harmless first time derivative on hij, and no time evolution for a0.

As for the second type of term in (7.61), since F = 0 on the background field \(A_\mu ^{{\rm{bg}}}\), the second type of terms is forced to be proportional to fμν and cannot involve any 0 a0 at all. As a result a0 is not dynamical, which ensures that the theory is free from the BD ghost.

This part of the argument generalizes easily for non symmetric background Stückelberg configurations, and the same replacement \(f \to \tilde f + (\partial \Lambda){\Lambda ^{- 1}}\tilde a\) still ensures that ã0 acquires no dynamics from (7.61).

9.2.4.3 Background curvature

Finally, to complete the argument, we consider the effect from background curvature, then \(g_{\mu v}^{{\rm{bg}}} \ne {\eta _{\mu v}}\) with \(g_{\mu v}^{{\rm{bg}}} = e_\mu ^a(x)e_v^b(x)\). The space-time curvature is another source of ‘misalignment’ between the coordinates and the Stückelberg fields. To rectify for this misalignment, we could go two ways: Either perform a local change of coordinate so as to align the background metric \(g_{\mu v}^{{\rm{bg}}}\) with the flat reference metric ημν (i.e., going to local inertial frame), or the other way around: i.e., express the flat reference metric in terms of the curved background metric, \({\eta _{ab}} = e_a^\mu e_b^vg_{\mu v}^{{\rm{bg}}}\), in terms of the inverse vielbein, \(e_a^\mu \equiv ({e^{- 1}})_a^\mu\). Then the building block of ghost-free massive gravity is the matrix \({\mathbb X}\), defined previously as

$${\mathbb X}_\nu ^\mu = ({g^{- 1}}\eta)_\nu ^\mu = {g^{\mu \gamma}}(e_{\,a}^\alpha {\partial _\gamma}{\phi ^a})(e_{\,b}^\beta {\partial _\nu}{\phi ^b})g_{\alpha \beta}^{{\rm{bg}}}\,.$$
(7.62)

As a result, the whole formalism derived previously is directly applicable with the only subtlety that the Stückelberg fields ϕa should be replaced by their ‘vielbein-dependent’ counterparts, i.e., \({\partial _\mu }{A_\nu } \to {g_{\mu {\nu ^{{\text{bg}}}}}} - g_{\nu \alpha }^{{\text{bg}}}e_{{\kern 1pt} a}^\alpha {\partial _\mu }{\phi ^a}\). In terms of the Stückelberg field fluctuation aa, this implies the replacement \({a^a} \to {\bar a_\mu} = g_{\mu v}^{{\rm{bg}}}e_{\,\,\,\,\,a}^v{a^a}\), and symbolically, \(f \to \bar f + (\partial \Sigma){\Sigma ^{- 1}}\bar a\), with Σ = . The situation is thus the same as when we were dealing with a non-symmetric Stückelberg background configuration, after integration by parts (which might involve curvature harmless contributions), the potential can be written in a way which never involves any time derivative on ā0. As a result, āμ, plays the role of an effective Proca vector field which only propagates three degrees of freedom, and this about any curved background metric. The beauty of this argument lies in the correct identification of the proper degrees of freedom when dealing with a curved background metric.

9.3 Absence of ghost in the vielbein formulation

Finally, we can also prove the absence of ghost for dRGT in the Vielbein formalism, either directly at the level of the Lagrangian in some special cases as shown in [171] or in full generality in the Hamiltonian formalism, as shown in [314]. The later proof also works in all generality for a multi-gravity theory and will thus be presented in more depth in what follows, but we first focus on a special case presented in Ref. [171].

Let us start with massive gravity in the vielbein formalism (6.1). As was the case in Part II, we work with the symmetric vielbein condition, \(e_\mu ^af_v^b{\eta _{ab}} = e_v^af_\mu ^b{\eta _{ab}}\). For simplicity we specialize further to the case where \(f_\mu ^a = \delta _\mu ^a\), so that the symmetric vielbein condition imposes e = eμa. Under this condition, the vielbein contains as many independent components as the metric. The symmetric veilbein condition ensures that one is able to reformulate the theory in a metric language. In spacetime dimensions, there is a priori d (d + 1)/2 independent components in the symmetric vielbein.

Varying the action (6.1) with respect to the vielbein leads to the modified Einstein equation,

$${G_a} = {t_a} = - {{{m^2}} \over 2}{\varepsilon _{abcd}}\left({4{c_0}\,{e^b} \wedge {e^c} \wedge {e^d} + 3{c_1}\,{e^b} \wedge {e^c} \wedge {f^d}} \right.$$
(7.63)
$$\left. {+ 2{c_2}\,{e^b} \wedge {f^c} \wedge {f^d} + {c_3}\,{f^b} \wedge {f^c} \wedge {f^d}} \right),$$
(7.64)

with Ga = εabcdωbced. From the Bianchi identity, \({\mathcal D}{G_a} = {\rm{d}}{G_a} = {\rm{d}}{G_a} - \omega _a^b{G_b}\), we infer the d constraints

$${\mathcal D}{t_a} = {\rm{d}}{t_a} - \omega _{\;a}^b{t_b} = 0\,,$$
(7.65)

leading to d (d − 1)/2 independent components in the vielbein. This is still one too many component, unless an additional constraint is found. The idea behind the proof in Ref. [171], is then to use the Bianchi identities to infer an additional constraint of the form,

$${m^a} \wedge {G_a} = {m^a} \wedge {t_a}\,,$$
(7.66)

where ma is an appropriate one-form which depends on the specific coefficients of the theory. Such a constrain is present at the linear level for Fierz-Pauli massive gravity, and it was further shown in Ref. [171] that special choices of coefficients for the theory lead to remarkably simple analogous relations fully non-linearly. To give an example, we consider all the coefficients cn to vanish but c1 ≠ 0. In that case the Bianchi identity (7.65) implies

$${\mathcal D}{t_a} = 0\qquad \Rightarrow \qquad \omega _{\;cb}^b = 0\,,$$
(7.67)

where similarly as in (5.2), the torsionless connection is given in term of the vielbein as

$$\omega _\mu ^{ab} = {1 \over 2}e_\mu ^c(o_{\;\;\;\;c}^{ab} - o_c^{\;ab} - o_{\;\;c}^{b\;\;a})\,,$$
(7.68)

with \({o^{ab}}_c = 2{e^{a\mu}}{e^{bv}}{\partial _{\left[ \mu \right.}}{e_{\left. v \right]}}_c\). The Bianchi identity (7.67) then implies \(e_a^{\,\,\,\,b}{\partial _{\left[ b \right.}}e_{\left. a \right]}^a = 0\), so that we obtain an extra constraint of the form (7.66) with ma = ea. Ref. [171] derived similar constraints for other parameters of the theory.

9.4 Absence of ghosts in multi-gravity

We now turn to the proof for the absence of ghost in multi-gravity and follow the vielbein formulation of Ref. [314]. In this subsection we use the notation that uppercase Latin indices represent d-dimensional Lorentz indices, A, B, ⋯ = 0, ⋯, d − 1, while lowercase Latin indices represent the d − 1-dimensional Lorentz indices along the space directions after ADM decomposition, a, b, ⋯ = 1, ⋯, d − 1. Greek indices represent d-dimensional spacetime indices μ, ν, = 0, ⋯, d − 1, while the ‘middle’ of the Latin alphabet indices i, j ⋯ represent pure space indices i, j, ⋯ = 1, ⋯, d, − 1. Finally, capital indices label the metric and span over I, J, K, ⋯ = 1, ⋯, N.

Let us start with N non-interacting spin-2 fields. The theory has then N copies of coordinate transformation invariance (the coordinate system associated with each metric can be changed separately), as well as N copies of Lorentz invariance. At this level may, for each vielbein e(J), J = 1, ⋯, N we may use part of the Lorentz freedom to work in the upper triangular form for the vielbein,

$${e_{(J)}}_{\;\mu}^A = \left({\begin{array}{*{20}c} {{N_{(J)}}} & {N_{(J)}^i{e_{(J)}}_{\;i}^a} \\ 0 & {{e_{(J)}}_{\;i}^a} \\ \end{array}} \right)\,,\qquad {e_{(J)}}_{\;A}^\mu = \left({\begin{array}{*{20}c} {{N_{(J)}}^{- 1}} & 0 \\ {- N_{(J)}^i{N^{- 1}}} & {{e_{(J)}}_{\;a}^i} \\ \end{array}} \right)\,,$$
(7.69)

leading to the standard ADM decomposition for the metric,

$$\begin{array}{*{20}c} {{g_{(J)\mu \nu}}\;{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = {e_{(J)}}_{\;\mu}^A{e_{(J)}}_{\;\nu}^B{\eta _{AB}}{\rm{d}}{x^\mu}{\rm{d}}{x^\nu}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= - {N_{(J)}}^2{\rm{d}}{t^2} + {\gamma _{(J)}}_{ij}\left({{\rm{d}}{x^i} + N_{(J)}^i{\rm{d}}t} \right)\;({\rm{d}}{x^j} + {N_{(J)}}^j{\rm{d}}t)\,,} \\ \end{array}$$
(7.70)

with the three-dimensional metric \(\gamma (J)ij = e(J)_{\,\,\,i}^ae(J)_{\,\,\,\,j}^b{\delta _{ab}}\). Starting with non-interacting fields, we simply take copies of the GR action,

$${L_{N{\rm{GR}}}} = \int {\rm{d}} t\sum\limits_{J = 1}^N {\sqrt {- {g_{(J)}}}} {R_{(J)}}\,,$$
(7.71)

and the Hamiltonian in terms of the vielbein variables then takes the form (7.6)

$${{\mathcal H}_{N{\rm{GR}}}} = \int {\;{{\rm{d}}^d}} x\sum\limits_{J = 1}^N {\left({{\pi _{(J)}}_a^{\;i}\dot e_{(J)\;\;i}^{\;\;\;\;\;\;a} + {N_{(J)}}{{\mathcal C}_{(J)}}_0 + N_{(J)}^i{{\mathcal C}_{(J)}}_i - {1 \over 2}\lambda _{(J)}^{ab}{{\mathcal P}_{(J)}}_{ab}} \right)} \,,$$
(7.72)

where \({\pi _{(J)}}_a^{\,\,i}\) is the conjugate momentum associated with the vielbein\({e_{(J)}}_{\,\,i}^a\) and the constraints \({{\mathcal C}_{(J)0,i}} = {{\mathcal C}_{0,i}}({e_{(J)}},{\pi _{(J)}})\) are the ones mentioned previously in (7.6) (now expressed in the vielbein variables) and are related to diffeomorphism invariance. In the vielbein language there is an addition d (d − 1)/2 primary constraints for each vielbein field

$${{\mathcal P}_{(J)}}_{ab} = {e_{(J)}}_{\left[ {ai} \right.}\;{\pi _{(J)}}_{\left. b \right]}^{\;\;i},$$
(7.73)

related to the residual local Lorentz symmetry still present after fixing the upper triangular form for the vielbeins.

Now rather than setting part of the N Lorentz frames to be on the upper diagonal form for all the vielbein (7.69) we only use one Lorentz boost to set one of the vielbein in that form, say e(1), and ‘unboost’ the N − 1 other frames, so that for any of the other vielbein one has

$${e_{(J)}}_{\;\mu}^A = \left({\begin{array}{*{20}c} {{N_{(J)}}{{\tilde \gamma}_{(J)}} + N_{(J)}^i{e_{(J)}}_{\;i}^a{p_{(J)}}_a} & {{N_{(J)}}p_{(J)}^a + N_{(J)}^i{e_{(J)}}_{\;i}^b{S_{(J)}}_b^a} \\ {{e_{(J)}}_{\;i}^a} & {{e_{(J)}}_{\;i}^b{S_{(J)}}_b^a} \\ \end{array}} \right)$$
(7.74)
$${S_{(J)}}_b^a = \delta _b^a + \tilde \gamma _{(J)}^{- 1}p_{(J)}^a{p_{(J)}}_b$$
(7.75)
$${\tilde \gamma _{(J)}} = \sqrt {1 + {p_{(J)}}_ap_{(J)}^a}$$
(7.76)

where p(J)a is the boost that would bring that vielbein in the upper diagonal form.

We now consider arbitrary interactions between the N fields of the form (6.1),

$${L_{N\,{\rm{int}}}} = \sum\limits_{{J_1}, \cdots ,{J_d} = 1}^N {{\alpha _{{J_1}, \cdots ,{J_d}}}} {\varepsilon _{{a_1} \cdots {a_d}}}\,e_{({J_1})}^{{a_1}} \wedge \cdots \wedge e_{({J_d})}^{{a_d}}\,,$$
(7.77)

where for concreteness we assume dN, otherwise the formalism is exactly the same (there is some redundancy in this formulation, i.e., some interactions are repeated in this formulation, but this has no consequence for the argument). Since the vielbeins \({e_{(J)}}_0^A\) are linear in their respective shifts and lapse \({N_{(J)}},N_{(J)}^i\) and the vielbeins \({e_{(J)}}_i^A\) do not depend any shift nor lapse, it is easy to see that the general set of interactions (7.77) lead to a Hamiltonian which is also linear in every shift and lapse,

$${{\mathcal H}_{N\,{\rm{int}}}} = \sum\limits_{J = 1}^N {\left({{N_{(J)}}{\mathcal C}_{(J)}^{{\rm{int}}}(e,p) + N_{(J)}^i{\mathcal C}_{(J)^{i}}^{{\rm{int}}}(e,p)} \right)} \,.$$
(7.78)

Indeed the wedge structure of (6.1) or (7.77) ensures that there is one and only one vielbein with time-like index \({e_{(J)}}_0^A\) for every term \({\varepsilon _{{a_1} \ldots {a_d}}}e_{({J_1})}^{{a_1}}\wedge \ldots \wedge e_{(Jd)}^{{a_d}}\).

Notice that for the interactions, the terms \({\mathcal C}_{(J)0,i}^{{\rm{int}}}\) can depend on all the N vielbeins e(J ′) and all the N − 1 ‘boosts’ p(J′), (as mentioned previously, part of one Lorentz frame is set so that p(1) = 0 and e(1) is in the upper diagonal form). Following the procedure of [314], we can now solve for the N − 1 remaining boosts by using (N − 1) of the N shift equations of motion

$${{\mathcal C}_{(J)\,i}}(e,\pi) + {{\mathcal C}_{(J)\,i}}(e,p) = 0\qquad \forall \;\;J = 1, \cdots ,N\,.$$
(7.79)

Now assuming that all N vielbein are interacting,Footnote 20 (i.e., there is no vielbein e(J) which does not appear at least once in the interactions (7.77) which mix different vielbeins), the shift equations (7.79) will involve all the N − 1 boosts and can be solved for them without spoiling the linearity in any of the N lapses N(J). As a result, the N − 1 lapses N(J) for J = 2, ⋯, N are Lagrange multiplier for (N − 1) first class constraints. The lapse N(1) for the first vielbein combines with the remaining shift \(N_{(1)}^i\) to generate the one remaining copy of diffeomorphism invariance.

We now have all the ingredients to count the number of dofs in phase space: We start with d2 components in each of the N vielbein \(e_{(J)i}^a\) and associated conjugate momenta, that is a total of 2 × d2 × N phase space variables. We then have 2 × d (d − 1)/2 × N constraintsFootnote 21 associated with the \(\lambda _{(J)}^{ab}\). There is one copy of diffeomorphism removing 2 × (d + 1) phase space dofs (with Lagrange multiplier N(1) and \(N_{(1)}^i\) and (N − 1) additional first-class constraints with Lagrange multipliers N(J ≥2) removing 2 × (N − 1) dofs. As a result we end up with

$$\begin{array}{*{20}c} {\left({2 \times {{d(d - 1)} \over 2} \times N} \right) - 2 \times {{d(d - 1)} \over 2} \times N - 2 \times (d + 1) - 2 \times (N - 1)} \\ {= \left({{d^2}N - 2N + d(N - 2)} \right){\rm{phase}}\;{\rm{space}}\;{\rm{dofs}}\;\quad \quad \quad \quad \quad \quad \;\;} \\ {= {1 \over 2}\left({{d^2}N - 2N + d(N - 2)} \right){\rm{field}}\;{\rm{space}}\;{\rm{dofs}}\quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(7.80)
$$\begin{array}{*{20}c} {= {1 \over 2}\left({{d^2} - d - 2} \right){\rm{dofs}}\;{\rm{for}}\;{\rm{a}}\;{\rm{massless}}\;{\rm{spin - 2}}\;{\rm{field}}\quad \quad \quad \quad \quad \quad \quad \;} \\ {+ {1 \over 2}\left({{d^2} + d - 2} \right) \times (N - 1)\;{\rm{dofs}}\;{\rm{for}}\;(N - 1){\rm{a}}\;{\rm{massive}}\;{\rm{spin - 2}}\;{\rm{fields}}\,,} \\ \end{array}$$
(7.81)

which is the correct counting in (d + 1) spacetime dimensions, and the theory is thus free of any BD ghost.

10 Decoupling Limits

10.1 Scaling versus decoupling

Before moving to the decoupling of massive gravity and bi-gravity, let us make a brief interlude concerning the correct identification of degrees of freedom. The Stückelberg trick used previously to identify the correct degrees of freedom works in all generality, but care must be used when taking a “decoupling limit” (i.e., scaling limit) as will be done in Section 8.2.

Imagine the following gauge field theory

$${\mathcal L} = - {1 \over 2}{m^2}{A_\mu}{A^\mu}\,,$$
(8.1)

i.e., the Proca mass term without any kinetic Maxwell term for the gauge field. Since there are no dynamics in this theory, there is no degrees of freedom. Nevertheless, one could still proceed and use the same split \({A_\mu} = A_\mu ^ \bot + {\partial _\mu}{\mathcal X}/m\) as performed previously,

$${\mathcal L} = - {1 \over 2}{m^2}A_\mu ^ \bot {A^{\bot \,\mu}} + m({\partial _\mu}{A^{\bot \,\mu}})\chi - {1 \over 2}{(\partial \chi)^2}\,,$$
(8.2)

so as to introduce what appears to be a kinetic term for the mode χ. At this level the theory is still invariant under χχ + and \(A_\mu ^ \bot \to A_\mu ^ \bot - {\partial _\mu}\xi\) and so while there appears to be a dynamical degree of freedom χ, the symmetry makes that degree of freedom unphysical, so that (8.2) still propagates no physical degree of freedom.

Now consider the m ⊒ 0 scaling limit of (8.2) while keeping \(A_\mu ^ \bot\) and χ finite. In that scaling limit, the theory reduces to

$${{\mathcal L}_{m \rightarrow 0}} = - {1 \over 2}{(\partial \chi)^2}\,,$$
(8.3)

i.e., one degree of freedom with no symmetry which implies that the theory (8.3) propagates one degree of freedom. This is correct and thus means that (8.3) is not a consistent decoupling limit of (8.2) since the number of degrees of freedom is different already at the linear level. In the rest of this review, we will call a decoupling limit a specific type of scaling limit which preserves the same number of physical propagating degrees of freedom in the linear theory. As suggested by the name, a decoupling limit is a special kind of limit in which some of the degrees of freedom of the original theory might decouple from the rest, but the total number of degrees of freedom remains identical. For the theory (8.2), this means that the scaling ought to be taken not with \(A_\mu ^ \bot\) fixed but rather with \(\tilde A_\mu ^ \bot = A_\mu ^ \bot/m\) fixed. This is indeed a consistent rescaling which leads to finite contributions in the limit m ⊒ 0,

$${{\mathcal L}_{m \rightarrow 0}} = - {1 \over 2}\tilde A_\mu ^ \bot {\tilde A^{\bot \,\mu}} + ({\partial _\mu}{\tilde A^{\bot \,\mu}})\chi - {1 \over 2}{(\partial \chi)^2}\,,$$
(8.4)

which clearly propagates no degrees of freedom.

This procedure is true in all generality: a decoupling limit is a special scaling limit where all the fields in the original theory are scaled with the highest possible power of the scale in such a way that the decoupling limit is finite.

A decoupling limit of a theory never changes the number of physical degrees of freedom of a theory. At best it ‘decouples’ some of them in such a way thai they are inaccessible from another sector.

Before looking at the massive gravity limit of bi-gravity and other decoupling limits of massive and bi-gravity, let us start by describing the different scaling limits that can be taken. We start with a bi-gravity theory where the two spin-2 fields have respective Planck scales Mg and Mf and the interactions between the two metrics arises at the scale m. In order to stick to the relevant points we perform the analysis in four dimensions, but the following arguments extend trivially to arbitrary dimensions.

  • Non-interacting Limit: The most natural question to ask is what happens in the limit where the interactions between the two fields are ‘switched off’, i.e., when sending the scale m ⊒ 0, (the limit m ⊒ 0 is studied more carefully in Sections 8.3 and 8.4). In that case if the two Planck scales Mg,f remain fixed as m → 0, we then recover two massless non-interacting spin-2 fields (carrying both 2 helicity-2 modes), in addition to a decoupled sector containing a helicity-0 mode and a helicity-1 mode. In bi-gravity matter fields couple only to one metric, and this remains the case in the limit m → 0, so that the two massless spin-2 fields live in two fully decoupled sectors even when matter in included.

  • Massive Gravity: Alternatively, we may look at the limit where one of the spin-2 fields (say fμν) decouples. This can be studied by sending its respective Planck scale to infinity. The resulting limit corresponds to a massive spin-2 field (carrying five dofs) and a decoupled massless spin-2 field carrying 2 dofs. This is nothing other than the massive gravity limit of bi-gravity (which includes a fully decoupled massless sector).

    If one considers matter coupling to the metric which scales in such a way that a non-trivial solution for fμν survives in the \({M_f} \to \infty \,\lim {\rm{it}}\,{f_{\mu v}} \to {\overset - f _{\mu v}}\), we then obtain a massive gravity sector on an arbitrary non-dynamical reference metric \({\overset - f _{\mu v}}\). The dynamics of the massless spin-2 field fully decouples from that of the massive sector.

  • Other Decoupling Limits Finally, one can look at combinations of the previous limits, and the resulting theory depends on how fast Mf, Mg → ∞ compared to how fast m → 0. For instance if one takes the limit Mf, Mg → ∞ and m → 0, while keeping both Mg/Mf and \(\Lambda _3^3 = {M_g}{m^2}\) fixed, then we obtain what is called the Λ3-decoupling limit of bi-gravity (derived in Section 8.4), where the dynamics of the two helicity-2 modes (which are both massless in that limit), and that of the helicity-1 and -0 modes can be followed without keeping track of the standard non-linearities of GR.

    If on top of this Λ3-decoupling limit one further takes Mf → ∞, then one of the massless spin-2 fields fully decoupled (no communication between that field and the helicity-1 and -0 modes). If, on the other hand, we take the additional limit m → 0 on top of the Λ3-decoupling limit, then the helicity-0 and -1 modes fully decouple from both helicity-2 modes.

In all of these decoupling limits, the number of dofs remains the same as in the original theory, some fields are simply decoupled from the rest of the standard gravitational sector. These prevents any communication between these decoupled fields and the gravitational sector, and so from the gravitational sector view point it appears as if these decoupled fields did not exist.

It is worth stressing that all of these limits are perfectly sensible and lead to sensible theories, (from a theoretical view point). This is important since if one of these scaling limits lead to a pathological theory, it would have severe consequences for the parent bi-gravity theory itself.

Similar decoupling limit could be taken in multi-gravity and out of N interacting spin-2 fields, we could obtain for instance N decoupled massless spin-2 fields and 3(N − 1) decoupled dofs in the helicity-0 and -1 modes.

In what follows we focus on massive gravity limit of bi-gravity when Mf ⊒∞

10.2 Massive gravity as a decoupling limit of bi-gravity

10.2.1 Minkowski reference metric

In the following two sections we review the decoupling arguments given previously in the literature, (see for instance [154]). We start with the theory of bi-gravity presented in Section 5.4 with the action (5.43)

$$\begin{array}{*{20}c} {{{\mathcal L}_{{\rm{bi - gravity}}}} = {{M_g^2} \over 2}\sqrt {- g} R[g] + {{M_f^2} \over 2}\sqrt {- f} R[f] + {1 \over 4}{m^2}M_{{\rm{Pl}}}^2\sqrt {- g} \,{{\mathcal L}_m}(g,f)} \\ {+ \sqrt {- g} {\mathcal L}_g^{({\rm{matter}})}({g_{\mu \nu}},{\psi _g}) + \sqrt {- f} {\mathcal L}_f^{({\rm{matter}})}({f_{\mu \nu}},{\psi _f})\,,\quad \quad} \\ \end{array}$$
(8.5)

with \({{\mathcal L}_m}(g,f) = \sum\nolimits_{n = 0}^4 {{\alpha _n}{{\mathcal L}_n}[{\mathcal K}(g,f)]}\) as defined in (6.3) and where \({\mathcal K}_v^\mu = \delta _v^\mu - \sqrt {{g^{\mu \alpha}}{f_{\alpha v}}}\). We also allow for the coupling to matter with different species ψg,f living on each metrics.

We now consider matter fields ψf such that fμν = ημν is a solution to the equations of motion (so for instance there is no overall cosmological constant living on the metric fμν). In that case we can write that metric fμν as

$${f_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {{M_f}}}{\chi _{\mu \nu}}\,,$$
(8.6)

We may now take the limit Mf → ∞ while keeping the scales Mg and m and all the fields χ, g, ψf,g fixed. We then recover massive gravity plus a completely decoupled massless spin-2 field χμν, and a fully decoupled matter sector ψf living on flat space

$$\begin{array}{*{20}c} {{{\mathcal L}_{{\rm{bi - gravity}}}}\overset {{M_f} \rightarrow \infty}{\rightarrow}{{\mathcal L}_{{\rm{MG}}}}(g,\eta) + \sqrt {- g} {\mathcal L}_g^{({\rm{matter}})}({g_{\mu \nu}},{\psi _g})\quad \quad \quad \quad \quad\;\;} \\ {+ {1 \over 2}{\chi ^{\mu \nu}}\hat {\mathcal E} _{\mu \nu}^{\alpha \beta}{\chi _{\alpha \beta}} + {\mathcal L}_f^{({\rm{matter}})}({\eta _{\mu \nu}},{\psi _f})\,,} \\ \end{array}$$
(8.7)

with the massive gravity Lagrangian MG is expressed in (6.3). That massive gravity Lagrangian remains fully non-linear in this limit and is expressed in terms of the full metric gμν and the reference metric ημν. While the metric fμν is ‘frozen’ in this limit, we emphasize however that the massless spin-2 field χμν is itself not frozen — its dynamics is captured through the kinetic term \({{\mathcal X}^{\mu v}}\hat \varepsilon _{\mu v}^{\alpha \beta}{{\mathcal X}_{\alpha \beta}}\), but that spin-2 field decouple from its own matter sector ψf, (although this can be accommodated for by scaling the matter fields ψf accordingly in the limit Mf → ∞ so as to maintain some interactions).

At the level of the equations of motion, in the limit Mf → ∞ we obtain the massive gravity modified Einstein equation for gμν, the free massless linearized Einstein equation for which fully decouples and the equation of motion for all the matter fields ψf on flat spacetime, (see also Ref. [44]).

10.2.2 (A)dS reference metric

To consider massive gravity with an (A)dS reference metric as a limit of bi-gravity, we include a cosmological constant for the metric f into (8.5)

$${{\mathcal L}_{{\rm{CC}},{\rm{f}}}} = - M_f^2\int {{{\rm{d}}^4}} x\sqrt {- f} {\Lambda _f}\,.$$
(8.8)

There can also be in principle another cosmological constant living on top of the metric but this can be included into the potential \({\mathcal U}(g,f)\). The background field equations of motion are then given by

$$M_f^2{G_{\mu \nu}}[f] + {{{m^2}M_{{\rm{Pl}}}^2} \over {4\sqrt {- f}}}\left({{\delta \over {\delta {f^{\mu \nu}}}}\sqrt {- g} \,{\mathcal U}(g,f)} \right) = {T_{\mu \nu}}({\psi _f}) - M_f^2{\Lambda _f}{f_{\mu \nu}}$$
(8.9)
$$M_{{\rm{Pl}}}^2{G_{\mu \nu}}[g] + {{{m^2}M_{{\rm{Pl}}}^2} \over {4\sqrt {- g}}}\left({{\delta \over {\delta {g^{\mu \nu}}}}\sqrt {- g} \,{\mathcal U}(g,f)} \right) = {T_{\mu \nu}}({\psi _g})\,.$$
(8.10)

Taking now the limit Mf → ∞ while keeping the cosmological constant Λf fixed, the background solution for the metric fμν is nothing other than dS (or AdS depending on the sign of Λf). So we can now express the metric fμν as

$${f_{\mu \nu}} = {\gamma _{\mu \nu}} + {1 \over {{M_f}}}{\chi _{\mu \nu}}\,,$$
(8.11)

where γμν is the dS metric with Hubble parameter \(H\sqrt {{\Lambda _f}/3}\). Taking the limit Mf → ∞, we recover massive gravity on (A)dS plus a completely decoupled massless spin-2 field χμν,

$$\begin{array}{*{20}c} {{{\mathcal L}_{{\rm{bi - gravity}}}} - M_f^2\int {{{\rm{d}}^4}} x\sqrt {- f} {\Lambda _f}\;\overset {{M_f} \rightarrow \infty}{\rightarrow} {{M_{{\rm{Pl}}}^2} \over 2}\sqrt {- g} R + {{{m^2}} \over 4}{\mathcal U}(g,\gamma)\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {1 \over 2}{\chi ^{\mu \nu}}\hat \varepsilon _{\mu \nu}^{\alpha \beta}{\chi _{\alpha \beta}}\,,} \\ \end{array}$$
(8.12)

where once again the scales MPl and m are kept fixed in the limit Mf → ∞. γμν now plays the role of a non-trivial reference metric for massive gravity. This corresponds to a theory of massive gravity on a more general reference metric as presented in [296]. Here again the Lagrangian for massive gravity is given in (6.3) with now \({\mathcal K}_v^\mu (g) = \delta _v^\mu - \sqrt {{g^{\mu \alpha}}{\gamma _{\alpha v}}}\). The massive gravity action remains fully non-linear in the limit Mf → ∞ and is expressed solely in terms of the full metric gμν and the reference metric γμν while the excitations χμν for the massless graviton remain dynamical but fully decouple from the massive sector.

10.2.3 Arbitrary reference metric

As is already clear from the previous discussion, to recover massive gravity on a non-trivial reference metric as a limit of bi-gravity, one needs to scale the Matter Lagrangian that couples to what will become the reference metric (say the metric f for definiteness) in such a way that the Riemann curvature of f remains finite in that decoupling limit. For a macroscopic description of the matter living on this is in principle always possible. For instance one can consider a point source of mass Mbh living on the metric f. Then, taking the limit Mf, Mbh → ∞ while keeping the ratio MBH/Mf fixed, leads to a theory of massive gravity on a Schwarzschild reference metric and a decoupled massless graviton. However, some care needs to be taken to see how this works when the dynamics of the matter sourcing is included.

As soon as the dynamics of the matter field is considered, one has to send the scale of that field to infinity so that it maintains some nonzero effect on f in the limit Mf → ∞ i.e.,

$$\underset {{M_f} \rightarrow \infty} {\lim} {1 \over {M_f^2}}{T^{\mu \nu}} = \underset {{M_f} \rightarrow \infty} {\lim} {1 \over {\sqrt {- f} M_f^2}}{{\delta \sqrt {- f} {\mathcal L}_f^{({\rm{matter}})}} \over {\delta {f_{\mu \nu}}}} \rightarrow {\rm{finite}}\,.$$
(8.13)

Nevertheless, this can be achieved in such a way that the fluctuations of the matter fields remain finite and decouple in the limit Mf → ∞. We note that this scaling is the key difference between the decoupling limit of bi-gravity on a Minkowski reference metric derived in section 8.2.1 where the matter field scale as \({\rm{li}}{{\rm{m}}_{{M_f} \to \infty}}{1 \over {M_f^2}}{T^{\mu v}} \to 0\) and the decoupling limit of bi-gravity on an arbitrary reference metric derived here.

As an example, suppose that the Lagrangian for the matter (for example a scalar field) sourcing the f metric is

$${\mathcal L}_f^{({\rm{matter}})} = \sqrt {- f} \left({- {1 \over 2}{f^{\mu \nu}}{\partial _\mu}\chi {\partial _\nu}\chi - {V_0}F\left({{\chi \over \lambda}} \right)} \right)$$
(8.14)

where F (X) is an arbitrary dimensionless function of its argument. Then choosing to take the form

$$\chi = {M_f}\bar \chi + \delta \chi \,,$$
(8.15)

and rescaling \({V_0} = M_f^2{\overset - V _0}\) and \(\lambda = {M_f}\overset - \lambda\), then on taking the limit Mf → ∞ keeping \(\bar {\mathcal X}\), \(\delta {\mathcal X}\) and \(\overset - \lambda\) fixed, since

$${\mathcal L}_f^{({\rm{matter}})} \rightarrow M_f^2\sqrt {- f} \left({- {1 \over 2}{f^{\mu \nu}}{\partial _\mu}\bar \chi {\partial _\nu}\bar \chi - {{\bar V}_0}F\left({{{\bar \chi} \over {\bar \lambda}}} \right)} \right) + {\rm{fluctuations}}\,,$$
(8.16)

we find that the background stress energy blows up in such a way that \({1 \over {M_f^2}}{T^{\mu v}}\) remains finite and nontrivial, and in addition the background equations of motion for \(\bar {\mathcal X}\) remain well-defined and nontrivial in this limit,

$${\square_f}\bar \chi = {{{{\bar V}_0}} \over {\bar \lambda}}F\prime\left({{{\bar \chi} \over {\bar \lambda}}} \right)\,.$$
(8.17)

This implies that even in the limit Mf → ∞ can remain consistently as a nontrivial sourced metric which is a solution of some dynamical equations sourced by matter. In addition the action for the fluctuations δχ asymptotes to a free theory which is coupled only to the fluctuations of which are themselves completely decoupled from the fluctuations of the metric g and matter fields coupled to g.

As a result, massive gravity with an arbitrary reference metric can be seen as a consistent limit of bi-gravity in which the additional degrees of freedom in the metric and matter that sources the background decouple. Thus all solutions of massive gravity may be seen as Mf → ∞ decoupling limits of solutions of bi-gravity. This will be discussed in more depth in Section 8.4. For an arbitrary reference metric which can be locally written as a small departures about Minkowski the decoupling limit is derived in Eq. (8.81).

Having derived massive gravity as a consistent decoupling limit of bi-gravity, we could of course do the same for any multi-metric theory. For instance, out of N-interacting fields, we could take a limit so as to decouple one of the metrics, we then obtain the theory of (N − 1)-interacting fields, all of which being massive and one decoupled massless spin-2 field.

10.3 Decoupling limit of massive gravity

We now turn to a different type of decoupling limit, whose aim is to disentangle the dofs present in massive gravity itself and analyze the ‘irrelevant interactions’ (in the usual EFT sense) that arise at the lowest possible scale. One could naively think that such interactions arise at the scale given by the graviton mass, but this is not so. In a generic theory of massive gravity with Fierz-Pauli at the linear level, the first irrelevant interactions typically arise at the scale Λ5 = (m4MPl)1/5. For the setups we have in mind, m ≪ Λ5MPl. But we shall see that interactions arising at such a low-energy scale are always pathological (reminiscent to the BD ghost [111, 173]), and in ghost-free massive gravity the first (irrelevant) interactions actually arise at the scale Λ3 = (m3MPl)1/3.

We start by deriving the decoupling limit in the absence of vectors (helicity-1 modes) and then include them in the following section 8.3.4. Since we are interested in the decoupling limit about flat spacetime, we look at the case where Minkowski is a vacuum solution to the equations of motion. This is the case in the absence of a cosmological constant and a tadpole and we thus focus on the case where α0 = α1 = 0 in (6.3).

10.3.1 Interaction scales

In GR, the interactions of the helicity-2 mode arise at the very high energy scale, namely the Planck scale. In massive gravity a new scale enters and we expect some interactions to arise at a lower energy scale given by a geometric combination of the Planck scale and the graviton mass. The potential term \(M_{{\rm{Pl}}}^2{m^2}\sqrt {- g} {{\mathcal L}_n}[{\mathcal K}[g,\eta ]]\) (6.3) includes generic interactions between the canonically normalized helicity-0 (π), helicity-1 (Aμ), and helicity-2 modes (hμν) introduced in (2.48)

$$\begin{array}{*{20}c} {{{\mathcal L}_{j,k,\ell}} = {m^2}M_{{\rm{Pl}}}^2{{\left({{h \over {M_{{\rm{Pl}}}}}} \right)}^j}{{\left({{{\partial A} \over {mM_{{\rm{Pl}}}}}} \right)}^{2k}}{{\left({{{{\partial ^2}\pi} \over {{m^2}M_{{\rm{Pl}}}}}} \right)}^\ell}} \\ {= \Lambda _{j,k,\ell}^{- 4 + (j + 4k + 3\ell)}\;{h^j}{{(\partial A)}^{2k}}{{({\partial ^2}\pi)}^\ell}\,,\quad} \\ \end{array}$$
(8.18)

at the scale

$${\Lambda _{j,k,\ell}} = {\left({{m^{2k + 2\ell - 2}}M_{{\rm{Pl}}}^{j + 2k + \ell - 2}} \right)^{1/(j + 4k + 3\ell - 4)}}\,,$$
(8.19)

and with j, k, ∈ ℕ, and j + 2k + > 2.

Clearly, the lowest interaction scale is Λj=0,k= 0, =3 ≡ Λ5 = (MPlm4)1/5 which arises for an operator of the form (2π)3. If present such an interaction leads to an Ostrogradsky instability which is another manifestation of the BD ghost as identified in [173].

Even if that very interaction is absent there is actually an infinite set of dangerous interactions of the form (2π) which arise at the scale Λj=0,k =0;ℓ≥3, with

$${\Lambda _5} = {({M_{{\rm{Pl}}}}{m^4})^{1/5}} \leq ({\Lambda _{j = 0,k = 0,\ell \geq 3}}) < {\Lambda _3} = {({M_{{\rm{Pl}}}}{m^2})^{1/3}}\,.$$
(8.20)

with Λj=0,k =0,→∞ = Λ3.

Any interaction with j > 0 or k > 0 automatically leads to a larger scale, so all the interactions arising at a scale between Λ5 (inclusive) and Λ3 are of the form (2π) and carry an Ostrogradsky instability. For DGP we have already seen that there is no interactions at a scale below Λ3. In what follows we show that same remains true for the ghost-free theory of massive gravity proposed in (6.3). To see this let us identify the interactions with j = k = 0 and arbitrary power for (2π).

10.3.2 Operators below the scale Λ3

We now express the potential term \(M_{{\rm{Pl}}}^2{m^2}\sqrt {- g} {{\mathcal L}_n}[{\mathcal K}]\) introduced in (6.3) using the metric in term of the helicity-0 mode, where we recall that the quantity \({\mathcal K}\) is defined in (6.7), as \({\mathcal K}_v^\mu [g,\tilde f] = \delta _{\,\,\,\,\,v}^\mu - (\sqrt {{g^{- 1}}\tilde f})_v^\mu\) where \(\tilde f\) is the ‘Stückelbergized’ reference metric given in (2.78). Since we are interested in interactions without the helicity-2 and -1 modes (j = k = 0), it is sufficient to follow the behaviour of the helicity-0 mode and so we have

$$\left. {\begin{array}{*{20}c} {{{\left. {{{\tilde f}_{\mu \nu}}} \right\vert}_{h = A = 0}} = {\eta _{\mu \nu}} - {2 \over {{M_{{\rm{Pl}}}}{m^2}}}{\Pi _{\mu \nu}} + {1 \over {M_{{\rm{Pl}}}^2{m^4}}}\Pi _{\mu \nu}^2}\quad \\ {{{\left. {{g^{\mu \nu}}} \right\vert}_{h = 0}} = {\eta ^{\mu \nu}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}} \right\}\quad \quad \Rightarrow \quad {\mathcal K}_{\;\;\nu}^\mu {\vert _{h = A = 0}} = {{\Pi _\nu ^\mu} \over {{M_{{\rm{Pl}}}}{m^2}}}\,,$$
(8.21)

with again Πμν = μν π and \(\Pi _{\mu v}^2: = {\eta ^{\alpha \beta}}{\Pi _{\mu \alpha}}{\Pi _{v\beta}}\).

As a result, we infer that up to the scale Λ3 (excluded), the potential in (6.3) is

$${\left. {{{\mathcal L}_{{\rm{mass}}}} = {{{m^2}M_{{\rm{Pl}}}^2} \over 4}\sqrt {- g} \sum\limits_{n = 2}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,\tilde f]]} \right\vert _{h = A = 0}}$$
(8.22)
$$\begin{array}{*{20}c} {= {{{m^2}M_{{\rm{Pl}}}^2} \over 4}\sum\limits_{n = 2}^4 {{\alpha _n}} {{\mathcal L}_n}\left[ {{{{\Pi _{\mu \nu}}} \over {{M_{{\rm{Pl}}}}{m^2}}}} \right] \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad\;} \\ {= {1 \over 4}{\epsilon ^{\mu \nu \alpha \beta}}{\epsilon _{\mu {\prime} \nu {\prime} \alpha {\prime} \beta {\prime}}}\left({{{{\alpha _2}} \over {{m^2}}}\delta _\nu ^{\mu {\prime}}\delta _\nu ^{\nu {\prime}} + {{{\alpha _3}} \over {{M_{{\rm{Pl}}}}{m^4}}}\delta _\nu ^{\mu {\prime}}\Pi _\nu ^{\nu {\prime}} + {{{\alpha _4}} \over {M_{{\rm{Pl}}}^2{m^6}}}\Pi _\nu ^{\mu {\prime}}\Pi _\nu ^{\nu {\prime}}} \right)\Pi _\alpha ^{\alpha {\prime}}\Pi _\beta ^{\beta {\prime}}\,,} \\ \end{array}$$
(8.23)

where as mentioned earlier we focus on the case without a cosmological constant and tadpole i.e., α0 = α1 = 0. All of these interactions are total derivatives. So even though the ghost-free theory of massive gravity does in principle involve some interactions with higher derivatives of the form (2π) it does so in a very precise way so that all of these terms combine so as to give a total derivative and being harmless.Footnote 22

As a result the potential term constructed proposed in Part II (and derived from the deconstruction framework) is free of any interactions of the form (2π). This means that the BD ghost as identified in the Stückelberg language in [173] is absent in this theory. However, at this level, the BD ghost could still reappear through different operators at the scale Λ3 or higher.

10.3.3 Λ3-decoupling limit

Since there are no operators all the way up to the scale Λ3 (excluded), we can take the decoupling limit by sending MPl ⊒ ∞, m ⊒ 0 and maintaining the scale Λ3 fixed.

The operators that arise at the scale Λ3 are the ones of the form (8.18) with either j = 1, k = 0 and arbitrary ≥ 2 or with j = 0, k = 1 and arbitrary ≥ 1. The second case scenario leads to vector interactions of the form (∂A)2(2π) and will be studied in the next Section 8.3.4. For now we focus on the first kind of interactions of the form h (∂2π),

$${\mathcal L}_{{\rm{mass}}}^{{\rm{dec}}} = {h_{\mu \nu}}{\bar X_{\mu \nu}}\,,$$
(8.24)

with [144] (see also refs. [137] and [143])

$$\begin{array}{*{20}c} {{{\left. {{{\bar X}_{\mu \nu}} = {\delta \over {\delta {h_{\mu \nu}}}}{{\mathcal L}_{{\rm{mass}}}}} \right\vert}_{h = A = 0}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {{{\left. {= {{M_{{\rm{Pl}}}^2{m^2}} \over 4}{\delta \over {\delta {h_{\mu \nu}}u}}\left({\sqrt {- g} \sum\limits_{n = 2}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,\tilde f]]} \right)} \right\vert}_{h = A = 0}}\,.} \\ \end{array}$$
(8.25)

Using the fact that

$${\left. {{{\delta {{\mathcal K}^n}} \over {\delta {h^{\mu \nu}}}}} \right\vert _{h = A = 0}} = {n \over 2}(\Pi _{\mu \nu}^{n - 1} - \Pi _{\mu \nu}^n)\,,$$
(8.26)

we obtain

$${\bar X_{\mu \nu}} = {{\Lambda _3^3} \over 8}\sum\limits_{n = 2}^4 {{\alpha _n}} \left({{{4 - n} \over {\Lambda _3^{3n}}}X_{\mu \nu}^{(n)}[\Pi ] + {n \over {\Lambda _3^{3(n - 1)}}}X_{\mu \nu}^{(n - 1)}[\Pi ]} \right)\,,$$
(8.27)

where the tensors \(X_{\mu v}^{(n)}\) are constructed out of Πμν, symbolically, X(n)Π(n) but in such a way that they are transverse and that their resulting equations of motion never involve more than two derivatives on each fields,

$${X^{(0)}}_{\,\mu {\prime}}^\mu [Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu {\prime} \nu \alpha \beta}}$$
(8.28)
$${X^{(1)}}_{\,\mu {\prime}}^\mu [Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu {\prime} \nu {\prime} \alpha \beta}}\;Q_\nu ^{\nu {\prime}}$$
(8.29)
$${X^{(2)}}_{\,\mu {\prime}}^\mu [Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu {\prime} \nu {\prime} \alpha {\prime} \beta}}\;Q_\nu ^{\nu {\prime}}\;Q_\alpha ^{\alpha {\prime}}$$
(8.30)
$${X^{(3)}}_{\,\mu {\prime}}^\mu [Q] = {\varepsilon ^{\mu \nu \alpha \beta}}{\varepsilon _{\mu {\prime} \nu {\prime} \alpha {\prime} \beta {\prime}}}\;Q_\nu ^{\nu {\prime}}\;Q_\alpha ^{\alpha {\prime}}Q_\beta ^{\beta {\prime}}$$
(8.31)
$${X^{(n \geq 4)}}_{\,\mu \prime}^\mu [Q] = 0\,,$$
(8.32)

where we have included X(0) and X(n ≥4) for completeness (these become relevant for instance in the context of bi-gravity). The generalization of these tensors to arbitrary dimensions is straightforward and in d-spacetime dimensions there are d such tensors, symbolically X(n) = εε Πnδdn1 for n = 0, ⋯ d, − 1.

Since we are dealing with the decoupling limit with MPl → ∞ the metric is flat \({g_{\mu v}} = {\eta _{\mu v}} + M_{{\rm{Pl}}}^{- 1}{h_{\mu v}} \to {\eta _{\mu v}}\) and all indices are raised and lowered with respect to the Minkowski metric. These tensors can be written more explicitly as follows

$$X_{\mu \nu}^{(0)}[Q] = 3!{\eta _{\mu \nu}}$$
(8.33)
$$X_{\mu \nu}^{(1)}[Q] = 2!([Q]{\eta _{\mu \nu}} - {Q_{\mu \nu}})$$
(8.34)
$$X_{\mu \nu}^{(2)}[Q] = ({[Q]^2} - [{Q^2}]){\eta _{\mu \nu}} - 2([Q]{Q_{\mu \nu}} - Q_{\mu \nu}^2)$$
(8.35)
$$\begin{array}{*{20}c} {X_{\mu \nu}^{(3)}[Q] = ({{[Q]}^3} - 3[Q][{Q^2}] + 2[{Q^3}]){\eta _{\mu \nu}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {- 3({{[Q]}^2}{Q_{\mu \nu}} - 2[Q]Q_{\mu \nu}^2 - [{Q^2}]{Q_{\mu \nu}} + 2Q_{\mu \nu}^3){.}} \\ \end{array}$$
(8.36)

Note that they also satisfy the recursive relation

$$X_{\mu \nu}^{(n)} = {1 \over {4 - n}}(- n\Pi _{\,\mu}^\alpha \delta _\nu ^\beta + {\Pi ^{\alpha \beta}}{\eta _{\mu \nu}})\;X_{\alpha \beta}^{(n - 1)},$$
(8.37)

with \(X_{\mu v}^{(0)} = 3!{\eta _{\mu v}}\).

10.3.3.1 Decoupling limit

From the expression of these tensors in terms of the fully antisymmetric Levi-Cevita tensors, it is clear that the tensors are transverse and that the equations of motion of \({h^{\mu v}}\overset - {{X_{\mu v}}}\) with respect to both h and π never involve more than two derivatives. This decoupling limit is thus free of the Ostrogradsky instability which is the way the BD ghost would manifest itself in this language. This decoupling limit is actually free of any ghost-lie instability and the whole theory is free of the BD even beyond the decoupling limit as we shall see in depth in Section 7.

Not only does the potential term proposed in (6.3) remove any potential interactions of the form (2π) which could have arisen at an energy between Λ5 = (MPlm4)1/5 and Λ3, but it also ensures that the interactions that arise at the scale Λ3 are healthy.

As already mentioned, in the decoupling limit MPl ⊒ ∞ the metric reduces to Minkowski and the standard Einstein-Hilbert term simply reduces to its linearized version. As a result, neglecting the vectors for now the full Λ3-decoupling limit of ghost-free massive gravity is given by

$$\begin{array}{*{20}c} {{{\mathcal L}_{{\Lambda _3}}} = - {1 \over 4}{h^{\mu \nu}}\hat {\mathcal E} _{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {1 \over 8}{h^{\mu \nu}}\left({2{\alpha _2}X_{\mu \nu}^{(1)} + {{2{\alpha _2} + 3{\alpha _3}} \over {\Lambda _3^3}}X_{\mu \nu}^{(2)} + {{{\alpha _3} + 4{\alpha _4}} \over {\Lambda _3^6}}X_{\mu \nu}^{(3)}} \right)} \\ {= - {1 \over 4}{h^{\mu \nu}}\hat {\mathcal E} _{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {h^{\mu \nu}}\sum\limits_{n = 1}^3 {{{{a_n}} \over {\Lambda _3^{3(n - 1)}}}} X_{\mu \nu}^{(n)}\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(8.38)

with α1 = α2/4, α2 = (2α2 + 3α3)/8 and α3 = (α3 + 4α4)/8 and the correct normalization should be α2 = 1.

10.3.3.2 Unmixing and Galileons

As was already the case at the linearized level for the Fierz-Pauli theory (see Eqs. (2.47) and (2.48)) the kinetic term for the helicity-0 mode appears mixed with the helicity-2 mode. It is thus convenient to diagonalize these two modes by performing the following shift,

$${h_{\mu \nu}} = {\tilde h_{\mu \nu}} + {\alpha _2}\pi {\eta _{\mu \nu}} - {{2{\alpha _2} + 3{\alpha _3}} \over {2\Lambda _3^3}}{\partial _\mu}\pi {\partial _\nu}\pi \,,$$
(8.39)

where the non-linear term has been included to unmix the coupling \({h^{\mu v}}X_{\mu v}^{(2)}\), leading to the following decoupling limit [137]

$${{\mathcal L}_{{\Lambda _3}}} = - {1 \over 4}\left[ {{{\tilde h}^{\mu \nu}}\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{{\tilde h}_{\alpha \beta}} + \sum\limits_{n = 2}^5 {{{{c_n}} \over {\Lambda _3^{3(n - 2)}}}} {\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ] - {{2({\alpha _3} + 4{\alpha _4})} \over {\Lambda _3^6}}{{\tilde h}^{\mu \nu}}X_{\mu \nu}^{(3)}} \right]\,,$$
(8.40)

where we introduced the Galileon Lagrangians \({\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ]\) as defined in Ref. [412]

$${\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ] = {1 \over {(6 - n)!}}{(\partial \pi)^2}{{\mathcal L}_{n - 2}}[\Pi ]$$
(8.41)
$$= - {2 \over {n(5 - n)!}}\pi {{\mathcal L}_{n - 1}}[\Pi ]\,,$$
(8.42)

where the Lagrangians n [Q ] = εεQnδ4−n for a tensor \(Q_{\,\,\,\,v}^\mu\) are defined in (6.9)(6.13), or more explicitly in (6.14)(6.18), leading to the explicit form for the Galileon Lagrangians

$${\mathcal L}_{({\rm{Gal}})}^{(2)}[\pi ] = {(\partial \pi)^2}$$
(8.43)
$${\mathcal L}_{({\rm{Gal}})}^{(3)}[\pi ] = {(\partial \pi)^2}[\Pi ]$$
(8.44)
$${\mathcal L}_{({\rm{Gal}})}^{(4)}[\pi ] = {(\partial \pi)^2}({[\Pi ]^2} - [{\Pi ^2}])$$
(8.45)
$${\mathcal L}_{({\rm{Gal}})}^{(5)}[\pi ] = {(\partial \pi)^2}({[\Pi ]^3} - 3[\Pi ][{\Pi ^2}] + 2[{\Pi ^3}])\,,$$
(8.46)

and the coefficients cn are given in terms of the αn as follows,

$$\begin{array}{*{20}c} {{c_2} = 3\alpha _2^2\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} & {\quad {c_3} = {3 \over 2}{\alpha _2}(2{\alpha _2} + 3{\alpha _3})\,,\quad \quad \quad} \\ {{c_4} = {1 \over 4}(4\alpha _2^2 + 9\alpha _3^2 + 16{\alpha _2}({\alpha _3} + {\alpha _4}))\,,} & {\quad {c_5} = {5 \over 8}(2{\alpha _2} + 3{\alpha _3})({\alpha _3} + 4{\alpha _4})\,.} \\ \end{array}$$
(8.47)

Setting α2 = 1, we indeed recover the same normalization of −3/4(∂π)2 for the helicity-0 mode found in (2.48).

10.3.3.3 X(3)-coupling

In general, the last coupling \({\tilde h^{\mu v}}X_{\mu v}^{(3)}\) between the helicity-2 and helicity-0 mode cannot be removed by a local field redefinition. The non-local field redefinition

$${\tilde h_{\mu \nu}} \rightarrow {\tilde h_{\mu \nu}} + G_{\mu \nu \alpha \beta}^{{\rm{massless}}}\,{X^{(3)\,\alpha \beta}}\,,$$
(8.48)

where \(G_{\mu v\alpha \beta}^{{\rm{massless}}}\) is the propagator for a massless spin-2 field as defined in (2.64), fully diagonalizes the helicity-0 and -2 mode at the price of introducing non-local interactions for π.

Note however that these non-local interactions do not hide any new degrees of freedom. Furthermore, about some specific backgrounds, the field redefinition is local. Indeed focusing on static and spherically symmetric configurations if we consider π = π0(r) and \({\tilde h_{\mu v}}\) given by

$${\tilde h_{\mu \nu}}\;{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = - \psi (r)\;{\rm{d}}{t^2} + \phi (r)\;{\rm{d}}{r^2}\,,$$
(8.49)

so that

$${\tilde h^{\mu \nu}}X_{\mu \nu}^{(3)} = - \psi \prime (r){\pi\prime_0}{(r)^3}\,.$$
(8.50)

The standard kinetic term for ψ sets ψ ′(r) = ϕ (r)/r as in GR and the X(3) coupling can be absorbed via the field redefinition, \(\phi \to \bar \phi - 2({\alpha _3} + 4{\alpha _4}){{\pi \prime_0}}{(r)^3}/r\Lambda _3^{- 6}\), leading to the following new sextic interactions for π,

$${\tilde h^{\mu \nu}}X_{\mu \nu}^{(3)} \rightarrow - {1 \over {{r^2}}}{\pi\prime_0}{(r)^6}\,,$$
(8.51)

interestingly this new order-6 term satisfy all the relations of a Galileon interaction but cannot be expressed covariantly in a local way. See [61] for more details on spherically symmetric configurations with the X(3)-coupling.

10.3.4 Vector interactions in the Λ3-decoupling limit

As can be seen from the relation (8.19), the scale associated with interactions mixing two helicity-1 fields with an arbitrary number of fields π, (j = 0, k = 1 and arbitrary ) is also Λ3. So at that scale, there are actually an infinite number of interactions when including the mixing with between the helicity-1 and -0 modes (however as mentioned previously, since the vector field always appears quadratically it is always consistent to set them to zero as was performed previously).

The full decoupling limit including these interactions has been derived in Ref. [419], (see also Ref. [238]) using the vielbein formulation of massive gravity as in (6.1) and we review the formalism and the results in what follows.

In addition to the Stückelberg fields associated with local covariance, in the vielbein formulation one also needs to introduce 6 additional Stückelberg fields ωab associated to local Lorentz invariance, ωab = − ωba. These are non-dynamical since they never appear with derivatives, and can thus be treated as auxiliary fields which can be integrated. It is however useful to keep them in the decoupling limit action, so as to retain a closes-form expression. In terms of the Lorentz Stückelberg fields, the full decoupling limit of massive gravity in four dimensions at the scale Λ3 is then (before diagonalization) [419]

$$\begin{array}{*{20}c} {{\mathcal L}_{{\Lambda _3}}^{(0)} = - {1 \over 4}{h^{\mu \nu}}\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {1 \over 2}{h^{\mu \nu}}\sum\limits_{n = 1}^3 {{{{\alpha _n}} \over {\Lambda _3^{3(n - 1)}}}} X_{\mu \nu}^{(n)}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ {+ {{3{\beta _1}} \over 8}\delta _{abcd}^{\alpha \beta \gamma \delta}\delta _\alpha ^a\left({\delta _\beta ^bF_{\;\gamma}^c\omega _{\;\delta}^d + 2[\omega _{\;\beta}^b\omega _{\;\gamma}^c + {1 \over 2}\delta _\beta ^b\omega _{\;\mu}^c\omega _{\;\gamma}^\mu ](\delta + \Pi)_\delta ^d} \right)\quad \quad} \\ {+ {{{\beta _2}} \over 8}\delta _{abcd}^{\alpha \beta \gamma \delta}(\delta + \Pi)_\alpha ^a\left({2\delta _\beta ^bF_{\;\gamma}^c\omega _{\;\delta}^d + [\omega _{\;\beta}^b\omega _{\;\gamma}^c + \delta _\beta ^b\omega _{\;\mu}^c\omega _{\;\gamma}^\mu ](\delta + \Pi)_\delta ^d} \right)\quad} \\ {+ {{{\beta _3}} \over {48}}\delta _{abcd}^{\alpha \beta \gamma \delta}(\delta + \Pi)_\alpha ^a(\delta + \Pi)_\beta ^b\left({3F_{\;\gamma}^c\omega _{\;\delta}^d + \omega _{\;\mu}^c\omega _{\;\gamma}^\mu (\delta + \Pi)_\delta ^d} \right),\quad \quad \quad} \\ \end{array}$$
(8.52)

(the superscript (0) indicates that this decoupling limit is taken with Minkowski as a reference metric), with Fab = aAbbAa and the coefficients βn are related to the αn as in (6.28).

The auxiliary Lorentz Stückelberg fields carries all the non-linear mixing between the helicity-0 and -1 modes,

$${\omega _{ab}} = \int\nolimits_0^\infty {\rm{d}} u\,{e^{- 2u}}{e^{- u\Pi _a^{a\prime}}}{F_{a\prime b\prime}}{e^{- u\Pi _b^{b\prime}}}$$
(8.53)
$$= \sum\limits_{n,m} {{{(n + m)!} \over {{2^{1 + n + m}}n!m!}}} {(- 1)^{n + m}}{({\Pi ^n}\,F\,{\Pi ^m})_{ab}}\,.$$
(8.54)

In some special cases these sets of interactions can be resummed exactly, as was first performed in [139], (see also Refs. [364, 456]).

This decoupling limit includes non-linear combinations of the second-derivative tensor Πμν and the first derivative Maxwell tensor Fμν. Nevertheless, the structure of the interactions is gauge invariant for Aμ, and there are no higher derivatives on in the equation of motion for A, so the equations of motions for both the helicity-1 and -2 modes are manifestly second order and propagating the correct degrees of freedom. The situation is more subtle for the helicity-0 mode. Taking the equation of motion for that field would lead to higher derivatives on π itself as well as on the helicity-1 field. Since this theory has been proven to be ghost-free by different means (see Section 7), it must be that the higher derivatives in that equation are nothing else but the derivative of the equation of motion for the helicity-1 mode similarly as what happens in Section 7.2.

When working beyond the decoupling limit, the even the equation of motion with respect to the helicity-1 mode is no longer manifestly well-behaved, but as we shall see below, the Stückelberg fields are no longer the correct representation of the physical degrees of freedom. As we shall see below, the proper number of degrees of freedom is nonetheless maintained when working beyond the decoupling limit.

10.3.5 Beyond the decoupling limit

10.3.5.1 Physical degrees of freedom

In Section 8.3, we have introduced four Stückelberg fields ϕa which transform as scalar fields under coordinate transformation, so that the action of massive gravity is invariant under coordinate transformations. Furthermore, the action is also invariant under global Lorentz transformations in the field space,

$${x^\mu} \rightarrow {x^\mu}\,,\qquad {g_{\mu \nu}} \rightarrow {g_{\mu \nu}}\,,\quad {\rm{and}}\quad {\phi ^a} \rightarrow \tilde \Lambda _{\,b}^a{\phi ^b}\,.$$
(8.55)

In the DL, taking MPl → ∞, all fields are living on flat space-time, so in that limit, there is an additional global Lorentz symmetry acting this time on the space-time,

$${x^\mu} \rightarrow \bar \Lambda _\nu ^\mu \,{x^\nu}\,,\qquad {h_{\mu \nu}} \rightarrow \bar \Lambda _{\,\mu}^\alpha \bar \Lambda _{\,\nu}^\beta {h_{\alpha \beta}}\,,\quad {\rm{and}}\quad {\phi ^a} \rightarrow {\phi ^a}\,.$$
(8.56)

The internal and space-time Lorentz symmetries are independent, (the internal one is always present while the space-time one is only there in the DL). In the DL we can identify both groups and work in the representation of the single group, so that the action is invariant under,

$${x^\mu} \rightarrow \Lambda _\nu ^\mu \,{x^\nu}\,,\qquad {h_{\mu \nu}} \rightarrow \Lambda _{\,\mu}^\alpha \Lambda _{\,\nu}^\beta {h_{\alpha \beta}},\quad {\rm{and}}\quad {\phi ^a} \rightarrow \Lambda _{\,b}^a{\phi ^b}\,.$$
(8.57)

The Stückelberg fields ϕa then behave as Lorentz vectors under this identified group, and π defined previously behaves as a Lorentz scalar. The helicity-0 mode of the graviton also behaves as a scalar in this limit, and captures the behavior of the graviton helicity-0 mode. So in the DL limit, the right requirement for the absence of BD ghost is indeed the requirement that the equations of motion for π remain at most second order (time) in derivative as was pointed out in [173], (see also [111]). However, beyond the DL, the helicity-0 mode of the graviton does not behave as a scalar field and neither does the π in the split of the Stückelberg fields. So beyond the DL there is no reason to anticipate that captures a whole degree of freedom, and it indeed, it does not. Beyond the DL, the equation of motion for will typically involve higher derivatives, but the correct requirement for the absence of ghost is different, as explained in Section 7.2. One should instead go back to the original four scalar Stückelberg fields ϕa and check that out of these four fields only three of them be dynamical. This has been shown to be the case in Section 7.2. These three degrees of freedom, together with the two standard graviton polarizations then gives the correct five degrees of freedom and circumvent the BD ghost.

Recently, much progress has been made in deriving the decoupling limit about arbitrary backgrounds, see Ref. [369].

10.3.6 Decoupling limit on (Anti) de Sitter

10.3.6.1 Linearized theory and Higuchi bound

Before deriving the decoupling limit of massive gravity on (Anti) de Sitter, we first need to analyze the linearized theory so as to infer the proper canonical normalization of the propagating dofs and the proper scaling in the decoupling limit, similarly as what was performed for massive gravity with flat reference metric. For simplicity we focus on (3 + 1) dimensions here, and when relevant give the result in arbitrary dimensions. Linearized massive gravity on (A)dS was first derived in [307, 308]. Since we are concerned with the decoupling limit of ghost-free massive gravity, we follow in this section the procedure presented in [154]. We also focus on the dS case first before commenting on the extension to AdS.

At the linearized level about dS, ghost-free massive gravity reduces to the Fierz-Pauli action with \({g_{\mu v}} = {\gamma _{\mu v}} + {\tilde h_{\mu v}} = {\gamma _{\mu v}} + {h_{\mu v}}/{M_{{\rm{Pl}}}}\), where γμν is the dS metric with constant Hubble parameter H0,

$${\mathcal L}_{{\rm{MG}},\,{\rm{dS}}}^{(2)} = - {1 \over 4}{h^{\mu \nu}}({{\mathcal{\hat E}}_{{\rm{dS}}}})_{\mu \nu}^{\alpha \beta}\,{h_{\alpha \beta}} - {{{m^2}} \over 8}{\gamma ^{\mu \nu}}{\gamma ^{\alpha \beta}}({H_{\mu \alpha}}{H_{\nu \beta}} - {H_{\mu \nu}}{H_{\alpha \beta}})\,,$$
(8.58)

where Hμν, is the tensor fluctuation as introduced in (2.80), although now considered about the dS metric,

$$\begin{array}{*{20}c} {{H_{\mu \nu}} = {h_{\mu \nu}} + 2{{{\nabla _{(\mu}}{A_{\nu)}}} \over m} + 2{{{\Pi _{\mu \nu}}} \over {{m^2}}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {- {1 \over {{M_{{\rm{Pl}}}}}}\left[ {{{{\nabla _\mu}{A_\alpha}} \over m} + {{{\Pi _{\mu \alpha}}} \over {{m^2}}}} \right]\left[ {{{{\nabla _\nu}{A_\beta}} \over m} + {{{\Pi _{\nu \beta}}} \over {{m^2}}}} \right]{\gamma ^{\alpha \beta}}\,,} \\ \end{array}$$
(8.59)

with πμν = ∇μνπ, ∇ being the covariant derivative with respect to the dS metric γμν and indices are raised and lowered with respect to this same metric. Similarly, \({\hat \varepsilon _{{\rm{dS}}}}\) is now the Lichnerowicz operator on de Sitter,

$$\begin{array}{*{20}c} {({{\hat {\mathcal E}}_{{\rm{dS}}}})_{\mu \nu}^{\alpha \beta}\,{h_{\alpha \beta}} = - {1 \over 2}\left[ {{\square h_{\mu \nu}} - 2{\nabla _{(\mu}}{\nabla _\alpha}h_{\nu)}^\alpha + {\nabla _\mu}{\nabla _\nu}h\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \right.} \\ {\left. {- {\gamma _{\mu \nu}}(\square h - {\nabla _\alpha}{\nabla _\beta}{h^{\alpha \beta}}) + 6H_0^2\left({{h_{\mu \nu}} - {1 \over 2}h{\gamma _{\mu \nu}}} \right)} \right]\,.} \\ \end{array}$$
(8.60)

So at the linearized level and neglecting the vector fields, the helicity-0 and -2 mode of massive gravity on dS behave as

$$\begin{array}{*{20}c} {{\mathcal L}_{{\rm{MG}},\,{\rm{dS}}}^{(2)} = - {1 \over 4}{h^{\mu \nu}}({{\hat {\mathcal E}}_{{\rm{dS}}}})_{\mu \nu}^{\alpha \beta}\,{h_{\alpha \beta}} - {{{m^2}} \over 8}(h_{\mu \nu}^2 - {h^2}) - {1 \over 8}F_{\mu \nu}^2\quad \quad \quad \;} \\ {- {1 \over 2}{h^{\mu \nu}}({\Pi _{\mu \nu}} - [\Pi ]{\gamma _{\mu \nu}}) - {1 \over {2{m^2}}}([{\Pi ^2}] - {{[\Pi ]}^2})\,.} \\ \end{array}$$
(8.61)

After integration by parts, [Π2] = [Π]2 − 3H2(∂π)2. The helicity-2 and -0 modes are thus diagonalized as in flat space-time by setting \({h_{\mu v}} = {\tilde h_{\mu v}} + \pi {\gamma _{\mu v}}\),

$$\begin{array}{*{20}c} {{\mathcal L}_{{\rm{MG}},\,{\rm{dS}}}^{(2)} = - {1 \over 4}{{\bar h}^{\mu \nu}}({{\hat {\mathcal E}}_{{\rm{dS}}}})_{\mu \nu}^{\alpha \beta}\,{{\bar h}_{\alpha \beta}} - {{{m^2}} \over 8}(\bar h_{\mu \nu}^2 - {{\bar h}^2}) - {1 \over 8}F_{\mu \nu}^2\quad \quad \;} \\ {- {3 \over 4}\left({1 - 2{{\left({{H \over m}} \right)}^2}} \right)\left({{{(\partial \pi)}^2} - {m^2}\bar h\pi - 2{m^2}{\pi ^2}} \right)\,.} \\ \end{array}$$
(8.62)

The most important difference from linearized massive gravity on Minkowski is that the properly canonically normalized helicity-0 mode is now instead

$$\phi = \sqrt {1 - 2{{{H^2}} \over {{m^2}}}} \;\pi \,.$$
(8.63)

for a standard coupling of the form \({1 \over {{M_{{\rm{Pl}}}}}}\pi T\), where T is the trace of the stress-energy tensor, as we would infer from the coupling \({1 \over {{M_{{\rm{P}}1}}}}{h_{\mu \nu}}{T^{\mu \nu}}\) after the shift \({h_{\mu \nu}} = {\bar h_{\mu \nu}} + \pi {\gamma _{\mu \nu}}\), this means that the properly normalized helicity-0 mode couples as

$${\mathcal L}_{{\rm{helicity - 0}}}^{{\rm{matter}}} = {{{m^2}} \over {{M_{{\rm{Pl}}}}\sqrt {{m^2} - 2{H^2}}}}\phi T\,,$$
(8.64)

and that coupling vanishes in the massless limit. This might suggest that in the massless limit m → 0, the helicity-0 mode decouples, which would imply the absence of the standard vDVZ discontinuity on (Anti) de Sitter [358, 430], unlike what was found on Minkowski, see Section 2.2.3, which confirms the Newtonian approximation presented in [186].

While this observation is correct on AdS, in the dS one cannot take the massless limit without simultaneously sending H → 0 at least the same rate. As a result, it would be incorrect to deduce that the helicity-0 mode decouples in the massless limit of massive gravity on dS.

To be more precise, the linearized action (8.62) is free from ghost and tachyons only if m ≡ 0 which corresponds to GR, or if m2 > 2H2, which corresponds to the well-know Higuchi bound [307, 190]. In d spacetime dimensions, the Higuchi bound is m2 > (d − 2)H2. In other words, on dS there is a forbidden range for the graviton mass, a theory with 0 < m2 < 2H2 or with m2 < 0 always excites at least one ghost degree of freedom. Notice that this ghost, (which we shall refer to as the Higuchi ghost from now on) is distinct from the BD ghost which corresponded to an additional sixth degree of freedom. Here the theory propagates five dof (in four dimensions) and is thus free from the BD ghost (at least at this level), but at least one of the five dofs is a ghost. When 0 < m2 < 2H2, the ghost is the helicity-0 mode, while for m2 < 0, the ghost is he helicity-1 mode (at quadratic order the helicity-1 mode comes in as \(- {{{m^2}} \over 4}F_{\mu v}^2\)). Furthermore, when m2 < 0, both the helicity-2 and -0 are also tachyonic, although this is arguably not necessarily a severe problem, especially not if the graviton mass is of the order of the Hubble parameter today, as it would take an amount of time comparable to the age of the Universe to see the effect of this tachyonic behavior. Finally, the case m2 = 2H2 (or m2 = (d − 2)H2 in d spacetime dimensions), represents the partially massless case where the helicity-0 mode disappears. As we shall see in Section 9.3, this is nothing other than a linear artefact and non-linearly the helicity-0 mode always reappears, so the PM case is infinitely strongly coupled and always pathological.

A summary of the different bounds is provided below as well as in Figure 4:

  • m2 < 0: Helicity-1 modes are ghost, helicity-2 and -0 are tachyonic, sick theory

  • m2 = 0: General Relativity: two healthy (helicity-2) degrees of freedom, healthy theory,

  • 0 < m2 < 2H2: One “Higuchi ghost” (helicity-0 mode) and four healthy degrees of freedom (helicity-2 and -1 modes), sick theory,

  • m2 = 2H2: Partially Massless Gravity: Four healthy degrees (helicity-2 and -1 modes), and one infinitely strongly coupled dof (helicity-0 mode), sick theory,

  • m2 > 2H2: Massive Gravity on dS: Five healthy degrees of freedom, healthy theory.

Figure 4
figure 4

Degrees of freedom for massive gravity on a maximally symmetric reference metric. The only theoretically allowed regions are the upper left green region and the line m = 0 corresponding to GR.

10.3.6.2 Massless and decoupling limit
  • As one can see from Figure 4, in the case where H2 < 0 (corresponding to massive gravity on AdS), one can take the massless limit m ⊒ 0 while keeping the AdS length scale fixed in that limit. In that limit, the helicity-0 mode decouples from external matter sources and there is no vDVZ discontinuity. Notice however that the helicity-0 mode is nevertheless still strongly coupled at a low energy scale.

    When considering the decoupling limit m ⊒ 0, MPl ⊒ ∞ of massive gravity on AdS, we have the choice on how we treat the scale H in that limit. Keeping the AdS length scale fixed in that limit could lead to an interesting phenomenology in its own right, but is yet to be explored in depth.

  • In the dS case, the Higuchi forbidden region prevents us from taking the massless limit while keeping the scale H fixed. As a result, the massless limit is only consistent if H ⊒ 0 simultaneously as m ⊒ 0 and we thus recover the vDVZ discontinuity at the linear level in that limit.

    When considering the decoupling limit m ⊒ 0, MPl ⊒ ∞ of massive gravity on dS, we also have to send H ⊒ 0. If H/m ⊒ 0 in that limit, we then recover the same decoupling limit as for massive gravity on Minkowski, and all the results of Section 8.3 apply. The case of interest is thus when the ratio H/m remains fixed in the decoupling limit.

10.3.6.3 Decoupling limit

When taking the decoupling limit of massive gravity on dS, there are two additional contributions to take into account:

  • First, as mentioned in Section 8.3.5, care needs to be applied to properly identify the helicity-0 mode on a curved background. In the case of (A)dS, the formalism was provided in Ref. [154] by embedding a d-dimensional de Sitter spacetime into a flat (d + 1)-dimensional spacetime where the standard Stückelberg trick could be applied. As a result the ‘covariant’ fluctuation defined in (2.80) and used in (8.59) needs to be generalized to (see Ref. [154] for details)

    $$\begin{array}{*{20}c} {{1 \over {{M_{{\rm{Pl}}}}}}{H_{\mu \nu}} = {1 \over {{M_{{\rm{Pl}}}}}}{h_{\mu \nu}} + {2 \over {\Lambda _3^3}}{\Pi _{\mu \nu}} - {1 \over {\Lambda _3^6}}\Pi _{\mu \nu}^2\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {1 \over {\Lambda _3^3}}{{{H^2}} \over {{m^2}}}\left({{{(\partial \pi)}^2}({\gamma _{\mu \nu}} - {2 \over {\Lambda _3^3}}{\Pi _{\mu \nu}}) - {1 \over {\Lambda _3^6}}{\Pi _{\mu \alpha}}{\Pi _{\nu \beta}}{\partial ^\alpha}\pi {\partial ^\beta}\pi} \right)} \\ {+ {H^2}{{{H^2}} \over {{m^2}}}{{{{(\partial \pi)}^4}} \over {\Lambda _3^3}} + \cdots \,.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
    (8.65)

    Any corrections in the third line vanish in the decoupling limit and can thus be ignored, but the corrections of order H2 in the second line lead to new non-trivial contributions.

  • Second, as already encountered at the linearized level, what were total derivatives in Minkowski (for instance the combination [Π2] − [Π]2), now lead to new contributions on de Sitter. After integration by parts, m−2([Π2] − [Π]2) = m−2 = 12H2/m2(∂π)2. This was the origin of the new kinetic structure for massive gravity on de Sitter and will have further effects in the decoupling limit when considering similar contributions from 3,4(Π), where 3,4 are defined in (6.12, 6.13) or more explicitly in (6.17, 6.18).

Taking these two effects into account, we obtain the full decoupling limit for massive gravity on de Sitter,

$${\mathcal L}_{{\Lambda _3}}^{({\rm{dS}})} = {\mathcal L}_{{\Lambda _3}}^{(0)} + {{{H^2}} \over {{m^2}}}\sum\limits_{n = 2}^5 {{{{\lambda _n}} \over {\Lambda _3^{3(n - 1)}}}} {\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ]\,,$$
(8.66)

where \({\mathcal L}_{{\Lambda _3}}^{(0)}\) is the full Lagrangian obtained in the decoupling limit in Minkowski and given in (8.52), and \({\mathcal L}_{{\rm{(Gal)}}}^{(n)}\) are the Galileon Lagrangians as encountered previously. Notice that while the ratio H/m remains fixed, this decoupling limit is taken with H, m ⊒ 0, so all the fields in (8.66) live on a Minkowski metric. The constant coefficients λn depend on the free parameters of the ghost-free theory of massive gravity, for the theory (6.3) with α1 = 0 and α2 = 1, we have

$${\lambda _2} = {3 \over 2}\,,\quad {\lambda _3} = {3 \over 4}(1 + 2{\alpha _3})\,,\quad {\lambda _4} = {1 \over 4}(- 1 + 6{\alpha _4})\,,\quad {\lambda _5} = - {3 \over {16}}({\alpha _3} + 4{\alpha _4})\,.$$
(8.67)

At this point we may perform the same field redefinition (8.39) as in flat space and obtain the following semi-diagonalized decoupling limit,

$${\mathcal L}_{{\Lambda _3}}^{{\rm{dS}})} = - {1 \over 4}{h^{\mu \nu}}\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} + {{{\alpha _3} + 4{\alpha _4}} \over {8\Lambda _3^9}}{h^{\mu \nu}}X_{\mu \nu}^{(3)} + \sum\limits_{n = 2}^5 {{{{{\tilde c}_n}} \over {\Lambda _3^{3(n - 2)}}}} {\mathcal L}_{({\rm{Gal}})}^{(n)}[\pi ]$$
(8.68)

where the contributions from the helicity-1 modes are the same as the ones provided in (8.52), and the new coefficients \({\tilde c_n} = - {c_n}/4 + {H^2}/{m^2}{\lambda _n}\) cancel identically for m2 = 2H2, α3 = −1 and α4 = −α3/4 = 1/4, as pointed out in [154], and the same result holds for bi-gravity as pointed out in [301]. Interestingly, for these specific parameters, the helicity-0 loses its kinetic term, and any self-mixing as well as any mixing with the helicity-2 mode. Nevertheless, the mixing between the helicity-1 and -0 mode as presented in (8.52) are still alive. There are no choices of parameters which would allow to remove the mixing with the helicity-1 mode and as a result, the helicity-0 mode generically reappears through that mixing. The loss of its kinetic term implies that the field is infinitely strongly coupled on a configuration with zero vev for the helicity-1 mode and is thus an ill-defined theory. This was confirmed in various independent studies, see Refs. [185, 147].

10.4 Λ3-decoupling limit of bi-gravity

We now proceed to derive the Λ3-decoupling limit of bi-gravity, and we will see how to recover the decoupling limit about any reference metric (including Minkowski and de Sitter) as special cases. As already seen in Section 8.3.4, the full DL is better formulated in the vielbein language, even though in that case Stückelberg fields ought to be introduced for the broken diff and the broken Lorentz. Yet, this is a small price to pay, to keep the action in a much simpler form. We thus proceed in the rest of this section by deriving the Λ3-decoupling of bi-gravity and start in its vielbein formulation. We follow the derivation and formulation presented in [224]. As previously, we focus on (3 + 1)-spacetime dimensions, although the whole formalism is trivially generalizable to arbitrary dimensions.

We start with the action (5.43) for bi-gravity, with the interaction

$${{\mathcal L}_{g,f}} = {{M_{{\rm{Pl}}}^2{m^2}} \over 4}\int {{{\rm{d}}^4}} x\sqrt {- g} \sum\limits_{n = 0}^4 {{\alpha _n}} {{\mathcal L}_n}[{\mathcal K}[g,f]]$$
(8.69)
$$\begin{array}{*{20}c} {= - {{M_{{\rm{Pl}}}^2{m^2}} \over 2}{\varepsilon _{abcd}}\int {\left[ {{{{\beta _0}} \over {4!}}{e^a} \wedge {e^b} \wedge {e^c} \wedge {e^d} + {{{\beta _1}} \over {3!}}{f^a} \wedge {e^b} \wedge {e^c} \wedge {e^d}} \right.} \quad \quad \quad \quad \quad \quad \quad} \\ {\left. {+ {{{\beta _2}} \over {2!2!}}{f^a} \wedge {f^b} \wedge {e^c} \wedge {e^d} + {{{\beta _3}} \over {3!}}{f^a} \wedge {f^b} \wedge {f^c} \wedge {e^d} + {{{\beta _4}} \over {4!}}{f^a} \wedge {f^b} \wedge {f^c} \wedge {f^d}} \right]\,,} \\ \end{array}$$
(8.70)

where the relation between the α’s and the β’s is given in (6.28).

We now introduce Stückelberg fields ϕa = xa − χa for diffs and \(\Lambda _b^a\) for the local Lorentz. In the case of massive gravity, there was no ambiguity in how to perform this ‘Stückelbergization’ but in the case of bi-gravity, one can either ‘Stückelbergize the metric fμν or the metric gμν. In other words the broken diffs and local Lorentz symmetries can be restored by performing either one of the two replacements in (8.69),

$$f_\mu ^a \rightarrow \tilde f_\mu ^a = {\Lambda ^a}_bf_c^b(\phi (x))\,{\partial _\mu}{\phi ^c}\,.$$
(8.71)

or alternatively

$$e_\mu ^a \rightarrow \tilde e_\mu ^a = {\Lambda ^a}_be_c^b(\phi (x))\,{\partial _\mu}{\phi ^c}\,.$$
(8.72)

For now we stick to the first choice (8.71) but keep in mind that this freedom has deep consequences for the theory, and is at the origin of the duality presented in Section 10.7.

Since we are interested in the decoupling limit, we now perform the following splits, (see Ref. [419] for more details),

$$\begin{array}{*{20}c} {e_\mu ^a = \bar e_\mu ^a + {1 \over {2{M_{{\rm{Pl}}}}}}h_\mu ^a\,,\qquad f_\mu ^a = \bar e_\mu ^a + {1 \over {2{M_f}}}v_\mu ^a} \\ {{\Lambda ^a}_b = {e^{{{\hat \omega}^a}_{\;\;b}}} = {\delta ^a}_b + {{\hat \omega}^a}_{\;\;b} + {1 \over 2}{{\hat \omega}^a}_{\;\;c}{{\hat \omega}^c}_{\;\;b} + \cdots \quad \quad \quad} \\ {\hat \omega _{\;\;\;b}^a = {{\omega _{\;\;\;b}^a} \over {m{M_{{\rm{Pl}}}}}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {{\partial _\mu}{\phi ^a} = {\partial _\mu}\left({{x^a} + {{{A^a}} \over {m{M_{{\rm{Pl}}}}}} + {{{\partial ^a}\pi} \over {\Lambda _3^3}}} \right)\quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(8.73)

and perform the scaling or decoupling limit,

$${M_{{\rm{Pl}}}} \rightarrow \infty \,,\quad {M_f} \rightarrow \infty \,,\quad m \rightarrow 0$$
(8.74)

while keeping

$$\begin{array}{*{20}c} {{\Lambda _3} = {{({m^2}{M_{{\rm{Pl}}}})}^{{1 \over 3}}} \rightarrow {\rm{constant}}\,,\quad \,{M_{{\rm{Pl}}}}/{M_f} \rightarrow {\rm{constant}}\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {{\rm{and}}\,\quad {\beta _n} \rightarrow {\rm{constant}}\,.} \\ \end{array}$$
(8.75)

Before performing any change of variables (any diagonalization), in addition to the kinetic term for quadratic h, v and A, there are three contributions to the decoupling limit of bi-gravity:

  1. Mixing of the helicity-0 mode with the helicity-1 mode Aμ, as derived in (8.52),

  2. Mixing of the helicity-0 mode with the helicity-2 mode \(h_\mu ^a\), as derived in (8.40),

  3. Mixing of the helicity-0 mode with the new helicity-2 mode \(\upsilon _\mu ^a\),

noticing that before field redefinitions, the helicity-0 mode do not self-interact (their self-interactions are constructed so as to be total derivatives).

As already explained in Section 8.3.6, the first contribution ❶ arising from the mixing between the helicity-0 and -1 modes is the same (in the decoupling limit) as what was obtained in Minkowski (and is independent of the coefficients βn or αn). This implies that the can be directly read of from the three last lines of (8.52). These contributions are the most complicated parts of the decoupling limit but remained unaffected by the dynamics of i.e., unaffected by the bi-gravity nature of the theory. This statement simply follows from scaling considerations. In the decoupling limit there cannot be any mixing between the helicity-1 and neither of the two helicity-2 modes. As a result, the helicity-1 modes only mix with themselves and the helicity-0 mode. Hence, in the scaling limit (8.74, 8.75) the helicity-1 decouples from the massless spin-2 field.

Furthermore, the first line of (8.52) which corresponds to the dynamics of \(h_\mu ^a\) and the helicity-0 mode is also unaffected by the bi-gravity nature of the theory. Hence, the second contribution ❷ is the also the same as previously derived. As a result, the only new ingredient in bi-gravity is the mixing ❸ between the helicity-0 mode and the second helicity-2 mode \(\upsilon _\mu ^a\), given by a fixing of the form hμνXμν.

Unsurprisingly, these new contributions have the same form as ❷, with three distinctions: First the way the coefficients enter in the expressions get modified ever so slightly (β1β1/3 and β3 → 3β3). Second, in the mass term the space-time index for ought to dressed with the Stückelberg field,

$$v_\mu ^a \rightarrow v_b^a{\partial _\mu}{\phi ^b} = v_b^a(\delta _\mu ^b + \Pi _\mu ^b/\Lambda _3^3)\,.$$
(8.76)

Finally, and most importantly, the helicity-2 field \(\upsilon _a^\mu\) (which enters in the mass term) is now a function of the ‘Stückelbergized’ coordinates ϕa, which in the decoupling limit means that for the mass term

$$v_b^a = v_b^a[{x^\mu} + {\partial ^\mu}\pi /\Lambda _3^3] \equiv v_b^a[\tilde x]\,.$$
(8.77)

These two effects do not need to be taken into account for the υ that enters in its standard curvature term as it is Lorentz and diff invariant.

Taking these three considerations into account, one obtains the decoupling limit for bi-gravity,

$$\begin{array}{*{20}c} {{\mathcal L}_{{\Lambda _3}}^{({\rm{bi - gravity}})} = {\mathcal L}_{{\Lambda _3}}^{(0)} - {1 \over 4}{v^{\mu \nu}}[x]\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{v_{\alpha \beta}}[x]\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ {- {1 \over 2}{{{M_{{\rm{Pl}}}}} \over {{M_f}}}{v^{\mu \beta}}[\tilde x]\left({\delta _\beta ^\nu + {{\Pi _\beta ^\nu} \over {\Lambda _3^3}}} \right)\sum\limits_{n = 0}^3 {{{{{\tilde \beta}_{n + 1}}} \over {\Lambda _3^{3(n - 1)}}}} X_{\mu \nu}^{(n)}[\Pi ]\,,} \\ \end{array}$$
(8.78)

with \({\tilde \beta _n} = {\beta _n}/(4 - n)!(n - 1)!\). Modulo the non-trivial dependence on the coordinate \(\tilde x = x + \partial \pi/\Lambda _3^3\), this is a remarkable simple decoupling limit for bi-gravity. Out of this decoupling limit we can re-derive all the DL found previously very elegantly.

Notice as well the presence of a tadpole for υ if β1 ≠ 0. When this tadpole vanishes (as well as the one for h), one can further take the limit Mf → ∞ keeping all the other β’s fixed as well as Λ3, and recover straight away the decoupling limit of massive gravity on Minkowski found in (8.52), with a free and fully decoupled massless spin-2 field.

In the presence of a cosmological constant for both metrics (and thus a tadpole in this framework), we can also take the limit Mf → ∞ and recover straight away the decoupling limit of massive gravity on (A)dS, as obtained in (8.66).

This illustrates the strength of this generic decoupling limit for bi-gravity (8.78). In principle we could even go further and derive the decoupling limit of massive gravity on an arbitrary reference metric as performed in [224]. To obtain a general reference metric we first need to add an external source for that generates a background for \({\overset - V _{\mu v}} = {M_f}/{M_{{\rm{Pl}}}}{\overset - U _{\mu v}}\). The reference metric is thus expressed in the local inertial frame as

$${f_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {{M_f}}}{\bar V_{\mu \nu}} + {1 \over {4M_f^2}}{\bar V_{\mu \alpha}}{\bar V_{\beta \nu}}{\eta ^{\alpha \beta}} + {1 \over {{M_f}}}{v_{\mu \nu}} + {\mathcal O}(M_f^{- 2})$$
(8.79)
$$= {\eta _{\mu \nu}} + {1 \over {{M_{{\rm{Pl}}}}}}{\bar U_{\mu \nu}} + {1 \over {{M_f}}}{v_{\mu \nu}} + {\mathcal O}{({M_{{\rm{Pl}}}},{M_f})^{- 2}}\,.$$
(8.80)

The fact that the metric looks like a perturbation away from Minkowski is related to the fact that the curvature needs to scale as m2 in the decoupling limit in order to avoid the issues previously mentioned in the discussion of Section 8.2.3.

We can then perform the scaling limit Mf → ∞, while keeping the β’s and the scale Λ3 = (MPlm2)1/3 fixed as well as the field υμν and the fixed tensor \({\overset - U _{\mu v}}\). The decoupling limit is then simply given by

$$\begin{array}{*{20}c} {{\mathcal L}_{{\Lambda _3}}^{({\rm{\bar U}})} = {\mathcal L}_{{\Lambda _3}}^{(0)} - {1 \over 2}{{\bar U}^{\mu \beta}}[\tilde x]\left({\delta _\beta ^\nu + {{\Pi _\beta ^\nu} \over {\Lambda _3^3}}} \right)\sum\limits_{n = 0}^3 {{{{{\tilde \beta}_{n + 1}}} \over {\Lambda _3^{3(n - 1)}}}} X_{\mu \nu}^{(n)}[\Pi ]} \\ {- {1 \over 4}{v^{\mu \nu}}\hat{\mathcal E}_{\mu \nu}^{\alpha \beta}{v_{\alpha \beta}}\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(8.81)

where the helicity-2 field υ fully decouples from the rest of the massive gravity sector on the first line which carries the other helicity-2 field as well as the helicity-1 and -0 modes. Notice that the general metric \(\overset - U\) has only an effect on the helicity-0 self-interactions, through the second term on the first line of (8.81) (just as observed for the decoupling limit on AdS). These new interactions are ghost-free and look like Galileons for conformally flat \({\overset - U _{\mu v}} = \lambda {\eta _{\mu v}}\), with λ constant, but not in general. In particular, the interactions found in (8.81) would not be the covariant Galileons found in [166, 161, 157] (nor the ones found in [237]) for a generic metric.

11 Extensions of Ghost-free Massive Gravity

Massive gravity can be seen as a theory of a spin-2 field with the following free parameters in addition to the standard parameters of GR (e.g., the cosmological constant, etc…),

  • Reference metric fab,

  • Graviton mass m,

  • (d − 2) dimensionless parameters αn (or the β’s).

As natural extensions of massive gravity one can make any of these parameters dynamical. As already seen, the reference metric can be made dynamical leading to bi-gravity which in addition to massive spin-2 field carries a massless one as well.

Another natural extension is to promote the graviton mass m, or any of the free parameters αn (or βn) to a function of a new dynamical variable, say of an additional scalar field In principle the mass and the parameters α’s can be thought as potentials for an arbitrary number of scalar fields m = m (ψj), αn = αn (ψj), and not necessarily the same fields for each one of them [320]. So long as these functions are pure potentials and hide no kinetic terms for any new degree of freedom, the constraint analysis performed in Section 7 will go relatively unaffected, and the theory remains free from the BD ghost. This was shown explicitly for the mass-varying theory [319, 315] (where the mass is promoted to a scalar function of a new single scalar field, m = m (ϕ), while the parameters α remain constantFootnote 23), as well as a general massive scalar-tensor theory [320], and for quasi-dilaton which allow for different couplings between the spin-2 and the scalar field, motivated by scale invariance. We review these models below in Sections 9.1 and 9.2.

Alternatively, rather than considering the parameters and as arbitrary, one may set them to special values of special interest depending on the reference metric fμν. Rather than an ‘extension’ per se this is more special cases in the parameter space. The first obvious one is m = 0 (for arbitrary reference metric and parameters), for which one recovers the theory of GR (so long as the spin-2 field couples to matter in a covariant way to start with). Alternatively, one may also sit on the Higuchi bound, (see Section 8.3.6) with the parameters m2 = 2H2, α3 = −1/3 and α4 = 1/12 in four dimensions. This corresponds to the partially massless theory of gravity, which at the moment is pathological in its simplest realization and will be reviewed below in Section 9.3.

The coupling massive gravity to a DBI Galileon [157] was considered in [237, 461, 261] leading to a generalized Galileon theory which maintains a Galileon symmetry on curved backgrounds. This theory was shown to be free of any Ostrogradsky ghost in [19] and the cosmology was recently studied in [315] and perturbations in [20].

Finally, as other extensions to massive gravity, one can also consider all the extensions applicable to GR. This includes the higher order Lovelock invariants in dimensions greater than four, as well as promoting the Einstein-Hilbert kinetic term to a function f (R), which is equivalent to gravity with a scalar field. In the case of massive gravity this has been performed in [89] (see also [46, 354]), where the absence of BD ghost was proven via a constraint analysis, and the cosmology was explored (this was also discussed in Section 5.6 and see also Section 12.5). f (R) extensions to bi-gravity were also derived in [416, 415].

Trace-anomaly driven inflation in bi-gravity was also explored in Ref. [47]. Massless quantum effects can be taking into account by including the trace anomaly \({{\mathcal T}_A}\) given as [203]

$${\mathcal{T}_A} = {c_1}({1 \over 3}{R^2} - 2R_{\mu \nu}^2 + R_{\mu \nu \alpha \beta}^2 + {2 \over 3}\square R) + {c_2}({R^2} - 4R_{\mu \nu}^2 + R_{\mu \nu \alpha \beta}^2) + {c_3}\square R,$$
(9.1)

where c1,2,3 are three constants depending on the field content (for instance the number of scalars, spinors, vectors, graviton etc.) Including this trace anomaly to the bi-gravity de Sitter-like solutions were found which could represent a good model for anomaly-driven models of inflation.

11.1 Mass-varying

The idea behind mass-varying gravity is to promote the graviton mass to a potential for an external scalar field ψ, mm (ψ), which has its own dynamics [319], so that in four dimensions, the dRGT action for massive gravity gets promoted to

$$\begin{array}{*{20}c} {{\mathcal{L}_{{\rm{Mass - Varying}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \left({R + {{{m^2}(\psi)} \over 2}\sum\limits_{n = 0}^4 {{\alpha _n}} {\mathcal{L}_n}[\mathcal{K}]\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \right.} \\ {\left. {- {1 \over 2}{g^{\mu \nu}}{\partial _\mu}\psi {\partial _\nu}\psi - W(\psi)} \right),} \end{array}$$
(9.2)

and the tensors \({\mathcal K}\) are given in (6.7). This could also be performed for bi-gravity, where we would simply include the Einstein-Hilbert term for the metric fμν. This formulation was then promoted not only to varying parameters αnαn (ψ) but also to multiple fields αA, with \(A = 1, \cdots {\mathcal N}\) in [320],

$$\begin{array}{*{20}c} {{\mathcal{L}_{{\rm{Generalized}}\;{\rm{MG}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {\Omega ({\psi _A})R + {1 \over 2}\sum\limits_{n = 0}^4 {{\alpha _n}} ({\psi _A}){\mathcal{L}_n}[\mathcal{K}]} \right.} \\ {\left. {- {1 \over 2}{g^{\mu \nu}}{\partial _\mu}{\psi _A}{\partial _\nu}{\psi ^A} - W({\psi _A})} \right].} \end{array}$$
(9.3)

The absence of BD ghost in these theories were performed in [319] and [320] in unitary gauge, in the ADM language by means of a constraint analysis as formulated in Section 7.1. We recall that in the absence of the scalar field ψ, the primary second-class (Hamiltonian) constraint is given by

$${\mathcal{C}_0} = {\mathcal{R}_0}(\gamma ,p) + {D^i}_j{n^j}{\mathcal{R}_i}(\gamma ,p) + {m^2}{\mathcal{U}_0}(\gamma ,n(\gamma ,p)) \approx 0.$$
(9.4)

In the case of a mass-varying theory of gravity, the entire argument remains the same, with the simple addition of the scalar field contribution,

$$\begin{array}{*{20}c} {\mathcal{C}_0^{{\rm{mass - varying}}} = {{\tilde{\mathcal{R}}}_0}(\gamma ,p,\psi ,{p_\psi}) + {D^i}_j{n^j}{{\tilde{\mathcal{R}}}_i}(\gamma ,p,\psi ,{p_\psi}) + {m^2}(\psi){\mathcal{U}_0}(\gamma ,n(\gamma ,p))} \\ {\approx 0,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \end{array}$$
(9.5)

where pψ is the conjugate momentum associated with the scalar field ψ and

$${\tilde{\mathcal{R}}_0}(\gamma ,p,\psi ,{p_\psi}) = {\mathcal{R}_0}(\gamma ,p) + {1 \over 2}\sqrt \gamma {\partial _i}\psi {\partial ^i}\psi + {1 \over {2\sqrt \gamma}}p_\psi ^2$$
(9.6)
$${\tilde{\mathcal{R}}_i}(\gamma ,p,\psi ,{p_\psi}) = {\mathcal{R}_i}(\gamma ,p) + {p_\psi}{\partial _i}\psi .$$
(9.7)

then the time-evolution of this primary constraint leads to a secondary constraint similarly as in Section 7.1. The expression for this secondary constraint is the same as in (7.33) with a benign new contribution from the scalar field [319]

$$\begin{array}{*{20}c} {{{\tilde{\mathcal{C}}}_2} = {\mathcal{C}_2} + {{\partial {m^2}(\psi)} \over {\partial \psi}}\left[ {{\mathcal{U}_0}{\partial _i}\psi (\bar{\mathcal{N}} {n^i} + {{\bar{\mathcal{N}}}^i}) + {{\bar{\mathcal{N}}} \over {\sqrt \gamma}}{\mathcal{U}_1}{p_\psi} + \bar{\mathcal{N}} {\partial _i}\psi {D^i}_k{n^k}} \right]} \\ {\approx 0.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \end{array}$$
(9.8)

then as in the normal fixed-mass case, the tertiary constraint is a constraint for the lapse and the system of constraint truncates leading to 5+1 physical degrees of freedom in four dimensions. The same logic goes through for generalized massive gravity as explained in [320].

One of the important aspects of a mass-varying theory of massive gravity is that it allows more flexibility for the graviton mass. In the past the mass could have been much larger and could have lead to potential interesting features, be it for inflation (see for instance Refs. [315, 378] and [282]), the Hartle-Hawking no-boundary proposal [498, 439, 499], or to avoid the Higuchi bound [307], and yet be compatible with current bounds on the graviton mass. If the graviton mass is an effective description from higher dimensions it is also quite natural to imagine that the graviton mass would depend on some moduli.

11.2 Quasi-dilaton

The Planck scale Mpl, or Newton constant explicitly breaks scale invariance, but one can easily extend the theory of GR to a scale invariant one MPlMPleλ(x) by including a dilaton scalar field λ which naturally arises from string theory or from extra dimension compactification (see for instance [122] and see Refs. [429, 120, 248] for the role of a dilaton scalar field on cosmology).

When dealing with multi-gravity, one can extend the notion of conformal transformation to the global rescaling of the coordinate system of one metric with respect to that of another metric. In the case of massive gravity this amounts to considering the global rescaling of the reference coordinates with respect to the physical one. As already seen, the reference metric can be promoted to a tensor with respect to transformations of the physical metric coordinates, by introducing four Stückelberg fields ϕa, fμνfab μ ϕaνϕb. Thus the theory can be made invariant under global rescaling of the reference metric if the reference metric is promoted to a function of the quasi-dilaton scalar field σ,

$${f_{ab}}{\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^b} \rightarrow {e^{2\sigma /{M_{{\rm{Pl}}}}}}{f_{ab}}{\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^b}.$$
(9.9)

This is the idea behind the quasi-dilaton theory of massive gravity proposed in Ref. [119]. The theoretical consistency of this model was explored in [119] and is reviewed below. The Vainshtein mechanism and the cosmology were also explored in [119, 118] as well as in Refs. [288, 243, 127] and we review the cosmology in Section 12.5. As we shall see in that section, one of the interests of quasi-dilaton massive gravity is the existence of spatially flat FLRW solutions, and particularly of self-accelerating solutions. Nevertheless, such solutions have been shown to be strongly coupled within the region of interest [118], but an extension of that model was proposed in [127] and shown to be free from such issues.

Recently, the decoupling limit of the original quasi-dilaton model was derived in [239]. Interestingly, a new self-accelerating solution was found in this model which admits no instability and all the modes are (sub)luminal for a given realistic set of parameters. The extension of this solution to the full theory (beyond the decoupling limit) should provide for a consistent self-accelerating solution which is guaranteed to be stable (or with a harmless instability time scale of the order of the age of the Universe at least).

11.2.1 Theory

As already mentioned, the idea behind quasi-dilaton massive gravity (QMG) is to extend massive gravity to a theory which admits a new global symmetry. This is possible via the introduction of a quasi-dilaton scalar field σ (x). The action for QMG is thus given by

$$\begin{array}{*{20}c} {{S_{{\rm{QMG}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {R - {\omega \over {2M_{{\rm{Pl}}}^2}}{{(\partial \sigma)}^2} + {{{m^2}} \over 2}\sum\limits_{n = 0}^4 {{\alpha _n}} {\mathcal{L}_n}[\tilde{\mathcal{K}}[g,\eta ]]} \right]} \\ {+ \int {{{\rm{d}}^4}} x\sqrt {- g} {\mathcal{L}_{{\rm{matter}}}}(g,\psi),} \end{array}$$
(9.10)

where ψ represent the matter fields, g is the dynamical metric, and unless specified otherwise all indices are raised and lowered with respect to g, and represents the scalar curvature with respect to g. The Lagrangians n were expressed in (6.96.13) or (6.146.18) and the tensor \(\tilde K\) is given in terms of the Stückelberg fields as

$${\tilde{\mathcal{K}}^\mu}_\nu [g,\eta ] = {\delta ^\mu}_\nu - {e^{\sigma /{M_{{\rm{Pl}}}}}}\sqrt {{g^{\mu \alpha}}{\partial _\alpha}{\phi ^a}{\partial _\nu}{\phi ^b}{\eta _{ab}}} .$$
(9.11)

In the case of the QMG presented in [119], there is no cosmological constant nor tadpole (α0 = α1 = 0) and α2 = 1. This is a very special case of the generalized theory of massive gravity presented in [320], and the proof for the absence of BD ghost thus goes through in the same way. Here again the presence of the scalar field brings only minor modifications to the Hamiltonian analysis in the ADM language as presented in Section 9.1, and so we do not reproduce the proof here. We simply note that the theory propagates six degrees of freedom in four dimensions and is manifestly free of any ghost on flat space time provided that ω > 1/6. The key ingredient compared to mass-varying gravity or generalized massive gravity is the presence of a global rescaling symmetry which is both a space-time and internal transformation [119],

$${x^\mu} \rightarrow {e^\xi}{x^\mu},\quad {g_{\mu \nu}} \rightarrow {e^{- 2\xi}}{g_{\mu \nu}},\quad \sigma \rightarrow \sigma - {M_{{\rm{Pl}}}}\xi ,\quad {\rm{and}}\quad {\phi ^a} \rightarrow {e^\xi}{\phi ^a}.$$
(9.12)

Notice that the matter action \({{\rm{d}}^{\rm{4}}}x\sqrt {- g} {\mathcal L}(g,\psi)\) breaks this symmetry, reason why it is called a ‘quasi-dilaton’.

An interesting feature of QMG is the fact that the decoupling limit leads to a bi-Galileon theory, one Galileon being the helicity-0 mode presented in Section 8.3, and the other Galileon being the quasi-dilaton σ. Just as in massive gravity, there are no irrelevant operators arising at energy scale below Λ3, and at that scale the theory is given by

$$\mathcal{L}_{{\Lambda _3}}^{({\rm{QMG}})} = \mathcal{L}_{{\Lambda _3}}^{(0)} - {\omega \over 2}{(\partial \sigma)^2} + {1 \over 2}\sigma \sum\limits_{n = 1}^4 {{{(4 - n){\alpha _n} - (n + 1){\alpha _{n + 1}}} \over {\Lambda _3^{3(n - 1)}}}} {\mathcal{L}_n}[\Pi ],$$
(9.13)

where the decoupling limit Lagrangian \({\mathcal L}_{{\Lambda _3}}^{(0)}\) in the absence of the quasi-dilaton is given in (8.52) and we recall that \({\alpha _2} = 1,\,{\alpha _1} = 0,\,\Pi _{\,\,\,v}^\mu = {\partial ^\mu}{\partial _v}\pi\) and the Lagrangians n are expressed in (6.10)(6.13) or (6.15)(6.18). We see emerging a bi-Galileon theory for π and σ, and thus the decoupling limit is manifestly ghost-free. We could then apply a similar argument as in Section 7.2.4 to infer the absence of BD ghost for the full theory based on this decoupling limit. Up to integration by parts, the Lagrangian (9.13) is invariant under both independent Galilean transformation ππ +c +υμxμ and \(\sigma \to \sigma + \tilde c + {\tilde \upsilon _\mu}{x^\mu}\).

One of the relevance of this decoupling limit is that it makes the study of the Vainshtein mechanism more explicit. As we shall see in what follows (see Section 10.1), the Galileon interactions are crucial for the Vainshtein mechanism to work.

Note that in (9.13), the interactions with the quasi-dilaton come in the combination ((4 − n)αn −(n + 1)αn+1), while in \({\mathcal L}_{{\Lambda _3}}^{(0)}\), the interactions between the helicity-0 and -2 modes come in the combination ((4 − n)αn + (n + 1)αn+1). This implies that in massive gravity, the interactions between the helicity-2 and -0 mode disappear in the special case where αn = −(n + 1)/(4 − n)αn+1 (this corresponds to the minimal model), and the Vainshtein mechanism is no longer active for spherically symmetric sources (see Refs. [99, 56, 58, 57, 435]). In the case of QMG, the interactions with the quasi-dilaton survive in that specific case α3 = −4α4, and a Vainshtein mechanism could still be feasible, although one might still need to consider non-asymptotically Minkowski configurations.

The cosmology of QMD was first discussed in [119] where the existence of self-accelerating solutions was pointed out. This will be reviewed in the section on cosmology, see Section 12.5. We now turn to the extended version of QMG recently proposed in Ref. [127].

11.2.2 Extended quasi-dilaton

Keeping the same philosophy as the quasi-dilaton in mind, a simple but yet powerful extension was proposed in Ref. [127] and then further extended in [126], leading to interesting phenomenology and stable self-accelerating solutions. The phenomenology of this model was then further explored in [45]. The stability of the extended quasi-dilaton theory of massive gravity was explored in [353] and was proven to be ghost-free in [406].

The key ingredient behind the extended quasi-dilaton theory of massive gravity (EMG) is to notice that two most important properties of QMG namely the absence of BD ghost and the existence of a global scaling symmetry are preserved if the covariantized reference metric is further generalized to include a disformal contribution of the form μσ ∂νσ (such a contribution to the reference metric can arise naturally from the brane-bending mode in higher dimensional braneworld models, see for instance [157]).

The action for EMG then takes the same form as in (9.10) with the tensor \(\tilde {\mathcal K}\), promoted to

$$\tilde{\mathcal{K}} \rightarrow \bar{\mathcal{K}} = \mathbb{I}- {e^{\sigma /{M_{{\rm{Pl}}}}}}\sqrt {{g^{- 1}}\bar f} ,$$
(9.14)

with the tensor defined as

$${\bar f_{\mu \nu}} = {\partial _\mu}{\phi ^a}{\partial _\nu}{\phi ^b}{\eta _{ab}} - {{{\alpha _\sigma}} \over {{M_{{\rm{Pl}}}}\Lambda _3^3}}{e^{- 2\sigma /{M_{{\rm{Pl}}}}}}{\partial _\mu}\sigma {\partial _\nu}\sigma ,$$
(9.15)

where ασ is a new coupling dimensionless constant (as mentioned in [127], this coupling constant is expected to enjoy a non-renormalization theorem in the decoupling limit, and thus to receive quantum corrections which are always suppressed by at least \({m^2}/\Lambda _3^2\). Furthermore, this action can be generalized further by

  • Considering different coupling constants for the \(\tilde {\mathcal K}\)’s entering in \({{\mathcal L}_2}[\tilde {\mathcal K}]\), \({{\mathcal L}_3}[\tilde {\mathcal K}]\) and \({{\mathcal L}_4}[\tilde {\mathcal K}]\).

  • One can also introduce what would be a cosmological constant for the metric \(\bar f\), namely a new term of the for \(\sqrt {- \bar f} {e^{4\sigma/{M_{{\rm{Pl}}}}}}\).

  • General shift-symmetric Horndeski Lagrangians for the quasi-dilaton.

With these further generalizations, one can obtain self-accelerating solutions similarly as in the original QMG. For these self-accelerating solutions, the coupling constant does not enter the background equations of motion but plays a crucial role for the stability of the scalar perturbations on top of these solutions. This is one of the benefits of this extended quasi-dilaton theory of massive gravity.

11.3 Partially massless

11.3.1 Motivations behind PM gravity

The multiple proofs for the absence of BD ghost presented in Section 7 ensures that the ghost-free theory of massive gravity, (or dRGT) does not propagate more than five physical degrees of freedom in the graviton. For a generic finite mass m the theory propagates exactly five degrees of freedom as can be shown from a linear analysis about a generic background. Yet, one can ask whether there exists special points in parameter space where some of degrees of freedom decouple. General relativity, for which m = 0 (and the other parameters αn are finite) is one such example. In the massless limit of massive gravity the two helicity-1 modes and the helicity-0 mode decouple from the helicity-2 mode and we thus recover the theory of a massless spin-2 field corresponding to GR, and three decoupled degrees of freedom. The decoupling of the helicity-0 mode occurs via the Vainshtein mechanismFootnote 24 as we shall see in Section 10.1.

As seen in Section 8.3.6, when considering massive gravity on de Sitter as a reference metric, if the graviton mass is precisely m2 = 2H2, the helicity-0 mode disappears linearly as can be seen from the linearized Lagrangian (8.62). The same occurs in any dimension when the graviton mass is tied to the de Sitter curvature by the relation m2 = (d − 2)H2. This special case is another point in parameter space where the helicity-0 mode could be decoupled, corresponding to a partially massless (PM) theory of gravity as first pointed out by Deser and Waldron [190, 189, 188], (see also [500] for partially massless higher spin, and [450] for related studies).

The absence of helicity-0 mode at the linearized level in PM is tied to the existence of a new scalar gauge symmetry at the linearized level when m2 = 2H2 (or (d − 2)H2 in arbitrary dimensions), which is responsible for making the helicity-0 mode unphysical. Indeed the action (8.62) is invariant under a special combination of a linearized diff and a conformal transformation [190, 189, 188],

$${h_{\mu \nu}} \rightarrow {h_{\mu \nu}} + {\nabla _\mu}{\nabla _\nu}\xi - (d - 2){H^2}\xi {\gamma _{\mu \nu}}.$$
(9.16)

If a non-linear completion of PM gravity exist, then there must exist a non-linear completion of this symmetry which eliminates the helicity-0 mode to all orders. The existence of such a symmetry would lead to several outstanding features:

  • It would protect the structure of the potential.

  • In the PM limit of massive gravity, the helicity-0 mode fully decouples from the helicity-2 mode and hence from external matter. As a consequence, there is no Vainshtein mechanism that decouples the helicity-0 mode in the PM limit of massive gravity unlike in the massless limit. Rather, the helicity-0 mode simply decouples without invoking any strong coupling effects and the theoretical and observational luggage that goes with it.

  • Last but not least, in PM gravity the symmetry underlying the theory is not diffeomorphism invariance but rather the one pointed out in (9.16). This means that in PM gravity, an arbitrary cosmological constant does not satisfy the symmetry (unlike in GR). Rather, the value of the cosmological constant is fixed by the gauge symmetry and is proportional to the graviton mass. As we shall see in Section 10.3 the graviton mass does not receive large quantum corrections (it is technically natural to set to small values). So, if a PM theory of gravity existed it would have the potential to tackle the cosmological constant problem.

Crucially, breaking of covariance implies that matter is no longer covariantly conserved. Instead the failure of energy conservation is proportional to the graviton mass,

$${\nabla _\mu}{\nabla _\nu}{T^{\mu \nu}} = - {{{m^2}} \over {d - 2}}T,$$
(9.17)

which in practise is extremely small.

It is worth emphasizing that if a PM theory of gravity existed, it would be distinct from the minimal model of massive gravity where the non-linear interactions between the helicity-0 and -2 modes vanish in the decoupling limit but the helicity-0 mode is still fully present. PM gravity is also distinct from some specific branches of solutions found in cosmology (see Section 12) on top of which the helicity-0 mode disappears. If a PM theory of gravity exists the helicity-0 mode would be fully absent of the whole theory and not only for some specific branches of solutions.

11.3.2 The search for a PM theory of gravity

11.3.2.1 A candidate for PM gravity:

The previous considerations represent some strong motivations for finding a fully fledged theory of PM gravity (i.e., beyond the linearized theory) and there has been many studies to find a nonlinear realization of the PM symmetry. So far all these studies have in common to keep the kinetic term for gravity unchanged (i.e., keeping the standard Einstein-Hilbert action, with a potential generalization to the Lovelock invariants [298]).

Under this assumption, it was shown in [501, 330], that while the linear level theory admits a symmetry in any dimensions, at the cubic level the PM symmetry only exists in d = 4 spacetime dimensions, which could make the theory even more attractive. It was also pointed out in [191] that in four dimensions the theory is conformally invariant. Interestingly, the restriction to four dimensions can be lifted in bi-gravity by including the Lovelock invariants [298].

From the analysis in Section 8.3.6 (see Ref. [154]) one can see that the helicity-0 mode entirely disappears from the decoupling limit of ghost-free massive gravity, if one ignores the vectors and sets the parameters of the theory to m2 = 2H2, a.3 = −1 and α.4 = 1/4 in four dimensions. The ghost-free theory of massive gravity with these parameters is thus a natural candidate for the PM theory of gravity. Following this analysis, it was also shown that bi-gravity with the same parameters for the interactions between the two metrics satisfies similar properties [301]. Furthermore, it was also shown in [147] that the potential has to follow the same structure as that of ghost-free massive gravity to have a chance of being an acceptable candidate for PM gravity. In bi-gravity the same parameters as for massive gravity were considered as also being the natural candidate [301], in addition of course to other parameters that vanish in the massive gravity limit (to make a fair comparison once needs to take the massive gravity limit of bi-gravity with care as was shown in [301]).

11.3.2.2 Re-appearance of the Helicity-0 mode:

Unfortunately, when analyzing the interactions with the vector fields, it is clear from the decoupling limit (8.52) that the helicity-0 mode reappears non-linearly through their couplings with the vector fields. These never cancel, not even in four dimensions and for no parameters of theory. So rather than being free from the helicity-0 mode, massive gravity with m2 = (d − 2)H2 has an infinitely strongly coupled helicity-0 mode and is thus a sick theory. The absence of the helicity-0 mode is simple artefact of the linear theory.

As a result we can thus deduce that there is no theory of PM gravity. This result is consistent with many independent studies performed in the literature (see Refs. [185, 147, 181, 194]).

11.3.2.3 Relaxing the assumptions:
  • One assumption behind this result is the form of the kinetic term for the helicity-2 mode, which is kept to the be Einstein-Hilbert term as in GR. A few studies have considered a generalization of that kinetic term to diffeomorphism-breaking ones [231, 310] however further analysis [339, 153] have shown that such interactions always lead to ghosts nonperturbatively. See Section 5.6 for further details.

  • Another potential way out is to consider the embedding of PM within bi-gravity or multigravity. Since bi-gravity is massive gravity and a decoupled massless spin-2 field in some limit it is unclear how bi-gravity could evade the results obtained in massive gravity but this approach has been explored in [301, 298, 299, 184]. A perturbative relation between bi-gravity and conformal gravity was derived at the level of the equations of motion in Ref. [299] (unlike claimed in [184]).

  • The other assumptions are locality and Lorentz-invariance. It is well known that Lorentz-breaking theories of massive gravity can excite fewer than five degrees of freedom. This avenue is explored in Section 14.

To summarize there is to date no known non-linear PM symmetry which could project out the helicity-0 mode of the graviton while keeping the helicity-2 mode massive in a local and Lorentz invariant way.

12 Massive Gravity Field Theory

12.1 Vainshtein mechanism

As seen earlier, in four dimensions a massless spin-2 field has five degrees of freedom, and there is no special PM case of gravity where the helicity-0 mode is unphysical while the graviton remains massive (or at least there is to date no known such theory). The helicity-0 mode couples to matter already at the linear level and this additional coupling leads to a extra force which is at the origin of the vDVZ discontinuity see in Section 2.2.3. In this section, we shall see how the non-linearities of the helicity-0 mode is responsible for a Vainshtein mechanism that screens the effect of this field in the vicinity of matter.

Since the Vainshtein mechanism relies strongly on non-linearities, this makes explicit solutions very hard to find. In most of the cases where the Vainshtein mechanism has been shown to work successfully, one assumes a static and spherically symmetric background source. Already in that case the existence of consistent solutions which extrapolate from a well-behaved asymptotic behavior at infinity to a screened solution close to the source are difficult to obtain numerically [121] and were only recently unveiled [37, 39] in the case of non-linear Fierz-Pauli gravity.

This review on massive gravity cannot do justice to all the ongoing work dedicated to the study of the Vainshtein mechanism (also sometimes called ‘kinetic chameleon’ as it relies on the kinetic interactions for the helicity-0 mode). In what follows, we will give the general idea behind the Vainshtein mechanism starting from the decoupling limit of massive gravity and then show explicit solutions in the decoupling limit for static and spherically symmetric sources. Such an analysis is relevant for observational tests in the solar system as well as for other astrophysical tests (such as binary pulsar timing), which we shall explore in Section 11. We refer to the following review on the Vainshtein mechanism for further details, [35] as well as to the following work [160, 38, 99, 332, 36, 244, 40, 338, 321, 440, 316, 53, 376, 366, 407]. Recently, it was also shown that the Vainshtein mechanism works for bi-gravity, see Ref. [34].

We focus the rest of this section to the case of four space-time dimensions, although many of the results presented in what follows are well understood in arbitrary dimensions.

12.1.1 Effective coupling to matter

As already mentioned, the key ingredient behind the Vainshtein mechanism is the importance of interactions for the helicity-0 mode which we denote as π. From the decoupling limit analysis performed for massive gravity (see (8.52)) and bi-gravity (see (8.78)), we see that in some limit the helicity-0 mode π behaves as a scalar field, which enjoys a special global symmetry

$$\pi \rightarrow \pi + c + {v_\mu}{x^\mu},$$
(10.1)

and yet only carries two derivatives at the level of the equations of motion, (which as we have seen is another way to see the absence of BD ghost).

These types of interactions are very similar to the Galileon-type of interactions introduced by Nicolis, Rattazzi and Trincherini in Ref. [412] as a generalization of the decoupling limit of DGP. For simplicity we shall focus most of the discussion on the Vainshtein mechanism with Galileons as a special example, and then mention in Section 10.1.3 peculiarities that arise in the special case of massive gravity (see for instance Refs. [58, 57]).

We thus start with a cubic Galileon theory

$$\mathcal{L} = - {1 \over 2}{(\partial \pi)^2} - {1 \over {{\Lambda ^3}}}{(\partial \pi)^2}\square \pi + {1 \over {{M_{{\rm{Pl}}}}}}\pi T,,$$
(10.2)

where \(T = T_\mu ^\mu\) is the trace of the stress-energy tensor of external sources, and Λ is the strong coupling scale of the theory. As seen earlier, in the case of massive gravity, Λ = Λ3 = (m2MPl)1/3. This is actually precisely the way the helicity-0 mode enters in the decoupling limit of DGP [389] as seen in Section 4.2. It is in that very context that the Vainshtein mechanism was first shown to work explicitly [165].

The essence of the Vainshtein mechanism is that close to a source, the Galileon interactions dominate over the linear piece. We make use of this fact by splitting the source into a background contribution T0 and a perturbation δT. The background source T0 leads to a background profile π0 for the field, and the response to the fluctuation δT on top of this background is given by so that the total field is expressed as

$$\pi = {\pi _0} + \phi .$$
(10.3)

for a sufficiently large source (or as we shall see below if T0 represents a static point-like source, then sufficiently close to the source), the non-linearities dominate and symbolically 2π0 ≫ Λ3.

We now follow the perturbations in the action (10.2) and notice that the background configuration π0 leads to a modified effective metric for the perturbations,

$${\mathcal{L}^{(2)}} = - {1 \over 2}{Z^{\mu \nu}}({\pi _0}){\partial _\mu}\phi {\partial _\nu}\phi + {1 \over {{M_{{\rm{Pl}}}}}}\phi \delta T,$$
(10.4)

up to second order in perturbations, with the new effective metric

$${Z^{\mu \nu}} = {\eta ^{\mu \nu}} + {2 \over {{\Lambda ^3}}}{X^{(1)\mu \nu}}({\Pi _0}),$$
(10.5)

where the tensor X(1) is the same as that defined for massive gravity in (8.29) or in (8.34), so symbolically Z is of the form \(Z \sim 1 + {{{\partial ^2}{\pi _0}} \over {{\Lambda ^3}}}\). One can generalize the initial action (10.2) to arbitrary set of Galileon interactions

$$\mathcal{L} = \pi \sum\limits_{n = 1}^4 {{{{c_{n + 1}}} \over {{\Lambda ^{3(n - 1)}}}}} {\mathcal{L}_n}[\Pi ],$$
(10.6)

with again Πμν = μν π and where the scalars n have been defined in (6.10)(6.13). The effective metric would then be of the form

$${Z^{\mu \nu}}({\pi _0}) = \sum\limits_{n = 1}^4 {{{n(n + 1){c_n}} \over {{\Lambda ^{3(n - 1)}}}}} {X^{(n - 1)\mu \nu}}({\Pi _0}),$$
(10.7)

where all the tensors \(X_{\mu v}^{(n)}\) are defined in (8.28)(8.32). Notice that μZμν = 0 identically. For sufficiently large sources, the components of Z are large, symbolically, Z ∼ (2π03)n ≫ 1 for n ≥ 1.

Canonically normalizing the fluctuations in (10.4), we have symbolically,

$$\hat \phi = \sqrt Z \phi ,$$
(10.8)

assuming Zμν μν, which is not generally the case. Nevertheless, this symbolic scaling is sufficient to get the essence of the idea. For a more explicit canonical normalization in specific configurations see Ref. [412]. As nicely explained in that reference, if Zμν is conformally flat, one should not only scale the field \(\phi \to \hat \phi\) but also the space-like coordinates \(x \to \hat x\) so at to obtain a standard canonically normalized field in the new system, \(\int {{{\rm{d}}^{\rm{4}}}\tilde x - {1 \over 2}{{({\partial _{\tilde x}}\hat \phi)}^2}}\). For now we stick to the simple normalization (10.8) as it is sufficient to see the essence of the Vainshtein mechanism. In terms of the canonically normalized field \(\hat \phi\), the perturbed action (10.4) is then

$${\mathcal{L}^{(2)}} = - {1 \over 2}{(\partial \hat \phi)^2} + {1 \over {{M_{{\rm{Pl}}}}\sqrt {Z({\pi _0})}}}\hat \phi \delta T,$$
(10.9)

which means that the coupling of the fluctuations to matter is medium dependent and can arise at a scale very different from the Planck scale. In particular, for a large background configuration, ∂2π0 ≫ Λ3 and Z (π0) ≫ 1, so the effective coupling scale to external matter is

$${M_{{\rm{eff}}}} = {M_{{\rm{Pl}}}}\sqrt Z \gg {M_{{\rm{Pl}}}},$$
(10.10)

and the coupling to matter is thus very suppressed. In massive gravity Λ is related to the graviton mass, Λ ∼ m2/3, and so the effective coupling scale Meff → ∞ as m → 0, which shows how the helicity-0 mode characterized by decouples in the massless limit.

We now first review how the Vainshtein mechanism works more explicitly in a static and spherically symmetric configuration before applying it to other systems. Note that the Vainshtein mechanism relies on irrelevant operators. In a standard EFT this cannot be performed without going beyond the regime of validity of the EFT. In the context of Galileons and other very specific derivative theories, one can reorganize the EFT so that the operators considered can be large and yet remain within the regime of validity of the reorganized EFT. This will be discussed in more depth in what follows.

12.1.2 Static and spherically symmetric configurations in Galileons

12.1.2.1 Suppression of the force

We now consider a point like source

$${T_0} = - M{\delta ^{(3)}}(r) = - M{{\delta (r)} \over {4\pi {r^2}}},$$
(10.11)

where M is the mass of the source localized at r = 0. Since the source is static and spherically symmetric, we can focus on configurations which respect the same symmetry, π0 = π0(r). The background configuration for the field π0(r) in the case of the cubic Galileon (10.2) satisfies the equation of motion [411]

$${1 \over {{r^2}}}{\partial _r}\left[ {{r^3}\left({{{\pi _0{\prime} (r)} \over r} + {1 \over {{\Lambda ^3}}}{{\left({{{\pi _0{\prime} (r)} \over r}} \right)}^2}} \right)} \right] = {M \over {4\pi {M_{{\rm{Pl}}}}}}{{\delta (r)} \over {{r^2}}},$$
(10.12)

and so integrating both sides of the equation, we obtain an algebraic equation for π0 (r),

$${{\pi _0{\prime} (r)} \over r} + {1 \over {{\Lambda ^3}}}{\left({{{\pi _0{\prime} (r)} \over r}} \right)^2} = {M \over {{M_{{\rm{Pl}}}}}}{1 \over {4\pi {r^3}}}.$$
(10.13)

We can define the Vainshtein or strong coupling radius r* as

$${r_{\ast}} = {1 \over \Lambda}{\left({{M \over {4\pi {M_{{\rm{Pl}}}}}}} \right)^{1/3}},$$
(10.14)

so that at large distances compared to that Vainshtein radius the linear term in (10.12) dominates while the interactions dominate at distances shorter than r*,

$$\begin{array}{*{20}c} {{\rm{for}}\ r \gg {r_{\ast}},\quad \pi _0{\prime} (r)\sim{M \over {4\pi {M_{{\rm{Pl}}}}}}{1 \over {{r^2}}}\quad \quad} \\ {{\rm{for}}\ r \ll {r_{\ast}},\quad \pi _0{\prime} (r)\sim{M \over {4\pi {M_{{\rm{Pl}}}}}}{1 \over {r_{\ast}^{3/2}{r^{1/2}}}}.} \end{array}$$
(10.15)

So, at large distances rr*, one recovers a Newton square law for the force mediated by π, and that fields mediates a force which is just a strong as standard gravity (i.e., as the force mediated by the usual helicity-2 modes of the graviton). On shorter distances scales, i.e., close to the localized source, the force mediated by the new field π is much smaller than the standard gravitational one,

$${{F_{r \ll {r_{\ast}}}^{(\pi)}} \over {{F_{{\rm{Newt}}}}}} \sim {\left({{r \over {{r_{\ast}}}}} \right)^{3/2}} \ll 1\quad {\rm{for}}\quad r \ll {r_ \star}.$$
(10.16)

In the case of the quartic Galileon (which typically arises in massive gravity), the force is even suppressed and goes as

$${{F_{r \ll {r_{\ast}}}^{({\rm{quartic}}\;\;\pi)}} \over {{F_{{\rm{Newt}}}}}} \sim {\left({{r \over {{r_{\ast}}}}} \right)^2} \ll 1\quad {\rm{for}}\quad r \ll {r_ \star}.$$
(10.17)

for a graviton mass of the order of the Hubble parameter today, i.e., Λ ∼ (1000 km)−1, then taking into account the mass of the Sun, the force at the position of the Earth is suppressed by 12 orders of magnitude compared to standard Newtonian force in the case of the cubic Galileon and by 16 orders of magnitude in the quartic Galileon. This means that the extra force mediated by is utterly negligible compared to the standard force of gravity and deviations to GR are extremely small.

Considering the Earth-Moon system, the force mediated by at the surface of the Moon is suppressed by 13 orders of magnitude compared to the Newtonian one in the cubic Galileon. While small, this is still not far off from the possible detectability from the lunar laser ranging space experiment [488], as will be discussed further in what follows. Note that in the quartic Galileon, that force is suppressed instead by 17 orders of magnitude and is there again very negligible.

When applying this naive estimate (10.16) to the Hulse-Taylor system for instance, we would infer a suppression of 15 orders of magnitude compared to the standard GR results. As we shall see in what follows this estimate breaks down when the time evolution is not negligible. These points will be discussed in the phenomenology Section 11, but before considering these aspects we review in what follows different aspects of massive gravity from a field theory perspective, emphasizing the regime of validity of the theory as well as the quantum corrections that arise in such a theory and the emergence of superluminal propagation.

12.1.2.2 Perturbations

We now consider perturbations riding on top of this background configuration for the Galileon field, π = π0(r) + ϕ (xμ). As already derived in Section 10.1.1, the perturbations ϕ see the effective space-dependent metric given in (10.7). Focusing on the cubic Galileon for concreteness, the background solution for π0 is given by (10.13). In that case the effective metric is

$${Z^{\mu \nu}} = {\eta ^{\mu \nu}} + {4 \over {{\Lambda ^3}}}\left(\square{{\pi _0}{\eta ^{\mu \nu}} - {\partial ^\mu}{\partial ^\nu}{\pi _0}} \right)$$
(10.18)
$$\begin{array}{*{20}c} {{Z_{\mu \nu}}\;\;{\rm{d}}{x^\mu}\;\;{\rm{d}}{x^\nu} = - \left({1 + {4 \over {{\Lambda ^3}}}\left({{{2\pi _0{\prime}} \over r} + \pi _0{\prime}{\prime}(r)} \right)} \right)\;\;{\rm{d}}{t^2}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ \left({{{1 + 8\pi _0{\prime} (r)} \over {r{\Lambda ^3}}}} \right)\;\;{\rm{d}}{r^2} + \left({1 + {4 \over {{\Lambda ^3}}}\left({{{\pi _0{\prime} (r)} \over r} + \pi \prime {\prime _0}(r)} \right)} \right){r^2}\;\;{\rm{d}}\Omega _2^2,} \\ \end{array}$$
(10.19)

so that close to the source, for rr*,

$${Z_{\mu \nu}}\;\;{\rm{d}}{x^\mu}\;\;{\rm{d}}{x^\nu} = 6{\left({{{{r_{\ast}}} \over r}} \right)^{1/2}}\left({- {\rm{d}}{t^2} + {4 \over 3}\;\;{\rm{d}}{r^2} + {1 \over 3}{r^2}\;\;{\rm{d}}\Omega _2^2} \right) + O{({r_{\ast}}/r)^0}.$$
(10.20)

A few comments are in order:

  • First, we recover \(Z \sim \sqrt {{r_*}/r} \gg 1\) for rr*, which is responsible for the redressing of the strong coupling scale as we shall see in (10.24). On the no-trivial background the new strong coupling scale is \({\Lambda _*} \sim \sqrt Z \Lambda \gg \Lambda\) for rr*. Similarly, on top of this background the coupling to external matter no longer occurs at the Planck scale but rather at the scale \(\sqrt Z {M_{{\rm{Pl}}}} \sim {10^7}{M_{{\rm{Pl}}}}\).

  • Second, we see that within the regime of validity of the classical calculation, the modes propagating along the radial direction do so with a superluminal phase and group velocity \(c_r^2 = 4/3 > 1\) and the modes propagating in the orthoradial direction do so with a subluminal phase and group velocity \(c_\Omega ^2 = 1/3\). This result occurs in any Galileon and multi-Galileon theory which exhibits the Vainshtein mechanism [412, 129, 246]. The subluminal velocity is not of great concern, not even for Cerenkov radiation since the coupling to other fields is so much suppressed, but the superluminal velocity has been source of many questions [1]. It is definitely one of the biggest issues arising in these kinds of theories see Section 10.6.

Before discussing the biggest concerns of the theory, namely the superluminalities and the low strong-coupling scale, we briefly present some subtleties that arise when considering static and spherically symmetric solutions in massive gravity as opposed to a generic Galileon theory.

12.1.3 Static and spherically symmetric configurations in massive gravity

The Vainshtein mechanism was discussed directly in the context of massive gravity (rather than the Galileon larger family) in Refs. [363, 365, 99, 440] and more recently in [58, 455, 57]. See also Refs. [478, 105, 61, 413, 277, 160, 38, 37, 39] for other spherically symmetric solutions in massive gravity.

While the decoupling limit of massive gravity resembles that of a Galileon, it presents a few particularities which affects the precise realization of the Vainshtein mechanism:

  • First if the parameters of the ghost-free theory of massive gravity are such that α3 + 4α4 ≠ 0, there is a mixing \({h^{\mu v}}X_{\mu v}^{(3)}\) between the helicity-0 and -2 modes of the graviton that cannot be removed by a local field redefinition (unless we work in an special types of backgrounds). The effects of this coupling were explored in [99, 57] and it was shown that the theory does not exhibit any stable static and spherically symmetric configuration in presence of a localized point-like matter source. So in order to be phenomenologically viable, the theory of massive gravity needs to be tuned with α3 + 4α4 = 0. Since these parameters do not get renormalized this is a tuning and not a fine-tuning.

  • When α3+4α4 = 0 and the previous mixing \({h^{\mu v}}X_{\mu v}^{(3)}\) is absent, the decoupling limit of massive gravity resembles a specific quartic Galileon, where the coefficient of the cubic Galileon is related to quartic coefficient (and if one vanishes so does the other one),

    $$\begin{array}{*{20}c} {{\mathcal{L}_{{\rm{Helicity - 0}}}} = - {3 \over 4}{{(\partial \pi)}^2} + {{3\alpha} \over {4\Lambda _3^3}}\mathcal{L}_{({\rm{Gal}})}^{(3)}[\pi ] - {1 \over 4}{{\left({{\alpha \over {\Lambda _3^3}}} \right)}^2}\mathcal{L}_{({\rm{Gal}})}^{(4)}[\pi ]} \\ {+ {1 \over {{M_{{\rm{Pl}}}}}}\left({\pi T + {\alpha \over {\Lambda _3^3}}{\partial _\mu}\pi {\partial _\nu}\pi {T^{\mu \nu}}} \right),\quad \quad} \\ \end{array}$$
    (10.21)

    where we have set α2 = 1 and the Galileon Lagrangians \({\mathcal L}_{{\rm{Gal}}}^{(3,4)}[\pi ]\) are given in (8.44) and (8.45). Note that in this decoupling limit the graviton mass always enters in the combination \(\alpha/\Lambda _3^3\), with α = (1 + 3/2α3). As a result this decoupling limit can never be used to directly probe the graviton mass itself but rather of the combination \(\alpha/\Lambda _3^3\) [57]. Beyond the decoupling limit however the theory breaks the degeneracy between and m.

    Not only is the cubic Galileon always present when the quartic Galileon is there, but one cannot prevent the new coupling to matter μπ∂ν∂πTμν which is typically absent in other Galileon theories.

The effect of the coupling μπ∂ν∂πTμν was explored in [58]. First it was shown that this coupling contributes to the definition of the kinetic term of π and can lead to a ghost unless α. > 0 so this restricts further the allowed region of parameter space for massive gravity. Furthermore, even when α > 0, none of the static spherically symmetric solutions which asymptote to π → 0 at infinity (asymptotically flat solutions) extrapolate to a Vainshtein solution close to the source. Instead the Vainshtein solution near the source extrapolate to cosmological solutions at infinity which is independent of the source

$${\pi _0}(r) \rightarrow {{3 + \sqrt 3} \over 4}{{\Lambda _3^3} \over \alpha}{r^2}\quad {\rm{for}}\quad r \gg {r_{\ast}}$$
(10.22)
$${\pi _0}(r) \rightarrow {\left({{{\Lambda _3^3} \over \alpha}} \right)^{2/3}}{\left({{M \over {4\pi {M_{{\rm{Pl}}}}}}} \right)^{1/3}}\quad {\rm{for}}\quad r \ll {r_{\ast}}.$$
(10.23)

if π was a scalar field in its own right such an asymptotic condition would not be acceptable. However, in massive gravity π is the helicity-0 mode of the gravity and its effect always enters from the Stückelberg combination μνπ, which goes to a constant at infinity. Furthermore, this result is only derived in the decoupling limit, but in the fully fledged theory of massive gravity, the graviton mass kicks in at the distance scale m−1 and suppresses any effect at these scales.

Interestingly, when performing the perturbation analysis on this solution, the modes along all directions are subluminal, unlike what was found for the Galileon in (10.20). It is yet unclear whether this is an accident to this specific solution or if this is something generic in consistent solutions of massive gravity.

12.2 Validity of the EFT

The Vainshtein mechanism presented previously relies crucially on interactions which are important at a low energy scale Λ ≪ MPl. These interactions are operators of dimension larger than four, for instance the cubic Galileon (∂π)2π is a dimension-7 operator and the quartic Galileon is a dimension-10 operator. The same can be seen directly within massive gravity. In the decoupling limit (8.38), the terms \({h^{\mu v}}X_{\mu v}^{(2,3)}\) are respectively dimension-7 and-10 operators. These operators are thus irrelevant from a traditional EFT viewpoint and the theory is hence not renormalizable.

This comes as no surprise, since gravity itself is not renormalizable and there is thus no reason to expect massive gravity nor its decoupling limit to be renormalizable. However, for the Vainshtein mechanism to be successful in massive gravity, we are required to work within a regime where these operators dominate over the marginal ones (i.e., over the standard kinetic term ∂π)2 in the strongly coupled region where 2π ≫ Λ3). It is, therefore, natural to wonder whether or not one can ever use the effective field description within the strong coupling region without going outside the regime of validity of the theory.

The answer to this question relies on two essential features:

  1. 1.

    First, as we shall see in what follows, the Galileon interactions or the interactions that arise in the decoupling limit of massive gravity and which are essential for the Vainshtein mechanism do not get renormalized within the decoupling limit (they enjoy a non-renormalization theorem which we review in what follows).

  2. 2.

    The non-renormalization theorem together with the shift and Galileon symmetry implies that only higher operators of the form (π)m, with , m ≥ 2 are generated by quantum corrections. These operators differ from the Galileon operators in that they always generate terms that more than two derivatives on the field at the level of the equation of motion (or they always have two or more derivatives per field at the level of the action).

This means that there exists a regime of interest for the theory, for which the operators generated by quantum corrections are irrelevant (non-important compared to the Galileon interactions). Within the strong coupling region, the field itself can take large values, π ∼ Λ, ∂π ∼ Λ2, ∂2π ∼ Λ3, and one can still rely on the Galileon interactions and take no other operator into account so long as any further derivative of the field is suppressed, dnπ ≪ Λn+1 for any n ≥ 3.

This is similar to the situation in DBI scalar field models, where the field operator itself and its velocity is considered to be large π ∼ Λ and ∂π Λ2, but the field acceleration and any higher derivatives are suppressed nπ ≪ Λn+1 for n ≥ 2 (see [157]). In other words, the Effective Field expansion should be reorganized so that operators which do not give equations of motion with more than two derivatives (i.e., Galileon interactions) are considered to be large and ought to be treated as the relevant operators, while all other interactions (which lead to terms in the equations of motion with more than two derivatives) are treated as irrelevant corrections in the effective field theory language.

Finally, as mentioned previously, the Vainshtein mechanism itself changes the canonical scale and thus the scale at which the fluctuations become strongly coupled. On top of a background configuration, interactions do not arise at the scale Λ but rather at the rescaled strong coupling scale \({\Lambda _*} = \sqrt Z \Lambda\), where Z is expressed in (10.7). In the strong coupling region, Z ≫ 1 and so Λ* ≫ Λ. The higher interactions for fluctuations on top of the background configuration are hence much smaller than expected and their quantum corrections are therefore suppressed.

When taking the cubic Galileon and considering the strong coupling effect from a static and spherically symmetric source then

$${\Lambda _{\ast}} \sim \sqrt Z \Lambda \sim \sqrt {{{\pi _0{\prime} (r)} \over {r{\Lambda ^3}}}} \Lambda ,$$
(10.24)

where the profile for the cubic Galileon in the strong coupling region is given in (10.15). If the source is considered to be the Earth, then at the surface of the Earth this gives

$${\Lambda _{\ast}}\sim\left({{M \over {{M_{{\rm{Pl}}}}}}{1 \over {{{(r\Lambda)}^3}}}} \right)\Lambda \sim{10^7}\Lambda \sim{\rm{c}}{{\rm{m}}^{- 1}},$$
(10.25)

taking Λ ∼ (1000 km)−1, which would be the scale Λ3 in massive gravity for a graviton mass of the order of the Hubble parameter today. In the quartic Galileon this enhancement in the strong coupling scale does not work as well in the purely static and spherically symmetric case [88] however considering a more realistic scenario and taking the smallest breaking of the spherical symmetry into account (for instance the Earth dipole) leads to a comparable result of a few cm [57]. Notice that this is the redressed strong coupling scale when taking into consideration only the effect of the Earth. When getting to these smaller distance scales, all the other matter sources surrounding whichever experiment or scattering process needs to be accounted for and this pushes the redressed strong coupling scale even higher [57].

12.3 Non-renormalization

The non-renormalization theorem mentioned above states that within a Galileon theory the Galileon operators themselves do not get renormalized. This was originally understood within the context of the cubic Galileon in the procedure established in [411] and is easily generalizable to all the Galileons [412]. In what follows, we review the essence of non-renormalization theorem within the context of massive gravity as derived in [140].

Let us start with the decoupling limit of massive gravity (8.38) in the absence of vector modes (the Vainshtein mechanism presented previously does not rely on these modes and it thus consistent for the purpose of this discussion to ignore them). This decoupling limit is a very special scalar-tensor theory on flat spacetime

$${\mathcal{L}_{{\Lambda _3}}} = - {1 \over 4}{h^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {1 \over 4}{h^{\mu \nu}}\sum\limits_{n = 1}^3 {{{{c_n}} \over {\Lambda _3^3(n - 1)}}} X_{\mu \nu}^{(n)},$$
(10.26)

where the coefficients cn are given in (8.47) and the tensors are given in (8.298.31) or (8.338.36). The theory described by (10.26) (including the two interactions hX(2,3)) enjoys two kinds of symmetries: a gauge symmetry for (linearized diffeomorphism) hμνhμν + ∂(μξν) and a global shift and Galilean symmetry for π, ππ + c + vμxμ. Notice that unlike in a pure Galileon theory, here the global symmetry for is an exact symmetry of the Lagrangian (not a symmetry up to boundary terms). This means that the quantum corrections generated by this theory ought to preserve the same kinds of symmetries.

The non-renormalization theorem follows simply from the antisymmetric structure of the interactions (8.30) and (8.31). Let us consider the contributions of the vertices

$${V_2} = {h^{\mu \nu}}X_{\mu \nu}^{(2)} = {h^{\mu \nu}}{\varepsilon ^{\mu \alpha \beta \gamma}}{\varepsilon ^{\nu {\alpha {\prime}}{\beta {\prime}}}}_\gamma {\partial _\alpha}{\partial _{{\alpha {\prime}}}}\pi {\partial _\beta}{\partial _{{\beta {\prime}}}}\pi$$
(10.27)
$${V_3} = {h^{\mu \nu}}X_{\mu \nu}^{(3)} = {h^{\mu \nu}}{\varepsilon ^{\mu \alpha \beta \gamma}}{\varepsilon ^{\nu {\alpha {\prime}}{\beta {\prime}}{\gamma {\prime}}}}{\partial _\alpha}{\partial _{{\alpha {\prime}}}}\pi {\partial _\beta}{\partial _{{\beta {\prime}}}}\pi {\partial _\gamma}{\partial _{{\gamma {\prime}}}}\pi$$
(10.28)

to an arbitrary diagram. If all the external legs of this diagram are π fields then it follows immediately that the contribution of the process goes as (2π or with more derivatives and is thus not an operator which was originally present in (10.26). So let us consider the case where a vertex (say V3) contributes to the diagram with a spin-2 external leg of momentum pμ. The contribution from that vertex to the whole diagram is given by

$$\begin{array}{*{20}c} {i{\mathcal{M}_{{V_3}}} \propto i\int {{{{{\rm{d}}^4}k} \over {{{(2\pi)}^4}}}} {{{{\rm{d}}^4}q} \over {{{(2\pi)}^4}}}{\mathcal{G}_k}{\mathcal{G}_q}{\mathcal{G}_{p - k - q}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \times \left[ {{\epsilon ^{{\ast}\mu \nu}}{\varepsilon _\mu}^{\alpha \beta \gamma}{\varepsilon _\nu}^{{\alpha {\prime}}{\beta {\prime}}{\gamma {\prime}}}{k_\alpha}{k_{{\alpha {\prime}}}}{q_\beta}{q_{{\beta {\prime}}}}{{(p - k - q)}_\gamma}{{(p - k - q)}_{{\gamma {\prime}}}}} \right]\quad \\ \propto i{\epsilon ^{{\ast}\mu \nu}}{\varepsilon _\mu}^{\alpha \beta \gamma}{\varepsilon _\nu}^{{\alpha {\prime}}{\beta {\prime}}{\gamma {\prime}}}{p_\gamma}{p_{{\gamma {\prime}}}}\int {{{{{\rm{d}}^4}k} \over {{{(2\pi)}^4}}}} {{{{\rm{d}}^4}q} \over {{{(2\pi)}^4}}}{\mathcal{G}_k}{\mathcal{G}_q}{\mathcal{G}_{p - k - q}}{k_\alpha}{k_{{\alpha {\prime}}}}{q_\beta}{q_{{\beta {\prime}}}}, \\ \end{array}$$
(10.29)

where ϵ *μν is the polarization of the spin-2 external leg and is the Feynman propagator for the π-particle, \({{\mathcal G}_k} = i{({k^2} - i\varepsilon)^{- 1}}\). This contribution is quadratic in the momentum of the external spin-2 field pγpγ′, which means that in position space it has to involve at least two derivatives in (there could be more derivatives arising from the integral over the propagator \({{\mathcal G}_{p - k - q}}\) inside the loops). The same result holds when inserting a V2 vertex as explained in [140]. As a result any diagram in this theory can only generate terms of the form (2h) (2π)m, or terms with even more derivatives. As a result the operators presented in (10.26) or in the decoupling limit of massive gravity are not renormalized. This means that within the decoupling limit the scale A does not get renormalized, and it can be set to an arbitrarily small value (compared to the Planck scale) without running issues. The same holds for the other parameter c2 or c3.

When working beyond the decoupling limit, we expect operators of the form h2(2π)n to spoil this non-renormalization theorem. However, these operators are MPl suppressed, and so they lead to quantum corrections which are themselves MPl suppressed. This means that the quantum corrections to the graviton mass is suppressed as well [140]

$$\delta {m^2} \lesssim {m^2}{\left({{m \over {{M_{{\rm{Pl}}}}}}} \right)^{2/3}}.$$
(10.30)

This result is crucial for the theory. It implies that a small graviton mass is technically natural.

12.4 Quantum corrections beyond the decoupling limit

As already emphasized, the consistency of massive gravity relies crucially on a very specific set of allowed interactions summarized in Section 6. Unlike for GR, these interactions are not protected by any (known) symmetry and we thus expect quantum corrections to destabilize this structure. Depending on the scale at which these quantum corrections kick in, this could lead to a ghost at an unacceptably low scale.

Furthermore, as discussed previously, the mass of the graviton itself is subject to quantum corrections, and for the theory to be viable the graviton mass ought to be tuned to extremely small values. This tuning would be technically unnatural if the graviton mass received large quantum corrections.

We first summarize the results found so far in the literature before providing further details

  1. 1.

    Destabilization of the potential:

    At one-loop, matter fields do not destabilize the structure of the potential. Graviton loops on the hand do lead to new operators which do not belong to the ghost-free family of interactions presented in (6.96.13), however they are irrelevant below the Planck scale.

  2. 2.

    Technically natural graviton mass:

    As already seen in (10.30), the quantum corrections for the graviton mass are suppressed by the graviton mass itself, δm2m2(m/MPl)2/3 this result is confirmed at one-loop beyond the decoupling limit and as result a small graviton mass is technically natural.

12.4.1 Matter loops

The essence of these arguments go as follows: Consider a ‘covariant’ coupling to matter, matter(gμν, ψi), for any species ψi be it a scalar, a vector, or a fermion (in which case the coupling has to be performed in the vielbein formulation of gravity, see (5.6)).

At one loop, virtual matter fields do not mix with the virtual graviton. As a result as far as matter loops are concerned, they are ‘unaware’ of the graviton mass, and only lead to quantum corrections which are already present in GR and respect diffeomorphism invariance. So the only potential term (i.e., operator with no derivatives on the metric fluctuation) it can lead to is the cosmological constant.

This result was confirmed at the level of the one-loop effective action in [146], where it was shown that a field of mass M leads to a running of the cosmological constant δΛCCM4. This result is of course well-known and is at the origin of the old cosmological constant problem [484]. The key element in the context of massive gravity is that this cosmological constant does not lead to any ghost and no new operators are generated from matter loops, at the one-loop level (and this independently of the regularization scheme used, be it dimensional regularization, cutoff regularization, or other.) At higher loops we expect virtual matter fields and graviton to mix and effect on the structure of the potential still remains to be explored.

12.4.2 Graviton loops

When considering virtual gravitons running in the loops, the theory does receive quantum corrections which do not respect the ghost-free structure of the potential. These are of course suppressed by the Planck scale and the graviton mass and so in dimensional regularization, we generate new operators of the formFootnote 25

$$\mathcal{L}_{{\rm{QC}}}^{({\rm{potential}})}\sim{{{m^4}} \over {M_{{\rm{Pl}}}^n}}\;{h^n},$$
(10.31)

with n ≥ 2, and where m is the graviton mass, and the contractions of h do not obey the structure presented in (6.9)(6.13). In a normal effective field theory this is not an issue as such operators are clearly irrelevant below the Planck scale. However, for massive gravity, the situation is more subtle.

As see in Section 10.1 (see also Section 10.2), massive gravity is phenomenologically viable only if it has an active Vainshtein mechanism which screens the effect of the helicity-0 mode in the vicinity of dense environments. This Vainshtein mechanisms relies on having a large background for the helicity-0 mode, π = π0 + δπ with \({\partial ^2}{\pi _0} \gg \Lambda _3^3 = {m^2}{M_{{\rm{Pl}}}}\), which in unitary gauge implies h = h0 + (δh, with h0MPl.

To mimic this effect, we consider a given background for h = h0MPl. Perturbing the new operators (10.31) about this background leads to a contribution at quadratic order for the perturbations δh which does not satisfy the Fierz-Pauli structure,

$$\mathcal{L}_{{\rm{QC}}}^{(2)}\sim{{{m^4}h_0^{n - 2}} \over {M_{{\rm{Pl}}}^n}}\;\delta {h^2}.$$
(10.32)

In terms of the helicity-0 mode π, considering δhδ2π/m2 this leads to higher derivative interactions

$$\mathcal{L}_{{\rm{QC}}}^{(2)}\sim{{h_0^{n - 2}} \over {M_{{\rm{Pl}}}^n}}\;\;{\left({{\partial ^2}\pi} \right)^2},$$
(10.33)

which revive the BD ghost at the scale \(m_{{\rm{ghost}}}^2 \sim h_0^2{({M_{{\rm{Pl}}}}/{h_0})^n}\). The mass of the ghost can be made arbitrarily small, (smaller than Λ3) by taking n ≫ 1 and h0MPl as is needed for the Vainshtein mechanism. In itself this would be a disaster for the theory as it means precisely in the regime where we need the Vainshtein mechanism to work, a ghost appears at an arbitrarily small scale and we can no longer trust the theory.

The resolution to this issue lies within the Vainshtein mechanism itself and its implementation not only at the classical level as was done to estimate the mass of the ghost in (10.33) but also within the calculation of the quantum corrections themselves. To take the Vainshtein mechanism consistently into account one needs to consider the effective action redressed by the interactions themselves (as was performed at the classical level for instance in (10.9)).

This redressing was taken into at the level of the one-loop effective action in Ref. [146] and it was shown that when resumed, the large background configuration has the effect of further suppressing the quantum corrections so that the mass of the ghost never reaches below the Planck scale even when h0MPl. To be more precise (10.33) is only one term in an infinite order expansion in h0. Resuming these terms leads rather to contribution of the form (symbolically)

$$\mathcal{L}_{{\rm{QC}}}^{(2)}\sim{1 \over {1 + {{{h_0}} \over {{M_{{\rm{Pl}}}}}}}}{1 \over {M_{{\rm{Pl}}}^2}}\;\;{\left({{\partial ^2}\pi} \right)^2},$$
(10.34)

so that the effective scale at which this operator is relevant is well above the Planck scale when h0MPl and is at the Planck scale when working in the weak-field regime h0MPl. Notice that h0 ∼ −MPl corresponds to a physical singularity in massive gravity (see [56]), and the theory would break down at that point anyways, irrespectively of the ghost.

As a result, at the one-loop level the quantum corrections destabilize the structure of the potential but in a way which is irrelevant below the Planck scale.

12.5 Strong coupling scale vs cutoff

Whether it is to compute the Vainshtein mechanism or quantum corrections to massive gravity, it is crucial to realize that the scale Λ = (m2 MPl)1/3 (denoted as Λ in what follows) is not necessarily the cutoff of the theory.

The cutoff of a theory corresponds to the scale at which the given theory breaks down and new physics is required to describe nature. For GR the cutoff is the Planck scale. For massive gravity the cutoff could potentially be below the Planck scale, but is likely well above the scale Λ, and the redressed scale Λ* computed in (10.24). Instead Λ (or Λ* on some backgrounds) is the strong-coupling scale of the theory.

When hitting the scale Λ or Λ* perturbativity breaks down (in the standard field representation of the theory), which means that in that representation loops ought to be taken into account to derive the correct physical results at these scales. However, it does not necessarily mean that new physics should be taken into account. The fact that tree-level calculations do not account for the full results does in no way imply that theory itself breaks down at these scales, only that perturbation theory breaks down.

Massive gravity is of course not the only theory whose strong coupling scale departs from its cutoff. See, for instance, Ref. [31] for other examples in chiral theory, or in gravity coupled to many species. To get more intuition on these types of theories and on the distinction between strong coupling scale and cutoff, consider a large number N ≫ 1 of scalar fields coupled to gravity. In that case the effective strong coupling scale seen by these scalars is \({M_{{\rm{eff}}}} = {M_{{\rm{Pl}}}}/\sqrt N \ll {M_{{\rm{Pl}}}}\), while the cutoff of the theory is still MPl (the scale at which new physics enters in GR is independent of the number of species living in GR).

The philosophy behind [31] is precisely analogous to the distinction between the strong coupling scale and the cutoff (onset of new physics) that arises in massive gravity, and summarizing the results of [31] would not make justice of their work, instead we quote the abstract and encourage the reader to refer to that article for further details:

“In effective field theories it is common to identify the onset of new physics with the violation of tree-level unitarity. However, we show that this is parametrically incorrect in the case of chiral perturbation theory, and is probably theoretically incorrect in general. In the chiral theory, we explore perturbative unitarity violation as a function of the number of colors and the number of flavors, holding the scale of the “new physics” (i.e., QCD) fixed. This demonstrates that the onset of new physics is parametrically uncorrelated with tree-unitarity violation. When the latter scale is lower than that of new physics, the effective theory must heal its unitarity violation itself, which is expected because the field theory satisfies the requirements of unitarity. (…) A similar example can be seen in the case of general relativity coupled to multiple matter fields, where iteration of the vacuum polarization diagram restores unitarity. We present arguments that suggest the correct identification should be connected to the onset of inelasticity rather than unitarity violation.” [31].

12.6 Superluminalities and (a)causality

Besides the presence of a low strong coupling scale in massive gravity (which is a requirement for the Vainshtein mechanism, and is thus not a feature that should necessarily try to avoid), another point of concern is the possibility to have superluminal propagation. This statements requires a qualification and to avoid any confusion, we shall first review the distinction between phase velocity, group velocity, signal velocity and front velocity and their different implications. We follow the same description as in [399] and [77] and refer to these books and references therein for further details.

  1. 1.

    Phase Velocity: For a wave of constant frequency, the phase velocity is the speed at which the peaks of the oscillations propagate. For a wave [77]

    $$f(t,x) = A\sin (\omega t - kx) = A\sin \left({\omega \left({t - {x \over {{v_{{\rm{phase}}}}}}} \right)} \right),$$
    (10.35)

    the phase velocity vphase is given by

    $${v_{{\rm{phase}}}} = {\omega \over k}.$$
    (10.36)
  2. 2.

    Group Velocity: If the amplitude of the signal varies, then the group velocity represents the speed at which the modulation or envelop of the signal propagates. In a medium where the phase velocity is constant and does not depend on frequency, the phase and the group velocity are the same. More generally, in a medium with dispersion relation ω (k), the group velocity is

    $${v_{{\rm{group}}}} = {{\partial \omega (k)} \over {\partial k}}.$$
    (10.37)

    We are familiar with the notion that the phase velocity can be larger than speed of light c (in this review we use units where c = 1.) Similarly, it has been known for now almost a century that

    “(…) the group velocity could exceed c in a spectral region of an anomalous dispersion” [399].

    While being a source of concern at first, it is now well-understood not to be in any conflict with the theory of general (or special) relativity and not to be the source of any acausality. The resolution lies in the fact that the group velocity does not represent the speed at which new information is transmitted. That speed is instead refer as the front velocity as we shall see below.

  3. 3.

    Signal Velocityyields the arrival of the main signal, with intensities of the order of magnitude of the input signal ” [77]. Nowadays it is common to define the signal velocity as the velocity from the part of the pulse which has reached at least half the maximum intensity. However, as mentioned in [399], this notion of speed rather is arbitrary and some known physical systems can exhibit a signal velocity larger than c.

  4. 4.

    Front Velocity: Physically, the front velocity represents the speed of the front of a disturbance, or in other words “Front velocity (…) correspond[s] to the speed at which the very first, extremely small (perhaps invisible) vibrations will occur.” [77]. The front velocity is thus the speed at which the very first piece of information of the first “forerunner” propagates once a front or a “sudden discontinuous turn-on of a field ” is turned on [399].

“The front is defined as a surface beyond which, at a given instant in time the medium is completely at rest ” [77],

$$f(t,x) = \theta (t)\sin (\omega t - kx),$$
(10.38)

where θ (t) is the Heaviside step function.

In practise the front velocity is the large (high frequency) limit of the phase velocity.

The distinction between these four types of velocities in presented in Figure 5. They are important to keep in mind and especially to be distinguished when it comes to superluminal propagation. Superluminal phase, group and signal velocities have been observed and measured experimentally in different physical systems and yet cause no contradiction with special relativity nor do they signal acausalities. See Ref. [318] for an enlightening discussion of the case of QED in curved spacetime.

Figure 5
figure 5

Difference between phase, group, signal and front velocities. At t = δt, the phase and group velocities are represented on the left and given respectively by v phase = δxP/δt and \({\upsilon _{{\rm{group}}}} = \delta {x_G}/\delta t\) (in the limit δt → 0.) The signal and front velocity represented on the right are given by vsignal = δxS/δt (where δxS is the point where at least half the intensity of the original signal is reached.) The front velocity is given by vfront = δxF/δt.

The front velocity, on the other hand, is the real ‘measure’ of the speed of propagation of new information, and the front velocity is always (and should always be) (sub)luminal. As shown in [445], “the ‘speed of light’ relevant for causality is vph(∞), i.e., the high-frequency limit of the phase velocity. Determining this requires a knowledge of the UV completion of the quantum field theory.” In other words, there is no sense in computing a classical version of the front velocity since quantum corrections always dominate.

When it comes to the presence of superluminalities in massive gravity and theories of Galileons this distinction is crucial. We first summarize the current state of the situation in the context of both Galileons and massive gravity and then give further details and examples in what follows:

  • In Galileons theories the presence of superluminal group velocity has been established for all the parameters which exhibit an active Vainshtein mechanism. These are present in spherically symmetric configurations near massive sources as well as in self-sourced plane waves and other configurations for which no special kind of matter is required.

  • Since massive gravity reduces to a specific Galileon theory in some limit, we expect the same result to be true there well and to yield solutions with superluminal group velocity. However, to date no fully consistent solution has yet been found in massive gravity which exhibits superluminal group velocity (let alone superluminal front velocity which would be the real signal of acausality). Only local configurations have been found with superluminal group velocity or finite frequency phase velocity but it has not been proven that these are stable global solutions. Actually, in all the cases where this has been checked explicitly so far, these local configurations have been shown not to be part of global stable solutions.

It is also worth noting that the potential existence of superluminal propagation is not restricted to theories which break the gauge symmetry. For instance, massless spin-3/2 are also known to propagate superluminal modes on some non-trivial backgrounds [306].

12.6.1 Superluminalities in Galileons

Superluminalities in Galileon and other closely related theories have been pointed out in several studies for more a while [412, 1, 262, 220, 115, 129, 246]. Note also that Ref. [313] was the first work to point out the existence of superluminal propagation in the higher-dimensional picture of DGP rather than in its purely four-dimensional decoupling limit. See also Refs. [112, 110, 311, 312, 218, 219] for related discussions on super- versus sub-luminal propagation in conformal Galileon and other DBI-related models. The physical interpretation of these superluminal propagations was studied in other non-Galileon models in [199, 43] and see [206, 469] for their potential connection with classicalization [214, 213, 205, 11].

In all the examples found so far, what has been pointed out is the existence of a superluminal group velocity, which is the regime inspected is the same as the phase velocity. As we will see below (see Section 10.7), in the one example where we can compute the phase velocity for momenta at which loops ought to be taken into account, we find (thanks to a dual description) that the corresponding front velocity is exactly luminal even though the low-energy group velocity is superluminal. This is no indication that all Galileon theories are causal but it comes to show how a specific Galileon theory which exhibits superluminal group velocity in some regime is dual to a causal theory.

In most of the cases considered, superluminal propagation was identified in a spherically symmetric setting in the vicinity of a localized mass as was presented in Section 10.1.2. To convince the reader that these superluminalities are independent of the coupling to matter, we show here how superluminal propagation can already occur in the vacuum in any Galileon theories without even the need of any external matter.

Consider an arbitrary quintic Galileon

$$\mathcal{L} = \pi \sum\limits_{n = 1}^4 {{{{c_{n + 1}}} \over {{\Lambda ^{3(n - 1)}}}}} {\mathcal{L}_n}(\Pi),$$
(10.39)

where the n are given in (6.10)(6.13) and we choose the canonical normalization c2 = 1/12. One can check that any plane-wave configuration of the form

$${\pi _0}({x^\mu}) = F({x^1} - t),$$
(10.40)

is a solution of the vacuum equations of motion for any arbitrary function F,

$$\sum\limits_{n = 1}^4 {{{(n + 1){c_n}} \over {{\Lambda ^{3(n - 1)}}}}} {\mathcal{L}_n}({\Pi _0}) = 0,$$
(10.41)

with Π0μν = μνπ0, since n (μνπ0) = 0 for any n ≥ 1 for a plane-wave of the form (10.40).

Now, considering perturbations riding on top of the plane-wave, π (xμ) = π0(t, x1) + δπ (xμ), these perturbations see an effective background-dependent metric similarly as in Section 10.1.1 and have the linearized equation of motion

$${Z^{\mu \nu}}({\pi _0}){\partial _\mu}{\partial _\nu}\delta \pi = 0,$$
(10.42)

with given in (10.7)

$${Z^{\mu \nu}}({\pi _0}) = \sum\limits_{n = 0}^3 {{{(n + 1)(n + 2){c_{n + 2}}} \over {{\Lambda ^{3n}}}}} {X^{(n)\mu \nu}}({\Pi _0})$$
(10.43)
$$= \left[ {{\eta ^{\mu \nu}} - {{12{c_3}} \over {{\Lambda ^3}}}{F{\prime}}{\prime} ({x^1} - t)(\delta _0^\mu + \delta _1^\mu)(\delta _0^\nu + \delta _1^\nu)} \right].$$
(10.44)

A perturbation traveling along the direction x1 has a velocity υ which satisfies

$${Z^{00}}{v^2} + 2{Z^{01}}v + {Z^{11}} = 0.$$
(10.45)

So, depending on wether the perturbation travels with or against the flow of the plane wave, it will have a velocity υ given by

$$v = - 1\quad {\rm{or}}\quad v = {{1 - {{12{c_3}} \over {{\Lambda ^3}}}{F{\prime}}{\prime} ({x^1} - t)} \over {1 + {{12{c_3}} \over {{\Lambda ^3}}}{F{\prime}}{\prime} ({x^1} - t)}}.$$
(10.46)

So, a plane wave which admitsFootnote 26

$$12{c_3}{F{\prime}}{\prime} < - {\Lambda ^3},$$
(10.47)

the perturbation propagates with a superluminal velocity. However, this velocity corresponds to the group velocity and in order to infer whether or not there is any acausality we need to derive the front velocity, which is the large momentum limit of the phase velocity. The derivation presented here presents a tree-level calculation and to compute the large momentum limit one would need to include loop corrections. This is especially important as 12c3F″ → −Λ3 as the theory becomes (infinitely) strongly coupled at that point [87]. So far, no computation has properly taken these quantum effects into account, and the (a)causality of Galileons theories is yet to determined.

12.6.2 Superluminalities in massive gravity

The existence of superluminal propagation directly in massive gravity has been pointed out in many references in the literature [87, 276, 192, 177] (see also [496] for another nice discussion). Unfortunately none of these studies have qualified the type of velocity which exhibits superluminal propagation. On closer inspection it appears that there again for all the cases cited the superluminal propagation has so far always been computed classically without taking into account quantum corrections. These results are thus always valid for the low frequency group velocity but never for the front velocity which requires a fully fledged calculation beyond the tree-level classical approximation [445].

Furthermore, while it is very likely that massive gravity admits superluminal propagation, to date there is no known consistent solution of massive gravity which has been shown to admit superluminal (even of group) velocity. We review the arguments in favor of superluminal propagation in what follows together with their limitations. Notice as well that while a Galileon theory typically admits superluminal propagation on top of static and spherically symmetric Vainshtein solutions as presented in Section 10.1.2, this is not the case for massive gravity see Section 10.1.3 and [58].

  1. 1.

    Argument: Some background solutions of massive gravity admit superluminal propagation.

    Limitation of the argument: the solutions inspected were not physical.

    Ref. [276] was the first work to point out the presence of superluminal group velocity in the full theory of massive gravity rather than in its Galileon decoupling limit. These superluminal modes ride on top of a solution which is unfortunately unrealistic for different reasons. First, the solution itself is unstable. Second, the solution has no rest frame (if seen as a perfect fluid) or one would need to perform a superluminal boost to bring the solution to its rest frame. Finally, to exist, such a solution should be sourced by a matter source with complex eigenvalues [142]. As a result the solution cannot be trusted in the first place, and so neither can the superluminal propagation of fluctuations about it.

  2. 2.

    Argument: Some background solutions of the decoupling limit of massive gravity admit superluminal propagation.

    Limitation of the argument: the solutions were only found in a finite region of space and time.

    In Ref. [87] superluminal propagation was found in the decoupling limit of massive gravity. These solutions do not require any special kind of matter, however the background has only be solved locally and it has not (yet) been shown whether or not they could extrapolate to sensible and stable asymptotic solutions.

  3. 3.

    Argument: There are some exact solutions of massive gravity for which the determinant of the kinetic matrix vanishes thus massive gravity is acausal.

    Limitation of the argument: misuse of the characteristics analysis — what has really been identified is the absence of BD ghost.

    Ref. [192] presented some solutions which appeared to admit some instantaneous modes in the full theory of massive gravity. Unfortunately the results presented in [192] were due to a misuse of the characteristics analysis.

    The confusion in the characteristics analysis arises from the very constraint that eliminates the BD ghost. The existence of such a constraint was discussed in length in many different formulations in Section 7 and it is precisely what makes ghost-free (or dRGT) massive gravity special and theoretically viable. Due to the presence of this constraint, the characteristics analysis should be performed after solving for the constraints and not before [326].

    In [192] it was pointed out that the determinant of the time kinetic matrix vanished in ghost-free massive gravity before solving for the constraint. This result was then interpreted as the propagation of instantaneous modes and it was further argued that the theory was then acausal. This result is simply an artefact of not properly taking into account the constraint and performing a characteristics analysis on a set of modes which are not all dynamical (since two phase space variables are constrained by the primary and secondary constrains [295, 294]). In other word it is precisely what would-have-been the BD ghost which is responsible for canceling the determinant of the time kinetic matrix. This does not mean that the BD ghost propagates instantaneously but rather that the BD ghost is not present in that theory, which is the very point of the theory.

    One can show that the determinant of the time kinetic matrix in general does not vanish when computing it after solving for the constraints. In summary the results presented in [192] cannot be used to deduce the causality of the theory or absence thereof.

  4. 4.

    Argument: Massive gravity admits shock wave solutions which admit superluminal and instantaneous modes.

    Limitation of the argument: These configurations lie beyond the regime of validity of the classical theory.

    Shock wave local solutions on top of which the fluctuations are superluminal were found in [177]. Furthermore, a characteristic analysis reveals the possibility for spacelike hypersurfaces to be characteristic. While interesting, such configurations lie beyond the regime of validity of the classical theory and quantum corrections ought to be included.

    Having said that, it is likely that the characteristic analysis performed in [177] and then in [178] would give the same results had it been performed on regular solutions.Footnote 27 This point is discussed below.

  5. 5.

    Argument: The characteristic analysis shows that some field configurations of massive gravity admit superluminal propagation and the possibility for spacelike hypersurfaces to be characteristic.

    Limitation of the argument: Same as point 2. Putting this limitation aside this result is certainly correct classically and in complete agreement with previous results presented in the literature (see point 2 where local solutions were given).

    Even though the characteristic analysis presented in [177] used shock wave local configurations, it is also valid for smooth wave solutions which would be within the regime of validity of the theory. In [178] the characteristic analysis for a shock wave was presented again and it was argued that CTCs were likely to exist.

    To better see the essence behind the general characteristic analysis argument, let us look at the (simpler yet representative) case of a Proca field with an additional quartic interaction as explored in [420, 467],

    $$\mathcal{L} = - {1 \over 4}F_{\mu \nu}^2 - {1 \over 2}{m^2}{A^\mu}{A_\mu} - {1 \over 4}\lambda {({A^\mu}{A_\mu})^2}.$$
    (10.48)

    The idea behind the characteristic analysis is to “replace the highest derivative terms ∂N A by kNÃ ” [420] so that one of the equations of motion is

    $$\left[ {({m^2} + \lambda {A^\nu}{A_\nu}){k^\alpha}{k_\alpha} + 2\lambda {{({A^\nu}{k_\nu})}^2}} \right]{k^\mu}{\tilde A_\mu} = 0.$$
    (10.49)

    When λ ≠ 0, one can solve this equation maintaining kμÃμ ≠ 0. Then there are certainly field configurations for which the normal to the characteristic surface is timelike and thus the mode with kμÃμ ≠ 0 can propagate superluminally in this Proca field theory. However, as we shall see below this very combination \({\mathcal Z} = [({m^2} + \lambda {A^v}{A_v}){k^\alpha}{k_\alpha} + 2\lambda {({A^v}{k_v})^2}] = 0\) with k2 timelike (say kμ = (1,0,0,0)) is the coefficient of the time-like kinetic term of the helicity-0 mode. So one can never have [(m2 + λAν Aν) kαkα + 2λ(Aνkν)2] = 0 with kμ = (1, 0, 0, 0) (or any timelike direction) without automatically having an infinitely strongly helicity-0 mode and thus automatically going beyond the regime of validity of the theory (see Ref. [87] for more details.)

    To see this more precisely, let us perform the characteristic analysis in the Stückelberg language. An analysis performed in unitary gauge is of course perfectly acceptable, but to connect with previous work in Galileons and in massive gravity the Stückelberg formalism is useful.

    In the Stückelberg language, AμAμ + m−1 μπ, keeping track of the terms quadratic in π, we have

    $$\mathcal{L}_\pi ^{(2)} = - {1 \over 2}{Z^{\mu \nu}}{\partial _\mu}\pi {\partial _\nu}\pi ,$$
    (10.50)

    @

    $${\rm{with}}\qquad {Z^{\mu \nu}}[{A_\mu}] = {\eta ^{\mu \nu}} + {\lambda \over {{m^2}}}{A^2}{\eta ^{\mu \nu}} + 2{\lambda \over {{m^2}}}{A^\mu}{A^\nu}.$$
    (10.51)

    It is now clear that the combination found in the characteristic analysis \({\mathcal Z}\) is nothing other than

    $$\mathcal{Z} \equiv {Z^{\mu \nu}}{k_\mu}{k_\nu},$$
    (10.52)

    where Zμν is the kinetic matrix of the helicity-0 mode. Thus, a configuration with \({\mathcal Z} = 0\) with kμ = (1, 0, 0, 0) implies that the Z00 component of helicity-0 mode kinetic matrix vanishes. This means that the conjugate momentum associated to cannot be solved for in this time-slicing, or that the helicity-0 mode is infinitely strongly coupled.

    This result should sound familiar as it echoes what has already been shown to happen in the decoupling limit of massive gravity, or here of the Proca field theory (see [43, 468] for related discussions in that case). Considering the decoupling limit of (10.48) with m → 0 and \(\hat \lambda = \lambda/{m^4} \to\) const, we obtain a decoupled massless gauge field and a scalar field,

    $${\mathcal{L}_{{\rm{DL}}}} = - {1 \over 4}F_{\mu \nu}^2 - {1 \over 2}{(\partial \pi)^2} - {{\hat \lambda} \over 4}{(\partial \pi)^4}.$$
    (10.53)

    For fluctuations about a given background configuration π = π0(x) + δπ, the fluctuations see an effective metric \({\tilde Z^{\mu v}}({\pi _0})\) given by

    $${\tilde Z^{\mu \nu}}({\pi _0}) = \left({1 + \hat \lambda {{(\partial {\pi _0})}^2}} \right){\eta ^{\mu \nu}} + 2\hat \lambda {\partial ^\mu}{\pi _0}{\partial ^\nu}{\pi _0}.$$
    (10.54)

    Of course unsurprisingly, we find \({\tilde Z^{\mu v}}({\pi _0}) \equiv {m^{- 2}}{Z^{\mu v}}[{m^{- 1}}{\partial _\mu}{\pi _0}]\). The fact that we can find superluminal or instantaneous propagation in the characteristic analysis is equivalent to the statement that in the decoupling limit there exists classical field configurations for π0 for which the fluctuations propagate superluminally (or even instantaneously). Thus, the results of the characteristic analysis are in agreement with previous results in the decoupling limit as was pointed out for instance in [1, 412, 87].

    Once again, if one starts with a field configuration where the kinetic matrix is well defined, one cannot reach a region where one of the eigenvalues of crosses zero without going beyond the regime of validity of the theory as described in [87]. See also Refs. [318, 445] for the use of the characteristic analysis and its relation to (micro-)causality.

The presence of instantaneous modes in some (self-accelerating) solutions of massive gravity was actually pointed out from the very beginning. See Refs. [139] and [364] for an analysis of self-accelerating solutions in the decoupling limit, and [125] for self-accelerating solutions in the full theory (see also [264] for a complementary analysis of self-accelerating solutions.) All these analysis had already found instantaneous modes on some self-accelerating branches of massive gravity. However, as pointed out in all these analysis, the real question is to establish whether or not these solutions lie within the regime of validity of the EFT, and whether one could reach such solutions with a finite amount of energy and while remaining within the regime of validity of the EFT.

This aspect connects with Hawking’s chronology protection argument which is already in effect in GR [302, 303], (see also [472] and [473] for a comprehensive review). This argument can be extended to Galileon theories and to massive gravity as was shown in Ref. [87].

It was pointed out in [87] and in many other preceding works that there exists local backgrounds in Galileon theories and in massive gravity which admit superluminal and instantaneous propagation. (As already mentioned, in point 2. above in massive gravity it is however unclear whether these localized backgrounds admit stable and consistent global realizations). The worry with superluminal propagation is that it could imply the presence of CTCs (closed timelike curves). However, when ‘cranking up’ the background sufficiently so as to reach a solution which would admit CTCs, the Galileon or the helicity-0 mode of the graviton becomes inevitably infinitely strongly coupled. This means that the effective field theory used breaks down and the background becomes unstable with arbitrarily fast decay time before any CTC can ever be formed.

Summary: Several analyses have confirmed the existence of local configurations admiting superluminalities in massive gravity. At this point, we leave it to the reader’s discretion to decide whether the existence of local classical configurations which admit superluminalities and sometimes even instantaneous propagation means that the theory should be discarded. We bear in mind the following considerations:

  • No stable global solutions have been found with the same properties.

  • No CTCs can been constructed within the regime of validity of the theory. As shown in Ref. [87] CTCs constructed with these configurations always lie beyond the regime of validity of the theory. Indeed in order to create a CTC, a mode needs to become instantaneous. As soon as a mode becomes instantaneous, the regime of validity of the classical theory is null and classical considerations are thus obsolete.

  • Finally, and most importantly, all the results presented so far for Galileons and massive gravity (including the ones summarized here), rely on classical configurations. As was explained at the beginning of this section causality is determined by the front velocity for which classical considerations break down. Therefore, no classical calculations can ever prove or disprove the (a)causality of a theory.

12.6.3 Superluminalities vs Boulware-Deser ghost vs Vainshtein

We finish by addressing what would be an interesting connection between the presence of superluminalities and the very constraint of massive gravity which removes the BD ghost which was pointed out in [192, 177, 178]. Actually, one can show that the presence of local configurations which admits superluminalities is generic to any theories of massive gravity, including DGP, cascading gravity, non-Fierz-Pauli massive gravity and even other braneworld models and is not specific to the presence of a constraint which removes the BD ghost. For instance consider a theory of massive gravity for which the cubic interactions about flat spacetime different than that of the ghost-free model of massive gravity. Then as shown in Section 2.5 (for instance, Eqs. (2.86) or 2.89, see also [111, 173]) the decoupling limit analysis leads to terms of the form

$${\mathcal{L}_{{\rm{FP}},\pi}} = - {1 \over 2}{(\partial \pi)^2} + {4 \over {{M_{{\rm{Pl}}}}{m^4}}}\left({[\Pi ][{\Pi ^2}] - [{\Pi ^3}]} \right).$$
(10.55)

As we have shown earlier, results from this decoupling limit are in full agreement with a characteristic analysis.

The plane wave solutions provided in (10.40) is still a vacuum solution in this case. Following the same analysis as that provided in Section 10.6.1, one can easily find modes propagating with superluminal group and phase velocity for appropriate choices of functions F (x1t) (while keeping within the regime of validity of the theory.)

Alternatively, let us look a background configuration \(\overset - \pi\) with \(\bar \Pi = {\partial ^2}\bar \pi\). Without loss of generality at any point x one can diagonalize the matrix \(\bar \Pi\). Focusing on a mode traveling along the x1 direction with momentum kμ = (k0, k1, 0, 0), we find the dispersion relation

$$\begin{array}{*{20}c} {\left({k_0^2 - k_1^2} \right) + {{16} \over {{M_{{\rm{Pl}}}}{m^4}}}{{\left({{k_0} - {k_1}} \right)}^2}\left[ {k_1^2\left({\bar \Pi _{\;\;0}^0 + \bar \Pi _{\;\;2}^2 + \bar \Pi _{\;\;3}^3} \right) + k_0^2\left({\bar \Pi _{\;\;1}^1 + \bar \Pi _{\;\;2}^2 + \bar \Pi _{\;\;3}^3} \right) + 2{k_0}{k_1}\bar \Pi _{\;\;\mu}^\mu} \right]\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= 0.} \\ \end{array}$$
(10.56)

The presence of higher power in k is nothing else but the signal of the BD ghost about generic backgrounds where \(\bar \Pi \ne 0\). Performing a characteristic analysis at this point would focus on the higher powers in k which are intrinsic to the ghost. One can follow instead the non-ghost mode which is already present even when \(\bar \Pi = 0\). To follow this mode, it is therefore sufficient to perform a perturbative analysis in k. Equation (10.56) can always be solved for k0 = k1 as well as for

$${k_0} = - {k_1} + {{32} \over {{M_{{\rm{Pl}}}}{m^4}}}k_1^3\left({\bar \Pi _{\;0}^0 - \bar \Pi _{\;1}^1} \right) + \mathcal{O}\left({{{k_1^5{{\bar \Pi}^2}} \over {{m^8}{M_{{\rm{Pl}}}}}}} \right).$$
(10.57)

We can, therefore, always find a configuration for which \(k_0^2 > k_1^2\) at least perturbatively which is sufficient to imply the existence of superluminalities. Even if this calculation was performed perturbatively, it still implies the presence classical superluminalities like in the previous analysis of Galileon theories or ghost-free massive gravity.

As a result the presence of local solutions in massive gravity which admit superluminalities is not connected to the constraint that removes the BD ghost. Rather it is likely that the presence of superluminalities could be tied to the Vainshtein mechanism (with flat asymptotic boundary conditions), which as we have seen is crucial for these types of theories (see Refs. [1, 313] and [129] for a possible connection.) More recently, the presence of superluminalities has also been connected to the idea of classicalization which is tied to the Vainshtein mechanism [206, 469]. It is possible that the only way these superluminalities could make sense is through this idea of classicalization. Needless to say this is very much speculative at the moment. Perhaps the Galileon dualities presented below could help understanding these open questions.

12.7 Galileon duality

The low strong coupling scale and the presence of superluminalities raises the question of how to understand the theory beyond the redressed strong coupling scale, and whether or not the superluminalities are present in the front velocity.

A non-trivial map between the conformal Galileon and the DBI conformal Galileon was recently presented in [113] (see also [55]). The conformal Galileon side admits superluminal propagation while the DBI side of the map is luminal. Since both sides are related by a ‘simple’ field redefinition which does not change the physics, and cannot change the causality of the theory, this suggests that the superluminalities encountered in that example must be in the group velocity rather than the front velocity.

Recently, another Galileon duality was proposed in [115] and [136] by use of simple Legendre transform. First encountered within the decoupling limit of bi-gravity [224], the duality can be seen as being related to the freedom in how to introduce the Stückelberg fields. However, the duality survives independently from bi-gravity and could be significant in the context of massive gravity.

To illustrate this duality, we start with a full Galileon in dimensions as in (10.6)

$$S = \int {{{\rm{d}}^d}} x\left({\pi \sum\limits_{n = 1}^d {{{{c_{n + 1}}} \over {{\Lambda ^{3(n - 1)}}}}} {\mathcal{L}_n}[\Pi ]} \right),$$
(10.58)

and perform the field redefinition

$$\pi (x)\quad \rightarrow \quad \rho (\tilde x) = - \pi (x) - {1 \over {2{\Lambda ^3}}}{\left({{{\partial \pi (x)} \over {\partial {x^\mu}}}} \right)^2}$$
(10.59)
$${x^\mu}\quad \rightarrow \quad {\tilde x^\mu} = {x^\mu} + {\eta ^{\mu \nu}}{1 \over {{\Lambda ^3}}}{{\partial \pi (x)} \over {\partial {x^\nu}}},$$
(10.60)

This transformation is fully invertible without requiring any inverse of derivatives,

$$\rho (\tilde x)\quad \rightarrow \quad \pi (x) = - \rho (\tilde x) - {1 \over {2{\Lambda ^3}}}{\left({{{\partial \rho (\tilde x)} \over {\partial {{\tilde x}^\mu}}}} \right)^2}$$
(10.61)
$${\tilde x^\mu}\quad \rightarrow \quad {x^\mu} = {\tilde x^\mu} + {\eta ^{\mu \nu}}{1 \over {{\Lambda ^3}}}{{\partial \rho (\tilde x)} \over {\partial {{\tilde x}^\nu}}},$$
(10.62)

so the field transformation is not non-local (at least not in the traditional sense) and does not hide degrees of freedom.

In terms of the dual field \(\rho (\tilde x)\), the Galileon theory (10.58) is nothing other than another Galileon with different coefficients,

$$S = \int {\;{{\rm{d}}^d}} \tilde x\left({\rho (\tilde x)\sum\limits_{n = 1}^d {{{{p_{n + 1}}} \over {{\Lambda ^{3(n - 1)}}}}} {\mathcal{L}_n}[\Sigma ]} \right),$$
(10.63)

with \(\Sigma_{\mu v} {= {\partial ^2}\rho (\tilde x)/\partial {{\tilde x}^\mu}\partial {{\tilde x}^v}}\) and the new coefficients are given by [136]

$${p_n} = {1 \over n}\sum\limits_{k = 2}^{d + 1} {{{(- 1)}^k}} {c_k}{{k(d - k + 1)!} \over {(n - k)!(d - n + 1)!}}.$$
(10.64)

This duality thus maps a Galileon to another Galileon theory with different coefficients. In particular this means that the free theory cn>2 = 0 maps to another non-trivial (d + 1)th order Galileon theory with pn ≠ 0 for any 2 ≤ nd + 1. This dual Galileon theory admits superluminal propagation precisely in the same way as was pointed out on the spherically symmetric configurations of Section 10.1.2 or on the plane wave solutions of Section 10.6.1. Yet, this non-trivial Galileon is dual to a free theory which is causal and luminal by definition.

What was computed in these examples for a non-trivial Galileon theory (and in all the examples known so far in the literature) is only the tree-level group velocity valid till the (redressed) strong coupling scale of the theory. Once hitting the (redressed) strong coupling scale the loops need to be included. In the dual free theory however there are no loops to account for, and thus the result of luminal velocity in that free theory is valid at all scale and has to match the front velocity. This is strongly suggestive that the front velocity in that example of non-trivial Galileon theory is luminal and the theory is causal even though it exhibits a superluminal group velocity.

It is clear at this point that a deeper understanding of this class of theories is required. We expect this will be the subject of further studies. In the rest of this review, we focus on some phenomenological aspects of massive gravity before presenting other theories of massive gravity.

13 Part III Phenomenological Aspects of Ghost-free Massive Gravity

14 Phenomenology

Below, we summarize some of the phenomenology of massive gravity and DGP. Many other interesting results have been derived in the literature, including the implication for the very Early Universe. For instance false vacuum decay and the Hartle-Hawking no-boundary proposal was studied in the context of massive gravity in [498, 439, 499] where it was shown that the graviton mass could increase the rate. The implications of massive gravity to the cyclic Universe were also studied in Ref. [91] with a regular bounce. We emphasize that in massive gravity the reference metric has to be chosen once and for all and cannot be modified, no matter what the background configuration is (no matter whether we are interested in cosmology, or in spherically symmetry configurations or other). This is a consistent procedure since massive gravity has been shown to be free of the BD ghost for any choice of reference metric independently of the background configuration. Theories with different reference metrics represent different independent theories.

14.1 Gravitational waves

14.1.1 Speed of propagation

If the photon had a mass it would no longer propagate at ‘the speed of light’, but at a lower speed. For the photon its speed of propagation is known with such an accuracy in so many different media that it can be used to put the most stringent constraints on the photon mass to [68] mγ < 10−18 eV. In the rest of this review we will adopt the viewpoint that the photon is massless and that light does indeed propagate at the ‘speed of light’.

The earliest bounds on the graviton mass were based on the same idea. As described in [487], (see also [394]), if the graviton had a mass, gravitational waves would propagate at a speed different than that of light, \(\upsilon _g^2 = 1 - {m^2}/{E^2}\) (assuming a speed of light c = 1). This different velocity between the light and gravitational waves would manifest itself in observations of supernovae. Assuming the emission of a gravitational wave with frequency larger than the graviton mass, this could lead to a bound on the graviton mass of m < 10−23 eV considering a frequency of 100 Hz and a supernovae located 200 Mpc away [487] (assuming that the photon propagates at the speed of light).

Alternatively, another way to test the speed of gravitational waves and bound the graviton mass without relying on any assumptions on the photon is through the observation of inspiralling compact objects which allows to derive the frequency-dependence of GWs. The detection of GWs in Advanced LIGO could then bound the graviton mass potentially all the way down to < 10−29 eV [487, 486, 71].

The graviton mass is also relevant for the production of primordial gravitational waves during inflation. Following the analysis of [282] it was shown that the graviton mass opens up the production of gravitational waves during inflation with a sharp peak with a height and position which depend on the graviton mass. See also [403] for the study of exact plane wave solutions in massive gravity.

Nevertheless, these bounds on the graviton mass are relatively weak compared to the typical value of m ∼ 10−30 − 10−33 eV considered till now in this review. The reason for this is because these bounds do not take into account the effects arising from the additional polarization in the gravitational waves which would be present if the graviton had a mass in a Lorentz-invariant theory. For the photon, if it had a mass, the additional polarization would decouple and would therefore be irrelevant (this is related to the absence of vDVZ discontinuity at the classical level for a Proca theory.) In massive gravity, however, the helicity-0 mode of the graviton couples to matter. As we shall see below, the bounds on the graviton mass inferred from the absence of fifth forces are typically much more stringent.

14.1.2 Additional polarizations

One of the predictions of GR is the existence of gravitational waves (GW) with two transverse independent polarizations.

While GWs have not been directly detected via interferometer yet, they have been detected through the spin-down of binary pulsar systems [322, 457, 485]. This detection via binary pulsars does not count as a direct detection, but it matches expectations from GWs with such an accuracy, and for now so many different systems of different relative masses that it seems unlikely that the spin-down could be due to something different than the emission of GWs.

In a modified theory of gravity, one could expect a total of up to six polarizations for the GWs as seen in Figure 6.

Figure 6
figure 6

Polarizations of gravitational waves in general relativity and potential additional polarizations in modified gravity.

As emphasized in the first part of this review, and particularly in Section 2.5, the sixth excitation, namely the longitudinal one, represents a ghost degree of freedom. Thus, if that mode is observed, it cannot be arising from a Lorentz-invariant massive graviton. Its presence could be linked for instance to new scalar degrees of freedom which are independent from the graviton itself. In massive gravity, only five polarizations are expected. Notice however that the helicity-1 mode does not couple directly to matter or external sources, so it is unlikely that GWs with polarizations which mix the transverse and longitudinal directions would be produced in a natural process.

Furthermore, any physical process which is expected to produce GWs would include very dense sources where the Vainshtein mechanism will thus be expected to be active and screen the effect of the helicity-0 mode. As a result the excitation of the breathing mode is expected to be suppressed in any theory of massive gravity which includes an active Vainshtein mechanism.

So, while one could in principle expect up to six polarizations for GWs in a modified theory of gravity, in massive gravity only the two helicity-2 polarizations are expected to be produced in a potentially observable amount by interferometers like advanced-LIGO [289]. To summarize, in ghost-free massive gravity or DGP we expect the following:

  • The helicity-2 modes are produced in the same way as in GR and would be indistinguishable if they travel distances smaller than the graviton Compton wavelength

  • The helicity-1 modes are not produced

  • The breathing or conformal mode is produced but suppressed by the Vainshtein mechanism and so the magnitude of this mode is suppressed compared to the helicity-2 polarization by many orders of magnitudes.

  • The longitudinal mode does not exist in a ghost-free theory of massive gravity. If such a mode is observed it must be arise from another field independent from the graviton.

We will also discuss the implications for indirect detection of GWs via binary pulsar spin-down in Section 11.4. We will see that already in these setups the radiation in the breathing mode is suppressed by 8 orders of magnitude compared to that in the helicity-2 mode. In more relativistic systems such as black-hole mergers, this suppression will be even bigger as the Vainshtein mechanism is stronger in these cases, and so we do not expect to see the helicity-0 mode component of a GW emitted by such systems.

To summarize, while additional polarizations are present in massive gravity, we do not expect to be able to observe them in current interferometers. However, these additional polarizations, and in particular the breathing mode can have larger effects on solar-system tests of gravity (see Section 11.2) as well as for weak lensing (see Section 11.3), as we review in what follows. They also have important implications for black holes as we discuss in Section 11.5 and in cosmology in Section 12.

14.2 Solar system

A lot of the phenomenology of massive gravity can be derived from its decoupling limit where it resembles a Galileon theory. Since the Galileon was first encountered in DGP most of the phenomenology was first derived for that model. The extension to massive gravity is usually relatively straightforward with a few subtleties which we mention at the end. We start by reviewing the phenomenology assuming a cubic Galileon decoupling limit, which is directly applicable for DGP and then extend to the quartic Galileon and ghost-free massive gravity.

Within the context of DGP, a lot of its phenomenology within the solar system was derived in [388, 386] using the full higher-dimensional picture as well as in [215]. In these work the effect from the helicity-0 mode in the advanced of the perihelion were computed explicitly. In particular in [215] it was shown how an infrared modification of gravity could have an effect on small solar system scales and in particular on the Moon. In what follows we review their approach.

Consider a point source of mass M localized at r = 0. In GR (or rather Newtonian gravity as it is a sufficient approximation), the gravitational potential mediated by the point source is

$$\Psi (r) = - {h_{00}} = - {M \over {4\pi {M_{{\rm{Pl}}}}}}{1 \over r} = - {1 \over {{M_{{\rm{Pl}}}}}}{{{r_S}} \over r},$$
(11.1)

where rS is the Schwarzschild radius associated with the source. Now, in a theory of massive gravity, the helicity-0 mode of the graviton also contributes to the gravitational potential with an additional amount δ Ψ. As seen in Section 10.1, when the Vainshtein mechanism is active the contribution from the helicity-0 mode is very much suppressed. δ Ψ ≪ Ψ but measurements in the Solar system are reaching such a level of accuracy than even a small deviation δ Ψ could in principle be observable [488].

In the decoupling limit of DGP, matter fields couple to the following perturbed metric

$$h_{\mu \nu}^{{\rm{DGP}}} = h_{\mu \nu}^{{\rm{Einstein}}} + {\pi _0}{\eta _{\mu \nu}},$$
(11.2)

where π is the helicity-0 mode of the graviton (up to some dimensionless numerical factors which we have set to unity). In massive gravity, matter couples to the following metric (see the discussion in Section 10.1.3 and (10.21)),

$$h_{\mu \nu}^{{\rm{massive}}\;\;{\rm{gravity}}} = h_{\mu \nu}^{{\rm{Einstein}}} + {\pi _0}{\eta _{\mu \nu}} + {\alpha \over {\Lambda _3^3}}{\partial _\mu}{\pi _0}{\partial _\nu}{\pi _0}.$$
(11.3)

The deviation δ Ψ to the gravitational potential is thus given by

$$\delta \Psi = - {\pi _0},$$
(11.4)

(notice that in the static and spherically symmetric case μπ∂νπ leads to no correction to the gravitational potential).

Following [215] we define as ϵ the fractional change in the gravitational potential

$$\epsilon (r) = {{\delta \Psi} \over \Psi} = {{{\pi _0}(r)} \over {{M_{{\rm{Pl}}}}}}{r \over {{r_S}}}.$$
(11.5)

This change in the Newtonian force implies a change in the motion of a test particle (for instance the Moon) within that gravitational field of the localized mass M (of for instance the Earth) as compared to GR. For elliptical orbits this leads to an additional angular precession of the perihelion due to the force mediated by the helicity-0 mode on top of that of GR. The additional advanced of the perihelion per orbit is given in terms of as

$$\delta \phi = \pi {R_0}{{\rm{d}} \over {{\rm{d}}r}}\left({{r^2}{{\rm{d}} \over {{\rm{d}}r}}({r^{- 1}}\epsilon)} \right){\vert_{{R_0}}},$$
(11.6)

where R0 is the mean orbit radius, (notice the π in that expression is the standard value π = 3.14 … nothing to do with the helicity-0 mode).

14.2.1 DGP and cubic Galileon

In the decoupling limit of DGP (cubic Galileon) π was given in (10.15) and \({r^{- 1}}\epsilon \sim {(r/r_{\,\,*}^3)^{1/2}}\), where r* is the strong coupling radius derived in (10.14), \({r_*} = \Lambda _3^{- 1}{(M/4\pi {M_{{\rm{Pl}}}})^{1/3}}\) leading to an anomalous advance of the perihelion

$$\delta \phi \sim{{3\pi} \over 4}{\left({{r \over {{r_{\ast}}}}} \right)^{3/2}}.$$
(11.7)

When the graviton mass goes to zero Λ3 → ∞ and the departure from GR goes to zero, this is another example of how the Vainshtein mechanism arises. Interestingly, it was pointed out in [388, 386] that in DGP the sign of this anomalous angle depends on whether on the branch studied (self-accelerating branch — or normal branch).

For the Earth-Moon system, taking \(\Lambda _3^3({m^2}{M_{{\rm{Pl}}}})\) with mH0 ∼ 10−33 eV, this leads to an anomalous precision of the order of [215]

$$\delta \phi \sim{10^{- 12}}\;{\rm{rad/orbit}},$$
(11.8)

which is just on the edge of the level of accuracy currently reached by the lunar laser ranging experiment [488] (for instance the accuracy quoted for the effective variation of the Gravitational constant is (4 ± 9) × 10−13/year ∼ (0.5 ± 1) × 10−11/orbit).

As pointed out in [215] and [388, 386], the effect could be bigger for the advance of the perihelion of Mars around the Sun, but at the moment the accuracy is slightly less.

14.2.2 Massive gravity and quartic Galileon:

As already mentioned in Section 10.1.2, the Vainshtein mechanism is typically much strongerFootnote 28 in the spherically symmetric configuration of the quartic Galileon and thus in massive gravity (see for instance the suppression of the force given in (10.17)). Using the same values as before for a quartic Galileon we obtain

$$\delta \phi \sim2\pi {\left({{r \over {{r_{\ast}}}}} \right)^2}\sim{10^{- 16}}/{\rm{orbit}}.$$
(11.9)

Furthermore, in massive gravity the parameter that enters this relation is not directly the graviton but rather the graviton mass weighted with the coefficient α = − (1 + 3/2α3) which depends on the cubic potential term 3, assuming that α4 = − α3/4, (see Section 10.1.3 for more precision)

$$\delta \phi \sim{10^{- 16}}{\left({{1 \over \alpha}{{{m^2}} \over {{{({{10}^{- 33}}\;{\rm{eV}})}^2}}}} \right)^{2/3}}/{\rm{orbit}},$$
(11.10)

This is typically very far from observations unless we are very close to the minimal model.Footnote 29

14.3 Lensing

As mentioned previously, one peculiarity of massive gravity not found in DGP nor in a typical Galileon theory (unless we derive the Galileons from a higher-dimensional brane picture [157]) is the new disformal coupling to matter of the form μπ∂νπTμν, which means that the helicity-0 mode also couples to conformal matter.

In the vacuum, for a static and spherically symmetric configuration the coupling μπ∂νπTμν plays no role. So to the level at which we are working when deriving the Vainshtein mechanism about a point-like mass this additional coupling to matter does not affect the background configuration of the field (see [140] for a discussion outside the vacuum, taking into account for the instance the effect of the Earth atmosphere). However, it does affect this disformal coupling does affect the effect metric seen by perturbed sources on top of this configuration. This could have some implications for structure formation is to the best of our knowledge have not been fully explored yet, and does affect the bending of light. This effect was pointed out in [490] and the effects to gravitational lensing were explored. We review the key results in what follows and refer to [490] for further discussions (see also [448]).

In GR, the relevant potential for lensing is \({\Phi _L} = {1 \over 2}(\Phi - \psi) \sim 2({r_S}/r)\), where we use the same notation as before, h00 = Ψ and hij = Φδij. A conformal coupling of the form πημν, does not affect this lensing potential but the disformal coupling \(\alpha/\Lambda _3^3{M_{{\rm{Pl}}}}{\partial _\mu}\pi {\partial _\mu}\pi {T^{\mu v}}\) leads to a new contribution δ Φ given by

$$\delta \Phi = {\alpha \over {\Lambda _3^3{M_{{\rm{Pl}}}}}}\pi _0{\prime} {(r)^2}.$$
(11.11)

[Note we use a different notation that in [490], here α = 1 + 3α3.] This new contribution to the lensing potential leads to an anomalous fractional lensing of

$$\mathcal{R} = {{{1 \over 2}\delta \Phi} \over {{\Phi _L}}}\sim{r \over {4{r_S}}}\left({{{{\Lambda _3}} \over {{\alpha ^{1/3}}{M_{{\rm{Pl}}}}}}} \right){\left({{M \over {4\pi {M_{{\rm{Pl}}}}}}} \right)^{2/3}}.$$
(11.12)

for the bending of light about the Sun, this leads to an effect of the order of

$$\mathcal{R}\sim{10^{- 11}}{\left({{1 \over \alpha}{{{m^2}} \over {{{({{10}^{- 33}}{\rm{eV}})}^2}}}} \right)^{1/3}},$$
(11.13)

which is utterly negligible. Note that this is a tree-level calculation. When getting at these distances loops ought to be taken into account as well.

At the level of galaxies or clusters of galaxy, the effect might be more tangible. The reason for that is that for the mass of a galaxy, the associated strong coupling radius is not much larger than the galaxy itself and thus at the edge of a galaxy these effects could be stronger. These effects were investigated in [490] where it was shown a few percent effect on the tangential shear caused by the helicity-0 mode of the graviton or of a disformal Galileon considering a Navarro-Frenk-White halo profile, for some parameters of the theory. Interestingly, the effect peaks at some specific radius which is the same for any halo when measured in units of the viral radius. Even though the effect is small, this peak could provide a smoking gun for such modifications of gravity.

Recently, another analysis was performed in Ref. [407], where the possibility to testing theories of modified gravity exhibiting the Vainshtein mechanism against observations of cluster lensing was explored. In such theories, like in massive gravity, the second derivative of the field can be large at the transition between the screened and unscreened region, leading to observational signatures in cluster lensing.

14.4 Pulsars

One of the main predictions of massive gravity is the presence of new polarizations for GWs. While these new polarization might not be detectable in GW interferometers as explained in Section 11.1.2, we could still expect them to lead to detectable effects in the binary pulsar systems whose spin-down is in extremely good agreement with GR. In this section, we thus consider the power emitted in the helicity-0 mode of the graviton in a binary-pulsar system. We use the effective action approach derived by Goldberger and Rothstein in [254] and start with the decoupling limit of DGP before exploring that of ghost-free massive gravity and discussing the subtleties that arise in that case. We mainly focus on the monopole and quadrupole radiation although the whole formalism can be derived for any multipoles. We follow the derivation of Refs. [158, 151], see also Refs. [100, 18] for related studies.

In order to account for the Vainshtein mechanism into account we perform a similar background-perturbation split as was performed in Section 10.1. The source is thus split as T = T0 + δT where T0 is a static and spherically source representing the total mass localized at the center of mass and captures the motion of the companions with respect to the center of mass.

This matter profile leads to a profile for the helicity-0 mode (here mimicked as a cubic Galileon which is the case for DGP) as in (10.3) as π = π0(r) + π, where the background π0(r) has the same static and spherical symmetry as T0 and so has the same profile as in Section 10.1.2.

The background configuration π0(r) of the field was derived in (10.13) where M accounts in this case for the total mass of both companions and is the distance to the center of mass. Following the same procedure, the fluctuation ϕ then follows a modified Klein-Gordon equation

$$Z({\pi _0})\partial _x^2\phi (x) = 0,$$
(11.14)

where the Vainshtein mechanism is fully encoded in the background dependent prefactor Z (π0) ∼ 1 + (2π03 and Z (π0) 1 in the vicinity of the binary pulsar system (well within the strong coupling radius defined in (10.14).)

Expanding the field in spherical harmonics the mode functions satisfy

$$Z({\pi _0})\partial _x^2\left[ {{u_\ell}(r){Y_{\ell m}}(\Omega){e^{- i\omega t}}} \right] = 0,$$
(11.15)

where the modes are normalized so as to satisfy the standard normalization in the WKB region, for rω−1.

The total power emitted via the field π is given by the sum over these mode functions,

$${P^{(\pi)}} = \sum\limits_{\ell = 0}^\infty {P_\ell ^{(\pi)}} = \sum\limits_{\ell = 0}^\infty {\sum\limits_{m = - \ell}^\ell {\sum\limits_{n \geq 0} {(n{\Omega _P})}}} \vert {1 \over {{M_{{\rm{Pl}}}}{T_P}}}\int\nolimits_0^{{T_P}} {{{\rm{d}}^4}} x{u_\ell}(r){Y_{\ell ,m}}{e^{- in{\Omega _P}t}}\delta T{\vert ^2},$$
(11.16)

where TP is the orbital period of the binary system and ΩP = 2π/Tp, is the corresponding angular velocity. \(P_0^{(\pi)}\) is the power emitted in the monopole, \(P_1^{(\pi)}\) in the dipole \(P_2^{(\pi)}\) in the quadrupole of the field π uniquely, etc. … in addition to the standard power emitted in the helicity-2 quadrupole channel of GR.

Without the Vainshtein mechanism, the mode functions would be the same as for a standard free-field in flat space-time, \({\upsilon _\ell} \sim {1 \over {r\sqrt {\pi \omega}}}\,{\rm{cos(}}\omega {\rm{r)}}\) and the power emitted in the monopole would be larger than that emitted in GR, which would be clearly ruled out by observations. The Vainshtein mechanism is thus crucial here as well for the viability of DGP or ghost-free massive gravity.

14.4.1 Monopole

Taking the prefactor Z (π0) into account, the zero mode for the monopole is given instead by

$${u_0}(r)\sim{1 \over {{{(\omega r_{\ast}^3)}^{1/4}}}}\left({1 - {{{{(\omega r)}^2}} \over 4} + \cdots} \right),$$
(11.17)

in the strong coupling regime rω−1r* which is the region where the radiation would be emitted. As a result, the power emitted in the monopole channel through the field π is given by [158]

$$P_0^{(\pi)} = \kappa {{{{({\Omega _P}\bar r)}^4}} \over {{{({\Omega _P}{r_{\ast}})}^{3/2}}}}{{{\mathcal{M}^2}} \over {M_{{\rm{Pl}}}^2}}\Omega _P^2,$$
(11.18)

where \({\mathcal M}\) is the reduced mass and \(\bar r\) is the semi-major axis of the orbit and κ is a numerical prefactor of order 1 which depends on the eccentricity of the orbit.

This is to be compared with the Peters-Mathews formula for the power emitted in GR (in the helicity-2 modes) in the quadrupole [428],

$$P_2^{({\rm{Peters - Mathews}})} = \tilde \kappa {({\Omega _P}\bar r)^4}{{{{\tilde{\mathcal{M}}}^2}} \over {M_{{\rm{Pl}}}^2}}\Omega _P^2,$$
(11.19)

where \(\tilde \kappa\) is again a different numerical prefactor which depends on the eccentricity of the orbit, and \(\tilde {\mathcal M}\) is a different combination of the companion masses, when both masses are the same (as is almost the case for the Hulse-Taylor pulsar), \({\mathcal M} = \tilde {\mathcal M}\).

We see that the radiation in the monopole is suppressed by a factor of (ΩPr*)−3/2 compared with the GR result. For the Hulse-Taylor pulsar this is a suppression of 10 orders of magnitudes which is completely unobservable (at best the precision of the GR result is of 3 orders of magnitude).

Notice, however, that the suppression is far less than what was naively anticipated from the static approximation in Section 10.1.2.

The same analysis can be performed for the dipole emission with an even larger suppression of about 19 orders of magnitude compared the Peters-Mathews formula.

14.4.2 Quadrupole

The quadrupole emission in the field π is slightly larger than the monopole. The reason is that energy conservation makes the non-relativistic limit of the monopole radiation irrelevant and one needs to take the first relativistic correction into account to emit in that channel. This is not so for the quadrupole as it does not correspond to the charge associated with any Noether current even in the non-relativistic limit.

In the non-relativistic limit, the mode function for the quadrupole is simply \({\upsilon _2}(r) \sim {(\omega r)^{3/2}}/{(\omega r_{\,\,*}^3)^{1/4}}\) yielding a quadrupole emission

$$P_2^{(\pi)} = \bar{\kappa} {{{{({\Omega _P}\bar r)}^3}} \over {{{({\Omega _P}{r_{\ast}})}^{3/2}}}}{{{{\bar{\mathcal{M}}}^2}} \over {M_{{\rm{Pl}}}^2}}\Omega _P^2,$$
(11.20)

where \(\tilde \kappa\) is another numerical factor which depends on the eccentricity of the orbit and \(\tilde {\mathcal M}\) another reduced mass. The Vainshtein suppression in the quadrupole is \({({\Omega _P}{r_*})^{- 3/2}}{(\Omega \bar r)^{- 1}} \sim {10^{- 8}}\) for the Hulse-Taylor pulsar, and is thus well below the limit of being detectable.

14.4.3 Quartic Galileon

When extending the analysis to more general Galileons or to massive gravity which includes a quartic Galileon, we expect a priori by following the analysis of Section 10.1.2, to find a stronger Vainshtein suppression. This result is indeed correct when considering the power radiated in only one multipole. For instance in a quartic Galileon, the power emitted in the field π via the quadrupole channel is suppressed by 12 orders of magnitude compared the GR emission.

However, this estimation does not account for the fact that there could be many multipoles contributing with the same strength in a quartic Galileon theory [151].

In a quartic Galileon theory, the effective metric in the strong coupling radius for a static and spherically symmetric background is

$${Z_{\mu \nu}}\;\;{\rm{d}}{x^\mu}\;\;{\rm{d}}{x^\nu}\sim{\left({{{\pi _0{\prime}} \over {{\Lambda ^3}r}}} \right)^2}\left({- {\rm{d}}{t^2} + {\rm{d}}{r^2} + r_{\ast}^2\;\;{\rm{d}}{\Omega ^2}} \right),$$
(11.21)

the fact that the angular direction is not suppressed by r2 but rather by a constant \(r_{\,\,*}^2\) implies that the multipoles are no longer suppressed by additional powers of velocity as is the case in GR or in the cubic Galileon. This implies that many multipoles contribute with the same strength, yielding a potentially large results. This is a sign that perturbation theory is not under control on top of this static and spherically symmetric background and one should really consider a more realistic background which will resume some of these contributions.

In situations where there is a large hierarchy between the mass of the two objects (which is the case for instance within the solar system), perturbation theory can be seen to remain under control and the power emitted in the quartic Galileon is completely negligible.

14.5 Black holes

As in any gravitational theory, the existence and properties of black holes are crucially important for probing the non-perturbative aspects of gravity. The celebrated black-hole theorems of GR play a significant role in guiding understanding of non-perturbative aspects of quantum gravity. Furthermore, the phenomenology of black holes is becoming increasingly important as understanding of astrophysical black holes increases.

Massive gravity and its extensions certainly exhibit black-hole solutions and if the Vainshtein mechanism is successful then we would expect solutions which look arbitrary close to the Schwarzschild and Kerr solutions of GR. However, as in the case of cosmological solutions, the situation is more complicated due to the absence of a unique static spherically symmetric solution that arises from the existence of additional degrees of freedom, and also the existence of other branches of solutions which may or may not be physical. There are a handful of known exact solutions in massive gravity [413, 363, 365, 277, 105, 56, 477, 90, 478, 455, 30, 357], but the most interesting and physically relevant solutions probably correspond to the generic case where exact analytic solutions cannot be obtained. A recent review of black-hole solutions in bi-gravity and massive gravity is given in [478].

An interesting effect was recently found in the context of bi-gravity in Ref. [41]. In that case, the Schwarzschild solutions were shown to be unstable (with a Gregory-Laflamme type of instability [268, 269]) at a scale dictated by the graviton mass, i.e., the instability rate is of the order of the age of the Universe. See also Ref. [42] where the analysis was generalized to the nonbidiagonal. In this more general situation, spherically symmetric perturbations were also found but generically no instabilities. Black-hole disappearance in massive gravity was explored in Ref. [401].

Since all black-hole solutions of massive gravity arise as decoupling limits Mf → ∞ of solutions in bi-gravity,Footnote 30 we can consider from the outset the bi-gravity solutions and consider the massive gravity limit after the fact. Let us consider then the bi-gravity action expressed as

$$\begin{array}{*{20}c} {S = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} R[g] + {{M_f^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- f} R[f]\quad \quad} \\ {+ {{{m^2}M_{{\rm{eff}}}^2} \over 4}\int {{{\rm{d}}^4}} x\sqrt {- g} \sum\limits_n {{{{\beta _n}} \over {n!}}} {\mathcal{L}_n}(\sqrt {\mathbb{X})} + {\rm{Matter}},} \\ \end{array}$$
(11.22)

where \(M_{{\rm{eff}}}^{- 2} = M_{{\rm{Pl}}}^{- 2} + M_f^{- 2}\). Here, the definition is such that in the limit Mf → ∞ the βn’s correspond the usual expressions in massive gravity. We may imagine matter coupled to both metrics although to take the massive gravity limit we should imagine black holes formed from matter which exclusively couples to the g metric.

One immediate consequence of working with bi-gravity is that since the g metric is sourced by polynomials of \(\sqrt {\mathbb X} = \sqrt {{g^{- 1}}f}\) whereas the f metric is sourced by polynomials of \(\sqrt {{f^{- 1}}g}\). We, thus, require that \({\mathbb X}\) is invertible away from curvature singularities. This is equivalent to saying that the eigenvalues of g−1 and −1 should not pass through zero away from a curvature singularity. This in turn means that if one metric is diagonal and admits a horizon, the second metric if it is diagonal must admit a horizon at the same place, i.e., two diagonal metrics have common horizons. This is a generic observation that is valid for any theory with more than one metric [167] regardless of the field equations. Equivalently, this implies that if f is a diagonal metric without horizons, e.g., Minkowski spacetime, then the metric for a black hole must be non-diagonal when working in unitary gauge. This is consistent with the known exact solutions. For certain solutions it may be possible by means of introducing Stückelberg fields to put both metrics in diagonal form, due to the Stückelberg fields absorbing the off-diagonal terms. However, for the generic solution we would expect that at least one metric to be non-diagonal even with Stückelberg fields present.

Working with a static spherically symmetric ansatz for both metrics, we find in general that bi-gravity admits Schwarzschild-(anti) de Sitter-type metrics of the form (see [478] for a review)

$$\;\;{\rm{d}}s_g^2 = - D(r)\;\;{\rm{d}}{t^2} + {1 \over {D(r)}}\;\;{\rm{d}}{r^2} + {r^2}\;{\rm{d}}{\Omega ^2},$$
(11.23)
$$\;\;{\rm{d}}s_f^2 = - \Delta (U)\;{\rm{d}}{T^2} + {1 \over {\Delta (U)}}\;{\rm{d}}{U^2} + {U^2}\;{\rm{d}}{\Omega ^2},$$
(11.24)

where

$$D(r) = 1 - {{2M} \over {8\pi M_{{\rm{Pl}}}^2r}} - {1 \over 3}{\Lambda _g}{r^2},$$
(11.25)
$$\Delta (U) = 1 - {1 \over 3}{\Lambda _f}{U^2},$$
(11.26)

are the familiar metric functions for de Sitter and Schwarzschild-de Sitter.

The f-metric coordinates are related to those of the g metric by (in other words the profiles of the Stückelberg fields)

$$U = ur,\quad T = ut - u\;\int {{{D(r) - \Delta (U)} \over {D(r)\Delta (U)}}} \;{\rm{d}}r,$$
(11.27)

where the constant is given by

$$u = - {{2{\beta _2}} \over {{\beta _3}}} \pm {1 \over {{\beta _3}}}\sqrt {4\beta _2^2 - 6{\beta _1}{\beta _3}} .$$
(11.28)

Finally, the two effective cosmological constants that arise from the mass terms are

$${\Lambda _g} = - {{{m^2}M_{{\rm{eff}}}^2} \over {M_{{\rm{Pl}}}^2}}\left({6{\beta _0} + 2{\beta _1}u + {1 \over 2}{\beta _2}{u^2}} \right),$$
(11.29)
$${\Lambda _f} = - {{{m^2}M_{{\rm{eff}}}^2} \over {M_f^2{u^2}}}\left({{1 \over 2}{\beta _2} + {1 \over 2}{\beta _3}u + {1 \over 4}{\beta _4}{u^2}} \right).$$
(11.30)

In this form, we see that in the limit Mf → ∞ we have Λf → ∞ and MeffMPl and then these solutions match onto the known exact black holes solutions in massive gravity in the absence of charge [413, 363, 365, 277, 105, 56, 477, 478, 455, 30, 357]. Note in particular that for every set of βn’s there are two branches of solutions determined by the two possible values of u.

These solutions describe black holes sourced by matter minimally coupled to metric with mass M. An obvious generalization is to assume that the matter couples to both metrics, with effective masses M1 and M2 so that

$$D(r) = 1 - {{2{M_1}} \over {8\pi M_{{\rm{Pl}}}^2r}} - {1 \over 3}{\Lambda _g}{r^2},$$
(11.31)
$$\Delta (U) = 1 - {{2{M_2}} \over {8\pi M_f^2U}} - {1 \over 3}{\Lambda _f}{U^2}.$$
(11.32)

Although these are exact solutions, not all of them are stable for all values and ranges of parameters and in certain cases it is found that the quadratic kinetic term for various fluctuations vanishes indicating a linearization instability, which means these are not good vacuum solutions. On the other hand, neither are these the most general black-hole-like solutions; the general case requires numerical analysis to solve the equations which is a subject of ongoing work (see, e.g., [477]). We note only that in [477] a distinct class of solutions is obtained numerically in bi-gravity for which the two metrics take the diagonal form

$$\;\;{\rm{d}}s_g^2 = - Q{(r)^2}\;{\rm{d}}{t^2} + {1 \over {N{{(r)}^2}}}\;{\rm{d}}{r^2} + {r^2}\;{\rm{d}}{\Omega ^2},$$
(11.33)
$$\;\;{\rm{d}}s_f^2 = - A{(r)^2}\;{\rm{d}}{T^2} + {{{U{\prime}}{{(r)}^2}} \over {Y{{(r)}^2}}}\;{\rm{d}}{r^2} + U{(r)^2}\;{\rm{d}}{\Omega ^2},$$
(11.34)

where Q, N, A, Y, U are five functions of radius that are numerically obtained solutions of five differential equations. According to the previous arguments about diagonal metrics [167] these solutions do not correspond to black holes in the massive gravity on Minkowski limit Mf → ∞, however the limit Mf → ∞ can be taken and they correspond to black-hole solutions in a theory of massive gravity in which the reference metric is Schwarzschild (-de Sitter or anti-de Sitter). The arguments of [167] are then evaded since the reference metric itself admits a horizon.

15 Cosmology

One of the principal motivations for considering massive theories of gravity is their potential to address, or at least provide a new perspective on, the issue of cosmic acceleration as already discussed in Section 3. Adding a mass for the graviton keeps physics at small scales largely equivalent to GR because of the Vainshtein mechanism. However, it inevitably modifies gravity in at large distances, i.e., in the infrared. This modification of gravity is thus most significant for sources which are long wavelength. The cosmological constant is the most infrared source possible since it is build entirely out of zero momentum modes and for this reason we may hope that the nature of a cosmological constant in a theory of massive gravity or similar infrared modification is changed.

There have been two principal ideas for how massive theories of gravity could be useful for addressing the cosmological constant. On the one hand, by weakening gravity in the infrared, they may weaken the sensitivity of the dynamics to an already existing large cosmological constant. This is the idea behind screening or degravitating solutions [211, 212, 26, 216] (see Section 4.5). The second idea is that a condensate of massive gravitons could form which act as a source for self-acceleration, potentially explaining the current cosmic acceleration without the need to introduce a non-zero cosmological constant (as in the case of the DGP model [159, 163], see Section 4.4). This idea does not address the ‘old cosmological constant problem’ [484] but rather assumes that some other symmetry, or mechanism exists which ensures the vacuum energy vanishes. Given this, massive theories of gravity could potential provide an explanation for the currently small, and hence technically unnatural value of the cosmological constant, by tying it to the small, technically natural, value of the graviton mass.

Thus, the idea of screening/degravitation and self-acceleration are logically opposites to each other, but there is some evidence that both can be achieved in massive theories of gravity. This evidence is provided by the decoupling limit of massive gravity to which we review first. We then go on to discuss attempts to find exact solutions in massive gravity and its various extensions.

15.1 Cosmology in the decoupling limit

A great deal of understanding about the cosmological solutions in massive gravity theories can be learned from considering the ‘decoupling limit’ of massive gravity discussed in Section 8.3. The idea here is to recognize that locally, i.e., in the vicinity of a point, any FLRW geometry can be expressed as a small perturbation about Minkowski spacetime (about \(\vec x = 0\)) with the perturbation expansion being good for distances small relative to the curvature radius of the geometry:

$${\rm{d}}{s^2} = - \left[ {1 - (\dot H + {H^2}){{\vec{x}}^2}} \right]{\rm{d}}{t^2} + \left[ {1 - {1 \over 2}{H^2}{{\vec{x}}^2}} \right]{\rm{d}}{\vec{x}^2}$$
(12.1)
$$= \left({{\eta _{\mu \nu}} + {1 \over {{M_{{\rm{Pl}}}}}}h_{\mu \nu}^{{\rm{FLRW}}}} \right){\rm{d}}{x^\mu}{\rm{d}}{x^\nu}.$$
(12.2)

In the decoupling limit MPl → ∞ m → ∞ we keep the canonically normalized metric perturbation fixed. Thus, the decoupling limit corresponds to keeping H2MP1 and ḢMPl fixed, or equivalently H2/m2 and /m2 fixed. Despite the fact that H → 0 vanishes in this limit, the analogue of the Friedmann equation remains nontrivial if we also scale the energy density such that ρ/MPl remains finite. Because of this fact, it is possible to analyze the modification to the Friedmann equation in the decoupling limit.Footnote 31

The generic form for the helicity-0 mode which preserves isotropy near \(\vec x = 0\) is

$$\pi = A(t) + B(t){\vec{x}^2} + \ldots.$$
(12.3)

In the specific case where (t), this also preserves homogeneity in a theory in which the Galileon symmetry is exact, as in massive gravity, since a translation in \(\vec x\) corresponds to a Galileon transformation of π which leaves invariant the combination μνπ. In Ref. [139] this ansatz was used to derive the existence of both self-accelerating and screening solutions.

15.1.1 Friedmann equation in the decoupling limit

We start with the decoupling limit Lagrangian given in (8.38). Following the same notation as in Ref. [139] we set an = −cn/2, where the coefficients cn are given in terms of the αn’s in (8.47). The self-accelerating branch of solutions then corresponds to the ansatz

$$\pi = {1 \over 2}{q_0}\Lambda _3^3{x_\mu}{x^\mu} + \phi$$
(12.4)
$${h_{\mu \nu}} = - {1 \over 2}H_{{\rm{dS}}}^2{x_\mu}{x^\mu} + {\chi _{\mu \nu}}$$
(12.5)
$${T_{\mu \nu}} = - \lambda {\eta _{\mu \nu}} + {\tau _{\mu \nu}},$$
(12.6)

where πχ and τ correspond to the fluctuations about the background solution.

For this ansatz, the background equations of motion reduce to

$${H_{{\rm{dS}}}}\left({{a_1} + 2{a_2}{q_0} + 3{a_3}q_0^2} \right) = 0$$
(12.7)
$$H_{{\rm{dS}}}^2 = {\lambda \over {3M_{{\rm{Pl}}}^2}} + {{2\Lambda _3^3} \over {{M_{{\rm{Pl}}}}}}({a_1}{q_0} + {a_2}q_0^2 + {a_3}q_0^3).$$
(12.8)

In the ‘self-accelerating branch’ when Hds ≠ 0, the first constraint can be used to infer q0 and the second one corresponds to the effective Friedmann equation. We see that even in the absence of a cosmological constant λ = 0, for generic coefficients we have a constant Hds solution which corresponds to a self-accelerating de Sitter solution.

The stability of these solutions can be analyzed by looking at the Lagrangian for the quadratic fluctuations

$${\mathcal{L}^{(2)}} = - {1 \over 4}{\chi ^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{\chi _{\alpha \beta}} + {{6H_{{\rm{dS}}}^2{M_{{\rm{Pl}}}}} \over {\Lambda _3^3}}({a_2} + 3{a_3}{q_0})\phi\square \phi + {1 \over {{M_{{\rm{Pl}}}}}}{\chi ^{\mu \nu}}{\tau _{\mu \nu}}.$$
(12.9)

Thus, we see that the helicity zero mode is stable provided that

$$(a_2+ 3 a_3 q_0) >0.$$
(12.10)

However, these solutions exhibit a peculiarity. To this order, the helicity-0 mode fluctuations do not couple to the matter perturbations (there is no kinetic mixing between π and χμν). This means that there is no Vainshtein effect, but at the same time there is no vDVZ discontinuity for the Vainshtein effect to resolve!

15.1.2 Screening solution

Another way to solve the system of Eqs. (12.7) and (12.8) is to consider instead flat solutions Hds = 0. Then (12.7) is trivially satisfied and we see the existence of a ‘screening solution’ in the Friedmann equation (12.8), which can accommodate a cosmological constant without any acceleration. This occurs when the helicity-0 mode ‘absorbs’ the contribution from the cosmological constant A, and the background configuration for π parametrized by q0 satisfies

$$({a_1}{q_0} + {a_2}q_0^2 + {a_3}q_0^3) = - {\lambda \over {6{M_{{\rm{Pl}}}}\Lambda _3^3}}.$$
(12.11)

Perturbations about this screened configuration then behave as

$${\mathcal{L}^{(2)}} = - {1 \over 2}{\chi ^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{\chi _{\alpha \beta}} + {3 \over 2}\phi\square \phi + {1 \over {{M_{{\rm{Pl}}}}}}({\chi ^{\mu \nu}} + {\eta ^{\mu \nu}}\phi){\tau _{\mu \nu}}.$$
(12.12)

In this case the perturbations are stable, and the Vainshtein mechanism is present which is necessary to resolve the vDVZ discontinuity. Furthermore, since the background contribution to the metric perturbation vanishes hμν = 0, they correspond to Minkowski solutions which are sourced by a nonzero cosmological constant. In the case where a3 = 0, these solutions only exist if \(\lambda < {M_{{\rm{Pl}}}}\Lambda _3^3{{3a_1^2} \over {2{a_2}}}\). In the case where a3 ≠ 0, there is no upper bound on the cosmological constant which can be screened via this mechanism.

In this branch of solution, the strong coupling scale for fluctuations on top of this configuration becomes of the same order of magnitude as that of the screened cosmological constant. For a large cosmological constant the strong coupling scale becomes to large and the helicity-0 mode would thus not be sufficiently Vainshtein screened.

Thus, while these solutions seem to indicate positively that there are self-screening solutions which can accommodate a continuous range of values for the cosmological constant and still remain flat, the range is too small to significantly change the old cosmological constant problem. Nevertheless, the considerable difficulty in attacking the old cosmological constant problem means that these solutions deserve further attention as they also provide a proof of principle on how Weinberg’s no go could be evaded [484]. We emphasize that what prevents a large cosmological constant from being screened is not an issue in the theoretical tuning but rather an observational bound, so this is already a step forward.

These two classes of solutions are both maximally symmetric. However, the general cosmological solution is isotropic but inhomogeneous. This is due to the fact that a nontrivial time dependence for the matter source will inevitably source B (t), and as soon as ≠ 0 the solutions are inhomogeneous. In fact, as we now explain in general, the full nonlinear solution is inevitably inhomogeneous due to the existence of a no-go theorem against spatially flat and closed FLRW solutions.

15.2 FLRW solutions in the full theory

15.2.1 Absence of flat/closed FLRW solutions

A nontrivial consequence of the fact that diffeomorphism invariance is broken in massive gravity is that there are no spatially flat or closed FLRW solutions [117]. This result follows from the different nature of the Hamiltonian constraint. For instance, choosing a spatially flat form for the metric \({\rm{d}}{s^2} = - {\rm{d}}{t^2} + a{(t)^2}{\rm{d}}{\vec x^2}\), the mini-superspace Lagrangian takes the schematic form

$$\mathcal{L} = - 3M_{{\rm{Pl}}}^2{{a{{\dot a}^2}} \over N} + {F_1}(a) + {F_2}(a)N.$$
(12.13)

Consistency of the constraint equation obtained from varying with respect to and the acceleration equation for ä implies

$${{\partial {F_1}(a)} \over {\partial t}} = \dot a{{\partial {F_1}(a)} \over {\partial a}} = 0.$$
(12.14)

In GR, since F1(a) = 0, there is no analogue of this equation. In the present case, this equation can be solved either by imposing \(\dot a = 0\) which implies the absence of any dynamic FLRW solutions, or by solving \({{{\partial _{{F_1}(a)}}} \over {{\partial _a}}} = 0\) for fixed a which implies the same thing. Thus, there are no nontrivial spatially flat FLRW solutions in massive gravity in which the reference metric is Minkowski. The result extends also to spatially closed cosmological solutions. As a result, different alternatives have been explored in the literature to study the cosmology of massive gravity. See Figure 7 for a summary of these different approaches.

Figure 7
figure 7

Alternative ways in deriving the cosmology in massive gravity.

15.2.2 Open FLRW solutions

While the previous argument rules out the possibility of spatially flat and closed FLRW solutions, open ones are allowed [283]. To see this we make the ansatz \({\rm{d}}{s^2} = - {\rm{d}}{t^2} + a{(t)^2}{\rm{d}}\Omega _{{H^3}}^2\), where \({\rm{d}}\Omega _{{H^3}}^2\) expressed in the form

$${\rm{d}}\Omega _{{H^3}}^2 = {\rm{d}}{\vec{x}^2} - \vert k\vert {{{{(\vec{x}.\;{\rm{d}}\vec{x})}^2}} \over {(1 + \vert k\vert {{\vec{x}}^2})}} = {{{\rm{d}}{r^2}} \over {1 + \vert k\vert {r^2}}} + {r^2}\,{\rm{d}}\Omega _{{S^2}}^2,$$
(12.15)

is the metric on a hyperbolic space, and express the reference metric in terms of Stückelberg fields \({\tilde f_{\mu v}}\,{\rm{d}}{x^\mu}\,{\rm{d}}{x^\mu} = {\eta _{ab}}{\partial _\mu}{\phi ^a}{\partial _v}{\phi ^b}\) with

$${\phi ^0} = f(t)\sqrt {1 + \vert k\vert {{\vec{x}}^2}} ,$$
(12.16)
$${\phi ^i} = \sqrt {\vert k\vert} f(t){x^i}.$$
(12.17)

then the mini-superspace Lagrangian of (6.3) takes the form

$$\begin{array}{*{20}c} {{\mathcal{L}_{{\rm{mGR}}}} = - 3M_{{\rm{Pl}}}^2\vert k\vert Na - 3M_{{\rm{Pl}}}^2{{a{{\dot a}^2}} \over N} + 3{m^2}M_{{\rm{Pl}}}^2\left[ {2{a^2}\mathcal{X}(2Na - \dot fa - N\sqrt {\vert k\vert} f} \right.} \\ {\left. {+ {\alpha _3}{a^2}{\mathcal{X}^2}(4Na - 3\dot fa - N\sqrt {\vert k\vert} f) + 4{\alpha _4}{a^3}{\mathcal{X}^3}(N - \dot f)} \right],\quad \quad} \\ \end{array}$$

with \({\mathcal X} = 1 - {{\sqrt {\vert k\vert} f} \over a}\). In this case, the analogue additional constraint imposed by consistency of the Friedmann and acceleration (Raychaudhuri) equation is

$$a\mathcal{X}\left({\left({3 - {{2\sqrt {\vert k\vert} f} \over a}} \right) + {3 \over 2}{\alpha _3}\left({3 - {{\sqrt {\vert k\vert} f} \over a}} \right)\mathcal{X} + 6{\alpha _4}{\mathcal{X}^2}} \right) = 0.$$
(12.18)

The solution for which \({\mathcal X} = 0\) is essentially Minkowski spacetime in the open slicing, and is thus uninteresting as a cosmology.

Focusing on the other branch and assuming \({\mathcal X} \ne 0\), the general solution determines f (t) in terms of a (t) takes the form \(f(t) = {1 \over {\sqrt {\vert k}\vert}}\mu a(t)\) where u is a constant determined by the quadratic equation

$$3 - 2u + {3 \over 2}{\alpha _3}\left({3 - u} \right)(1 - u) + 6{\alpha _4}{(1 - u)^2} = 0.$$
(12.19)

The resulting Friedmann equation is then

$$3M_{{\rm{Pl}}}^2{H^2} - {{3M_{{\rm{Pl}}}^2\vert k\vert} \over {{a^2}}} = \rho + 2{m^2}{\rho _m},$$
(12.20)

where

$${\rho _m} = - (1 - u)\left({3\left({2 - u} \right) + {3 \over 2}{\alpha _3}\left({4 - u} \right)(1 - u) + 6{\alpha _4}{{(1 - u)}^2}} \right).$$
(12.21)

Despite the positive existence of open FLRW solutions in massive gravity, there remain problems of either strong coupling (due to absence of quadratic kinetic terms for physical degrees of freedom) or other instabilities which essentially rule out the physical relevance of these FLRW solutions [285, 125, 464].

15.3 Inhomogenous/anisotropic cosmological solutions

As pointed out in [117], the absence of FLRW solutions in massive gravity should not be viewed as an observational flaw of the theory. On the contrary, the Vainshtein mechanism guarantees that there exist inhomogeneous cosmological solutions which approximate the normal FLRW solutions of GR as closely as desired in the limit m → 0. Rather, it is the existence of a new physical length scale 1/m in massive gravity, which cause the dynamics to be inhomogeneous at cosmological scales. If this scale 1/m is comparable to or larger than the current Hubble radius, then the effects of these inhomogeneities would only become apparent today, with the universe locally appearing as homogeneous for most of its history in the local patch that we observe.

One way to understand how the Vainshtein mechanism recovers the prediction of homogeneity and isotropy is to work in the formulation of massive gravity in which the Stückelberg fields are turned on. In this formulation, the Stückelberg fields can exhibit order unity inhomogeneities with the metric remaining approximately homogeneous. Matter that couples only to the metric will perceive an effectively homogeneous and anisotropic universe, and only through interaction with the Vainshtein suppressed additional scalar and vector degrees of freedom would it be possible to perceive the inhomogeneities. This is achieved because the metric is sourced by the Stückelberg fields through terms in the equations of motion which are suppressed by m2. Thus, as long as Rm2, the metric remains effectively homogeneous and isotropic despite the existence of no-go theorems against exact homogeneity and isotropy.

In this regard, a whole range of exact solutions have been studied exhibiting these properties [364, 474, 363, 97, 264, 356, 456, 491, 476, 334, 478, 265, 124, 123, 125, 455, 198]. A generalization of some of these solutions was presented in Ref. [404] and Ref. [266]. In particular, we note that in [475, 476] the most general exact solution of massive gravity is obtained in which the metric is homogeneous and isotropic with the Stückelberg fields inhomogeneous. These solutions exist because the effective contribution to the stress energy tensor from the mass term (i.e., viewing the mass term corrections as a modification to the energy density) remains homogeneous and isotropic despite the fact that it is build out of Stückelberg fields which are themselves inhomogeneous.

Let us briefly discuss how these solutions are obtained.Footnote 32 As we have already discussed, all solutions of massive gravity can be seen as Mf → ∞ decoupling limits of bi-gravity. Therefore, we may consider the case of inhomogeneous solutions in bi-gravity and the solutions of massive gravity can always be derived as a limit of these bi-gravity solutions. We thus begin with the action

$$\begin{array}{*{20}c} {S = {{M_{{\rm{Pl}}}^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- g} R[g] + {{M_f^2} \over 2}\int {{{\rm{d}}^4}} x\sqrt {- f} R[f]\quad \,\,} \\ {+ {{{m^2}M_{{\rm{eff}}}^2} \over 4}\int {{{\rm{d}}^4}} x\sqrt {- g} \sum\limits_n {{{{\beta _n}} \over {n!}}} {\mathcal{L}_n}(\sqrt{\mathbb{X}}) + {\rm{Matter}},} \\ \end{array}$$
(12.22)

where \(M_{{\rm{eff}}}^{- 2} = M_{{\rm{Pl}}}^{- 2} + M_f^{- 2}\) and \(\sqrt {\mathcal X} = \sqrt {{g^{- 1}}f}\) and we may imagine matter coupled to both f and g but for simplicity let us imagine matter is either minimally coupled to g or it is minimally coupled to f.

15.3.1 Special isotropic and inhomogeneous solutions

Although it is possible to find solutions in which the two metrics are proportional to each other fμν = C2gμν [478], these solutions require in addition that the stress energies of matter sourcing f and g are proportional to one another. This is clearly too restrictive a condition to be phenomenologically interesting. A more general and physically realistic assumption is to suppose that both metrics are isotropic but not necessarily homogeneous. This is covered by the ansatz

$${\rm{d}}s_g^2 = - Q{(t,r)^2}\,{\rm{d}}{t^2} + N{(t,r)^2}{\rm{d}}{r^2} + {R^2}{{\rm{d}}^2}{\Omega _{{S^2}}},$$
(12.23)
$$\begin{array}{*{20}c} {{\rm{d}}s_f^2 = - {{(a(t,r)Q{{(t,r)}^2}\,{\rm{d}}t + c(t,r)N(t,r){\rm{d}}r)}^2}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {{(c(t,r)Q(t,r){\rm{d}}t - b(t,r)n(t,r)N(t,r){\rm{d}}r)}^2} + u{{(t,r)}^2}{R^2}\,{{\rm{d}}^2}{\Omega _{{S^2}}},} \\ \end{array}$$
(12.24)

and \({{\rm{d}}^{\rm{2}}}{\Omega _{{S^2}}}\) is the metric on a unit 2-sphere. To put the g metric in diagonal form we have made use of the one copy of overall diff invariance present in bi-gravity. To distinguish from the bi-diagonal case we shall assume that c (t, r) ≠ 0. The bi-diagonal case allows for homogeneous and isotropic solutions for both metrics which will be dealt with in Section 12.4.2. The square root may be easily taken to give

$$\sqrt{\mathbb{X}}= \left({\begin{array}{*{20}c} a & {cN/Q} & 0 & 0 \\ {- cN/Q} & b & 0 & 0 \\ 0 & 0 & u & 0 \\ 0 & 0 & 0 & u \\ \end{array}} \right),$$
(12.25)

which can easily be used to determine the contribution of the mass terms to the equations of motion for f and g. This leads to a set of partial differential equations for Q, R, N, n, c, b which in general require numerical analysis. As in GR, due to the presence of constraints associated with diffeomorphism invariance, and the Hamiltonian constraint for the massive graviton, several of these equations will be first order in time-derivatives. This simplifies matters somewhat but not sufficiently to make analytic progress. Analytic progress can be made however by making additional more restrictive assumptions, at the cost of potentially losing the most physically interesting solutions.

15.3.1.1 Effective cosmological constant

For instance, from the above form we may determine that the effective contribution to the stress energy tensor sourcing arising from the mass term is of the form

$${T_{{\rm{mass}}}}_{\;r}^0 = - {m^2}{{M_{{\rm{eff}}}^2} \over {M_{{\rm{Pl}}}^2}}{{cN} \over Q}\left({{3 \over 2}{\beta _1} + {\beta _2}u + {1 \over 4}{\beta _3}{u^2}} \right).$$
(12.26)

if we make the admittedly restrictive assumption that the metric is of the FLRW form or is static, then this requires that \(T_{\,\,\,\,r}^0 = 0\) which for c ≠ 0 implies

$${3 \over 2}{\beta _1} + {\beta _2}u + {1 \over 4}{\beta _3}{u^2} = 0.$$
(12.27)

This should be viewed as an equation for u (t, r) whose solution is

$$u(t,r) = u = - {{2{\beta _2}} \over {{\beta _3}}} \pm {1 \over {{\beta _3}}}\sqrt {4\beta _2^2 - 6{\beta _1}{\beta _3}} .$$
(12.28)

then conservation of energy imposes further

$$\begin{array}{*{20}c} {{T_{{\rm{mass}}}}_{\;0}^0 - {T_{{\rm{mass}}}}_{\;\theta}^\theta = - {m^2}{{M_{{\rm{eff}}}^2} \over {M_{{\rm{Pl}}}^2}}\left({{1 \over 2}{\beta _2} + {1 \over 4}{\beta _3}u} \right)\left({(u - a)(u - b) + {c^2}} \right)} \\ {= 0,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(12.29)

since u is already fixed we should view this generically as an equation for c (t, r) in terms of a (t, r) and b (t, r)

$$(u - a)(u - b) + {c^2} = 0.$$
(12.30)

With these assumptions the contribution of the mass term to the effective stress energy tensor sourcing each metric becomes equivalent to a cosmological constant for each metric \({T_{{\rm{mass}}}}_{\,\,\,v}^\mu (g) = - {\Lambda _g}{\delta ^\mu}_v\) and \({T_{{\rm{mass}}}}_{\,\,\,v}^\mu (f) = - {\Lambda _f}{\delta ^\mu}_v\) with

$${\Lambda _g} = - {{{m^2}M_{{\rm{eff}}}^2} \over {M_{{\rm{Pl}}}^2}}\left({6{\beta _0} + 2{\beta _1}u + {1 \over 2}{\beta _2}{u^2}} \right),$$
(12.31)
$${\Lambda _f} = - {{{m^2}M_{{\rm{eff}}}^2} \over {M_f^2{u^2}}}\left({{1 \over 2}{\beta _2} + {1 \over 2}{\beta _3}u + {1 \over 4}{\beta _4}{u^2}} \right).$$
(12.32)

Thus, all of the potential dynamics of the mass term is reduced to an effective cosmological constant. Let us stress again that this rather special fact is dependent on the rather restrictive assumptions imposed on the metric g and that we certainly do not expect this to be the case for the most general time-dependent, isotropic, inhomogeneous solution.

15.3.1.2 Massive gravity limit

As usual, we can take the Mf → ∞ limit to recover solutions for massive gravity on Minkowski (if Λf → 0) or more generally if the scaling of the parameters βn is chosen so that Λf and \((6{\beta _0} + 2{\beta _{1\mu}} + {1 \over 2}{\beta _2}{\upsilon ^2})\) and hence Λg remains finite in the limit then these will give rise to solutions for massive gravity for which the reference metric is any Einstein space for which

$${G_{\mu \nu}}(f) = - {\Lambda _f}{f_{\mu \nu}}.$$
(12.33)

for example, this includes the interesting cases of de Sitter and anti-de Sitter reference metrics.

Thus, for example, assuming no additional matter couples to the metric, both bi-gravity and massive gravity on a fixed reference metric admit exact cosmological solutions for which the f metric is de Sitter or anti-de Sitter

$${\rm{d}}s_g^2 = - {\rm{d}}{t^2} + a{(t)^2}\left({{{{\rm{d}}{r^2}} \over {1 - k{r^2}}} + {r^2}\,{\rm{d}}{\Omega ^2}} \right)$$
(12.34)
$${\rm{d}}s_f^2 = - \Delta (U)\,{\rm{d}}{T^2} + \left({{{{\rm{d}}{U^2}} \over {\Delta (U)}} + {U^2}\,{\rm{d}}{\Omega ^2}} \right),$$
(12.35)

where \(\Delta (U) = 1 - {{{\Delta _f}} \over 3}{U^2}\), and the scale factor a (t) satisfies

$$3M_{{\rm{Pl}}}^2\left({{H^2} + {k \over {{a^2}}}} \right) = {\Lambda _g} + {\rho _M}(t),$$
(12.36)

where ρM (t) is the energy density of matter minimally coupled to \(g,\,H\, = \dot a/a\), and U, T can be expressed as a function of r and t and comparing with the previous representation U = ur. The one remaining undetermined function is T (t, r) and this is determined by the constraint that (ua)(ub) + c2 = 0 and the conversion relations

$$\sqrt {\Delta (U)} \,{\rm{d}}T = a(t,r)\,\,{\rm{d}}t + c(t,r){1 \over {\sqrt {1 - k{r^2}}}}\,\,{\rm{d}}r,$$
(12.37)
$${1 \over {\sqrt {\Delta (U)}}}\,{\rm{d}}U = c(t,r)\,{\rm{d}}t - b(t,r)n(t,r){1 \over {\sqrt {1 - k{r^2}}}}\,{\rm{d}}r,$$
(12.38)
$$U(t,r) = ur,$$
(12.39)

which determine b (t, r) and c (t, r) in terms of and T ′. These relations are difficult to solve exactly, but if we consider the special case Λf = 0 which corresponds in particular to massive gravity on Minkowski then the solution is

$$T(t,r) = q\int\nolimits^t {\,{\rm{d}}t{1 \over {\dot a}} + \left({{{{u^2}} \over {4q}} + q{r^2}} \right)a,}$$
(12.40)

where q is an integration constant.

In particular, in the open universe case \({\Lambda _f} = 0,\,k = 0,\,q = u,\,T = ua\sqrt {1 + \vert k\vert{r^2}}\), we recover the open universe solution of massive gravity considered in Section 12.2.2, where for comparison \(f(t) = {1 \over {\sqrt {\vert k\vert}}}ua(t),\,{\phi ^0}(t,r) = T(t,r)\) and ϕr = U (t, r).

15.3.2 General anisotropic and inhomogeneous solutions

Let us reiterate again that there are a large class of inhomogeneous but isotropic cosmological solutions for which the effective Friedmann equation for the g metric is the same as in GR with just the addition of a cosmological constant which depends on the graviton mass parameters. However, these are not the most general solutions, and as we have already discussed many of the exact solutions of this form considered so far have been found to be unstable, in particular through the absence of kinetic terms for degrees of freedom which implies infinite strong coupling. However, all the exact solutions arise from making a strong restriction on one or the other of the metrics which is not expected to be the case in general. Thus, the search for the ‘correct’ cosmological solution of massive gravity and bi-gravity will almost certainly require a numerical solution of the general equations for Q, R, N, n, c, b, and their stability.

Closely related to this, we may consider solutions which maintain homogeneity, but are anisotropic [284, 393, 123]. In [393] the general Bianchi class A cosmological solutions in bi-gravity are studied. There it is shown that the generic anisotropic cosmological solution in bi-gravity asymptotes to a self-accelerating solution, with an acceleration determined by the mass terms, but with an anisotropy that falls off less rapidly than in GR. In particular the anisotropic contribution to the effective energy density redshifts like non-relativistic matter. In [284, 123] it is found that if the reference metric is made to be of an anisotropic FLRW form, then for a range of parameters and initial conditions stable ghost free cosmological solutions can be found.

These analyses are ongoing and it has been uncovered that certain classes of exact solutions exhibit strong coupling instabilities due to vanishing kinetic terms and related pathologies. However, this simply indicates that these solutions are not good semi-classical backgrounds. The general inhomogeneous cosmological solution (for which the metric is also inhomogeneous) is not known at present, and it is unlikely it will be possible to obtain it exactly. Thus, it is at present unclear what are the precise nonlinear completions of the stable inhomogeneous cosmological solutions that can be found in the decoupling limit. Thus the understanding of the cosmology of massive gravity should be regarded as very much work in progress, at present it is unclear what semi-classical solutions of massive gravity are the most relevant for connecting with our observed cosmological evolution.

15.4 Massive gravity on FLRW and bi-gravity

15.4.1 FLRW reference metric

One straightforward extension of the massive gravity framework is to allow for modifications to the reference metric, either by making it cosmological or by extending to bi-gravity (or multi-gravity). In the former case, the no-go theorem is immediately avoided since if the reference metric is itself an FLRW geometry, there can no longer be any obstruction to finding FLRW geometries.

The case of massive gravity with a spatially flat FLRW reference metric was worked out in [223], where it was found that if using the convention for which the massive gravity Lagrangian is (6.5) with the potential given in terms of the coefficient βs as in (6.23), then the Friedmann equation takes the form

$${H^2} = - \sum\limits_{n = 0}^3 {{{3(4 - n){m^2}{\beta _n}} \over {2n!}}} {\left({{b \over a}} \right)^n} + {1 \over {3M_{{\rm{Pl}}}^2}}\rho .$$
(12.41)

Here the dynamical and reference metrics in the form

$${g_{\mu \nu}}\,{\rm{d}}{x^\mu}\,{\rm{d}}{x^\nu} = - {\rm{d}}{t^2} + a{(t)^2}\,{\rm{d}}{\vec{x}^2},$$
(12.42)
$${f_{\mu \nu}}\,{\rm{d}}{x^\mu}{\rm{d}}{x^\nu} = - {M^2}\,{\rm{d}}{t^2} + b{(t)^2}\,{\rm{d}}{\vec{x}^2}$$
(12.43)

and the Hubble constants are related by

$$\sum\limits_{n = 0}^2 {{{(3 - n){\beta _{n + 1}}} \over {2n!}}} {\left({{b \over a}} \right)^{n + 1}}\left({{H \over b} - {{{H_f}} \over a}} \right) = 0.$$
(12.44)

Ensuring a nonzero ghost-free kinetic term in the vector sector requires us to always solve this equation with

$${H \over {{H_f}}} = {b \over a},$$
(12.45)

so that the Friedmann equation takes the form

$${H^2} = - \sum\limits_{n = 0}^3 {{{3(4 - n){m^2}{\beta _n}} \over {2n!}}} {\left({{H \over {{H_f}}}} \right)^n} + {1 \over {3M_{{\rm{Pl}}}^2}}\rho ,$$
(12.46)

where Hf is the Hubble parameter for the reference metric. By itself, this Friedmann equation looks healthy in the sense that it admits FLRW solutions that can be made as close as desired to the usual solutions of GR.

However, in practice, the generalization of the Higuchi consideration [307] to this case leads to an unacceptable bound (see Section 8.3.6).

It is a straightforward consequence of the representation theory for the de Sitter group that a unitary massive spin-2 representation only exists in four dimensions for m2 ≥ 2H2 as was the case in de Sitter. Although this result only holds for linearized fluctuations around de Sitter, its origin as a bound comes from the requirement that the kinetic term for the helicity zero mode is positive, i.e., the absence of ghosts in the scalar perturbations sector. In particular, the kinetic term for the helicity-0 mode π takes the form

$${\mathcal{L}_{{\rm{helicity - 0}}}} \propto - {m^2}({m^2} - 2{H^2}){(\partial \pi)^2}.$$
(12.47)

Thus, there should exist an appropriate generalization of this bound for any cosmological solution of nonlinear massive gravity for which there an FLRW reference metric.

This generalized bound was worked out in [223] and takes the form

$$- {{{m^2}} \over {4M_{{\rm{Pl}}}^2}}{H \over {{H_f}}}\left[ {3{\beta _1} + 4{\beta _2}{H \over {{H_f}}} + {\beta _3}{{{H^2}} \over {H_f^2}}} \right] \geq 2{H^2}.$$
(12.48)

Again, by itself this equation is easy to satisfy. However, combined with the Friedmann equation, we see that the two equations are generically in conflict if in addition we require that the massive gravity corrections to the Friedmann equation are small for most of the history of the Universe, i.e., during radiation and matter domination

$$- \sum\limits_{n = 0}^3 {{{3(4 - n){m^2}{\beta _n}} \over {2n!}}} {\left({{H \over {{H_f}}}} \right)^n} \leq {H^2}.$$
(12.49)

This phenomenological requirement essentially rules out the applicability of FLRW cosmological solutions in massive gravity with an FLRW reference metric.

This latter problem which is severe for massive gravity with dS or FLRW reference metrics,Footnote 33 gets resolved in bi-gravity extensions, at least for a finite regime of parameters.

15.4.2 Bi-gravity

Cosmological solutions in bi-gravity have been considered in [474, 479, 104, 106, 8, 475, 478, 9, 7, 62]. We keep the same notation as previously and consider the action for bi-gravity as in (5.43) (in terms of the β’s where the conversion between the β’s and the β’s is given in (6.28))

$$\begin{array}{*{20}c} {{\mathcal{L}_{{\rm{bi - gravity}}}} = {{M_{{\rm{Pl}}}^2} \over 2}\sqrt {- g} R[g] + {{M_f^2} \over 2}\sqrt {- f} R[f]\quad} \\ + {{{m^2}M_{{\rm{Pl}}}^2} \over 4}\sum\limits_{n = 0}^4 {{{{\beta _n}} \over {n!}}} {\mathcal{L}_n}[\sqrt{\mathbb{X}}] + {\mathcal{L}_{{\rm{matter}}}}[g,{\psi _i}], \end{array}$$
(12.50)

assuming that matter only couples to the metric. Then the two Friedmann equations for each Hubble parameter take the respective form

$${H^2} = - \sum\limits_{n = 0}^3 {{{3(4 - n){m^2}{\beta _n}} \over {2n!}}} {\left({{H \over {{H_f}}}} \right)^n} + {1 \over {3M_{{\rm{Pl}}}^2}}\rho$$
(12.51)
$$H_f^2 = - {{M_{{\rm{Pl}}}^2} \over {M_f^2}}\left[ {\sum\limits_{n = 0}^3 {{{3{m^2}{\beta _{n + 1}}} \over {2n!}}} {{\left({{H \over {{H_f}}}} \right)}^{n - 3}}} \right].$$
(12.52)

Crucially, the generalization of the Higuchi bound now becomes

$$- {{{m^2}} \over 4}{H \over {{H_f}}}\left[ {3{\beta _1} + 4{\beta _2}{H \over {{H_f}}} + {\beta _3}{{{H^2}} \over {H_f^2}}} \right]\left[ {1 + {{\left({{{{H_f}{{\rm{M}}_{{\rm{Pl}}}}} \over {H{M_f}}}} \right)}^2}} \right] \geq 2{H^2}.$$
(12.53)

The important new feature is the last term in square brackets. Although this tends to unity in the limit Mf → ∞, which is consistent with the massive gravity result, for finite Mf it opens a new regime where the bound is satisfied by having \({\left({{{{H_f}{M_{{\rm{Pl}}}}} \over {H{M_f}}}} \right)^2} \gg 1\) (notice that in our convention the β’s are typically negative). One may show [224] that it is straightforward to find solutions of both Friedmann equations which are consistent with the Higuchi bound over the entire history of the universe. For example, choosing the parameters β2 = β3 = 0 and solving for Hf the effective Friedmann equation for the metric which matter couples to is

$${H^2} = {1 \over {6M_{{\rm{Pl}}}^2}}\left({\rho (a) + \sqrt {\rho {{(a)}^2} + {{12{m^4}M_{{\rm{Pl}}}^6} \over {M_f^2}}}} \right)$$
(12.54)

and the generalization of the Higuchi bound is

$$\left({1 + {{16M_f^2} \over {3M_{{\rm{Pl}}}^2\beta _1^2}}{{{H^4}} \over {{m^4}}}} \right) > 0.$$
(12.55)

which is trivially satisfied at all times. More generally, there is an open set of such solutions. The observationally viability of the self-accelerating branch of these models has been considered in [8, 9] with generally positive results. Growth histories of the bi-gravity cosmological solutions have been considered in [62]. However, while avoiding the Higuchi bound indicates absence of ghosts, it has been argued that these solutions may admit gradient instabilities in their cosmological perturbations [106].

We should stress again that just as in massive gravity, the absence of FLRW solutions should not be viewed as an inconsistency of the theory with observations, also in bi-gravity these solutions may not necessarily be the ones of most relevance for connecting with observations. It is only that they are the most straightforward to obtain analytically. Thus, cosmological solutions in bi-gravity, just as in massive gravity, should very much be viewed as a work in progress.

15.5 Other proposals for cosmological solutions

Finally, we may note that more serious modifications the massive gravity framework have been considered in order to allow for FLRW solutions. These include mass-varying gravity and the quasi-dilaton models [119, 118]. In [281] it was shown that mass-varying gravity and the quasi-dilaton model could allow for stable cosmological solutions but for the original quasi-dilaton theory the self-accelerating solutions are always unstable. On the other hand, the generalizations of the quasi-dilaton [126, 127] appears to allow stable cosmological solutions.

In addition, one can find cosmological solutions in non-Lorentz invariant versions of massive gravity [109] (and [103, 107, 108]). We can also allow the mass to become dependent on a field [489, 375], extend to multiple metrics/vierbeins [454], extensions with f (R) terms either in massive gravity [89] or in bi-gravity [416, 415] which leads to interesting self-accelerating solutions. Alternatively, one can consider other extensions to the form of the mass terms by coupling massive gravity to the DBI Galileons [237, 19, 20, 315].

As an example, we present here the cosmology of the extension of the quasi-dilaton model considered in [127], where the reference metric \({\bar f_{\mu v}}\) is given in (9.15) and depends explicitly on the dynamical quasi-dilaton field σ.

The action takes the familiar form with an additional kinetic term introduced for the quasi-dilaton which respects the global symmetry

$$\begin{array}{*{20}c} {S = M_{{\rm{Pl}}}^2\int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over 2}R - \Lambda - {\omega \over {2M_{{\rm{Pl}}}^2}}{{(\partial \sigma)}^2}} \right.\quad} \\ {\left. {+ {{{m^2}} \over 4}\left({{\mathcal{L}_2}[\tilde{\mathcal{K}}] + {\alpha _3}{\mathcal{L}_3}[\tilde{\mathcal{K}}] + {\alpha _4}{\mathcal{L}_4}[\tilde{\mathcal{K}}]} \right)} \right],} \\ \end{array}$$
(12.56)

where the tensor \(\tilde {\mathcal K}\) is given in (9.14).

The background ansatz is taken as

$${\rm{d}}{s^2} = - {N^2}(t)\;{\rm{d}}{t^2} + a{(t)^2}\;{\rm{d}}{\vec{x}^2},$$
(12.57)
$${\phi ^0} = {\phi ^0}(t),$$
(12.58)
$${\phi ^i} = {x^i},$$
(12.59)
$$\sigma = \sigma (t),$$
(12.60)

so that

$${f_{\mu \nu}}\;{\rm{d}}{x^\mu}\;{\rm{d}}{x^\nu} = - n{(t)^2}\;{\rm{d}}{t^2} + \;{\rm{d}}{\vec{x}^2},$$
(12.61)
$$n{(t)^2} = {({\dot \phi ^0}(t))^2} + {{{\alpha _\sigma}} \over {M_{{\rm{Pl}}}^2{m^2}}}{\dot \sigma ^2}.$$
(12.62)

The equation that for normal massive gravity forbids FLRW solutions follows from varying with respect to ϕ0 and takes the form

$${\partial _t}\left({{a^4}X(1 - X)J} \right) = 0,$$
(12.63)

where

$$X = {{{e^{\sigma /{M_{{\rm{Pl}}}}}}} \over a}\quad {\rm{and}}\quad J = 3 + {9 \over 2}(1 - X){\alpha _3} + 6{(1 - X)^2}{\alpha _4}.$$
(12.64)

As the universe expands X (1 − X)J ∼ 1/a4 which for one branch of solutions implies J → 0 which determines a fixed constant asymptotic value of X from J = 0. In this asymptotic limit the effective Friedmann equation becomes

$$\left({{3 \over 2} - \omega} \right){H^2} = \Lambda + {\Lambda _X},$$
(12.65)

where

$${\Lambda _X} = {m^2}(X - 1)\left[ {6 - 3X + {3 \over 2}(X - 4)(X - 1){\alpha _3} + 6{{(X - 1)}^2}{\alpha _4}} \right],$$
(12.66)

defines an effective cosmological constant which gives rise to self-acceleration even when Λ = 0 (for ω < 6).

The analysis of [127] shows that these self-accelerating cosmological solutions are ghost free provided that

$$0 < \omega < 6,\qquad {X^2} < {{{\alpha _\sigma}{H^2}} \over {m_g^2}} < {r^2}{X^2}$$
(12.67)

where

$$r = 1 + {{\omega {H^2}} \over {{m^2}{X^2}\left({{3 \over 2}{\alpha _3}\left({X - 1} \right) - 2} \right)}}.$$
(12.68)

In particular, this implies that ασ > 0 which demonstrates that the original quasi-dilaton model [119, 116] has a scalar (Higuchi type) ghost. The analysis of [126] confirms these properties in a more general extension of this model.

16 Part IV Other Theories of Massive Gravity

17 New Massive Gravity

17.1 Formulation

Independently of the formal development of massive gravity in four dimensions described above, there has been interest in constructing a purely three dimensional theory of massive gravity. Three dimensions are special for the following reason: for a massless graviton in three dimensions there are no propagating degrees of freedom. This follows simply by counting, a symmetric tensor in three dimensions has six components. A massless graviton must admit a diffeomorphism symmetry which renders three of the degrees of freedom pure gauge, and the remaining three are non-dynamical due to the associated first class constraints. On the contrary, a massive graviton in three dimensions has the same number of degrees of freedom as a massless graviton in four dimensions, namely two. Combining these two facts together, in three dimensions it should be possible to construct a diffeomorphism invariant theory of massive gravity. The usual massless graviton implied by diffeomorphism invariance is absent and only the massive degree of freedom remains.

A diffeomorphism and parity invariant theory in three dimensions was given in [66] and referred to as ‘new massive gravity’ (NMG). In its original formulation the action is taken to be

$${S_{{\rm{NMG}}}} = {1 \over {{\kappa ^2}}}\int {{{\rm{d}}^3}} x\sqrt {- g} \left[ {\sigma R + {1 \over {{m^2}}}\left({{R_{\mu \nu}}{R^{\mu \nu}} - {3 \over 8}{R^2}} \right)} \right],$$
(13.1)

where κ2 = 1/M3 defines the three dimensional Planck mass, σ = α1 and m is the mass of the graviton. In this form the action is manifestly diffeomorphism invariant and constructed entirely out of the metric gμν. However, to see that it really describes a massive graviton, it is helpful to introduce an auxiliary field which, as we will see below, also admits an interpretation as a metric, to give a quasi-bi-gravity formulation

$${S_{{\rm{NMG}}}} = {M_3}\int {{{\rm{d}}^3}x\sqrt {- g} \left[ {\sigma R - {q^{\mu \nu}}{G_{\mu \nu}} - {1 \over 4}{m^2}({q_{\mu \nu}}{q^{\mu \nu}} - {q^2})} \right].}$$
(13.2)

The kinetic term for qμν appears from the mixing with Gμν. Although this is not a true bi-gravity theory, since there is no direct Einstein-Hilbert term for (qμν, we shall see below that it is a well-defined decoupling limit of a bi-gravity theory, and for this reason it makes sense to think of (qμν as effectively a metric degree of freedom. In this form, we see that the special form of \(R_{\mu v}^2 - 3/8{R^2}\) was designed so that (qμν has the Fierz-Pauli mass term. It is now straightforward to see that this corresponds to a theory of massive gravity by perturbing around Minkowski spacetime. Defining

$${g_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {\sqrt {{M_3}}}}{h_{\mu \nu}},$$
(13.3)

and perturbing to quadratic order in and (hμν we have

$${S_2} = {M_3}\int {{{\rm{d}}^3}} x\left[ {- {\sigma \over 2}{h^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {q^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {1 \over 4}{m^2}({q_{\mu \nu}}{q^{\mu \nu}} - {q^2})} \right].$$
(13.4)

Finally, diagonalizing as \({h_{\mu v}} = {\tilde h_{\mu v}} - \sigma {q_{\mu v}}\) we obtain

$${S_2} = {M_3}\int {{{\rm{d}}^3}} x\left[ {- {\sigma \over 2}{{\tilde{h}}^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{{\tilde{h}}_{\alpha \beta}} + {\sigma \over 2}{q^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{q_{\alpha \beta}} - {1 \over 4}{m^2}({q_{\mu \nu}}{q^{\mu \nu}} - {q^2})} \right].$$
(13.5)

which is manifestly a decoupled massless graviton and massive graviton. Crucially, however, we see that the kinetic terms of each have the opposite sign. Since only the degrees of freedom of the massive graviton qμν are propagating, unitarity when coupled to other sources forces us to choose σ = −1. The apparently ghostly massless graviton does not lead to any unitarity violation, at least in perturbation theory, as there is no massless pole in the propagator. The stability of the vacua was further shown in different gauges in Ref. [252].

17.2 Absence of Boulware-Deser ghost

The auxiliary field formulation of new massive gravity is also useful for understanding the absence of the BD ghost [141]. Setting σ = −1 as imposed previously and working with the formulation (13.2), we can introduce new vector and scalar degrees of freedom as follows

$${q_{\mu \nu}} = {1 \over {\sqrt {{M_3}}}}{\bar{q}_{\mu \nu}} + {\nabla _\mu}{V_\nu} + {\nabla _\nu}{V_\mu},$$
(13.6)

with

$${V_\mu} = {1 \over {\sqrt {{M_3}m}}}{A_\mu} + {{{\nabla _\mu}\pi} \over {\sqrt {{M_3}} {m^2}}},$$
(13.7)

where the factors of \(\sqrt {{M_3}}\) and m are chosen for canonical normalization. Aμ represents the helicity-1 mode which carries 1 degree of freedom and π the helicity-0 mode that carries 1 degree freedom. These two modes carries all the dynamical fields.

Introducing new fields in this way also introduced new symmetries. Specifically there is a U (1) symmetry

$$\pi \rightarrow \pi + m\chi ,\quad {A_\mu} \rightarrow {A_\mu} - \chi ,$$
(13.8)

and a linear diffeomorphism symmetry

$${\bar{q}_{\mu \nu}} \rightarrow {\bar{q}_{\mu \nu}} + {\nabla _\mu}{\chi _\nu} + {\nabla _\nu}{\chi _\mu},\quad {A_\mu} \rightarrow {A_\mu} - \sqrt m {\chi _\mu}.$$
(13.9)

Substituting in the action, integrating by parts and using the Bianchi identity \({\nabla _\mu}G_v^\mu = 0\) we obtain

$$\begin{array}{*{20}c} {{S_{{\rm{NMG}}}} = \int {{{\rm{d}}^3}} x\sqrt {- g} \left[ {- {M_3}R - \sqrt {{M_3}} {{\bar{q}}^{\mu \nu}}{G_{\mu \nu}}} \right.\quad \quad \quad \quad \;} \\ {- {1 \over 4}({{(m{{\bar{q}}_{\mu \nu}} + {\nabla _\mu}{A_\nu} + {\nabla _\nu}{A_\mu} + {2 \over m}{\nabla _\mu}{\nabla _\nu}\pi)}^2}} \\ {\left. {- {{(m\bar{q} + 2\nabla A + {2 \over m}\square\pi)}^2})} \right].\quad \quad \quad \quad \quad \quad \;\;} \\ \end{array}$$
(13.10)

Although this action contains apparently higher order terms due to its dependence on μνπ, this dependence is Galileon-like in that the equations of motion for all fields are second order. For instance the naively dangerous combination

$${({\nabla _\mu}{\nabla _\nu}\pi)^2} - {(\square \pi)^2}$$
(13.11)

is up to a boundary term equivalent to Rμνμνπ. In [141] it is shown that the resulting equations of motion of all fields are second order due to these special Fierz-Pauli combinations.

As a result of the introduction of the new gauge symmetries, we straightforwardly count the number of non-perturbative degrees of freedom. The total number of fields are 16: six from gμν, six from qμν, three from Aμ and one from π. The total number of gauge symmetries are 7: three from diffeomorphisms, three from linear diffeomorphisms and one from the U (1). Thus, the total number of degrees of freedom are 16 − 7 (gauge) − 7 (constraint) = 2 which agrees with the linearized analysis. An independent argument leading to the same result is given in [317] where NMG including its topologically massive extension (see below) are presented in Hamiltonian form using Einstein-Cartan language (see also [176]).

17.3 Decoupling limit of new massive gravity

The formalism of Section 13.2 is also useful for deriving the decoupling limit of NMG which as in the higher dimensional case, determines the leading interactions for the helicity-0 mode. The decoupling limit [141] is defined as the limit

$${M_3} \rightarrow \infty ,\quad m \rightarrow 0\quad {\Lambda _{5/2}} = {(\sqrt {{M_3}} {m^2})^{2/5}} = {\rm{fixed}}.$$
(13.12)

As usual the metric is scaled as

$${g_{\mu \nu}} = {\eta _{\mu \nu}} + {1 \over {\sqrt {{M_3}}}}{h_{\mu \nu}},$$
(13.13)

and in the action

$$\begin{array}{*{20}c} {{S_{{\rm{NMG}}}} = \int {{{\rm{d}}^3}} x\sqrt {- g} \left[ {- {M_3}R - \sqrt {{M_3}} {{\bar{q}}^{\mu \nu}}{G_{\mu \nu}}} \right.\quad \quad \quad \quad \;\;} \\ {- {1 \over 4}\left({{{(m{{\bar{q}}_{\mu \nu}} + {\nabla _\mu}{A_\nu} + {\nabla _\nu}{A_\mu} + {2 \over m}{\nabla _\mu}{\nabla _\nu}\pi)}^2}} \right.} \\ {\left. {\left. {- {{(m\bar{q} + 2\nabla A + {2 \over m}\square\pi)}^2}} \right)} \right],\quad \quad \quad \quad \quad \quad \;\;} \\ \end{array}$$
(13.14)

the normalizations have been chosen so that we keep Aμ and π fixed in the limit. We readily find

$$\begin{array}{*{20}c} {{S_{{\rm{dec}}}} = \int {{{\rm{d}}^3}} x\left[ {+ {1 \over 2}{h^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {{\bar{q}}^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - {{\bar{q}}^{\mu \nu}}({\partial _\mu}{\partial _\nu}\pi - {\eta _{\mu \nu}}\square\pi)} \right.} \\ {\left. {- {1 \over 4}{F_{\mu \nu}}{F^{\mu \nu}} + {1 \over {\Lambda _{5/2}^{5/2}}}\hat{\mathcal{E}}_{\alpha \beta}^{\mu \nu}{h^{\alpha \beta}}({\partial _\mu}\pi {\partial _\nu}\pi - {\eta _{\mu \nu}}{{(\partial \pi)}^2})} \right],} \\ \end{array}$$
(13.15)

where all raising and lowering is understood with respect to the 3 dimensional Minkowski metric. Performing the field redefinition \({h_{\mu \nu}} = 2\pi {\eta _{\mu \nu}} + {\tilde h_{\mu \nu}} + {\bar q_{\mu \nu}}\) we finally obtain

$$\begin{array}{*{20}c} {{S_{{\rm{dec}}}} = \int {{{\rm{d}}^3}} x\left[ {+ {1 \over 2}{{\tilde{h}}^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{{\tilde{h}}_{\alpha \beta}} - {1 \over 2}{{\bar{q}}^{\mu \nu}}\hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{{\bar{q}}_{\alpha \beta}}\quad \quad \quad \quad \quad} \right.} \\ \left. {- {1 \over 4}{F_{\mu \nu}}{F^{\mu \nu}} - {1 \over 2}{{(\partial \pi)}^2} + {1 \over {2\Lambda _{5/2}^{5/2}}}{{(\partial \pi)}^2}\square\pi)} \right]. \end{array}$$
(13.16)

Thus, we see that in the decoupling limit, NMG becomes equivalent to two massless gravitons which have no degrees of freedom, one massless spin-1 particle which has one degree of freedom, and one scalar π which has a cubic Galileon interaction. This confirms that the strong coupling scale for NMG is Λ5/2.

The decoupling limit clarifies one crucial aspect of NMG. It has been suggested that NMG could be power counting renormalizable following previous arguments for topological massive gravity [196] due to the softer nature of divergences in three-dimensional and the existence of a dimensionless combination of the Planck mass and the graviton mass. This is in fact clearly not the case since the above cubic interaction is a non-renormalizable operator and dominates the Feynman diagrams leading to perturbative unitarity violation at the strong coupling scale Λ5/2(see Section 10.5 for further discussion on the distinction between the breakdown of perturbative unitarity and the breakdown of the theory).

17.4 Connection with bi-gravity

The existence of the NMG theory at first sight appears to be something of an anomaly that cannot be reproduced in higher dimensions. There also does not at first sight seem to be any obvious connection with the diffeomorphism breaking ghost-free massive gravity model (or dRGT) and multi-gravity extensions. However, in [425] it was shown that NMG, and certain extensions to it, could all be obtained as scaling limits of the same 3-dimensional bi-gravity models that are consistent with ghost-free massive gravity in a different decoupling limit. As we already mentioned, the key to seeing this is the auxiliary formulation where the tensor fμν is related to the missing extra metric of the bi-gravity theory.

Starting with the 3-dimensional version of bi-gravity [293] in the form

$$S = \int {{{\rm{d}}^3}} x\left[ {{{{M_g}} \over 2}\sqrt {- g} R[g] + {{{M_f}} \over 2}\sqrt {- f} R[f] - {m^2}\mathcal{U}[g,f]} \right],$$
(13.17)

where the bi-gravity potential takes the standard form in terms of characteristic polynomials similarly as in (6.4)

$$\mathcal{U}[g,f] = - {{{M_{{\rm{eff}}}}} \over 4}\sum\limits_{n = 0}^3 {{\alpha _n}} {\mathcal{L}_n}(\mathcal{K}),$$
(13.18)

and \({\mathcal K}\) is given in (6.7) in terms of the two dynamical metrics g and f. The scale Meff is defined as \(M_{{\rm{eff}}}^{- 1} = M_g^{- 1} + M_f^{- 1}\). The idea is to define a scaling limit [425] as follows

$${M_f} \rightarrow + \infty$$
(13.19)

keeping M3 = −(Mg + Mf) fixed and keeping fixed in the definition

$${f_{\mu \nu}} = {g_{\mu \nu}} - {{{M_3}} \over {{M_f}}}{q_{\mu \nu}}.$$
(13.20)

Since \({{\mathcal K}_{\mu \nu}} \to {{{M_3}} \over {2{M_f}}}{q_{\mu \nu}}\), then we have in the limit

$$S = \int {{{\rm{d}}^3}} x\left[ {- {{{M_3}} \over 2}\sqrt {- g} R[g] - {{{M_3}} \over 2}{q^{\mu \nu}}{G_{\mu \nu}}(g) + {{{m^2}{M_{{\rm{eff}}}}} \over 4}\sum\limits_{n = 0}^3 {{\alpha _n}} {{(- 1)}^n}{{\left({{{{M_3}} \over {{M_f}}}} \right)}^n}{\mathcal{L}_n}(q)} \right]$$

which prompts the definition of a new set of coefficients

$${c_n} = - {{{{(- 1)}^n}} \over {2{M_3}}}\bar{M}{\alpha _n}{\left({{{{M_3}} \over {{M_f}}}} \right)^n},$$
(13.21)

so that

$$S = {M_3}\int {{{\rm{d}}^3}} x\left[ {- \sqrt {- g} R[g] - {q^{\mu \nu}}{G_{\mu \nu}}(g) - {m^2}\sum\limits_{n = 0}^3 {{c_n}} {\mathcal{L}_n}(q)} \right].$$
(13.22)

Since this theory is obtained as a scaling limit of the ghost-free bi-gravity action, it is guaranteed to be free from the BD ghost. We see that in the case c2 = 1/4, c3 = c4 = 0 we obtain the auxiliary field formulation of NMG, justifying the connection between the auxiliary field qμν and the bi-gravity metric fμν.

17.5 3D massive gravity extensions

The generic form of the auxiliary field formulation of NMG derived above [425]

$$S = {M_3}\int {{{\rm{d}}^3}} x\left[ {- \sqrt {- g} R[g] - {q^{\mu \nu}}{G_{\mu \nu}}(g) - {m^2}\sum\limits_{n = 0}^3 {{c_n}} {\mathcal{L}_n}(q)} \right],$$
(13.23)

demonstrates that there exists a two parameter family extensions of NMG determined by nonzero coefficients for c3 and c4. The purely metric formulation for the generic case can be determined by integrating out the auxiliary field qμν. The equation of motion for qμν is given symbolically

$$- G - {m^2}\sum\limits_{n = 1}^3 n {c_n}{\epsilon\;\epsilon\;q^{n - 1}}{g^{3 - n}} = 0.$$
(13.24)

This is a quadratic equation for the tensor qμν. Together, these two additional degrees of freedom give the cubic curvature [447] and Born-Infeld extension NMG [279]. Although additional higher derivative corrections have been proposed based on consistency with the holographic c-theorem [424], the above connection suggests that Eq. (13.23) is the most general set of interactions allowed in NMG which are free from the BD ghost.

In the specific case of the Born-Infeld extension [279] the action is

$${S_{{\rm{B}}.{\rm{I}}}} = 4{m^2}{M_3}\int {{{\rm{d}}^3}} x\left[ {\sqrt {- g} - \sqrt {- \det [{g_{\mu \nu}} - {1 \over {{m^2}}}{G_{\mu \nu}}]}} \right].$$
(13.25)

It is straightforward to show that on expanding the square root to second order in 1/m2 we recover the original NMG action. The specific case of the Born-Infeld extension of NMG, also has a surprising role as a counterterm in the AdS4 holographic renormalization group [329]. The significance of this relation is unclear at present.

17.6 Other 3D theories

17.6.1 Topological massive gravity

In four dimensions, the massive spin-2 representations of the Poincaré group must come in positive and negative helicity pairs. By contrast, in three dimensions the positive and negative helicity states are completely independent. Thus, while a parity preserving theory of massive gravity in three dimensions will contain two propagating degrees of freedom, it seems possible in principle for there to exist an interacting theory for one of the helicity modes alone. What is certainly possible is that one can give different interactions to the two helicity modes. Such a theory necessarily breaks parity, and was found in [180, 179]. This theory is known as ‘topologically massive gravity’ (TMG) and is described by the Einstein-Hilbert action, with cosmological constant, supplemented by a term constructed entirely out of the connection (hence the name topological)

$$S = {{{M_3}} \over 2}\int {{{\rm{d}}^3}} x\sqrt {- g} (R - 2\Lambda) + {1 \over {4\mu}}{\epsilon^{\lambda \mu \nu}}\Gamma _{\lambda \sigma}^\rho \left[ {{\partial _\mu}\Gamma _{\rho \nu}^\sigma + {2 \over 3}\Gamma _{\mu \tau}^\sigma \Gamma _{\nu \rho}^\tau} \right].$$
(13.26)

The new interaction is a gravitational Chern-Simons term and is responsible for the parity breaking. More generally, this action may be supplemented to the NMG Lagrangian interactions and so the TMG can be viewed as a special case of the full extended parity violating NMG.

The equations of motion for topologically massive gravity take the form

$${G_{\mu \nu}} + \Lambda {g_{\mu \nu}} + {1 \over \mu}{C_{\mu \nu}} = 0,$$
(13.27)

where Cμν is the Cotton tensor which is given by

$${C_{\mu \nu}} = {\epsilon_\mu}^{\alpha \beta}{\nabla _\alpha}({R_{\beta \nu}} - {1 \over 4}{g_{\beta \nu}}R).$$
(13.28)

Einstein metrics for which Gμν = −Λgμν remain as a subspace of general set of vacuum solutions. In the case where the cosmological constant is negative Λ = −1/2 we can use the correspondence of Brown and Henneaux [78] to map the theory of gravity on an asymptotically AdS3 space to a 2D CFT living at the boundary.

The AdS/CFT in the context of topological massive gravity was also studied in Ref. [449].

17.6.2 Supergravity extensions

As with any gravitational theory, it is natural to ask whether extensions exist which exhibit local supersymmetry, i.e., supergravity. A supersymmetric extension to topologically massive gravity was given in [182]. An N = 1 supergravity extension of NMG including the topologically massive gravity terms was given in [21] and further generalized in [67]. The construction requires the introduction of an ‘auxiliary’ bosonic scalar field S so that the form of the action is

$$\begin{array}{*{20}c} {S = {1 \over {{\kappa ^2}}}\int {{{\rm{d}}^3}} x\sqrt {- g} \left[ {M{\mathcal{L}_C} + \sigma {\mathcal{L}_{E.H.}} + {1 \over {{m^2}}}{\mathcal{L}_K} + {1 \over {8{{\tilde{m}}^2}}}{\mathcal{L}_{{R^2}}} + {1 \over {{{\hat m}^2}}}{\mathcal{L}_{{S^4}}} + {1 \over {\hat \mu}}{\mathcal{L}_{{S^3}}}} \right]} \\ {+ \int {{{\rm{d}}^3}} x{1 \over \mu}{\mathcal{L}_{{\rm{top}}}},\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(13.29)

where

$${\mathcal{L}_C} = S + {\rm{fermions}}$$
(13.30)
$${\mathcal{L}_{\rm{E}.\rm{H}.}} = R - 2{S^2} + {\rm{fermions}}$$
(13.31)
$${\mathcal{L}_K} = K - {1 \over 2}{S^2}R - {3 \over 2}{S^4} + {\rm{fermions}}$$
(13.32)
$${\mathcal{L}_{{R^2}}} = - 16\left[ {{{(\partial S)}^2} - {9 \over 4}{{({S^2} + {1 \over 6}R)}^2}} \right] + {\rm{fermions}}$$
(13.33)
$${\mathcal{L}_{{S^4}}} = {S^4} + {3 \over {10}}R{S^2} + {\rm{fermions}}$$
(13.34)
$${\mathcal{L}_{{S^3}}} = {S^3} + {1 \over 2}RS + {\rm{fermions}}$$
(13.35)
$${\mathcal{L}_{{\rm{top}}}} = {1 \over 4}{\epsilon^{\lambda \mu \nu}}\Gamma _{\lambda \sigma}^\rho \left[ {{\partial _\mu}\Gamma _{\rho \nu}^\sigma + {2 \over 3}\Gamma _{\mu \tau}^\sigma \Gamma _{\nu \rho}^\tau} \right] + {\rm{fermions}}.$$
(13.36)

The fermion terms complete each term in the Lagrangian into an independent supersymmetric invariant. In other words supersymmetry alone places no further restrictions on the parameters in the theory. It can be shown that the theory admits supersymmetric AdS vacua [21, 67]. The extensions of this supergravity theory to larger numbers of supersymmetries is considered at the linearized level in [64].

Moreover, N = 2 supergravity extensions of TMG were recently constructed in Ref. [370] and its N = 3 and N = 4 supergravity extensions in Ref. [371].

17.6.3 Critical gravity

Finally, let us comment on a special case of three dimensional gravity known as log gravity [65] or critical gravity in analogy with the general dimension case [384, 183, 16]. For a special choice of parameters of the theory, there is a degeneracy in the equations of motion for the two degrees of freedom leading to the fact that one of the modes of the theory becomes a ‘logarithmic’ mode.

Indeed, at the special point μℓ = 1, (where is the AdS length scale, Λ = −1/2), known as the ‘chiral point’ the left-moving (in the language of the boundary CFT) excitations of the theory become pure gauge and it has been argued that the theory then becomes purely an interacting theory for the right moving graviton [93]. In Ref. [377] it was earlier argued that there was no massive graviton excitations at the critical point μℓ = 1 however Ref. [93] found one massive graviton excitation for every finite and non-zero value of μℓ, including at the critical point μℓ = 1.

This case was further analyzed in [273], see also Ref. [274] for a recent review. It was shown that the degeneration of the massive graviton mode with the left moving boundary graviton leads to logarithmic excitations.

To be more precise, starting with the auxiliary formulation of NMG with a cosmological constant λm2

$${S_{{\rm{NMG}}}} = {M_3}\int {{{\rm{d}}^3}} x\sqrt {- g} \left[ {\sigma R - 2\lambda {m^2} - {q^{\mu \nu}}{G_{\mu \nu}} - {1 \over 4}{m^2}({q_{\mu \nu}}{q^{\mu \nu}} - {q^2})} \right],$$
(13.37)

we can look for AdS vacuum solutions for which the associated cosmological constant Λ = − 1/2 in Gμν = −Λgμν is not the same as λm2. The relation between the two is set by the vacuum equations to be

$$- {1 \over {4{m^2}}}{\Lambda ^2} - \Lambda \sigma + \lambda {m^2} = 0,$$
(13.38)

which generically has two solutions. Perturbing the action to quadratic order around this vacuum solution we have

$${S_2} = {M_3}\int {{{\rm{d}}^3}} x\left[ {- {{\bar{\sigma}} \over 2}{h^{\mu \nu}}{\mathcal{G}_{\mu \nu}} - {q^{\mu \nu}}{\mathcal{G}_{\mu \nu}} - {1 \over 4}{m^2}({q_{\mu \nu}}{q^{\mu \nu}} - {q^2})} \right].$$
(13.39)

where

$${\mathcal{G}_{\mu \nu}}(h) = \hat{\mathcal{E}}_{\mu \nu}^{\alpha \beta}{h_{\alpha \beta}} - 2\Lambda {h_{\mu \nu}} + \Lambda {\bar{g}_{\mu \nu}}h$$
(13.40)

and

$$\bar{\sigma} = \sigma - {\Lambda \over {3{m^2}}}$$
(13.41)

where we raise and lower the indices with respect to the background AdS metric μν.

As usual, it is apparent that this theory describes one massless graviton (with no propagating degrees of freedom) and one massive one whose mass is given by \({M^2} = - {m^2}\bar \sigma\). However, by choosing \(\bar \sigma = 0\) the massive mode becomes degenerate with the existing massless one.

In this case, the action is

$${S_2} = {M_3}\int {{{\rm{d}}^3}} x\left[ {- {q^{\mu \nu}}{\mathcal{G}_{\mu \nu}} - {1 \over 4}{m^2}({q_{\mu \nu}}{q^{\mu \nu}} - {q^2})} \right],$$
(13.42)

and varying with respect to hμν and qμν we obtain the equations of motion

$${\mathcal{G}_{\mu \nu}}(q) = 0,$$
(13.43)
$${\mathcal{G}_{\mu \nu}}(h) + {1 \over 2}({q_{\mu \nu}} - {\bar{g}_{\mu \nu}}q) = 0.$$
(13.44)

Choosing the gauge ∇μhμν − ∇μh = 0, the equations of motion imply h = 0 and the resulting equation of motion for takes the form

$${\left[ {\square - 2\Lambda} \right]^2}{h_{\mu \nu}} = 0.$$
(13.45)

It is this factorization of the equations of motion into a square of an operator that is characteristic of the critical/log gravity theories. Although the equation of motion is solved by the usual massless models for which [□ − 2Λ]hμν, = 0, there are additional logarithmic modes which do not solve this equation but do solve Eq. (13.45). These are so-called because they behave logarithmically in ρ asymptotically when the AdS metric is put in the form 2 = 2(− cosh(ρ)2 dτ2 + dρ2 + sinh(ρ)2 dθ2). The presence of these log modes was shown to remain beyond the linear regime, see Ref. [271].

Based on this result as well as on the finiteness and conservation of the stress tensor and on the emergence of a Jordan cell structure in the Hamiltonian, the correspondence to a logarithmic CFT was conjectured in Ref. [273], where the to be dual log CFTs representations have degeneracies in the spectrum of scaling dimensions.

Strong indications for this correspondence appeared in many different ways. First, consistent boundary conditions which allow the log modes were provided in Ref. [272], were it was shown that in addition to the Brown-Henneaux boundary conditions one could also consider more general ones. These boundary conditions were further explored in [305, 397], where it was shown that the stress-energy tensor for these boundary conditions are finite and not chiral, giving another indication that the theory could be dual to a logarithmic CFT.

Then specific correlator functions were computed and compared. Ref. [449] checked the 2-point correlators and Ref. [275] the 3-point ones. A similar analysis was also performed within the context of NMG in Ref. [270] where the 2-point correlators were computed at the chiral point and shown to behave as those of a logarithmic CFT.

Further checks for this AdS/log CFT include the 1-loop partition function as computed in Ref. [241]. See also Ref. [274] for a review of other checks.

It has been shown, however, that ultimately these theories are non-unitary due to the fact that there is a non-zero inner product between the log modes and the normal models and the inability to construct a positive definite norm on the Hilbert space [432].

17.7 Black holes and other exact solutions

A great deal of physics can be learned from studying exact solutions, in particular those corresponding to black hole geometries. Black holes are also important probes of the non-perturbative aspects of gravitational theories. We briefly review here the types of exact solutions obtained in the literature.

In the case of topologically massive gravity, a one-parameter family of extensions to the BTZ black hole have been obtained in [245]. In the case of NMG as well as the usual BTZ black holes obtained in the presence of a negative cosmological constant there are in a addition a class of warped AdS3 black holes [102] whose metric takes the form

$${\rm{d}}{s^2} = - {\beta ^2}{{{\rho ^2} - \rho _0^2} \over {{r^2}}}\;{\rm{d}}{t^2} + {r^2}{\left({{\rm{d}}\phi - {{\rho + (1 - {\beta ^2})\omega} \over {{r^2}}}{\rm{d}}t} \right)^2} + {1 \over {{\beta ^2}{\zeta ^2}}}{{{\rm{d}}{\rho ^2}} \over {{\rho ^2} - \rho _0^2}},$$
(13.46)

where the radial coordinate is given by

$${r^2} = {\rho ^2} + 2\omega \rho + {\omega ^2}(1 - {\beta ^2}) + {{{\beta ^2}\rho _0^2} \over {1 - {\beta ^2}}}.$$
(13.47)

and the parameters β and ζ are determined in terms of the graviton mass and the cosmological constant Λ by

$${\beta ^2} = {{9 - 21\Lambda /{m^2} \mp 2\sqrt {3(5 + 7\Lambda /{m^2}}} \over {4(1 - \Lambda /{m^2})}},\quad {\zeta ^{- 2}} = {{21 - 4{\beta ^2}} \over {8{m^2}}}.$$
(13.48)

This metric exhibits two horizons at ρ = ± ρ0, if β2 ≥ 0 and 0 is real. Absence of closed timelike curves requires that β2 ≤ 1. This puts the allowed range on the values of Λ to be

$$- {{35{m^2}} \over {289}} \leq \Lambda \leq {{{m^2}} \over {21}}.$$
(13.49)

AdS waves, extensions of plane (pp) waves anti-de Sitter spacetime have been considered in [33]. Further work on extensions to black hole solutions, including charged black hole solutions can be found in [418, 101, 253, 5, 6, 372, 427, 250]. We note in particular the existence of a class of Lifshitz black holes [32] that exhibit the Lifshitz anisotropic scale symmetry

$$t \rightarrow {\lambda ^z}t,\quad \vec{x} \rightarrow \lambda \vec{x},$$
(13.50)

where z is the dynamical critical exponent. As an example for z = 3 the following Lifshitz black hole can be found [32]

$${\rm{d}}{s^2} = - {{{r^6}} \over {{\ell ^6}}}\left({1 - {{M{l^2}} \over {{r^2}}}} \right)\;{\rm{d}}{t^2} + {{{\rm{d}}{r^2}} \over {\left({{{{r^2}} \over {{\ell ^2}}} - M} \right)}} + {r^2}{\rm{d}}{\phi ^2}.$$
(13.51)

This metric has a curvature singularity at r = 0 and a horizon at \({r_ +} = \ell \sqrt M\). The Lifshitz symmetry is preserved if we scale t → λ3t, xλx, rλ−1r and in addition we scale the black hole mass as M → λ−2M. The metric should be contrasted with the normal BTZ black hole which corresponds to z = 1

$${\rm{d}}{s^2} = - {{{r^2}} \over {{\ell ^2}}}\left({1 - {{M{\ell ^2}} \over {{r^2}}}} \right)\;{\rm{d}}{t^2} + {{{\rm{d}}{r^2}} \over {\left({{{{r^2}} \over {{\ell ^2}}} - M} \right)}} + {r^2}{\rm{d}}{\phi ^2}.$$
(13.52)

Exact solutions for charged black holes were also derived in Ref. [249] and an exact, non-stationary solution of TMG and NMG with the asymptotic charges of a BTZ black hole was find in [227]. This exact solution was shown to admit a timelike singularity. Other exact asymptotically AdS-like solutions were found in Ref. [251].

17.8 New massive gravity holography

One of the most interesting avenues of exploration for NMG has been in the context of Maldacena’s AdS/CFT correspondence [396]. According to this correspondence, NMG with a cosmological constant chosen so that there are asymptotically anti-de Sitter solutions is dual to a conformal field theory (CFT). This has been considered in [67, 381, 380] where it was found that the requirements of bulk unitarity actually lead to a negative central charge.

The argument for this proceeds from the identification of the central charge of the dual two dimensional field theory with the entropy of a black hole in the bulk using Cardy’s formula. The entropy of the black hole is given by [368]

$$S = {{{A_{{\rm{BTZ}}}}} \over {4{G_3}}}\Omega ,$$
(13.53)

where G3 is the 3-dimensional Newton constant and \(\Omega = {{2{G_3}} \over {3\ell}}c\) where l is the AdS radius and c is the central charge. This formula is such that c = 1 for pure Einstein-Hilbert gravity with a negative cosmological constant.

A universal formula for this central charge has been obtained as is given by

$$c = {\ell \over {2{G_3}}}{g_{\mu \nu}}{{\partial \mathcal{L}} \over {\partial {R_{\mu \nu}}}}.$$
(13.54)

This result essentially follows from using the Wald entropy formula [480] for a higher derivative gravity theory and identifying this with the central change through the Cardy formula. Applying this argument for new massive gravity we obtain [67]

$$c = {{3\ell} \over {2{G_3}}}\left({\sigma + {1 \over {2{m^2}{\ell ^2}}}} \right).$$
(13.55)

Since σ = −1 is required for bulk unitarity, we must choose m2 > 0 to have a chance of getting c positive. Then we are led to conclude that the central charge is only positive if

$$\Lambda = - {1 \over {{\ell ^2}}} < - 2{m^2}.$$
(13.56)

However, unitarity in the bulk requires m2 > − Λ/2 and this excludes this possibility. We are thus led to conclude that NMG cannot be unitary both in the bulk and in the dual CFT. This failure to maintain both bulk and boundary unitarity can be resolved by a modification of NMG to a full bi-gravity model, namely Zwei-Dreibein gravity to which we turn next.

17.9 Zwei-dreibein gravity

As we have seen, there is a conflict in NMG between unitarity in the bulk, i.e., the requirement that the massive gravitons are not ghosts, and unitarity in dual CFT as required by the positivity of the central charge. This conflict may be resolved, however, by replacing NMG with the 3-dimensional bi-gravity extension of ghost-free massive gravity that we have already discussed. In particular, if we work in the Einstein-Cartan formulation in three dimensions, then the metric is replaced by a ‘dreibein’ and since this is a bi-gravity model, we need two ‘dreibeins’. This gives us the Zwei-dreibein gravity [63].

In the notation of [63] the Lagrangian is given by

$$\begin{array}{*{20}c} {\mathcal{L} = - \sigma {M_1}{e_a}{R^a}(e) - {M_2}{f_a}{R^a}(f) - {1 \over 6}{m^2}{M_1}{\alpha _1}{\epsilon_{abc}}{e^a}{e^b}{e^c}\quad \quad \quad \quad \quad \quad} \\ {- {1 \over 6}{m^2}{M_2}{\alpha _2}{\epsilon_{abc}}{f^a}{f^b}{f^c} + {1 \over 2}{m^2}{M_{12}}{\epsilon_{abc}}({\beta _1}{e^a}{e^b}{f^c} + {\beta _2}{e^a}{f^b}{f^c})} \\ \end{array}$$
(13.57)

where we have suppressed the wedge products e3 = ee ∧e, Ra (e) is Lorentz vector valued curvature two-form for the spin-connection associated with the dreibein e and Ra (f) that associated with the dreibein f. Since we are in three dimensions, the spin-connection can be written as a Lorentz vector dualizing with the Levi-Civita symbol ωa = ϵabcϵbc This is nothing other than the vierbein representation of bi-gravity with the usual ghost-free (dRGT) mass terms. As we have already discussed, NMG and its various extensions arise in appropriate scaling limits.

A computation of the central charge following the same procedure was given in [63] with the result that

$$c = 12\pi \ell (\sigma {M_1} + \gamma {M_2})$$
(13.58)

Defining the parameter γ via the relation

$$\begin{array}{*{20}c} {({\alpha _2}(\sigma {M_1} + {M_2}) + {\beta _2}{M_2}){\gamma ^2} + 2({M_2}{\beta _1} - \sigma {M_1}{\beta _2})\gamma \quad \quad \quad \quad \quad \quad} \\ {- \sigma ({\alpha _1}(\sigma {M_1} + {M_2}) + {\beta _1}{M_1}) = 0,} \\ \end{array}$$
(13.59)

then bulk unitarity requires γ/(σM1 +σM2) < 0. In order to have c > 0 we thus need γ < 0 which in turn implies σ > 1 (since M1 and M2 are defined as positive). The absence of tachyons in the AdS vacuum requires β1 + γβ2 > 0, and this assumes a real solution for γ for a negative Λ. There are an open set of such solutions to these conditions, which shows that the conditions for unitarity are not finely tuned. For example in [63] it is shown that there is an open set of solutions which are close to the special case M1 = M2, β1 = β2 = 1, γ = 1 and \({\alpha _1} = {\alpha _2} = 3/2 + {1 \over {{\ell ^2}{m^2}}}\). This result is not in contradiction with the scaling limit that reproduces NMG, because this scaling limit requires the choice σ = −1 which is in contradiction with positive central charge.

These results potentially have an impact on the higher dimensional case. We see that in three dimensions we potentially have a diffeomorphism invariant theory of massive gravity (i.e., bi-gravity) which at least for AdS solutions exhibits unitarity both in the bulk and in the boundary CFT for a finite range of parameters in the theory. However, these bi-gravity models are easily extended into all dimensions as we have already discussed and it is similarly easy to find AdS solutions which exhibit bulk unitarity. It would be extremely interesting to see if the associated dual CFTs are also unitary thus providing a potential holographic description of generalized theories of massive gravity.

18 Lorentz-Violating Massive Gravity

18.1 SO(3)-invariant mass terms

The entire analysis performed so far is based on assuming Lorentz invariance. In what follows we briefly review a few other potentially viable theories of massive gravity where Lorentz invariance is broken and their respective cosmology.

Prior to the formulation of the ghost-free theory of massive gravity, it was believed that no Lorentz invariant theories of massive gravity could evade the BD ghost and Lorentz-violating theories were thus the best hope; we refer the reader to [438] for a thorough review on the field. A thorough analysis of Lorentz-violating theories of massive gravity was performed in [200] and more recently in [107]. See also Refs. [236, 73, 114] for other complementary studies. Since this field has been reviewed in [438] we only summarize the key results in this section (see also [74] for a more recent review on many developments in Lorentz violating theories.) See also Ref. [378] for an interesting spontaneous breaking of Lorentz invariance in ghost-free massive gravity using three scalar fields, and Ref. [379] for a SO (3)-invariant ghost-free theory of massive gravity which can be formulated with three Stückelberg scalar fields and propagating five degrees of freedom.

In most theories of Lorentz-violating massive gravity, the SO (3, 1) Poincaré group is broken down to a SO (3) rotation group. This implies the presence of a preferred time. Preferred-frame effects are, however, strongly constrained by solar system tests [487] as well as pulsar tests [54], see also [494, 493] for more recent and even tighter constraints.

At the linearized level the general mass term that satisfies this rotation symmetry is

$${{\mathcal L}_{SO(3)\;{\rm{mass}}}} = {1 \over 8}\left({m_0^2h_{00}^2 + 2m_1^2h_{0i}^2 - m_2^2h_{ij}^2 + m_3^2h_i^ih_j^j - 2m_4^2{h_{00}}h_i^i} \right)\,,$$
(14.1)

where space indices are raised and lowered with respect to the flat spatial metric δij. This extends the Lorentz invariant mass term presented in (2.39). In the rest of this section, we will establish the analogue of the Fierz-Pauli mass term (2.44) in this Lorentz violating case and establish the conditions on the different mass parameters m0,1,2,3,4.

We note that Lorentz invariance is restored when m1 = m2, m3 = m4 and \(m_0^2 = - m_1^2 + m_3^2\). The Fierz-Pauli structure then further fixes m1 = m3 implying m0 = 0, which is precisely what ensures the presence of a constraint and the absence of BD ghost (at least at the linearized level).

Out of these five mass parameters some of them have a direct physical meaning [200, 437, 438]

  • The parameter m2 is the one that represents the mass of the helicity-2 mode. As a result we should impose \(m_2^2 \ge 0\) to avoid tachyon-like instabilities. Although we should bear in mind that if that mass parameter is of the order of the Hubble parameter today m2 ≃ 10−33 eV, then such an instability would not be problematic.

  • The parameter m1 is the one responsible for turning on a kinetic term for the two helicity-1 modes. Since m1 = m2 in a Lorentz-invariant theory of massive gravity, the helicity-1 mode cannot be turned off (m1 = 0) while maintaining the graviton massive (m2 ≠ 0). This is a standard result of Lorentz invariant massive gravity seen so far where the helicity-1 mode is always present. For Lorentz breaking theories the theory is quite different and one can easily switch off at the linearized level the helicity-1 modes in a theory of Lorentz-breaking massive gravity. The absence of a ghost in the helicity-1 mode requires \(m_1^2 \ge 0\).

  • If m0 ≠ 0 and m1 ≠ 0 and m4 ≠ 0 then two scalar degrees of freedom are present already at the linear level about flat space-time and one of these is always a ghost. The absence of ghost requires either m0 = 0 or m1 = 0 or finally m4 = 0 and m2 = m3.

    In the last scenario, where m4 = 0 and m2 = 3, the scalar degree of freedom loses its gradient terms at the linear level, which means that this mode is infinitely strongly coupled unless no gradient appears fully non-linearly either.

    The case m0 has an interesting phenomenology as will be described below. While it propagates five degrees of freedom about Minkowski it avoids the vDVZ discontinuity in an interesting way.

    Finally, the case m1 = 0 (including when m0 = 0) will be discussed in more detail in what follows. It is free of both scalar (and vector) degrees of freedom at the linear level about Minkowski and thus evades the vDVZ discontinuity in a straightforward way.

  • The analogue of the Higuchi bound was investigated in [72]. In de Sitter with constant curvature H, the generalized Higuchi bound is

    $$m_4^4 + 2{H^2}\left({3(m_3^2 - m_4^2) - m_2^2} \right) > m_4^2(m_1^2 - m_4^2)\quad {\rm{if}}\quad {m_0} = 0\,,$$
    (14.2)

    while if instead m1 = 0 then no scalar degree of freedom are propagating on de Sitter either so there is no analogue of the Higuchi bound (a scalar starts propagating on FLRW solutions but it does not lead to an equivalent Higuchi bound either. However, the absence of tachyon and gradient instabilities do impose some conditions between the different mass parameters).

As shown in the case of the Fierz-Pauli mass term and its non-linear extension, one of the most natural way to follow the physical degrees of freedom and their health is to restore the broken symmetry with the appropriate number of Stückelberg fields.

In Section 2.4 we reviewed how to restore the broken diffeomorphism invariance using four Stückelberg ϕa fields using the relation (2.75). When Lorentz invariance is broken, the Stückelberg trick has to be performed slightly differently. Performing an ADM decomposition, which is appropriate for the type of Lorentz breaking we are considering, we can use for Stückelberg scalar fields Φ = Φ0 and Φii = 1, ⋯, 3 to define the following four-dimensional scalar, vector and tensors [107]

$$n = {(- {g^{\mu \nu}}{\partial _\mu}\Phi {\partial _\nu}\Phi)^{- 1/2}}$$
(14.3)
$${n_\mu} = n\,{\partial _\mu}\Phi$$
(14.4)
$$Y_\nu ^\mu = {g^{\mu \alpha}}{\partial _\alpha}{\Phi ^i}\;{\partial _\nu}{\Phi ^j}{\delta _{ij}}$$
(14.5)
$$\Gamma _\nu ^\mu = Y_\nu ^\mu + {n^\mu}{n^\alpha}\;{\partial _\alpha}{\Phi ^i}\;{\partial _\nu}{\Phi ^j}{\delta _{ij}}\,.$$
(14.6)

n can be thought of as the ‘Stückelbergized’ version of the lapse and \(\Gamma _{\,\,\,\,v}^\mu\), as that of the spatial metric.

In the Lorentz-invariant case we are stuck with the combination \(X_{\,\, v}^\mu = Y_{\,\, v}^\mu - {n^{- 2}}{g^{\mu \alpha}}{n_\alpha}{n_\nu}\), but this combination can be broken here and the mass term can depend separately on n, nμ and Y. This allows for new mass terms. In [107] this framework was derived and used to find new mass terms that exhibit five degrees of freedom. This formalism was also developed in [200] and used to derive new mass terms that also have fewer degrees of freedom. We review both cases in what follows.

18.2 Phase m1 = 0

18.2.1 Degrees of freedom on Minkowski

As already mentioned, the helicity-1 mode have no kinetic term at the linear level on Minkowski if m1 = 0. Furthermore, it turns out that the field υ in (14.17) is the Lagrange multiplier which removes the BD ghost (as opposed to the field ψ in the case m0 = 0 presented previously). It imposes the constraint \(\dot \tau = 0\) which in turns implies τ = 0. Using this constraint back in the action, one can check that there remains no time derivatives on any of the scalar fields, which means that there are no propagating helicity-0 mode on Minkowski either [200, 437, 438]. So in the case where m1 = 0 there are only 2 modes propagating in the graviton on Minkowski, the 2 helicity-2 modes as in GR.

In this case, the absence of the ghost can be seen to follow from the presence of a residual symmetry on flat space [200, 438]

$${x^i} \rightarrow {x^i} + {\xi ^i}(t)\,,$$
(14.7)

for three arbitrary functions ξi (t). In the Stückelberg language this implies the following internal symmetry

$${\Phi ^i} \rightarrow {\Phi ^i} + {\xi ^i}(\Phi)\,.$$
(14.8)

To maintain this symmetry non-linearly the mass term should be a function of n and \(\Gamma _{\,\,v}^\mu\), [438]

$${{\mathcal L}_{{\rm{mass}}}} = - {{{m^2}M_{{\rm{P}}1}^2} \over 8}\sqrt {- g} \,F\left({n,\Gamma _\nu ^\mu} \right)\,$$
(14.9)

The absence of helicity-1 and -0 modes while keeping the helicity-2 mode massive makes this Lorentz violating theory of gravity especially attractive. Its cosmology was explored in [201] and it turns out that this theory of massive gravity could be a candidate for cold dark matter as shown in [202].

Moreover, explicit black hole solutions were presented in [438] where it was shown that in this theory of massive gravity black holes have hair and the Stückelberg fields (in the Stückelberg formulation of the theory) do affect the solution. This result is tightly linked to the fact that this theory of massive gravity admits instantaneous interactions which is generic to any action of the form (14.9).

18.2.2 Non-perturbative degrees of freedom

Perturbations on more general FLRW backgrounds were then considered more recently in [72]. Unlike in Minkowski, scalar perturbations on curved backgrounds are shown to behave in a similar way for the cases m1 = 0 and m0 = 0. However, as we shall see below, the case m0 = 0 propagates five degrees of freedom including a helicity-0 mode that behaves as a scalar it follows that on generic backgrounds the theory with m1 = 0 also propagates a helicity-0 mode. The helicity-0 mode is thus infinitely strongly coupled when considered perturbatively about Minkowski.

18.3 General massive gravity (m0 = 0)

In [107] the most general mass term that extends (14.1) non-linearly was considered. It can be written using the Stückelberg variables defined in (14.3), (14.4) and (14.5),

$${{\mathcal L}_{{\rm{SO}}(3)\;{\rm{mass}}}} = - {{{m^2}M_{{\rm{P}}1}^2} \over 8}\sqrt {- g} \,V\left({n,{n_\mu},Y_\nu ^\mu} \right)\,.$$
(14.10)

Generalizing the Hamiltonian analysis for this mass term and requiring the propagation of five degrees of freedom about any background led to the Lorentz-invariant ghost-free theory of massive gravity presented in Part II as well as two new theories of Lorentz breaking massive gravity.

All of these cases ensures the absence of BD ghost by having m0 = 0. The case where the BD ghost is projected thanks to the requirement m1 = 0 is discussed in Section 14.2.

18.3.1 First explicit Lorentz-breaking example with five dofs

The first explicit realization of a consistent nonlinear Lorentz breaking model is as follows [107]

$${V_1}\left({n,{n_\mu},Y_\nu ^\mu} \right) = {n^{- 1}}[n + \zeta (\Gamma)]U(\tilde{\mathcal K}) + {n^{- 1}}{\mathcal C}(\Gamma)\,,$$
(14.11)

with

$${{\tilde {\mathcal K}}^\mu}_\nu = \left({{\Gamma ^{\mu \alpha}} - {{{n^2}{n^\mu}{n^\alpha}} \over {{{[n + \zeta (\Gamma)]}^2}}}} \right){\partial _\alpha}{\Phi ^i}{\partial _\nu}{\Phi ^j}{\delta _{ij}}\,,$$
(14.12)

and where U, C and ζ are scalar functions.

The fact that several independent functions enter the mass term will be of great interest for cosmology as one of these functions (namely \({\mathcal C}\)) can be used to satisfy the Bianchi identity while the other function can be used for an appropriate cosmological history.

The special case ζ = 0 is what is referred to as the ‘minimal model’ and was investigated in [108]. In unitary gauge, this minimal model is simply

$${\mathcal L}_{{\rm{SO}}(3)\;{\rm{mass}}}^{({\rm{minimal}})} = - {{{m^2}M_{{\rm{P}}1}^2} \over 8}\sqrt {- g} \left({U({g^{ik}}{\delta _{kj}}) + {N^{- 1}}{\mathcal C}({\gamma ^{ik}}{\delta _{ij}})} \right)\,,$$
(14.13)

where γij is the spatial part of the metric and gij = γijN−2NiNj, where N is the lapse and Ni the shift.

This minimal model is of special interest as both the primary and secondary second-class constraints that remove the sixth degree of freedom can be found explicitly and on the constraint surface the contribution of the mass term to the Hamiltonian is

$$H \propto M_{{\rm{P}}1}^2{m^2}\int {{{\rm{d}}^3}} x\sqrt \gamma {\mathcal C}({\gamma ^{ik}}{\delta _{ij}})\,,$$
(14.14)

where the overall factor is positive so the Hamiltonian is positive definite as long as the function \({\mathcal C}\) is positive.

At the linearized level about Minkowski (which is a vacuum solution) this theory can be parameterized in terms of the mass scales introduces in (14.1) with m0 = 0, so the BD ghost is projected out in a way similar as in ghost-free massive gravity.

Interestingly, if \({\mathcal C} = 0\), this theory corresponds to m1 = 0 (in addition to m0 = 0) which as seen earlier the helicity-1 mode is absent at the linearized level. However, they survive non-linearly and so the case \({\mathcal C} = 0\) is infinitely strongly coupled.

18.3.2 Second example of Lorentz-breaking with five dofs

Another example of Lorentz breaking (3) invariant theory of massive gravity was provided in [107]. In that case the Stückelberg language is not particularly illuminating and we simply give the form of the mass term in unitary gauge (Φ = t and Φi = xi)

$$\begin{array}{*{20}c}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \,{V_2} = {{{c_1}} \over 2}\left[ {{{\vec N}^T}{{\left({N\,{\rm{{\mathbb I}}} + {\rm{{\mathbb M}}}} \right)}^{- 1}}\left({{\rm{{\mathbb F}}} + {N^{- 1}}{\rm{{\mathbb M}{\mathbb F}}}} \right){{\left({N\,{\rm{{\mathbb I}}} + {\rm{{\mathbb M}}}} \right)}^{- 1}}\vec N} \right] \\ {+ C + {N^{- 1}}\tilde C\,.} \\ \end{array}$$
(14.15)

where \(\mathbb{F}=\{f_{ij}\}\) is the spatial part of the reference metric (for a Minkowski reference metric fij = δij), c1 is a constant, and \({\mathcal C}\) and \({\tilde {\mathcal C}}\) are functions of the spatial metric γij, while \({\mathbb M}\) is a rank-3 matrix which depends on γikfkj.

Interestingly, \({\tilde {\mathcal C}}\) does not enter the Hamiltonian on the constraint surface. The contribution of this mass term to the on-shell Hamitonian is [107]

$$H \propto M_{{\rm{P}}1}^2{m^2}\int {{{\rm{d}}^3}} x\sqrt \gamma \left[ {- {{{c_1}} \over 2}{{\vec N}^T}{{\left({N\,{\rm{{\mathbb I}}} + {\rm{{\mathbb M}}}} \right)}^{- 1}}{\rm{{\mathbb F}{\mathbb M}}}{{\left({N\,{\rm{{\mathbb I}}} + {\rm{{\mathbb I}}}} \right)}^{- 1}}\vec N + {\mathcal C}} \right]\,,$$
(14.16)

with a positive coefficient, which implies that \({\mathcal C}\) should be bounded from below

18.3.3 Absence of vDVZ and strong coupling scale

Unlike in the Lorentz-invariant case, the kinetic term for the Stückelberg fields does not only arise from the mixing with the helicity-2 mode.

When looking at perturbations about Minkowski and focusing on the scalar modes we can follow the analysis of [437],

$${\rm{d}}{s^2} = - (1 - \psi)\,{\rm{d}}{t^2} + 2{\partial _i}\upsilon \,\,{\rm{d}}{x^i}{\rm{d}}t + ({\delta _{ij}} + \tau {\delta _{ij}} + {\partial _i}{\partial _j}\sigma)\,{\rm{d}}{x^i}\,{\rm{d}}{x^j}\,,$$
(14.17)

when m0 = 0 ψ plays the role of the Lagrange multiplier for the primary constraint imposing

$$\sigma = \left({{2 \over {m_4^2}} - {3 \over \nabla}} \right)\tau \,,$$
(14.18)

where ∇ is the three-dimensional Laplacian. The secondary constraint then imposes the relation

$$v = {2 \over {m_1^2}}\dot \tau \,,$$
(14.19)

where dots represent derivatives with respect to the time. Using these relations for υ and σ, we obtain the Lagrangian for the remaining scalar mode (the helicity-0 mode) τ [437, 200],

$${{\mathcal L}_\tau} = {{M_{{\rm{P}}1}^2} \over 4}(\left[ {\left({{4 \over {m_4^2}} - {4 \over {m_1^2}}} \right)\nabla \tau - 3\tau} \right]\ddot \tau - 2{{m_2^2 - m_3^2} \over {m_4^4}}{(\nabla \tau)^2}$$
(14.20)
$$+ \left({4{{m_2^2} \over {m_4^2}} - 1} \right)\tau \nabla \tau - 3m_2^2{\tau ^2})\,$$
(14.21)

In terms of power counting this means that the Lagrangian includes terms of the form \(M_{{\rm{Pl}}}^2{m^2}\nabla \phi \ddot \phi\) arising from the term going as \((m_4^{- 2} - m_1^{- 2})\nabla \tau \ddot \tau\) (where ϕ designates the helicity-0 mode which includes a combination of σ and υ). Such terms are not present in the Lorentz-invariant Fierz-Pauli case and its non-linear ghost-free extension since m4 = 1 in that case, and they play a crucial role in this Lorentz violating setting.

Indeed, in the small mass limit these terms \(M_{{\rm{Pl}}}^2{m^2}\nabla \phi \ddot \phi\) dominate over the ones that go as \(M_{{\rm{Pl}}}^2{m^4}\nabla \phi \ddot \phi\) (i.e., the ones present in the Lorentz invariant case). This means that in the small mass limit, the correct canonical normalization of the helicity-0 mode ϕ is not of the form \(\hat \phi = \phi/{M_{{\rm{Pl}}}}{m^2}\) but rather \(\hat \phi = \phi/{M_{{\rm{Pl}}}}m\sqrt \nabla\), which is crucial in determining the strong coupling scale and the absence of vDVZ discontinuity:

  • The new canonical normalization implies a much larger strong coupling scale that goes as Λ2 = (MPlm)1/2 rather than Λ3 = (MPlm2)1/3 as is the case in DGP and ghost-free massive gravity.

  • Furthermore, in the massless limit the coupling of the helicity-0 mode to the tensor vanishes fasters than some of the Lorentz-violating kinetic interactions in (14.20) (which is scales as \(m\hat h{\partial ^2}\hat \phi\). This means that one can take the massless limit m → 0 in such a way that the coupling to the helicity-2 mode disappears and so does the coupling of the helicity-0 mode to matter (since this coupling arises after de-mixing of the helicity-0 and -2 modes). This implies the absence of vDVZ discontinuity in this Lorentz-violating theory despite the presence of five degrees of freedom.

The absence of vDVZ discontinuity and the larger strong coupling scale Λ2 makes this theory more tractable at small mass scales. We emphasize, however, that the absence of vDVZ discontinuity does prevent some sort of Vainshtein mechanism to still come into play since the theory is still strongly coupled at the scale Λ2MPl. This is similar to what happens for the Lorentz-invariant ghost-free theory of massive gravity on AdS (see Section 8.3.6 and [154]). Interestingly, however, the same redressing of the strong coupling scale as in DGP or ghost-free massive gravity was explored in [109] where it was shown that in the vicinity of a localized mass, the strong coupling scale gets redressed in such a way that the weak field approximation remains valid till the Schwarzschild radius of the mass, i.e., exactly as in GR.

In these theories, bounds on the graviton comes from the exponential decay in the Yukawa potential which switches gravity off at the graviton’s Compton wavelength, so the Compton wavelength ought to be larger than the largest gravitational bound states which are of about 5 Mpc, putting a bound on the graviton mass of m ≲ 10−30 eV in which case Λ2 ∼ (10−4 mm)−1 [108, 257].

18.3.4 Cosmology of general massive gravity

The cosmology of general massive gravity was recently studied in [109] and we summarize their results in what follows.

In Section 12, we showed how the Bianchi identity in ghost-free massive gravity prevents the existence of spatially flat FLRW solutions. The situation is similar in general Lorentz violating theories of massive gravity unless the function \({\mathcal C}\) in (14.11) is chosen so as to satisfy the following relation when the shift and ni vanish [109]

$$H\left({{\mathcal C}\prime - {1 \over 2}{\mathcal C}} \right) = 0\,.$$
(14.22)

Choosing a function \({\mathcal C}\) which satisfies the appropriate condition to allow for FLRW solutions, the Friedmann equation then depends entirely on the function \(U(\tilde {\mathcal K})\) also defined in (14.11). In this case the graviton potential (14.10) acts as an effective ‘dark fluid’ with respective energy density and pressure dictated by the function U [109]

$${\rho _{{\rm{eff}}}} = {{{m^2}} \over 4}U(\tilde {\mathcal K})\qquad {p_{{\rm{eff}}}} = {{{m^2}} \over 4}(2U\prime(\tilde {\mathcal K}) - U(\tilde {\mathcal K}))\,,$$
(14.23)

leading to an effective phantom-like behavior when 2U/U < 0.

This solution is stable and healthy as long as the second derivative of \({\mathcal C}\) satisfies some conditions which can easily be accommodated for appropriate functions \({\mathcal C}\) and U.

Expanding U in terms of the scale factor for late time \(U = \sum\nolimits_{n \ge 0} {{{\bar U}_n}{{(a - 1)}^n}}\) one can use CMB and BAO data from [2] to put constraints on the first terms of that series [109]

$${{{{\bar U}_1}} \over {{{\bar U}_0}}} = 0.12 \pm 2.1\quad {\rm{and}}\quad {{{{\bar U}_2}} \over {{{\bar U}_0}}} < 2 \pm 3\quad {\rm{at}}\quad 95\% \;{\rm{C.L.}}$$
(14.24)

Focusing instead on early time cosmology BBN data can similarly be used to constrains the function U, see [109] for more details.

19 Non-local massive gravity

The ghost-free theory of massive gravity proposed in Part II as well as the Lorentz-violating theories of Section 14 require an auxiliary metric. New massive gravity, on the other hand, can be formulated in a way that requires no mention of an auxiliary metric. Note however that all of these theories do break one copy of diffeomorphism invariance, and this occurs in bi-gravity as well and in the zwei dreibein extension of new massive gravity.

One of the motivations of non-local theories of massive gravity is to formulate the theory without any reference metric.Footnote 34 This is the main idea behind the non-local theory of massive gravity introduced in [328].Footnote 35

Starting with the linearized equation about flat space-time of the Fierz-Pauli theory

$$\delta {G_{\mu \nu}} - {1 \over 2}{m^2}({h_{\mu \nu}} - h{\eta _{\mu \nu}}) = 8\pi G{T_{\mu \nu}}\,,$$
(15.1)

where \(\delta {G_{\mu v}} = \hat \varepsilon _{\mu v}^{\alpha \beta}{h_{\alpha \beta}}\) is the linearized Einstein tensor, this modified Einstein equation can be ‘covariantized’ so as to be valid about for any background metric. The linearized Einstein tensor δGμν gets immediately covariantized to the full Einstein tensor Gμν. The mass term, on the other hand, is more subtle and involves non-local operators. Its covariantization can take different forms, and the ones considered in the literature that do not involve a reference metric are

$${1 \over 2}({h_{\mu \nu}} - h{\eta _{\mu \nu}}) \rightarrow \left\{{\begin{array}{*{20}c} {{{(\square_g^{- 1}{G_{\mu \nu}})}^T}} & {{\rm{Ref}}{\rm{.[328]}}} \\ {{3 \over 8}{{({g_{\mu \nu}}\square_g^{- 1}R)}^T}} & {{\rm{Refs}}{\rm{.[395,229,228]}}} \\ \end{array} ,} \right.$$
(15.2)

where □g is the covariant d’Alembertian □g = ∇μν and \(\square_g^{-1}\) represents the retarded propagator. One could also consider a linear combination of both possibilities. Furthermore, any of these terms could also be implemented by additional terms that vanish on flat space, but one should take great care in ensuring that they do not propagate additional degrees of freedom (and ghosts).

Following [328] we use the notation where T designates the transverse part of a tensor. For any tensor Sμν,

$${S_{\mu \nu}} = S_{\mu \nu}^T + {\nabla _{(\mu}}{S_{\nu)}}\,,$$
(15.3)

with \({\nabla ^\mu}S_{\mu v}^T = 0\). In flat space we can infer the relation [328]

$$S_{\mu \nu}^T = {S_{\mu \nu}} - {2 \over \square}{\partial _{(\mu}}{\partial ^\alpha}{S_{\nu)\alpha}} + {1 \over {{\square^2}}}{\partial _\mu}{\partial _\nu}{\partial ^\alpha}{\partial ^\beta}{S_{\alpha \beta}}\,.$$
(15.4)

The theory propagates what looks like a ghost-like instability irrespectively of the exact formulation chosen in (15.2). However, it was recently argued that the would-be ghost is not a radiative degree of freedom and therefore does not lead to any vacuum decay. It remains an open question of whether the would be ghost can be avoided in the full nonlinear theory.

The cosmology of this model was studied in [395, 228]. The new contribution (15.2) in the Einstein equation can play the role of dark energy. Taking the second formulation of (15.2) and setting the graviton mass to m ≃ 0.67H0, where H0 is the Hubble parameter today, reproduces the observed amount of dark energy. The mass term acts as a dark fluid with effective time-dependent equation of state ωeff(a) ≃ −1.04 − 0.02(1 − a), where a is the scale factor, and is thus phantom-like.

Since this theory is formulated at the level of the equations of motion and not at the level of the action and since it includes non-local operators it ought to be thought as an effective classical theory. These equations of motion should not be used to get some insight on the quantum nature of the theory nor on its quantum stability. New physics would kick in when quantum corrections ought to be taken into account. It remains an open question at the moment of how to embed nonlocal massive gravity into a consistent quantum effective field theory.

Notice, however, that an action principle was proposed in Ref. [402], (focusing on four dimensions),

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{{M_{{\rm{P}}1}^2} \over 2}R + \bar \lambda + {{M_{{\rm{P}}1}^2} \over {2{M^2}}}{R_{\mu \nu}}h\left({- {\square \over {{M^2}}}} \right){G^{\mu \nu}}} \right]\,,$$
(15.5)

where the function h is defined as

$$h(z) = {{\square + {m^2}} \over \square}{1 \over z}\vert {p_s}(z)\vert {e^{{1 \over 2}\Gamma (0,p_s^2(z)) + {1 \over 2}{\gamma _E}}}\,,$$
(15.6)

where γ = 0.577216 is the Euler’s constant, \(\Gamma (b,z) = \int\nolimits_z^\infty {{t^{b - 1}}} {e^{- t}}\) is the incomplete gamma function, s is a integer s > 3 and ps (z) is a real polynomial of rank s. Upon deriving the equations of motion we recover the non-local massive gravity Einstein equation presented above [402],

$${G_{\mu \nu}} + {{{m^2}} \over \square}{G_{\mu \nu}} = {{M_{{\rm{P}}1}^2} \over 2}{T_{\mu \nu}}\,,$$
(15.7)

up to order R2 corrections. We point out however that in this action derivation principle the operator □−1 likely correspond to a symmetrized Green’s function, while in (15.2) causality requires □−1 to represent the retarded one.

We stress, however, that this theory should be considered as a classical theory uniquely and not be quantized. It is an interesting question of whether or not the ghost reappears when considering quantum fluctuations like the ones that seed any cosmological perturbations. We emphasize for instance that when dealing with any cosmological perturbations, these perturbations have a quantum origin and it is important to rely of a theory that can be quantized to describe them.

20 Outlook

The past decade has witnessed a revival of interest in massive gravity as a potential alternative to GR. The original theoretical obstacles that came in the way of deriving a consistent theory of massive gravity have now been overcome, but with them comes a new set of challenges that will be decisive in establishing the viability of such theories. The presence of a low strong coupling scale on which the Vainshtein mechanism relies has opened the door to a new way of thinking about these types of effective field theories. At the moment, it is yet unclear whether these types of theories could lead to an alternative to UV completion. The superluminalities that also arise in many cases with the Vainshtein mechanism should also be understood in more depth. At the moment, its real implications are not well understood and no case of true acausality has been shown to be present within the regime of validity of the theory. Finally, the difficulty in finding fully-fledged cosmological and black-hole solutions in many of these theories (both in ghost-free massive gravity and bi-gravity, and in other extensions or related models such as cascading gravity) makes their full phenomenology still evasive. Nevertheless, the well understood decoupling limits of these models can be used to say a great deal about phenomenology without going into the complications of the full theories. These represent many open questions in massive gravity, which reflect the fact that the field is yet extremely young and many developments are still in progress.