1 Introduction

General Relativity (GR) [225, 226] is widely accepted as a fundamental theory to describe the geometric properties of spacetime. In a homogeneous and isotropic spacetime the Einstein field equations give rise to the Friedmann equations that describe the evolution of the universe. In fact, the standard big-bang cosmology based on radiation and matter dominated epochs can be well described within the framework of General Relativity.

However, the rapid development of observational cosmology which started from 1990s shows that the universe has undergone two phases of cosmic acceleration. The first one is called inflation [564, 339, 291, 524], which is believed to have occurred prior to the radiation domination (see [402, 391, 71] for reviews). This phase is required not only to solve the flatness and horizon problems plagued in big-bang cosmology, but also to explain a nearly flat spectrum of temperature anisotropies observed in Cosmic Microwave Background (CMB) [541]. The second accelerating phase has started after the matter domination. The unknown component giving rise to this late-time cosmic acceleration is called dark energy [310] (see [517, 141, 480, 485, 171, 32] for reviews). The existence of dark energy has been confirmed by a number of observations — such as supernovae Ia (SN Ia) [490, 506, 507], large-scale structure (LSS) [577, 578], baryon acoustic oscillations (BAO) [227, 487], and CMB [560, 561, 367].

These two phases of cosmic acceleration cannot be explained by the presence of standard matter whose equation of state w = P/ρ satisfies the condition w ≥ 0 (here P and ρ are the pressure and the energy density of matter, respectively). In fact, we further require some component of negative pressure, with w < −1/3, to realize the acceleration of the universe. The cosmological constant Λ is the simplest candidate of dark energy, which corresponds to w = −1. However, if the cosmological constant originates from a vacuum energy of particle physics, its energy scale is too large to be compatible with the dark energy density [614]. Hence we need to find some mechanism to obtain a small value of Λ consistent with observations. Since the accelerated expansion in the very early universe needs to end to connect to the radiation-dominated universe, the pure cosmological constant is not responsible for inflation. A scalar field ϕ with a slowly varying potential can be a candidate for inflation as well as for dark energy.

Although many scalar-field potentials for inflation have been constructed in the framework of string theory and supergravity, the CMB observations still do not show particular evidence to favor one of such models. This situation is also similar in the context of dark energy — there is a degeneracy as for the potential of the scalar field (“quintessence” [111, 634, 267, 263, 615, 503, 257, 155]) due to the observational degeneracy to the dark energy equation of state around w = −1. Moreover it is generally difficult to construct viable quintessence potentials motivated from particle physics because the field mass responsible for cosmic acceleration today is very small (mϕ ≃ 10−33 eV) [140, 365].

While scalar-field models of inflation and dark energy correspond to a modification of the energy-momentum tensor in Einstein equations, there is another approach to explain the acceleration of the universe. This corresponds to the modified gravity in which the gravitational theory is modified compared to GR. The Lagrangian density for GR is given by f(R) = R − 2Λ, where R is the Ricci scalar and Λ is the cosmological constant (corresponding to the equation of state w = −1). The presence of Λ gives rise to an exponential expansion of the universe, but we cannot use it for inflation because the inflationary period needs to connect to the radiation era. It is possible to use the cosmological constant for dark energy since the acceleration today does not need to end. However, if the cosmological constant originates from a vacuum energy of particle physics, its energy density would be enormously larger than the today’s dark energy density. While the Λ-Cold Dark Matter (ΛCDM) model (f(R) = R − 2Λ) fits a number of observational data well [367, 368], there is also a possibility for the time-varying equation of state of dark energy [10, 11, 450, 451, 630].

One of the simplest modifications to GR is the f(R) gravity in which the Lagrangian density f is an arbitrary function of R [77, 512, 102, 106]. There are two formalisms in deriving field equations from the action in f(R) gravity. The first is the standard metric formalism in which the field equations are derived by the variation of the action with respect to the metric tensor gμν. In this formalism the affine connection \(\Gamma _{\beta \gamma}^\alpha\) depends on gμν. Note that we will consider here and in the remaining sections only torsion-free theories. The second is the Palatini formalism [481] in which gμν and \(\Gamma _{\beta \gamma}^\alpha\) are treated as independent variables when we vary the action. These two approaches give rise to different field equations for a non-linear Lagrangian density in R, while for the GR action they are identical with each other. In this article we mainly review the former approach unless otherwise stated. In Section 9 we discuss the Palatini formalism in detail.

The model with f(R) = R + αR2 (α > 0) can lead to the accelerated expansion of the Universe because of the presence of the αR2 term. In fact, this is the first model of inflation proposed by Starobinsky in 1980 [564]. As we will see in Section 7, this model is well consistent with the temperature anisotropies observed in CMB and thus it can be a viable alternative to the scalar-field models of inflation. Reheating after inflation proceeds by a gravitational particle production during the oscillating phase of the Ricci scalar [565, 606, 426].

The discovery of dark energy in 1998 also stimulated the idea that cosmic acceleration today may originate from some modification of gravity to GR. Dark energy models based on f(R) theories have been extensively studied as the simplest modified gravity scenario to realize the late-time acceleration. The model with a Lagrangian density f(R) = Rα/Rn (α > 0, n > 0) was proposed for dark energy in the metric formalism [113, 120, 114, 143, 456]. However it was shown that this model is plagued by a matter instability [215, 244] as well as by a difficulty to satisfy local gravity constraints [469, 470, 245, 233, 154, 448, 134]. Moreover it does not possess a standard matter-dominated epoch because of a large coupling between dark energy and dark matter [28, 29]. These results show how non-trivial it is to obtain a viable f(R) model. Amendola et al. [26] derived conditions for the cosmological viability of f(R) dark energy models. In local regions whose densities are much larger than the homogeneous cosmological density, the models need to be close to GR for consistency with local gravity constraints. A number of viable f(R) models that can satisfy both cosmological and local gravity constraints have been proposed in. [26, 382, 31, 306, 568, 35, 587, 206, 164, 396]. Since the law of gravity gets modified on large distances in f(R) models, this leaves several interesting observational signatures such as the modification to the spectra of galaxy clustering [146, 74, 544, 526, 251, 597, 493], CMB [627, 544, 382, 545], and weak lensing [595, 528]. In this review we will discuss these topics in detail, paying particular attention to the construction of viable f(R) models and resulting observational consequences.

The f(R) gravity in the metric formalism corresponds to generalized Brans-Dicke (BD) theory [100] with a BD parameter ωBD = 0 [467, 579, 152]. Unlike original BD theory [100], there exists a potential for a scalar-field degree of freedom (called “scalaron” [564]) with a gravitational origin. If the mass of the scalaron always remains as light as the present Hubble parameter H0, it is not possible to satisfy local gravity constraints due to the appearance of a long-range fifth force with a coupling of the order of unity. One can design the field potential of f(R) gravity such that the mass of the field is heavy in the region of high density. The viable f(R) models mentioned above have been constructed to satisfy such a condition. Then the interaction range of the fifth force becomes short in the region of high density, which allows the possibility that the models are compatible with local gravity tests. More precisely the existence of a matter coupling, in the Einstein frame, gives rise to an extremum of the effective field potential around which the field can be stabilized. As long as a spherically symmetric body has a “thin-shell” around its surface, the field is nearly frozen in most regions inside the body. Then the effective coupling between the field and non-relativistic matter outside the body can be strongly suppressed through the chameleon mechanism [344, 343]. The experiments for the violation of equivalence principle as well as a number of solar system experiments place tight constraints on dark energy models based on f(R) theories [306, 251, 587, 134, 101].

The spherically symmetric solutions mentioned above have been derived under the weak gravity backgrounds where the background metric is described by a Minkowski space-time. In strong gravitational backgrounds such as neutron stars and white dwarfs, we need to take into account the backreaction of gravitational potentials to the field equation. The structure of relativistic stars in f(R) gravity has been studied by a number of authors [349, 350, 594, 43, 600, 466, 42, 167]. Originally the difficulty of obtaining relativistic stars was pointed out in [349] in connection to the singularity problem of f(R) dark energy models in the high-curvature regime [266]. For constant density stars, however, a thin-shell field profile has been analytically derived in [594] for chameleon models in the Einstein frame. The existence of relativistic stars in f(R) gravity has been also confirmed numerically for the stars with constant [43, 600] and varying [42] densities. In this review we shall also discuss this issue.

It is possible to extend f(R) gravity to generalized BD theory with a field potential and an arbitrary BD parameter ωBD. If we make a conformal transformation to the Einstein frame [213, 609, 408, 611, 249, 268], we can show that BD theory with a field potential corresponds to the coupled quintessence scenario [23] with a coupling Q between the field and non-relativistic matter. This coupling is related to the BD parameter via the relation 1/(2Q2) = 3 + 2ωBD [343, 596]. One can recover GR by taking the limit Q − 0, i.e., ωBD → ∞. The f(R) gravity in the metric formalism corresponds to \(Q = - 1/\sqrt 6\) [28], i.e., ωBD = 0. For large coupling models with \(\left\vert Q \right\vert = \mathcal{O}\left(1 \right)\) it is possible to design scalar-field potentials such that the chameleon mechanism works to reduce the effective matter coupling, while at the same time the field is sufficiently light to be responsible for the late-time cosmic acceleration. This generalized BD theory also leaves a number of interesting observational and experimental signatures [596].

In addition to the Ricci scalar R, one can construct other scalar quantities such as RμνRμν and Rμνρσ Rμνρσ from the Ricci tensor Rμν and Riemann tensor Rμνρσ [142]. For the Gauss-Bonnet (GB) curvature invariant defined by \(\mathcal{G} \equiv {R^2} - 4{R_{\alpha \beta}}{R^{\alpha \beta}} + {R_{\alpha \beta \gamma \delta}}{R^{\alpha \beta \gamma \delta}}\), it is known that one can avoid the appearance of spurious spin-2 ghosts [572, 67, 302] (see also [98, 465, 153, 447, 110, 181, 109]). In order to give rise to some contribution of the GB term to the Friedmann equation, we require that (i) the GB term couples to a scalar field ϕ, i.e., \(F\left(\phi \right)\mathcal{G}\) or (ii) the Lagrangian density f is a function of Q, i.e., \(f\left({\mathcal G} \right)\). The GB coupling in the case (i) appears in low-energy string effective action [275] and cosmological solutions in such a theory have been studied extensively (see [34, 273, 105, 147, 588, 409, 468] for the construction of nonsingular cosmological solutions and [463, 360, 361, 593, 523, 452, 453, 381, 25] for the application to dark energy). In the case (ii) it is possible to construct viable models that are consistent with both the background cosmological evolution and local gravity constraints [458, 188, 189] (see also [165, 180, 178, 383, 633, 599]). However density perturbations in perfect fluids exhibit negative instabilities during both the radiation and the matter domination, irrespective of the form of \(f\left(\mathcal{G} \right)\) [383, 182]. This growth of perturbations gets stronger on smaller scales, which is difficult to be compatible with the observed galaxy spectrum unless the deviation from GR is very small. We shall review such theories as well as other modified gravity theories.

This review is organized as follows. In Section 2 we present the field equations of f(R) gravity in the metric formalism. In Section 3 we apply f(R) theories to the inflationary universe. Section 4 is devoted to the construction of cosmologically viable f(R) dark energy models. In Section 5 local gravity constraints on viable f(R) dark energy models will be discussed. In Section 6 we provide the equations of linear cosmological perturbations for general modified gravity theories including metric f(R) gravity as a special case. In Section 7 we study the spectra of scalar and tensor metric perturbations generated during inflation based on f(R) theories. In Section 8 we discuss the evolution of matter density perturbations in f(R) dark energy models and place constraints on model parameters from the observations of large-scale structure and CMB. Section 9 is devoted to the viability of the Palatini variational approach in f(R) gravity. In Section 10 we construct viable dark energy models based on BD theory with a potential as an extension of f(R) theories. In Section 11 the structure of relativistic stars in f(R) theories will be discussed in detail. In Section 12 we provide a brief review of Gauss-Bonnet gravity and resulting observational and experimental consequences. In Section 13 we discuss a number of other aspects of f(R) gravity and modified gravity. Section 14 is devoted to conclusions.

There are other review articles on f(R) gravity [556, 555, 618] and modified gravity [171, 459, 126, 397, 217]. Compared to those articles, we put more weights on observational and experimental aspects of f(R) theories. This is particularly useful to place constraints on inflation and dark energy models based on f(R) theories. The readers who are interested in the more detailed history of f(R) theories and fourth-order gravity may have a look at the review articles by Schmidt [531] and Sotiriou and Faraoni [556].

In this review we use units such that c = ħ = kB = 1, where c is the speed of light, ħ is reduced Planck’s constant, and kB is Boltzmann’s constant. We define \({\kappa ^2} = 8\pi G = 8\pi/m_{{\rm{pl}}}^2 = 1/M_{{\rm{pl}}}^2\), where G is the gravitational constant, mpl = 1.22 × 1019 GeV is the Planck mass with a reduced value \({M_{{\rm{pl}}}} = {m_{{\rm{pl}}}}/\sqrt {8\pi} = 2.44 \times {10^{18}}{\rm{Gev}}\). Throughout this review, we use a dot for the derivative with respect to cosmic time t and “X” for the partial derivative with respect to the variable X, e.g., f,R∂f/∂R and f,RR2f/∂R2. We use the metric signature (−, +, +, +). The Greek indices μ and ν run from 0 to 3, whereas the Latin indices i and j run from 1 to 3 (spatial components).

2 Field Equations in the Metric Formalism

We start with the 4-dimensional action in f(R) gravity:

$$S = {1 \over {2{\kappa ^2}}}\int {{{\rm{d}}^4}x\sqrt {- g} f(R) +} \;\int {{{\rm{d}}^4}x} {{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(2.1)

where κ2 = 8πG, g is the determinant of the metric gμν, and \({{\mathcal L}_M}\) is a matter LagrangianFootnote 1 that depends on gμν and matter fields ΨM. The Ricci scalar R is defined by R = gμν Rμν, where the Ricci tensor Rμν is

$${R_{\mu \nu}} = {R^\alpha}_{\mu \alpha \nu} = {\partial _\lambda}\Gamma _{\mu \nu}^\lambda - {\partial _\mu}\Gamma _{\lambda \nu}^\lambda + \Gamma _{\mu \nu}^\lambda \Gamma _{\rho \lambda}^\rho - \Gamma _{\nu \rho}^\lambda \Gamma _{\mu \lambda}^\rho.$$
(2.2)

In the case of the torsion-less metric formalism, the connections \(\Gamma _{\beta \gamma}^\alpha\) are the usual metric connections defined in terms of the metric tensor gμν, as

$$\Gamma _{\beta \gamma}^\alpha = {1 \over 2}{g^{\alpha \lambda}}\left({{{\partial {g_{\gamma \lambda}}} \over {\partial {x^\beta}}} + {{\partial {g_{\lambda \beta}}} \over {\partial {x^\gamma}}} - {{\partial {g_{\beta \gamma}}} \over {\partial {x^\lambda}}}} \right).$$
(2.3)

This follows from the metricity relation, \({\nabla _\lambda}{g_{\mu \nu}} = \partial {g_{\mu \nu}}/\partial {x^\lambda} - {g_{\rho \nu}}\Gamma _{\mu \lambda}^\rho - {g_{\mu \rho}}\Gamma _{\nu \lambda}^\rho = 0\).

2.1 Equations of motion

The field equation can be derived by varying the action (2.1) with respect to gμν:

$${\Sigma _{\mu \nu}} \equiv F(R){R_{\mu \nu}}(g) - {1 \over 2}f(R){g_{\mu \nu}} - {\nabla _\mu}{\nabla _\nu}F(R) + {g_{\mu \nu}}\square F(R) = {\kappa ^2}T_{\mu \nu}^{(M)},$$
(2.4)

where F(R) = ∂f/∂R. \(T_{\mu \nu}^{\left(M \right)}\) is the energy-momentum tensor of the matter fields defined by the variational derivative of \({{\mathcal L}_M}\) in terms of gμν:

$$T_{\mu \nu}^{(M)} = - {2 \over {\sqrt {- g}}}{{\delta {{\mathcal L}_M}} \over {\delta {g^{\mu \nu}}}}.$$
(2.5)

This satisfies the continuity equation

$${\nabla ^\mu}T_{\mu \nu}^{(M)} = 0,$$
(2.6)

as well as Σμν, i.e., ∇μΣμν = 0.Footnote 2 The trace of Eq. (2.4) gives

$$3\square F(R) + F(R)R - 2f(R) = {\kappa ^2}T,$$
(2.7)

where \(T = {g^{\mu \nu}}T_{\mu \nu}^{\left(M \right)}\) and \(\Box F = \left({1/\sqrt {- g}} \right){\partial _\mu}\left({\sqrt {- g} {g^{\mu \nu}}{\partial _\nu}F} \right)\).

Einstein gravity, without the cosmological constant, corresponds to f(R) = R and F(R) = 1, so that the term □F(R) in Eq. (2.7) vanishes. In this case we have R = −κ2T and hence the Ricci scalar R is directly determined by the matter (the trace T). In modified gravity the term □F(R) does not vanish in Eq. (2.7), which means that there is a propagating scalar degree of freedom, φF(R). The trace equation (2.7) determines the dynamics of the scalar field φ (dubbed “scalaron” [564]).

The field equation (2.4) can be written in the following form [568]

$${G_{\mu \nu}} = {\kappa ^2}\left({T_{\mu \nu}^{(M)} + T_{\mu \nu}^{(D)}} \right),$$
(2.8)

where GμνRμν − (1/2)gμνR and

$${\kappa ^2}T_{\mu \nu}^{(D)} \equiv {g_{\mu \nu}}(f - R)/2 + {\nabla _\mu}{\nabla _\nu}F - {g_{\mu \nu}}\square F + (1 - F){R_{\mu \nu}}.$$
(2.9)

Since ∇μGμν = 0 and \({\nabla ^\mu}T_{\mu \nu}^{\left(M \right)} = 0\), it follows that

$${\nabla ^\mu}T_{\mu \nu}^{(D)} = 0.$$
(2.10)

Hence the continuity equation holds, not only for Σμν, but also for the effective energy-momentum tensor \(T_{\mu \nu}^{\left(D \right)}\) defined in Eq. (2.9). This is sometimes convenient when we study the dark energy equation of state [306, 568] as well as the equilibrium description of thermodynamics for the horizon entropy [53].

There exists a de Sitter point that corresponds to a vacuum solution (T = 0) at which the Ricci scalar is constant. Since □F(R) = 0 at this point, we obtain

$$F(R)R - 2f(R) = 0.$$
(2.11)

The model f(R) = αR2 satisfies this condition, so that it gives rise to the exact de Sitter solution [564]. In the model f(R) = R + αR2, because of the linear term in R, the inflationary expansion ends when the term αR2 becomes smaller than the linear term R (as we will see in Section 3). This is followed by a reheating stage in which the oscillation of R leads to the gravitational particle production. It is also possible to use the de Sitter point given by Eq. (2.11) for dark energy.

We consider the spatially flat Friedmann-Lemaître-Robertson-Walker (FLRW) spacetime with a time-dependent scale factor a(t) and a metric

$${\rm{d}}{s^2} = {g_{\mu \nu}}{\rm{d}}{x^\mu}{\rm{d}}{x^\nu} = - {\rm{d}}{t^2} + {a^2}(t){\rm{d}}{x^2},$$
(2.12)

where t is cosmic time. For this metric the Ricci scalar R is given by

$$R = 6(2{H^2} + \dot H),$$
(2.13)

where Hȧ/a is the Hubble parameter and a dot stands for a derivative with respect to t. The present value of H is given by

$${H_0} = 100\;h\;{\rm{km}}\;{\sec ^{- 1}}{\rm{Mp}}{{\rm{c}}^{- 1}} = 2.1332\;h \times {10^{- 42}}{\rm{GeV,}}$$
(2.14)

where h = 0.72 ± 0.08 describes the uncertainty of H0 [264].

The energy-momentum tensor of matter is given by \({T^\mu}_\nu ^{\left(M \right)} = {\rm{diag}}\left({- {\rho _M},\,{P_M},\,{P_M},\,{P_M}} \right)\), where ρM is the energy density and PM is the pressure. The field equations (2.4) in the flat FLRW background give

$$3F{H^2} = (FR - f)/2 - 3H\dot F + {\kappa ^2}{\rho _M},$$
(2.15)
$$- 2F\dot H = \ddot F - H\dot F + {\kappa ^2}({\rho _M} + {P_M}),$$
(2.16)

where the perfect fluid satisfies the continuity equation

$${\dot \rho _M} + 3H({\rho _M} + {P_M}) = 0{.}$$
(2.17)

We also introduce the equation of state of matter, wMPM/ρm. As long as wM is constant, the integration of Eq. (2.17) gives \({\rho _M} \propto {a^{- 3\left({1 + {w_M}} \right)}}\). In Section 4 we shall take into account both non-relativistic matter (wM = 0) and radiation (wr = 1/3) to discuss cosmological dynamics of f(R) dark energy models.

Note that there are some works about the Einstein static universes in f(R) gravity [91, 532]. Although Einstein static solutions exist for a wide variety of f(R) models in the presence of a barotropic perfect fluid, these solutions have been shown to be unstable against either homogeneous or inhomogeneous perturbations [532].

2.2 Equivalence with Brans-Dicke theory

The f(R) theory in the metric formalism can be cast in the form of Brans-Dicke (BD) theory [100] with a potential for the effective scalar-field degree of freedom (scalaron). Let us consider the following action with a new field χ,

$$S = {1 \over {2{\kappa ^2}}}\int {{{\rm{d}}^4}} x\sqrt {- g} [f(\chi) + {f_{,\chi}}(\chi)(R - \chi)] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}){.}$$
(2.18)

Varying this action with respect to χ, we obtain

$${f_{,\chi \chi}}(\chi)(R - \chi) = 0{.}$$
(2.19)

Provided f,χχ(χ) ≠ 0 it follows that χ = R. Hence the action (2.18) recovers the action (2.1) in f(R) gravity. If we define

$$\varphi \equiv {f_{,\chi}}(\chi),$$
(2.20)

the action (2.18) can be expressed as

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over {2{\kappa ^2}}}\varphi R - U(\varphi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(2.21)

where U(φ) is a field potential given by

$$U(\varphi) = {{\chi (\varphi)\varphi - f(\chi (\varphi))} \over {2{\kappa ^2}}}.$$
(2.22)

Meanwhile the action in BD theory [100] with a potential U(φ) is given by

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over 2}\varphi R - {{{\omega _{{\rm{BD}}}}} \over {2\varphi}}{{(\nabla \varphi)}^2} - U(\varphi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(2.23)

where ωBD is the BD parameter and (∇φ)2gμνμφ∂νφ. Comparing Eq. (2.21) with Eq. (2.23), it follows that f(R) theory in the metric formalism is equivalent to BD theory with the parameter ωBD = 0 [467, 579, 152] (in the unit κ2 = 1). In Palatini f(R) theory where the metric gμν and the connection \(\Gamma _{\beta \gamma}^\alpha\) are treated as independent variables, the Ricci scalar is different from that in metric f(R) theory. As we will see in Sections 9.1 and 10.1, f(R) theory in the Palatini formalism is equivalent to BD theory with the parameter ωBD = −3/2.

2.3 Conformal transformation

The action (2.1) in f(R) gravity corresponds to a non-linear function f in terms of R. It is possible to derive an action in the Einstein frame under the conformal transformation [213, 609, 408, 611, 249, 268, 410]:

$${\tilde g_{\mu \nu}} = {\Omega ^2}{g_{\mu \nu}},$$
(2.24)

where Ω2 is the conformal factor and a tilde represents quantities in the Einstein frame. The Ricci scalars R and \(\tilde R\) in the two frames have the following relation

$$R = {\Omega ^2}(\tilde R + 6\tilde \square\omega - 6{\tilde g^{\mu \nu}}{\partial _\mu}\omega {\partial _\nu}\omega),$$
(2.25)

where

$$\omega \equiv \ln \Omega, \quad \quad {\partial _\mu}\omega \equiv {{\partial \omega} \over {\partial {{\tilde x}^\mu}}},\quad \quad \tilde\square \omega \equiv {1 \over {\sqrt {- \tilde g}}}{\partial _\mu}(\sqrt {- \tilde g} {\tilde g^{\mu \nu}}{\partial _\nu}\omega){.}$$
(2.26)

We rewrite the action (2.1) in the form

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left({{1 \over {2{\kappa ^2}}}FR - U} \right) + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(2.27)

where

$$U = {{FR - f} \over {2{\kappa ^2}}}.$$
(2.28)

Using Eq. (2.25) and the relation \(\sqrt {- g} = {\Omega ^{- 4}}\sqrt {- \tilde g}\), the action (2.27) is transformed as

$$S = \int {{{\rm{d}}^4}} x\sqrt {- \tilde g} \left[ {{1 \over {2{\kappa ^2}}}F{\Omega ^{- 2}}(\tilde R + 6\tilde \square\omega - 6{{\tilde g}^{\mu \nu}}{\partial _\mu}\omega {\partial _\nu}\omega) - {\Omega ^{- 4}}U} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({\Omega ^{- 2}}{\tilde g_{\mu \nu}},{\Psi _M}){.}$$
(2.29)

We obtain the Einstein frame action (linear action in \(\tilde R\)) for the choice

$${\Omega ^2} = F.$$
(2.30)

This choice is consistent if F > 0. We introduce a new scalar field ϕ defined by

$$\kappa \phi \equiv \sqrt {3/2} \ln F.$$
(2.31)

From the definition of ω in Eq. (2.26) we have that \(\omega = \kappa \phi/\sqrt 6\). Using Eq. (2.26), the integral \(\int {{{\rm{d}}^4}x} \sqrt {- \tilde g} \tilde \Box \omega\) vanishes on account of the Gauss’s theorem. Then the action in the Einstein frame is

$${S_E} = \int {{{\rm{d}}^4}} x\sqrt {- \tilde g} \left[ {{1 \over {2{\kappa ^2}}}\tilde R - {1 \over 2}{{\tilde g}^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - V(\phi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({F^{- 1}}(\phi){\tilde g_{\mu \nu}},{\Psi _M}),$$
(2.32)

where

$$V(\phi) = {U \over {{F^2}}} = {{FR - f} \over {2{\kappa ^2}{F^2}}}.$$
(2.33)

Hence the Lagrangian density of the field ϕ is given by \({{\mathcal L}_\phi} = - {1 \over 2}{\tilde g^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - V\left(\phi \right)\) with the energy-momentum tensor

$$\tilde T_{\mu \nu}^{(\phi)} = - {2 \over {\sqrt {- \tilde g}}}{{\delta (\sqrt {- \tilde g} {{\mathcal L}_\phi})} \over {\delta {{\tilde g}^{\mu \nu}}}} = {\partial _\mu}\phi {\partial _\nu}\phi - {\tilde g_{\mu \nu}}\left[ {{1 \over 2}{{\tilde g}^{\alpha \beta}}{\partial _\alpha}\phi {\partial _\beta}\phi + V(\phi)} \right].$$
(2.34)

The conformal factor \({\Omega ^2} = F = \exp \left({\sqrt {2/3} \kappa \phi} \right)\) is field-dependent. From the matter action (2.32) the scalar field ϕ is directly coupled to matter in the Einstein frame. In order to see this more explicitly, we take the variation of the action (2.32) with respect to the field ϕ:

$$- {\partial _\mu}\left({{{\partial (\sqrt {- \tilde g} {{\mathcal L}_\phi})} \over {\partial ({\partial _\mu}\phi)}}} \right) + {{\partial (\sqrt {- \tilde g} {{\mathcal L}_\phi})} \over {\partial \phi}} + {{\partial {{\mathcal L}_M}} \over {\partial \phi}} = 0,$$
(2.35)

that is

$$\tilde \square\phi - {V_{,\phi}} + {1 \over {\sqrt {- \tilde g}}}{{\partial {{\mathcal L}_M}} \over {\partial \phi}} = 0,\quad {\rm{where}}\quad \tilde \square\phi \equiv {1 \over {\sqrt {- \tilde g}}}{\partial _\mu}(\sqrt {- \tilde g} {\tilde g^{\mu \nu}}{\partial _\nu}\phi).$$
(2.36)

Using Eq. (2.24) and the relations \(\sqrt {- \tilde g} = {F^2}\sqrt {- g}\) and \({\tilde g^{\mu \nu}} = {F^{- 1}}{g^{\mu \nu}}\), the energy-momentum tensor of matter is transformed as

$$\tilde T_{\mu \nu}^{(M)} = - {2 \over {\sqrt {- \tilde g}}}{{\delta {{\mathcal L}_M}} \over {\delta {{\tilde g}^{\mu \nu}}}} = {{T_{\mu \nu}^{(M)}} \over F}.$$
(2.37)

The energy-momentum tensor of perfect fluids in the Einstein frame is given by

$$\tilde T_{\;\;\;\nu}^{\mu (M)} = {\rm diag} (-{\tilde \rho _M},\tilde P_{M},\tilde P_{M},\tilde P_{M}) = {\rm diag}(- \rho _{M}/{F^2},{P_M}/{F^2},{P_M}/{F^2},{P_M}/{F^2}){.}$$
(2.38)

The derivative of the Lagrangian density \({{\mathcal L}_M} = {{\mathcal L}_M}\left({{g_{\mu \nu}}} \right) = {{\mathcal L}_M}\left({{F^{- 1}}\left(\phi \right){{\tilde g}_{\mu \nu}}} \right)\) with respect to ϕ is

$${{\partial {{\mathcal L}_M}} \over {\partial \phi}} = {{\delta {{\mathcal L}_M}} \over {\delta {g^{\mu \nu}}}}{{\partial {g^{\mu \nu}}} \over {\partial \phi}} = {1 \over {F(\phi)}}{{\delta {{\mathcal L}_M}} \over {\delta {{\tilde g}^{\mu \nu}}}}{{\partial (F(\phi){{\tilde g}^{\mu \nu}})} \over {\partial \phi}} = - \sqrt {- \tilde g} {{{F_{,\phi}}} \over {2F}}\tilde T_{\mu \nu}^{(M)}{\tilde g^{\mu \nu}}.$$
(2.39)

The strength of the coupling between the field and matter can be quantified by the following quantity

$$Q \equiv - {{{F_{,\phi}}} \over {2\kappa F}} = - {1 \over {\sqrt 6}},$$
(2.40)

which is constant in f(R) gravity [28]. It then follows that

$${{\partial {{\mathcal L}_M}} \over {\partial \phi}} = \sqrt {- \tilde g} \kappa Q\tilde T,$$
(2.41)

where \(\tilde T = {\tilde g_{\mu \nu}}{\tilde T^{\mu \nu \left(M \right)}} = - {\tilde \rho _M} + 3{\tilde P_M}\). Substituting Eq. (2.41) into Eq. (2.36), we obtain the field equation in the Einstein frame:

$$\tilde \square\phi - {V_{,\phi}} + \kappa Q\tilde T = 0.$$
(2.42)

This shows that the field ϕ is directly coupled to matter apart from radiation \(\left({\tilde T = 0} \right)\).

Let us consider the flat FLRW spacetime with the metric (2.12) in the Jordan frame. The metric in the Einstein frame is given by

$$\begin{array}{*{20}c} {{\rm{d}}{{\tilde s}^2} = {\Omega ^2}{\rm{d}}{s^2} = F(- {\rm{d}}{t^2} + {a^2}(t){\rm{d}}{x^2}),\quad \quad \quad \;\;\;}\\ {= - {\rm{d}}{{\tilde t}^2} + {{\tilde a}^2}(\tilde t){\rm{d}}{x^2},}\\ \end{array}$$
(2.43)

which leads to the following relations (for F > 0)

$${\rm{d}}\tilde t = \sqrt F {\rm{d}}t,\quad \tilde a = \sqrt F a,$$
(2.44)

where

$$F = {e^{- 2Q\kappa \phi}}.$$
(2.45)

Note that Eq. (2.45) comes from the integration of Eq. (2.40) for constant Q. The field equation (2.42) can be expressed as

$${{{{\rm{d}}^2}\phi} \over {{\rm{d}}{{\tilde t}^2}}} + 3\tilde H{{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}} + {V_{,\phi}} = - \kappa Q({\tilde \rho _M} - 3{\tilde P_M}),$$
(2.46)

where

$$\tilde H \equiv {1 \over {\tilde a}}{{{\rm{d}}\tilde a} \over {{\rm{d}}\tilde t}} = {1 \over {\sqrt F}}\left({H + {{\dot F} \over {2F}}} \right).$$
(2.47)

Defining the energy density \({\tilde \rho _\phi} = {1 \over 2}{\left({{\rm{d}}\phi/{\rm{d}}\tilde t} \right)^2} + V\left(\phi \right)\) and the pressure \({\tilde P_\phi} = {1 \over 2}{\left({{\rm{d}}\phi/{\rm{d}}\tilde t} \right)^2} - V\left(\phi \right)\), Eq. (2.46) can be written as

$${{{\rm{d}}{{\tilde \rho}_\phi}} \over {{\rm{d}}\tilde t}} + 3\tilde H({\tilde \rho _\phi} + {\tilde P_\phi}) = - \kappa Q({\tilde \rho _M} - 3{\tilde P_M}){{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}}.$$
(2.48)

Under the transformation (2.44) together with \({\rho _M} = {F^2}{\tilde \rho _M},\,{P_M} = {F^2}{\tilde P_M}\), and \(H = {F^{1/2}}[\tilde H - ({\rm{d}}F/{\rm{d}}\tilde t)/2F]\), the continuity equation (2.17) is transformed as

$${{{\rm{d}}{{\tilde \rho}_M}} \over {{\rm{d}}\tilde t}} + 3\tilde H({\tilde \rho _M} + {\tilde P_M}) = \kappa Q({\tilde \rho _M} - 3{\tilde P_M}){{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}}.$$
(2.49)

Equations (2.48) and (2.49) show that the field and matter interacts with each other, while the total energy density \({\tilde \rho _T} = {\tilde \rho _\phi} + {\tilde \rho _M}\) and the pressure \({\tilde P_T} = {\tilde P_\phi} + {\tilde P_M}\) satisfy the continuity equation \({{\rm{d}}_{\tilde \rho T}}/{\rm{d}}\tilde t + 3\tilde H\left({{{\tilde \rho}_T} + {{\tilde P}_T}} \right) = 0\). More generally, Eqs. (2.48) and (2.49) can be expressed in terms of the energy-momentum tensors defined in Eqs. (2.34) and (2.37):

$${\tilde \nabla _\mu}\tilde T_\nu ^{\mu (\phi)} = - Q\tilde T{\tilde \nabla _\nu}\phi, \quad {\tilde \nabla _\mu}\tilde T_\nu ^{\mu (M)} = Q\tilde T{\tilde \nabla _\nu}\phi,$$
(2.50)

which correspond to the same equations in coupled quintessence studied in [23] (see also [22]).

In the absence of a field potential V(ϕ) (i.e., massless field) the field mediates a long-range fifth force with a large coupling (∣Q∣ ≃ 0.4), which contradicts with experimental tests in the solar system. In f(R) gravity a field potential with gravitational origin is present, which allows the possibility of compatibility with local gravity tests through the chameleon mechanism [344, 343].

In f(R) gravity the field ϕ is coupled to non-relativistic matter (dark matter, baryons) with a universal coupling \(Q = - 1/\sqrt 6\). We consider the frame in which the baryons obey the standard continuity equation ρma−3, i.e., the Jordan frame, as the “physical” frame in which physical quantities are compared with observations and experiments. It is sometimes convenient to refer the Einstein frame in which a canonical scalar field is coupled to non-relativistic matter. In both frames we are treating the same physics, but using the different time and length scales gives rise to the apparent difference between the observables in two frames. Our attitude throughout the review is to discuss observables in the Jordan frame. When we transform to the Einstein frame for some convenience, we go back to the Jordan frame to discuss physical quantities.

3 Inflation in f(R) Theories

Most models of inflation in the early universe are based on scalar fields appearing in superstring and supergravity theories. Meanwhile, the first inflation model proposed by Starobinsky [564] is related to the conformal anomaly in quantum gravityFootnote 3. Unlike the models such as “old inflation” [339, 291, 524] this scenario is not plagued by the graceful exit problem — the period of cosmic acceleration is followed by the radiation-dominated epoch with a transient matter-dominated phase [565, 606, 426]. Moreover it predicts nearly scale-invariant spectra of gravitational waves and temperature anisotropies consistent with CMB observations [563, 436, 566, 355, 315]. In this section we review the dynamics of inflation and reheating. In Section 7 we will discuss the power spectra of scalar and tensor perturbations generated in f(R) inflation models.

3.1 Inflationary dynamics

We consider the models of the form

$$f(R) = R + \alpha {R^n},\quad (\alpha > 0,n > 0),$$
(3.1)

which include the Starobinsky’s model [564] as a specific case (n = 2). In the absence of the matter fluid (ρM = 0), Eq. (2.15) gives

$$3(1 + n\alpha {R^{n - 1}}){H^2} = {1 \over 2}(n - 1)\alpha {R^n} - 3n(n - 1)\alpha H{R^{n - 2}}\dot R.$$
(3.2)

The cosmic acceleration can be realized in the regime F = 1 + nαRn−1 ≫ 1. Under the approximation FnαRn−1, we divide Eq. (3.2) by 3nαRn−1 to give

$${H^2} \simeq {{n - 1} \over {6n}}\left({R - 6nH{{\dot R} \over R}} \right).$$
(3.3)

During inflation the Hubble parameter H evolves slowly so that one can use the approximation ∣Ḣ/H2∣ ♪ 1 and ∣Ḧ/(HḢ)∣ ♪ 1. Then Eq. (3.3) reduces to

$${{\dot H} \over {{H^2}}} \simeq - {\epsilon _1},\quad \;\;{\epsilon _1} = {{2 - n} \over {(n - 1)(2n - 1)}}.$$
(3.4)

Integrating this equation for ϵ1 > 0, we obtain the solution

$$H \simeq {1 \over {{\epsilon _1}t}},\quad \;\;a \propto {t^{1/{\epsilon _1}}}.$$
(3.5)

The cosmic acceleration occurs for ϵ1 < 1, i.e., \(n > \left({1 + \sqrt 3} \right)/2\). When n = 2 one has ϵ1 = 0, so that H is constant in the regime F ≫ 1. The models with n > 2 lead to super inflation characterized by > 0 and \(a \propto {\left\vert {{t_0} - t} \right\vert^{- 1/\left\vert {{\epsilon_1}} \right\vert}}\) (t0 is a constant). Hence the standard inflation with decreasing H occurs for \(\left({1 + \sqrt 3} \right)/2 < n < 2\).

In the following let us focus on the Starobinsky’s model given by

$$f(R) = R + {R^2}/(6{M^2}),$$
(3.6)

where the constant M has a dimension of mass. The presence of the linear term in R eventually causes inflation to end. Without neglecting this linear term, the combination of Eqs. (2.15) and (2.16) gives

$$\ddot H - {{{{\dot H}^2}} \over {2H}} + {1 \over 2}{M^2}H = - 3H\dot H,$$
(3.7)
$$\ddot R + 3H\dot R + {M^2}R = 0.$$
(3.8)

During inflation the first two terms in Eq. (3.7) can be neglected relative to others, which gives ≃ − M2/6. We then obtain the solution

$$H \simeq {H_i} - ({M^2}/6)(t - {t_i}),$$
(3.9)
$$a \simeq {a_i}\exp [{H_i}(t - {t_i}) - ({M^2}/12){(t - {t_i})^2}],$$
(3.10)
$$R \simeq 12{H^2} - {M^2},$$
(3.11)

where Hi and ai are the Hubble parameter and the scale factor at the onset of inflation (t = ti), respectively. This inflationary solution is a transient attractor of the dynamical system [407]. The accelerated expansion continues as long as the slow-roll parameter

$${\epsilon _1} = - {{\dot H} \over {{H^2}}} \simeq {{{M^2}} \over {6{H^2}}},$$
(3.12)

is smaller than the order of unity, i.e., H2M2. One can also check that the approximate relation 3HṘ + M2R ≃ 0 holds in Eq. (3.8) by using R ≃ 12H2. The end of inflation (at time t = tf) is characterized by the condition ϵf ≃ 1, i.e., \({H_f} \simeq M/\sqrt 6\). From Eq. (3.11) this corresponds to the epoch at which the Ricci scalar decreases to RM2. As we will see later, the WMAP normalization of the CMB temperature anisotropies constrains the mass scale to be M ≃ 1013 GeV. Note that the phase space analysis for the model (3.6) was carried out in [407, 24, 131].

We define the number of e-foldings from t = ti to t = tf:

$$N \equiv \int\nolimits_{{t_i}}^{{t_f}} H \;{\rm{d}}t \simeq {H_i}({t_f} - {t_i}) - {{{M^2}} \over {12}}{({t_f} - {t_i})^2}.$$
(3.13)

Since inflation ends at tfti + 6Hi/M2, it follows that

$$N \simeq {{3H_i^2} \over {{M^2}}} \simeq {1 \over {2{\epsilon _1}({t_i})}},$$
(3.14)

where we used Eq. (3.12) in the last approximate equality. In order to solve horizon and flatness problems of the big bang cosmology we require that N ≳ 70 [391], i.e., ϵ1(ti) ≲ 7 × 10−3. The CMB temperature anisotropies correspond to the perturbations whose wavelengths crossed the Hubble radius around N = 55–60 before the end of inflation.

3.2 Dynamics in the Einstein frame

Let us consider inflationary dynamics in the Einstein frame for the model (3.6) in the absence of matter fluids \(\left({{\mathcal{L}_M} = 0} \right)\). The action in the Einstein frame corresponds to (2.32) with a field ϕ defined by

$$\phi = \sqrt {{3 \over 2}} {1 \over \kappa}\ln F = \sqrt {{3 \over 2}} {1 \over \kappa}\ln \left({1 + {R \over {3{M^2}}}} \right).$$
(3.15)

Using this relation, the field potential (2.33) reads [408, 61, 63]

$$V(\phi) = {{3{M^2}} \over {4{\kappa ^2}}}{\left({1 - {e^{- \sqrt {2/3} \kappa \phi}}} \right)^2}.$$
(3.16)

In Figure 1 we illustrate the potential (3.16) as a function of ϕ. In the regime κϕ ≫ 1 the potential is nearly constant (V(ϕ) ≃ 3M2/(4κ2)), which leads to slow-roll inflation. The potential in the regime κϕ ≪ 1 is given by V(ϕ) ≃ (1/2)M2ϕ2, so that the field oscillates around ϕ = 0 with a Hubble damping. The second derivative of V with respect to ϕ is

$${V_{,\phi \phi}} = - {M^2}{e^{- \sqrt {2/3} \kappa \phi}}\left({1 - 2{e^{- \sqrt {2/3} \kappa \phi}}} \right),$$
(3.17)

which changes from negative to positive at \(\phi = {\phi _1} \equiv \sqrt {3/2} \left({\ln \,2} \right)/\kappa \simeq 0.169{m_{{\rm{pl}}}}\).

Figure 1
figure 1

The field potential (3.16) in the Einstein frame corresponding to the model (3.6). Inflation is realized in the regime κϕ ≫ 1.

Since F ≃ 4H2/M2 during inflation, the transformation (2.44) gives a relation between the cosmic time \(\tilde t\) in the Einstein frame and that in the Jordan frame:

$$\tilde t = \int\nolimits_{{t_i}}^t {\sqrt F} \;{\rm{d}}t \simeq {2 \over M}\left[ {{H_i}(t - {t_i}) - {{{M^2}} \over {12}}{{(t - {t_i})}^2}} \right],$$
(3.18)

where t = ti corresponds to \(\tilde t = 0\). The end of inflation (tfti + 6Hi/M2) corresponds to \({\tilde t_f} = \left({2/M} \right)N\) in the Einstein frame, where N is given in Eq. (3.13). On using Eqs. (3.10) and (3.18), the scale factor \(\tilde a = \sqrt F a\) in the Einstein frame evolves as

$$\tilde a(\tilde t) \simeq \left({1 - {{{M^2}} \over {12H_i^2}}M\tilde t} \right){\tilde a_i}{e^{M\tilde t/2}},$$
(3.19)

where \({\tilde a_i} = 2{H_i}{a_i}/M\). Similarly the evolution of the Hubble parameter \(\tilde H = \left({H/\sqrt F} \right)\left[ {1 + \dot F/\left({2HF} \right)} \right]\) is given by

$$\tilde H(\tilde t) \simeq {M \over 2}\left[ {1 - {{{M^2}} \over {6H_i^2}}{{\left({1 - {{{M^2}} \over {12H_i^2}}M\tilde t} \right)}^{- 2}}} \right],$$
(3.20)

which decreases with time. Equations (3.19) and (3.20) show that the universe expands quasi-exponentially in the Einstein frame as well.

The field equations for the action (2.32) are given by

$$3{\tilde H^2} = {\kappa ^2}\left[ {{1 \over 2}{{\left({{{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}}} \right)}^2} + V(\phi)} \right],$$
(3.21)
$${{{{\rm{d}}^2}\phi} \over {{\rm{d}}{{\tilde t}^2}}} + 3\tilde H{{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}} + {V_{,\phi}} = 0.$$
(3.22)

Using the slow-roll approximations \({\left({{\rm{d}}\phi/{\rm{d}}\tilde t} \right)^2} \ll V\left(\phi \right)\) and \(\left\vert {{{\rm{d}}^2}\phi/{\rm{d}}{{\tilde t}^2}} \right\vert \ll \left\vert {\tilde H{\rm{d}}\phi/{\rm{d}}\tilde t} \right\vert\) during inflation, one has \(3{\tilde H^2} \simeq {\kappa ^2}V\left(\phi \right)\) and \(3\tilde H\left({{\rm{d}}\phi/{\rm{d}}\tilde t} \right) + {V_{,\phi}} \simeq 0\). We define the slow-roll parameters

$${\tilde \epsilon _1} \equiv - {{{\rm{d}}\tilde H/{\rm{d}}\tilde t} \over {{{\tilde H}^2}}} \simeq {1 \over {2{\kappa ^2}}}{\left({{{{V_{,\phi}}} \over V}} \right)^2},\quad {\tilde \epsilon _2} \equiv {{{{\rm{d}}^2}\phi/{\rm{d}}{{\tilde t}^2}} \over {\tilde H({\rm{d}}\phi/{\rm{d}}\tilde t)}} \simeq {\tilde \epsilon _1} - {{{V_{,\phi \phi}}} \over {3{{\tilde H}^2}}}.$$
(3.23)

for the potential (3.16) it follows that

$${\tilde \epsilon_1} \simeq {4 \over 3}{({e^{\sqrt {2/3} \kappa \phi}} - 1)^{- 2}},\quad \;{\tilde \epsilon_2} \simeq {\tilde \epsilon_1} + {{{M^2}} \over {3{{\tilde H}^2}}}{e^{- \sqrt {2/3} \kappa \phi}}(1 - 2{e^{- \sqrt {2/3} \kappa \phi}}),$$
(3.24)

which are much smaller than 1 during inflation (κϕ ≫ 1). The end of inflation is characterized by the condition \(\left\{{{{\tilde \epsilon}_1},\,\left\vert {{{\tilde \epsilon}_2}} \right\vert} \right\} = \mathcal{O}\left(1 \right)\). Solving \({\tilde \epsilon_1} = 1\), we obtain the field value ϕf ≃ 0.19mpl.

We define the number of e-foldings in the Einstein frame,

$$\tilde N = \int\nolimits_{{{\tilde t}_i}}^{{{\tilde t}_f}} {\tilde H} {\rm{d}}\tilde t \simeq {\kappa ^2}\int\nolimits_{{\phi _f}}^{{\phi _i}} {{V \over {{V_{,\phi}}}}{\rm{d}}\phi,}$$
(3.25)

where ϕi is the field value at the onset of inflation. Since \(\tilde H{\rm{d}}\tilde t = H{\rm{d}}t\left[ {1 + \dot F/\left({2HF} \right)} \right]\), it follows that \(\tilde N\) is identical to N in the slow-roll limit: ∣/(2HF)∣ ≃ ∣Ḣ/H2∣ ≪ 1. Under the condition κϕi ≫ 1 we have

$$\tilde N \simeq {3 \over 4}{e^{\sqrt {2/3} \kappa {\phi _i}}}.$$
(3.26)

This shows that ϕi ≃ 1.11mpl for \(\tilde N = 70\). From Eqs. (3.24) and (3.26) together with the approximate relation \(\tilde H \simeq M/2\), we obtain

$${\tilde \epsilon _1} \simeq {3 \over {4{{\tilde N}^2}}},\quad {\tilde \epsilon _2} \simeq {1 \over {\tilde N}},$$
(3.27)

where, in the expression of \({\tilde \epsilon _2}\), we have dropped the terms of the order of 1/Ñ2. The results (3.27) will be used to estimate the spectra of density perturbations in Section 7.

3.3 Reheating after inflation

We discuss the dynamics of reheating and the resulting particle production in the Jordan frame for the model (3.6). The inflationary period is followed by a reheating phase in which the second derivative \(\ddot R\) can no longer be neglected in Eq. (3.8). Introducing \(\hat R = {a^{3/2}}R\), we have

$$\ddot \hat R + \left({{M^2} - {3 \over 4}{H^2} - {3 \over 2}\dot H} \right)\hat R = 0.$$
(3.28)

Since M2 ≫ {H2, ∣∣} during reheating, the solution to Eq. (3.28) is given by that of the harmonic oscillator with a frequency M. Hence the Ricci scalar exhibits a damped oscillation around R = 0:

$$R \propto {a^{- 3/2}}\sin (Mt).$$
(3.29)

Let us estimate the evolution of the Hubble parameter and the scale factor during reheating in more detail. If we neglect the r.h.s. of Eq. (3.7), we get the solution H(t) = const × cos2 (Mt/2). Setting H(t) = f(t)cos2(Mt/2) to derive the solution of Eq. (3.7), we obtain [426]

$$f(t) = {1 \over {C + (3/4)(t - {t_{{\rm{os}}}}) + 3/(4M)\sin [M(t - {t_{{\rm{os}}}})]}},$$
(3.30)

where tos is the time at the onset of reheating. The constant C is determined by matching Eq. (3.30) with the slow-roll inflationary solution = −M2/6 at t = tos. Then we get C = 3/M and

$$H(t) = {\left[ {{3 \over M} + {3 \over 4}(t - {t_{{\rm{os}}}}) + {3 \over {4M}}\sin \;M(t - {t_{{\rm{os}}}})} \right]^{- 1}}{\cos ^2}\left[ {{M \over 2}(t - {t_{{\rm{os}}}})} \right].$$
(3.31)

Taking the time average of oscillations in the regime M(ttos) ≫ 1, it follows that 〈H〉 ≃ (2/3)(ttos) −1. This corresponds to the cosmic evolution during the matter-dominated epoch, i.e., 〈a〉 ∝ (ttos)2/3. The gravitational effect of coherent oscillations of scalarons with mass M is similar to that of a pressureless perfect fluid. During reheating the Ricci scalar is approximately given by R ≃ 6, i.e.

$$R \simeq - 3{\left[ {{3 \over M} + {3 \over 4}(t - {t_{{\rm{os}}}}) + {3 \over {4M}}\sin \;M(t - {t_{{\rm{os}}}})} \right]^{- 1}}M\sin [M(t - {t_{{\rm{os}}}})].$$
(3.32)

In the regime M(ttos) ≫ 1 this behaves as

$$R \simeq - {{4M} \over {t - {t_{{\rm{os}}}}}}\sin [M(t - {t_{{\rm{os}}}})].$$
(3.33)

In order to study particle production during reheating, we consider a scalar field χ with mass mχ. We also introduce a nonminimal coupling (1/2)ξRχ2 between the field χ and the Ricci scalar R [88]. Then the action is given by

$$S = \int {{{\rm{d}}^4}x\sqrt {- g} \left[ {{{f(R)} \over {2{\kappa ^2}}} - {1 \over 2}{g^{\mu \nu}}{\partial _\mu}\chi {\partial _\nu}\chi - {1 \over 2}m_\chi ^2{\chi ^2} - {1 \over 2}\xi R{\chi ^2}} \right],}$$
(3.34)

where f(R) = R + R2/(6M2). Taking the variation of this action with respect to χ gives

$$\square\chi - m_\chi ^2\chi - \xi R\chi = 0.$$
(3.35)

We decompose the quantum field χ in terms of the Heisenberg representation:

$$\chi (t,x) = {1 \over {{{(2\pi)}^{3/2}}}}\int {{{\rm{d}}^{\rm{3}}}} k\left({{{\hat a}_k}{\chi _k}(t){e^{- ik \cdot x}} + \hat a_k^\dagger \chi _k^{\ast}(t){e^{ik \cdot x}}} \right),$$
(3.36)

where \({{\hat a}_k}\) and \(\hat a_{_k}^\dag\) are annihilation and creation operators, respectively. The field χ can be quantized in curved spacetime by generalizing the basic formalism of quantum field theory in the flat spacetime. See the book [88] for the detail of quantum field theory in curved spacetime. Then each Fourier mode χk(t) obeys the following equation of motion

$${\ddot \chi _k} + 3H{\dot \chi _k} + \left({{{{k^2}} \over {{a^2}}} + m_\chi ^2 + \xi R} \right){\chi _k} = 0,$$
(3.37)

where k = ∣k∣ is a comoving wavenumber. Introducing a new field uk = k and conformal time η = ∫ a−1dt, we obtain

$${{{{\rm{d}}^2}{u_k}} \over {{\rm{d}}{\eta ^2}}} + \left[ {{k^2} + m_\chi ^2{a^2} + \left({\xi - {1 \over 6}} \right){a^2}R} \right]{u_k} = 0,$$
(3.38)

where the conformal coupling correspond to ξ = 1/6. This result states that, even though ξ = 0 (that is, the field is minimally coupled to gravity), R still gives a contribution to the effective mass of uk. In the following we first review the reheating scenario based on a minimally coupled massless field (ξ = 0 and mχ = 0). This corresponds to the gravitational particle production in the perturbative regime [565, 606, 426]. We then study the case in which the nonminimal coupling ∣ξ∣ is larger than the order of 1. In this case the non-adiabatic particle production preheating [584, 353, 538, 354] can occur via parametric resonance.

3.3.1 Case: ξ = 0 and mχ = 0

In this case there is no explicit coupling among the fields χ and R. Hence the χ particles are produced only gravitationally. In fact, Eq. (3.38) reduces to

$${{{{\rm{d}}^2}{u_k}} \over {{\rm{d}}{\eta ^2}}} + {k^2}{u_k} = U{u_k},$$
(3.39)

where U = a2R/6. Since U is of the order of (aH)2, one has k2U for the mode deep inside the Hubble radius. Initially we choose the field in the vacuum state with the positive-frequency solution [88]: \(u_k^{(i)} = {e^{- ik\eta}}/\sqrt {2k}\). The presence of the time-dependent term U(η) leads to the creation of the particle χ. We can write the solution of Eq. (3.39) iteratively, as [626]

$${u_k}(\eta) = u_k^{(i)} + {1 \over k}\int\nolimits_0^\eta U (\eta ^{\prime})\sin [k(\eta - \eta ^{\prime})]{u_k}(\eta ^{\prime}){\rm{d}}\eta ^{\prime}.$$
(3.40)

After the universe enters the radiation-dominated epoch, the term U becomes small so that the flat-space solution is recovered. The choice of decomposition of χ into âk and \(\hat a_{_k}^\dag\) is not unique. In curved spacetime it is possible to choose another decomposition in term of new ladder operators \({{\hat {\mathcal A}}_k}\) and \(\hat {\mathcal A}_k^\dag\), which can be written in terms of âk and \(\hat a_{_k}^\dag\), such as \({\hat {\mathcal{A}}_k} = {\alpha _k}{{\hat a}_k} + \beta _k^ \ast \hat a_{- k}^\dagger\). Provided that \(\beta _k^ {\ast} \neq 0\), even though âk∣0〉 ≠ 0, we have \({{\hat {\mathcal A}}_k}\left\vert 0 \right\rangle \neq 0\). Hence the vacuum in one basis is not the vacuum in the new basis, and according to the new basis, the particles are created. The Bogoliubov coefficient describing the particle production is

$${\beta _k} = - {i \over {2k}}\int\nolimits_0^\infty U (\eta ^{\prime}){e^{- 2ik\eta ^{\prime}}}{\rm{d}}\eta ^{\prime}.$$
(3.41)

The typical wavenumber in the η-coordinate is given by k, whereas in the t-coordinate it is k/a. Then the energy density per unit comoving volume in the η-coordinate is [426]

$$\begin{array}{*{20}c} {{\rho _\eta} = {1 \over {{{(2\pi)}^3}}}\int\nolimits_0^\infty {4\pi {k^2}{\rm{d}}k \cdot k\vert {\beta _k}{\vert ^2}\quad \quad \quad \quad \quad \quad \quad \quad \quad}}\\ {= {1 \over {8{\pi ^2}}}\int\nolimits_0^\infty {{\rm{d}}\eta} \;U(\eta)\int\nolimits_0^\infty {{\rm{d}}\eta} ^{\prime}U(\eta ^{\prime})\int\nolimits_0^\infty {{\rm{d}}k \cdot k{e^{2ik(\eta^{\prime} - \eta)}}}}\\ {= {1 \over {32{\pi ^2}}}\int\nolimits_0^\infty {{\rm{d}}\eta {{{\rm{d}}U} \over {{\rm{d}}\eta}}} \int\nolimits_0^\infty {{\rm{d}}\eta^{\prime}{{U(\eta^{\prime})} \over {\eta^{\prime}- \eta}},\quad \quad \quad \quad \quad \quad}}\\ \end{array}$$
(3.42)

where in the last equality we have used the fact that the term U approaches 0 in the early and late times.

During the oscillating phase of the Ricci scalar the time-dependence of U is given by \(U = I(\eta)\sin (\int\nolimits_0^\eta {\omega {\rm{d}}\bar \eta})\), where I(η) = ca(η)1/2 and ω = Ma (c is a constant). When we evaluate the term dU/dη in Eq. (3.42), the time-dependence of I(η) can be neglected. Differentiating Eq. (3.42) in terms of η and taking the limit \(\int\nolimits_0^\eta {\omega {\rm{d}}\bar \eta} \gg 1\), it follows that

$${{{\rm{d}}{\rho _\eta}} \over {{\rm{d}}\eta}} \simeq {\omega \over {32\pi}}{I^2}(\eta){\cos ^2}\left({\int\nolimits_0^\eta {\omega {\rm{d}}\bar \eta}} \right),$$
(3.43)

where we used the relation limk→∞ sin(kx)/x = πδ(x). Shifting the phase of the oscillating factor by π/2, we obtain

$${{{\rm{d}}{\rho _\eta}} \over {{\rm{d}}\eta}} \simeq {{M{U^2}} \over {32\pi}} = {{M{a^4}{R^2}} \over {1152\pi}}.$$
(3.44)

The proper energy density of the field χ is given by ρχ = (ρη/a)/a3 = ρη/a4. Taking into account g* relativistic degrees of freedom, the total radiation density is

$${\rho _M} = {{{g_\ast}} \over {{a^4}}}{\rho _\eta} = {{{g_\ast}} \over {{a^4}}}\int\nolimits_{{t_{{\rm{os}}}}}^t {{{M{a^4}{R^2}} \over {1152\pi}}} {\rm{d}}t,$$
(3.45)

which obeys the following equation

$${\dot \rho _M} + 4H{\rho _M} = {{{g_\ast}M{R^2}} \over {1152\pi}}.$$
(3.46)

Comparing this with the continuity equation (2.17) we obtain the pressure of the created particles, as

$${P_M} = {1 \over 3}{\rho _M} - {{{g_\ast}M{R^2}} \over {3456\pi H}}.$$
(3.47)

Now the dynamical equations are given by Eqs (2.15) and (2.16) with the energy density (3.45) and the pressure (3.47)

In the regime M(ttos) ≫ 1 the evolution of the scale factor is given by aa0(ttos)2/3, and hence

$${H^2} \simeq {4 \over {9{{(t - {t_{{\rm{os}}}})}^2}}},$$
(3.48)

where we have neglected the backreaction of created particles. Meanwhile the integration of Eq (3.45) gives

$${\rho _M} \simeq {{{g_\ast}{M^3}} \over {240\pi}}{1 \over {t - {t_{{\rm{os}}}}}},$$
(3.49)

where we have used the averaged relation 〈R2〉 ≃ 8M2/(ttos)2 [which comes from Eq. (3.33)]. The energy density ρM evolves slowly compared to H2 and finally it becomes a dominant contribution to the total energy density \((3{H^2} \simeq 8\pi {\rho _M}/m_{{\rm{pl}}}^2)\) at the time \({t_f} \simeq {t_{{\rm{os}}}} + 40m_{{\rm{pl}}}^2/({g_{\ast}}{M^3})\). In [426] it was found that the transition from the oscillating phase to the radiation-dominated epoch occurs slower compared to the estimation given above. Since the epoch of the transient matter-dominated era is about one order of magnitude longer than the analytic estimation [426], we take the value \({t_f} \simeq {t_{{\rm{os}}}} + 400m_{{\rm{pl}}}^2/({g_{\ast}}{M^3})\) to estimate the reheating temperature Tr. Since the particle energy density ρM(tf) is converted to the radiation energy density \({\rho _r} = {g_\ast}{\pi ^2}T_r^4/30\), the reheating temperature can be estimated asFootnote 4

$${T_r} \underset{\sim}{<} 3 \times {10^{17}}g_\ast ^{1/4}{\left({{M \over {{m_{{\rm{pl}}}}}}} \right)^{3/2}}{\rm{GeV}}.$$
(3.50)

As we will see in Section 7, the WMAP normalization of the CMB temperature anisotropies determines the mass scale to be M ≃ 3 × 10−6mpl. Taking the value g* = 100, we have Tr ≲ 5 × 109 GeV. For t > tf the universe enters the radiation-dominated epoch characterized by at1/2, R = 0, and ρrt−2.

3.3.2 Case: ∣ξ∣ ≳ 1

If ∣ξ∣ is larger than the order of unity, one can expect the explosive particle production called preheating prior to the perturbative regime discussed above. Originally the dynamics of such gravitational preheating was studied in [70, 592] for a massive chaotic inflation model in Einstein gravity. Later this was extended to the f(R) model (3.6) [591].

Introducing a new field Xk = a3/2χk, Eq. (3.37) reads

$${\ddot X_k} + \left({{{{k^2}} \over {{a^2}}} + m_\chi ^2 + \xi R - {9 \over 4}{H^2} - {3 \over 2}\dot H} \right){X_k} = 0.$$
(3.51)

As long as ∣ξ∣ is larger than the order of unity, the last two terms in the bracket of Eq. (3.51) can be neglected relative to ξR. Since the Ricci scalar is given by Eq. (3.33) in the regime M(ttos) ≫ 1, it follows that

$${\ddot X_k} + \left[ {{{{k^2}} \over {{a^2}}} + m_\chi ^2 - {{4M\xi} \over {t - {t_{{\rm{os}}}}}}\sin \{M(t - {t_{{\rm{os}}}})\}} \right]{X_k} \simeq 0.$$
(3.52)

The oscillating term gives rise to parametric amplification of the particle χk. In order to see this we introduce the variable z defined by M(ttos) =2z ± π/2, where the plus and minus signs correspond to the cases ξ > 0 and ξ < 0 respectively. Then Eq. (3.52) reduces to the Mathieu equation

$${{{{\rm{d}}^2}} \over {{\rm{d}}{z^2}}}{X_k} + [{A_k} - 2q\cos (2z)]{X_k} \simeq 0,$$
(3.53)

where

$${A_k} = {{4{k^2}} \over {{a^2}{M^2}}} + {{4m_\chi ^2} \over {{M^2}}},\quad \;q = {{8\vert \xi \vert} \over {M(t - {t_{{\rm{os}}}})}}.$$
(3.54)

The strength of parametric resonance depends on the parameters Ak and q. This can be described by a stability-instability chart of the Mathieu equation [419, 353, 591]. In the Minkowski spacetime the parameters Ak and q are constant. If Ak and q are in an instability band, then the perturbation Xk grows exponentially with a growth index μk, i.e., \({X_k} \propto {e^{{\mu _k}z}}\). In the regime q ≪ 1 the resonance occurs only in narrow bands around Ak = 2, where = 1, 2, …, with the maximum growth index μk = q/2 [353]. Meanwhile, for large q(≫ 1), a broad resonance can occur for a wide range of parameter space and momentum modes [354].

In the expanding cosmological background both Ak and q vary in time. Initially the field Xk is in the broad resonance regime (q ≫ 1) for ∣ξ∣ ≫ 1, but it gradually enters the narrow resonance regime (q ≲ 1). Since the field passes many instability and stability bands, the growth index μk stochastically changes with the cosmic expansion. The non-adiabaticity of the change of the frequency \(\omega _k^2 = {k^2}/{a^2} + m_\chi ^2 - 4M\xi \sin \{M(t - {t_{{\rm{os}}}})\}/(t - {t_{{\rm{os}}}})\) can be estimated by the quantity

$${r_{{\rm{na}}}} \equiv \left\vert {{{{{\dot \omega}_k}} \over {\omega _k^2}}} \right\vert = M{{\vert {k^2}/{a^2} + 2M\xi \cos \{M(t - {t_{{\rm{os}}}})\}/(t - {t_{{\rm{os}}}})\vert} \over {\vert {k^2}/{a^2} + m_\chi ^2 - 4M\xi \sin \{M(t - {t_{{\rm{os}}}})\}/(t - {t_{{\rm{os}}}}){\vert ^{3/2}}}},$$
(3.55)

where the non-adiabatic regime corresponds to rna ≳ 1. For small k and mχ we have rna ≫ 1 around M(ttos) = , where n are positive integers. This corresponds to the time at which the Ricci scalar vanishes. Hence, each time R crosses 0 during its oscillation, the non-adiabatic particle production occurs most efficiently. The presence of the mass term mχ tends to suppress the non-adiabaticity parameter rna, but still it is possible to satisfy the condition rna ≳ 1 around R = 0.

For the model (3.6) it was shown in [591] that massless χ particles are resonantly amplified for ∣ξ∣ ≳ 3. Massive particles with mχ of the order of M can be created for ∣ξ∣ ≳ 10. Note that in the preheating scenario based on the model \(V(\phi, \chi) = (1/2)m_\phi ^2{\phi ^2} + (1/2){g^2}{\phi ^2}{\chi ^2}\) the parameter q decreases more rapidly (q ∝ 1/t2) than that in the model (3.6) [354]. Hence, in our geometric preheating scenario, we do not require very large initial values of q [such as \(q > {\mathcal O}({10^3})\)] to lead to the efficient parametric resonance.

While the above discussion is based on the linear analysis, non-linear effects (such as the mode-mode coupling of perturbations) can be important at the late stage of preheating (see, e.g., [354, 342]). Also the energy density of created particles affects the background cosmological dynamics, which works as a backreaction to the Ricci scalar. The process of the subsequent perturbative reheating stage can be affected by the explosive particle production during preheating. It will be of interest to take into account all these effects and study how the thermalization is reached at the end of reheating. This certainly requires the detailed numerical investigation of lattice simulations, as developed in [255, 254].

At the end of this section we should mention a number of interesting works about gravitational baryogenesis based on the interaction \((1/M_{\ast}^2)\int {{{\rm{d}}^4}x\sqrt {- g} {J^\mu}{\partial _\mu}R}\) between the baryon number current Jμ and the Ricci scalar R (M* is the cut-off scale characterizing the effective theory) [179, 376, 514]. This interaction can give rise to an equilibrium baryon asymmetry which is observationally acceptable, even for the gravitational Lagrangian f(R) =Rn with n close to 1. It will be of interest to extend the analysis to more general f(R) gravity models.

4 Dark Energy in f(R) Theories

In this section we apply f(R) theories to dark energy. Our interest is to construct viable f(R) models that can realize the sequence of radiation, matter, and accelerated epochs. In this section we do not attempt to find unified models of inflation and dark energy based on f(R) theories.

Originally the model f(R) = Rα/Rn (α > 0, n > 0) was proposed to explain the late-time cosmic acceleration [113, 120, 114, 143] (see also [456, 559, 17, 223, 212, 16, 137, 62] for related works). However, this model suffers from a number of problems such as matter instability [215, 244], the instability of cosmological perturbations [146, 74, 544, 526, 251], the absence of the matter era [28, 29, 239], and the inability to satisfy local gravity constraints [469, 470, 245, 233, 154, 448, 134]. The main reason why this model does not work is that the quantity f,RR ≡ ∂2f/R2 is negative. As we will see later, the violation of the condition f,RR > 0 gives rise to the negative mass squared M2 for the scalaron field. Hence we require that f,RR > 0 to avoid a tachyonic instability. The condition f,R∂f/∂R > 0 is also required to avoid the appearance of ghosts (see Section 7.4). Thus viable f(R) dark energy models need to satisfy [568]

$${f_{,R}} > 0,\quad \;{f_{,RR}} > 0,\quad \;{\rm{for}}\quad R \geq {R_0}(> 0),$$
(4.56)

where R0 is the Ricci scalar today.

In the following we shall derive other conditions for the cosmological viability of f(R) models. This is based on the analysis of [26]. For the matter Lagrangian \({{\mathcal L}_M}\) in Eq. (2.1) we take into account non-relativistic matter and radiation, whose energy densities ρm and ρr satisfy

$${\dot \rho _m} + 3H{\rho _m} = 0,$$
(4.57)
$${\dot \rho _r} + 4H{\rho _r} = 0,$$
(4.58)

respectively. From Eqs. (2.15) and (2.16) it follows that

$$3F{H^2} = (FR - f)/2 - 3H\dot F + {\kappa ^2}({\rho _m} + {\rho _r}),$$
(4.59)
$$- 2F\dot H = \ddot F - H\dot F + {\kappa ^2}[{\rho _m} + (4/3){\rho _r}].$$
(4.60)

4.1 Dynamical equations

We introduce the following variables

$${x_1} \equiv - {{\dot F} \over {HF}},\quad {x_2} \equiv - {f \over {6F{H^2}}},\quad {x_3} \equiv {R \over {6{H^2}}},\quad {x_4} \equiv {{{\kappa ^2}{\rho _r}} \over {3F{H^2}}},$$
(4.61)

together with the density parameters

$${\Omega _m} \equiv {{{\kappa ^2}{\rho _m}} \over {3F{H^2}}} = 1 - {x_1} - {x_2} - {x_3} - {x_4},\quad {\Omega _r} \equiv {x_4},\quad {\Omega _{{\rm{DE}}}} \equiv {x_1} + {x_2} + {x_3}.$$
(4.62)

It is straightforward to derive the following equations

$${{{\rm{d}}{x_1}} \over {{\rm{d}}N}} = - 1 - {x_3} - 3{x_2} + x_1^2 - {x_1}{x_3} + {x_4},$$
(4.63)
$${{{\rm{d}}{x_2}} \over {{\rm{d}}N}} = {{{x_1}{x_3}} \over m} - {x_2}(2{x_3} - 4 - {x_1}),$$
(4.64)
$${{{\rm{d}}{x_3}} \over {{\rm{d}}N}} = - {{{x_1}{x_3}} \over m} - 2{x_3}({x_3} - 2),$$
(4.65)
$${{{\rm{d}}{x_4}} \over {{\rm{d}}N}} = - 2{x_3}{x_4} + {x_1}{x_4},$$
(4.66)

where N = ln a is the number of e-foldings, and

$$m \equiv {{{\rm{d}}\ln F} \over {{\rm{d}}\ln R}} = {{R{f_{,RR}}} \over {{f_{,R}}}},$$
(4.67)
$$r \equiv - {{{\rm{d}}\ln f} \over {{\rm{d}}\ln R}} = - {{R{f_{,R}}} \over f} = {{{x_3}} \over {{x_2}}}.$$
(4.68)

From Eq. (4.68) the Ricci scalar can be expressed by x3/x2. Since m depends on R, this means that m is a function of r, that is, m = m(r). The ΛCDM model, f(R) = R − 2Λ, corresponds to m = 0. Hence the quantity characterizes the deviation of the background dynamics from the ΛCDM model. A number of authors studied cosmological dynamics for specific f(R) models [160, 382, 488, 252, 31, 198, 280, 72, 41, 159, 235, 1, 279, 483, 321, 432].

The effective equation of state of the system is defined by

$${w_{{\rm{eff}}}} \equiv - 1 - 2\dot H/(3{H^2}),$$
(4.69)

which is equivalent to weff = − (2x3 − 1)/3. In the absence of radiation (x4 = 0) the fixed points for the above dynamical system are

$${P_1}:({x_1},{x_2},{x_3}) = (0, - 1,2),\quad \quad {\Omega _m} = 0,\quad \quad \quad \quad {w_{{\rm{eff}}}} = - 1,$$
(4.70)
$${P_2}:({x_1},{x_2},{x_3}) = (- 1,0,0),\quad \quad {\Omega _m} = 2,\quad \quad \quad \quad {w_{{\rm{eff}}}} = 1/3,$$
(4.71)
$${P_3}:({x_1},{x_2},{x_3}) = (1,0,0),\quad \quad {\Omega _m} = 0,\quad \quad \quad \quad {w_{{\rm{eff}}}} = 1/3,$$
(4.72)
$${P_4}:({x_1},{x_2},{x_3}) = (- 4,5,0),\quad \quad {\Omega _m} = 0,\quad \quad \quad \quad {w_{{\rm{eff}}}} = 1/3,$$
(4.73)
$${{P_5}:({x_1},{x_2},{x_3}) = \left({{{3m} \over {1 + m}}, - {{1 + 4m} \over {2{{(1 + m)}^2}}},{{1 + 4m} \over {2(1 + m)}}} \right),}$$
(4.74)
$${{\Omega _m} = 1 - {{m(7 + 10m)} \over {2{{(1 + m)}^2}}},\quad \quad {w_{{\rm{eff}}}} = - {m \over {1 + m}},}$$
(4.75)
$$\begin{array}{*{20}c} {{P_6}:({x_1},{x_2},{x_3}) = \left({{{2(1 - m)} \over {1 + 2m}},{{1 - 4m} \over {m(1 + 2m)}}, - {{(1 - 4m)(1 + m)} \over {m(1 + 2m)}}} \right),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {\quad {\Omega _m} = 0,\quad \quad \quad \quad {w_{{\rm{eff}}}} = {{2 - 5m - 6{m^2}} \over {3m(1 + 2m)}}.}\\ \end{array}$$
(4.76)

The points P5 and P6 are on the line m(r) = − r − 1 in the (r, m) plane.

The matter-dominated epoch (Ωm ≃ 1 and weff − 0) can be realized only by the point P5 for m close to 0. In the (r, m) plane this point exists around (r, m) = (−1, 0). Either the point P1 or P6 can be responsible for the late-time cosmic acceleration. The former is a de Sitter point (weff = −1) with r = −2, in which case the condition (2.11) is satisfied. The point P6 can give rise to the accelerated expansion (weff < −1/3) provided that \(m > (\sqrt 3 - 1)/2\), or −1/2 < m < 0, or \(m < - (1 + \sqrt 3)/2\).

In order to analyze the stability of the above fixed points it is sufficient to consider only time-dependent linear perturbations δxi(t) (i = 1, 2, 3) around them (see [170, 171] for the detail of such analysis). For the point P5 the eigenvalues for the 3 × 3 Jacobian matrix of perturbations are

$$3(1 + m_5^\prime),\;\;\;{{- 3{m_5} \pm \sqrt {{m_5}(256m_5^3 + 160m_5^2 - 31{m_5} - 16)}} \over {4{m_5}({m_5} + 1)}},$$
(4.77)

where m5m(r5) and \(m_5^\prime \equiv {{{\rm{d}}m} \over {{\rm{d}}r}}({r_5})\) with r5 ≈ −1. In the limit that ∣m5∣ ≪ 1 the latter two eigenvalues reduce to \(- 3/4 \pm \sqrt {- 1/{m_5}}\). For the models with m5 < 0, the solutions cannot remain for a long time around the point P5 because of the divergent behavior of the eigenvalues as m5 → −0. The model f(R) = Rα/Rn (α > 0, n > 0) falls into this category. On the other hand, if 0 < m5 < 0.327, the latter two eigenvalues in Eq. (4.77) are complex with negative real parts. Then, provided that \(m_5^\prime > - 1\), the point P5 corresponds to a saddle point with a damped oscillation. Hence the solutions can stay around this point for some time and finally leave for the late-time acceleration. Then the condition for the existence of the saddle matter era is

$$m(r) \simeq + 0,\quad {{{\rm{d}}m} \over {{\rm{d}}r}} > - 1,\quad {\rm{at}}\quad r = - 1.$$
(4.78)

The first condition implies that viable f(R) models need to be close to the ΛCDM model during the matter domination. This is also required for consistency with local gravity constraints, as we will see in Section 5.

The eigenvalues for the Jacobian matrix of perturbations about the point P1 are

$$- 3,\quad - {3 \over 2} \pm {{\sqrt {25 - 16/{m_1}}} \over 2},$$
(4.79)

where m1 = m(r = −2). This shows that the condition for the stability of the de Sitter point P1 is [440, 243, 250, 26]

$$0 < m(r = - 2) \leq 1.$$
(4.80)

The trajectories that start from the saddle matter point P5 satisfying the condition (4.78) and then approach the stable de Sitter point P1 satisfying the condition (4.80) are, in general, cosmologically viable.

One can also show that P6 is stable and accelerated for (a) \(m_6^\prime < - 1,\,(\sqrt 3 - 1)/2 < {m_6} < 1\), (b) \(m_6^\prime > - 1,\,{m_6} < - (1 + \sqrt 3)/2\), (c) \(m_6^\prime > - 1,\, - 1/2 < {m_6} < 0\), (d) \(m_6^\prime > - 1,\,{m_6} \geq 1\). Since both P5 and P6 are on the line m = −r − 1, only the trajectories from \(m_5^\prime > - 1\) to \(m_6^\prime < - 1\) are allowed (see Figure 2). This means that only the case (a) is viable as a stable and accelerated fixed point P6. In this case the effective equation of state satisfies the condition weff > −1.

Figure 2
figure 2

Four trajectories in the (r, m) plane. Each trajectory corresponds to the models: (i) ΛCDM, (ii) f(R) = (Rb − Λ)c, (iii) f(R) = RαRn with α > 0, 0 < n < 1, and (iv) m(r) = −C(r + l)(r2 + ar + b). From [31].

From the above discussion the following two classes of models are cosmologically viable.

  • Class A: Models that connect P5 (r ≃ −1, m ≃ +0) to P1 (r = −2, 0 < m ≤ 1)

  • Class B: Models that connect P5 (r ≃ −1, m ≃ +0) to \({P_6}\left({m = - r - 1,\,\left({\sqrt 3 - 1} \right)/2 < m < 1} \right)\)

From Eq. (4.56) the viable f(R) dark energy models need to satisfy the condition m > 0, which is consistent with the above argument.

4.2 Viable f(R) dark energy models

We present a number of viable f(R) models in the (r, m) plane. First we note that the ΛCDM model corresponds to m = 0, in which case the trajectory is the straight line (i) in Figure 2. The trajectory (ii) in Figure 2 represents the model f(R) = (Rb − Λ)c [31], which corresponds to the straight line m(r) = [(1 − c)/c]r + b − 1 in the (r, m) plane. The existence of a saddle matter epoch demands the condition c ≥ 1 and bc ≃ 1. The trajectory (iii) represents the model [26, 382]

$$f(R) = R - \alpha {R^n}\quad (\alpha > 0,0 < n < 1),$$
(4.81)

which corresponds to the curve m = n(1 + r)/r. The trajectory (iv) represents the model m(r) = −C(r + 1)(r2 + ar + b), in which case the late-time accelerated attractor is the point P6 with \({\left({\sqrt 3 - 1} \right)/2 < m < 1}\).

In [26] it was shown that m needs to be close to 0 during the radiation domination as well as the matter domination. Hence the viable f(R) models are close to the ΛCDM model in the region RR0. The Ricci scalar remains positive from the radiation era up to the present epoch, as long as it does not oscillate around R = 0. The model f(R) = Rα/Rn (α > 0, n > 0) is not viable because the condition f,RR > 0 is violated.

As we will see in Section 5, the local gravity constraints provide tight bounds on the deviation parameter m in the region of high density (RR0), e.g., m(R) ≲ 10−15 for R = 105R0 [134, 596]. In order to realize a large deviation from the ΛCDM model such as \(m(R) > {\mathcal O}(0.1)\) today (R = R0) we require that the variable m changes rapidly from the past to the present. The f(R) model given in Eq. (4.81), for example, does not allow such a rapid variation, because evolves as m ≃ (−r −1) in the region RR0. Instead, if the deviation parameter has the dependence

$$m = C{(- r - 1)^p},\quad p > 1,\;\;C > 0,$$
(4.82)

it is possible to lead to the rapid decrease of m as we go back to the past. The models that behave as Eq. (4.82) in the regime RR0 are

$$({\rm{A}})\;f(R) = R - \mu {R_c}{{{{(R/{R_c})}^{2n}}} \over {{{(R/{R_c})}^{2n}} + 1}}\quad {\rm{with}}\;\;n,\mu, {R_c} > 0,$$
(4.83)
$$({\rm{B}})\;f(R) = R - \mu {R_c}\left[ {1 - {{(1 + {R^2}/R_c^2)}^{- n}}} \right]\quad {\rm{with}}\;\;n,\mu, {R_c} > 0.$$
(4.84)

The models (A) and (B) have been proposed by Hu and Sawicki [306] and Starobinsky [568], respectively. Note that Rc roughly corresponds to the order of R0 for \(\mu = O(1)\). This means that p = 2n + 1 for RR0. In the next section we will show that both the models (A) and (B) are consistent with local gravity constraints for n ≳ 1.

In the model (A) the following relation holds at the de Sitter point:

$$\mu = {{{{(1 + x_d^{2n})}^2}} \over {x_d^{2n - 1}(2 + 2x_d^{2n} - 2n)}},$$
(4.85)

where xdR1/Rc and R1 is the Ricci scalar at the de Sitter point. The stability condition (4.80) gives [587]

$$2x_d^{4n} - (2n - 1)(2n + 4)x_d^{2n} + (2n - 1)(2n - 2) \geq 0.$$
(4.86)

The parameter μ has a lower bound determined by the condition (4.86). When n = 1, for example, one has \({x_d} \geq \sqrt 3\) and \(\mu \geq 8\sqrt 3/9\). Under Eq. (4.86) one can show that the conditions (4.56) are also satisfied.

Similarly the model (B) satisfies [568]

$${(1 + x_d^2)^{n + 2}} \geq 1 + (n + 2)x_d^2 + (n + 1)(2n + 1)x_d^4,$$
(4.87)

with

$$\mu = {{{x_d}{{(1 + x_d^2)}^{n + 1}}} \over {2[{{(1 + x_d^2)}^{n + 1}} - 1 - (n + 1)x_d^2]}}.$$
(4.88)

When n = 1 we have \({x_d} \geq \sqrt 3\) and \(\mu \geq 8\sqrt 3/9\), which is the same as in the model (A). For general n, however, the bounds on μ in the model (B) are not identical to those in the model (A).

Another model that leads to an even faster evolution of m is given by [587]

$$({\rm{C}})\;f(R) = R - \mu {R_c}\tanh \;(R/{R_c})\quad {\rm{with}}\;\;\mu, {R_c} > 0.$$
(4.89)

A similar model was proposed by Appleby and Battye [35]. In the region RRc the model (4.89) behaves as f(R) ≃ RμRc [1 − exp(−2R/Rc)], which may be regarded as a special case of (4.82) in the limit that p ≫ 1Footnote 5. The Ricci scalar at the de Sitter point is determined by μ, as

$$\mu = {{{x_d}{{\cosh}^2}({x_d})} \over {2\sinh ({x_d})\cosh ({x_d}) - {x_d}}}.$$
(4.90)

From the stability condition (4.80) we obtain

$$\mu > 0.905,\quad \;{x_d} > 0.920{.}$$
(4.91)

The models (A), (B) and (C) are close to the ΛCDM model for RRcs, but the deviation from it appears when R decreases to the order of Rc. This leaves a number of observational signatures such as the phantom-like equation of state of dark energy and the modified evolution of matter density perturbations. In the following we discuss the dark energy equation of state in f(R) models. In Section 8 we study the evolution of density perturbations and resulting observational consequences in detail.

4.3 Equation of state of dark energy

In order to confront viable f(R) models with SN Ia observations, we rewrite Eqs. (4.59) and (4.60) as follows:

$$3A{H^2} = {\kappa ^2}({\rho _m} + {\rho _r} + {\rho _{{\rm{DE}}}}),$$
(4.92)
$$- 2A\dot H = {\kappa ^2}[{\rho _m} + (4/3){\rho _r} + {\rho _{{\rm{DE}}}} + {P_{{\rm{DE}}}}],$$
(4.93)

where A is some constant and

$${\kappa ^2}{\rho _{{\rm{DE}}}} \equiv (1/2)(FR - f) - 3H\dot F + 3{H^2}(A - F),$$
(4.94)
$${\kappa ^2}{P_{{\rm{DE}}}} \equiv \ddot F + 2H\dot F - (1/2)(FR - f) - (3{H^2} + 2\dot H)(A - F).$$
(4.95)

Defining ρDE and PDE in the above way, we find that these satisfy the usual continuity equation

$${\dot \rho _{{\rm{DE}}}} + 3H({\rho _{{\rm{DE}}}} + {P_{{\rm{DE}}}}) = 0{.}$$
(4.96)

Note that this holds as a consequence of the Bianchi identities, as we have already mentioned in the discussion from Eq. (2.8) to Eq. (2.10).

The dark energy equation of state, wDEPDE/ρDE, is directly related to the one used in SN Ia observations. From Eqs. (4.92) and (4.93) it is given by

$${w_{{\rm{DE}}}} = - {{2A\dot H + 3A{H^2} + {\kappa ^2}{\rho _r}/3} \over {3A{H^2} - {\kappa ^2}({\rho _m} + {\rho _r})}} \simeq {{{w_{{\rm{eff}}}}} \over {1 - (F/A){\Omega _m}}},$$
(4.97)

where the last approximate equality is valid in the regime where the radiation density ρr is negligible relative to the matter density ρm. The viable f(R) models approach the ΛCDM model in the past, i.e., F → 1 as R → ∞. In order to reproduce the standard matter era (3H2κ2ρm) for z ≫ 1, we can choose A = 1 in Eqs. (4.92) and (4.93). Another possible choice is A = F0, where F0 is the present value of F. This choice may be suitable if the deviation of F0 from 1 is small (as in scalar-tensor theory with a nearly massless scalar field [583, 93]). In both cases the equation of state wDE can be smaller than −1 before reaching the de Sitter attractor [306, 31, 587, 435], while the effective equation of state weff is larger than −1. This comes from the fact that the denominator in Eq. (4.97) becomes smaller than 1 in the presence of the matter fluid. Thus f(R) gravity models give rise to the phantom equation of state of dark energy without violating any stability conditions of the system. See [210, 417, 136, 13] for observational constraints on the models (4.83) and (4.84) by using the background expansion history of the universe. Note that as long as the late-time attractor is the de Sitter point the cosmological constant boundary crossing of weff reported in [52, 50] does not typically occur, apart from small oscillations of weff around the de Sitter point.

There are some works that try to reconstruct the forms of f(R) by using some desired form for the evolution of the scale factor a(t) or the observational data of SN Ia [117, 130, 442, 191, 621, 252]. We need to caution that the procedure of reconstruction does not in general guarantee the stability of solutions. In scalar-tensor dark energy models, for example, it is known that a singular behavior sometimes arises at low-redshifts in such a procedure [234, 271]. In addition to the fact that the reconstruction method does not uniquely determine the forms of f(R), the observational data of the background expansion history alone is not yet sufficient to reconstruct f(R) models in high precision.

Finally we mention a number of works [115, 118, 119, 265, 319, 515, 542, 90] about the use of metric f(R) gravity as dark matter instead of dark energy. In most of past works the power-law f(R) model f = Rn has been used to obtain spherically symmetric solutions for galaxy clustering. In [118] it was shown that the theoretical rotation curves of spiral galaxies show good agreement with observational data for n = 1.7, while for broader samples the best-fit value of the power was found to be n = 2.2 [265]. However, these values are not compatible with the bound ∣n − 1∣ < 7.2 × 10−19 derived in [62, 160] from a number of other observational constraints. Hence, it is unlikely that f(R) gravity works as the main source for dark matter.

5 Local Gravity Constraints

In this section we discuss the compatibility of f(R) models with local gravity constraints (see [469, 470, 245, 233, 154, 448, 251] for early works, and [31, 306, 134] for experimental constraints on viable f(R) dark energy models, and [101, 210, 330, 332, 471, 628, 149, 625, 329, 45, 511, 277, 534, 133, 445, 309, 89] for other related works). In an environment of high density such as Earth or Sun, the Ricci scalar R is much larger than the background cosmological value R0. If the outside of a spherically symmetric body is a vacuum, the metric can be described by a Schwarzschild exterior solution with R = 0. In the presence of non-relativistic matter with an energy density ρm, this gives rise to a contribution to the Ricci scalar R of the order κ2ρm.

If we consider local perturbations δR on a background characterized by the curvature R0, the validity of the linear approximation demands the condition δRR0. We first derive the solutions of linear perturbations under the approximation that the background metric \(g_{\mu \nu}^{(0)}\) is described by the Minkowski metric ημν. In the case of Earth and Sun the perturbation δR is much larger than R0, which means that the linear theory is no longer valid. In such a non-linear regime the effect of the chameleon mechanism [344, 343] becomes important, so that f(R) models can be consistent with local gravity tests.

5.1 Linear expansions of perturbations in the spherically symmetric background

First we decompose the quantities R, F(R), and Tμν into the background part and the perturbed part: R = R0 + δR, F = F0(1 + δF), and Tμν = (0)Tμν + δTμν about the approximate Minkowski background \((g_{\mu \nu}^{(0)} \approx {\eta _{\mu \nu}})\). In other words, although we consider R close to a mean-field value R0, the metric is still very close to the Minkowski case. The linear expansion of Eq. (2.7) in a time-independent background gives [470, 250, 154, 448]

$${\nabla ^2}{\delta _F} - {M^2}{\delta _F} = {{{\kappa ^2}} \over {3{F_0}}}\delta T,$$
(5.1)

where δTημνδTμν and

$${M^2} \equiv {1 \over 3}\left[ {{{{f_{,R}}({R_0})} \over {{f_{,RR}}({R_0})}} - {R_0}} \right] = {{{R_0}} \over 3}\left[ {{1 \over {m({R_0})}} - 1} \right].$$
(5.2)

The variable m is defined in Eq. (4.67). Since 0 < m(R0) < 1 for viable f(R) models, it follows that M2 > 0 (recall that R0 > 0).

We consider a spherically symmetric body with mass Mc, constant density ρ (= −δT), radius rc, and vanishing density outside the body. Since δF is a function of the distance r from the center of the body, Eq. (5.1) reduces to the following form inside the body (r < rc):

$${{{{\rm{d}}^2}} \over {{\rm{d}}{r^2}}}{\delta _F} + {2 \over r}{{\rm{d}} \over {{\rm{d}}r}}{\delta _F} - {M^2}{\delta _F} = - {{{\kappa ^2}} \over {3{F_0}}}\rho,$$
(5.3)

whereas the r.h.s. vanishes outside the body (r > rc). The solution of the perturbation δF for positive M2 is given by

$${({\delta _F})_{r < {r_c}}} = {c_1}{{{e^{- Mr}}} \over r} + {c_2}{{{e^{Mr}}} \over r} + {{8\pi G\rho} \over {3{F_0}{M^2}}},$$
(5.4)
$${({\delta _F})_{r > {r_c}}} = {c_3}{{{e^{- Mr}}} \over r} + {c_4}{{{e^{Mr}}} \over r},$$
(5.5)

where ci (i = 1, 2, 3, 4) are integration constants. The requirement that \({({\delta _F})_{r > {r_c}}} \rightarrow 0\) as r → ∞ gives c4 = 0. The regularity condition at r = 0 requires that c2 = −c1. We match two solutions (5.4) and (5.5) at r = rc by demanding the regular behavior of δF(r) and \(\delta _F^{\prime}(r)\). Since δFδR, this implies that R is also continuous. If the mass M satisfies the condition Mrc ≪ 1, we obtain the following solutions

$${({\delta _F})_{r < {r_c}}} \simeq {{4\pi G\rho} \over {3{F_0}}}\left({r_c^2 - {{{r^2}} \over 3}} \right),$$
(5.6)
$${({\delta _F})_{r > {r_c}}} \simeq {{2G{M_c}} \over {3{F_0}r}}{e^{- Mr}}.$$
(5.7)

As we have seen in Section 2.3, the action (2.1) in f(R) gravity can be transformed to the Einstein frame action by a transformation of the metric. The Einstein frame action is given by a linear action in \(\tilde R\), where \(\tilde R\) is a Ricci scalar in the new frame. The first-order solution for the perturbation hμν of the metric \({\tilde g_{\mu \nu}} = {F_0}({\eta _{\mu \nu}} + {h_{\mu \nu}})\) follows from the first-order linearized Einstein equations in the Einstein frame. This leads to the solutions h00 = 2 GMc/(F0r) and hij = 2GMc/(F0r) δij. Including the perturbation δF to the quantity F, the actual metric gμν is given by [448]

$${g_{\mu \nu}} = {{{{\tilde g}_{\mu \nu}}} \over F} \simeq {\eta _{\mu \nu}} + {h_{\mu \nu}} - {\delta _F}{\eta _{\mu \nu}}.$$
(5.8)

Using the solution (5.7) outside the body, the (00) and (ii) components of the metric gμν are

$${g_{00}} \simeq - 1 + {{2G_{{\rm{eff}}}^{(N)}{M_c}} \over r},\quad \;{g_{ii}} \simeq 1 + {{2G_{{\rm{eff}}}^{(N)}{M_c}} \over r}\gamma,$$
(5.9)

where\(G_{{\rm{eff}}}^{(N)}\) and γ are the effective gravitational coupling and the post-Newtonian parameter, respectively, defined by

$$G_{{\rm{eff}}}^{(N)} \equiv {G \over {{F_0}}}\left({1 + {1 \over 3}{e^{- Mr}}} \right),\quad \quad \gamma \equiv {{3 - {e^{- Mr}}} \over {3 + {e^{- Mr}}}}.$$
(5.10)

For the f(R) models whose deviation from the ΛCDM model is small (m ≪ 1), we have M2R0/[3m(R0)] and R ≃ 8πGρ. This gives the following estimate

$${(M{r_c})^2} \simeq 2{{{\Phi _c}} \over {m({R_0})}},$$
(5.11)

where \({\Phi _c} = G{M_c}/({F_0}{r_c}) = 4\pi G\rho r_c^2/(3{F_0})\) is the gravitational potential at the surface of the body. The approximation Mrc ≪ 1 used to derive Eqs. (5.6) and (5.7) corresponds to the condition

$$m({R_0}) \gg {\Phi _c}.$$
(5.12)

Since F0δF = f,rr(R0)δR, it follows that

$$\delta R = {{{f_{,R}}({R_0})} \over {{f_{,RR}}({R_0})}}{\delta _F}.$$
(5.13)

The validity of the linear expansion requires that δRR0, which translates into δFm(R0). Since δF ≃ 2GMc/(3F0rc) = 2Φc/3 at r = rc, one has δFm(R0) ≪ 1 under the condition (5.12). Hence the linear analysis given above is valid for m(R0) ≫ Φc.

For the distance r close to rc the post Newtonian parameter in Eq. (5.10) is given by γ≃ 1/2 (i.e., because Mr ≪ 1). The tightest experimental bound on γ is given by [616, 83, 617]:

$$\vert \gamma - 1\vert \; < 2.3 \times {10^{- 5}},$$
(5.14)

which comes from the time-delay effect of the Cassini tracking for Sun. This means that f(R) gravity models with the light scalaron mass (Mrc ≪ 1) do not satisfy local gravity constraints [469, 470, 245, 233, 154, 448, 330, 332]. The mean density of Earth or Sun is of the order of ρ ≃ 1–10 g/cm3, which is much larger than the present cosmological density \(\rho _c^{(0)} \simeq {10^{- 29}}g/{\rm{c}}{{\rm{m}}^3}\). In such an environment the condition δRR0 is violated and the field mass M becomes large such that Mrc ≫ 1. The effect of the chameleon mechanism [344, 343] becomes important in this nonlinear regime (δRR0) [251, 306, 134, 101]. In Section 5.2 we will show that the f(R) models can be consistent with local gravity constraints provided that the chameleon mechanism is at work.

5.2 Chameleon mechanism in f(R) gravity

Let us discuss the chameleon mechanism [344, 343] in metric f(R) gravity. Unlike the linear expansion approach given in Section 5.1, this corresponds to a non-linear effect arising from a large departure of the Ricci scalar from its background value R0. The mass of an effective scalar field degree of freedom depends on the density of its environment. If the matter density is sufficiently high, the field acquires a heavy mass about the potential minimum. Meanwhile the field has a lighter mass in a low-density cosmological environment relevant to dark energy so that it can propagate freely. As long as the spherically symmetric body has a thin-shell around its surface, the effective coupling between the field and matter becomes much smaller than the bare coupling ∣Q∣. In the following we shall review the chameleon mechanism for general couplings Q and then proceed to constrain f(R) dark energy models from local gravity tests.

5.2.1 Field profile of the chameleon field

The action (2.1) in f(R) gravity can be transformed to the Einstein frame action (2.32) with the coupling \(Q = - 1/\sqrt 6\) between the scalaron field \(\phi = \sqrt {3/(2{\kappa ^2})}\) ln F and non-relativistic matter. Let us consider a spherically symmetric body with radius \({\tilde r_c}\) in the Einstein frame. We approximate that the background geometry is described by the Minkowski space-time. Varying the action (2.32) with respect to the field ϕ, we obtain

$${{{{\rm{d}}^2}\phi} \over {{\rm{d}}{{\tilde r}^2}}} + {2 \over {\tilde r}}{{{\rm{d}}\phi} \over {{\rm{d}}\tilde r}} - {{{\rm{d}}{V_{{\rm{eff}}}}} \over {{\rm{d}}\phi}} = 0,$$
(5.15)

where \(\tilde r\) is a distance from the center of symmetry that is related to the distance r in the Jordan frame via \(\tilde r = \sqrt F r = {e^{- Q\kappa \phi}}r\). The effective potential Veff is defined by

$${V_{{\rm{eff}}}}(\phi) = V(\phi) + {e^{Q\kappa \phi}}{\rho ^{\ast}},$$
(5.16)

where ρ* is a conserved quantity in the Einstein frame [343]. Recall that the field potential V(ϕ) is given in Eq. (2.33). The energy density \(\tilde \rho\) in the Einstein frame is related with the energy density ρ in the Jordan frame via the relation \(\tilde \rho = \rho/{F^2} = {e^{4Q\kappa \phi}}\rho\). Since the conformal transformation gives rise to a coupling Q between matter and the field, \(\tilde \rho\) is not a conserved quantity. Instead the quantity \({\rho ^{\ast}} = {e^{3Q\kappa \phi}}\rho = {e^{- Q\kappa \phi}}\tilde \rho\) corresponds to a conserved quantity, which satisfies \({\tilde r^3}{\rho ^{\ast}} = {r^3}\rho\). Note that Eq. (5.15) is consistent with Eq. (2.42).

In the following we assume that a spherically symmetric body has a constant density ρ* = ρA inside the body \((\tilde r < {\tilde r_c})\) and that the energy density outside the body \((\tilde r > {\tilde r_c})\) is ρ* = ρB (≪ρA). The mass Mc of the body and the gravitational potential Φc at the radius \({\tilde r_c}\) are given by \({M_c} = (4\pi/3)\tilde r_c^3{\rho _A}\) and \({\Phi _c} = G{M_c}/{\tilde r_c}\), respectively. The effective potential has minima at the field values ϕA and ϕB:

$${V_{,\phi}}({\phi _A}) + \kappa Q{e^{Q\kappa {\phi _A}}}{\rho _A} = 0,$$
(5.17)
$${V_{,\phi}}({\phi _B}) + \kappa Q{e^{Q\kappa {\phi _B}}}{\rho _B} = 0.$$
(5.18)

The former corresponds to the region of high density with a heavy mass squared \(m_A^2 \equiv {V_{{\rm{eff,}}\phi \phi}}({\phi _A})\), whereas the latter to a lower density region with a lighter mass squared \(m_B^2 \equiv {V_{{\rm{eff,}}\phi \phi}}({\phi _B})\). In the case of Sun, for example, the field value ϕB is determined by the homogeneous dark matter/baryon density in our galaxy, i.e., ρB ≃ 10−24 g/cm3.

When Q > 0 the effective potential has a minimum for the models with V,ϕ < 0, which occurs, e.g., for the inverse power-law potential V(ϕ) = M4+nϕn. The f(R) gravity corresponds to a negative coupling \((Q = - 1/\sqrt 6)\), in which case the effective potential has a minimum for V,ϕ > 0. As an example, let us consider the shape of the effective potential for the models (4.83) and (4.84). In the region RRc both models behave as

$$f(R) \simeq R - \mu {R_c}\left[ {1 - {{({R_c}/R)}^{2n}}} \right].$$
(5.19)

For this functional form it follows that

$$F = {e^{{2 \over {\sqrt 6}}\kappa \phi}} = 1 - 2n\mu {(R/{R_c})^{- (2n + 1)}},$$
(5.20)
$$V(\phi) = {{\mu {R_c}} \over {2{\kappa ^2}}}{e^{- {4 \over {\sqrt 6}}\kappa \phi}}\left[ {1 - (2n + 1){{\left({{{- \kappa \phi} \over {\sqrt 6 n\mu}}} \right)}^{{{2n} \over {2n + 1}}}}} \right].$$
(5.21)

The r.h.s. of Eq. (5.20) is smaller than 1, so that ϕ < 0. The limit R → ∞ corresponds to ϕ → −0. In the limit ϕ → −0 one has VμRc/(2κ2) and V,ϕ → ∞. This property can be seen in the upper panel of Figure 3, which shows the potential V(ϕ) for the model (4.84) with parameters n = 1 and μ = 2. Because of the existence of the coupling term \({e^{- \kappa \phi/\sqrt 6}}{\rho ^{\ast}}\), the effective potential Veff(ϕ) has a minimum at

$$\kappa {\phi _M} = - \sqrt 6 n\mu {\left({{{{R_c}} \over {{\kappa ^2}{\rho ^{\ast}}}}} \right)^{2n + 1}}.$$
(5.22)

Since Rκ2ρ*≫ Rc in the region of high density, the condition ∣κϕM∣≪ 1 is in fact justified (for n and μ of the order of unity). The field mass mϕ about the minimum of the effective potential is given by

$$m_\phi ^2 = {1 \over {6n(n + 1)\mu}}{R_c}{\left({{{{\kappa ^2}{\rho ^{\ast}}} \over {{R_c}}}} \right)^{2(n + 1)}}.$$
(5.23)

This shows that, in the regime Rκ2ρ* ≫ Rc, mϕ is much larger than the present Hubble parameter \({H_0}(\sim \sqrt {{R_c}})\). Cosmologically the field evolves along the instantaneous minima characterized by Eq. (5.22) and then it approaches a de Sitter point which appears as a minimum of the potential in the upper panel of Figure 3.

Figure 3
figure 3

(Top) The potential V(ϕ) = (FRf)/(2κ2F2) versus the field \(\phi = \sqrt {3/(16\pi){m_{{\rm{pl}}}}}\) ln F for the Starobinsky’s dark energy model (4.84) with n = 1 and μ = 2. (Bottom) The inverted effective potential −Veff for the same model parameters as the top with \({\rho ^{\ast}} = 10{R_c}m_{{\rm{pl}}}^2\). The field value, at which the inverted effective potential has a maximum, is different depending on the density ρ*, see Eq. (5.22). In the upper panel “de Sitter” corresponds to the minimum of the potential, whereas “singular” means that the curvature diverges at ϕ = 0.

In order to solve the “dynamics” of the field ϕ in Eq. (5.15), we need to consider the inverted effective potential (−Veff). See the lower panel of Figure 3 for illustration [which corresponds to the model (4.84)]. We impose the following boundary conditions:

$${{{\rm{d}}\phi} \over {{\rm{d}}\tilde r}}(\tilde r = 0) = 0,$$
(5.24)
$$\phi (\tilde r \rightarrow \infty) = {\phi _B}.$$
(5.25)

The boundary condition (5.25) can be also understood as \({\lim\nolimits_{\tilde r \rightarrow \infty}}{\rm{d}}\phi {\rm{/d}}\tilde r = 0\). The field ϕ is at rest at \(\tilde r = 0\) and starts to roll down the potential when the matter-coupling term κQρAeQκϕ in Eq. (5.15) becomes important at a radius \({\tilde r_1}\). If the field value at \(\tilde r = 0\) is close to ϕA, the field stays around ϕA in the region \(0 < \tilde r < {\tilde r_1}\). The body has a thin-shell if \({\tilde r_1}\) is close to the radius \({\tilde r_c}\) of the body.

In the region \(0 < \tilde r < {{\tilde r}_1}\) one can approximate the r.h.s. of Eq. (5.15) as \({\rm{d}}{V_{{\rm{eff}}}}/{\rm{d}}\phi \simeq m_A^2(\phi - {\phi _A})\) around ϕ = ϕA, where \(m_A^2 = {R_c}{({\kappa ^2}{\rho _A}/{R_c})^{2(n + 1)}}/[6n(n + 1)]\). Hence the solution to Eq. (5.15) is given by \(\phi (\tilde r) = {\phi _A} + A{e^{- {m_A}\tilde r}}/\tilde r + B{e^{{m_A}\tilde r}}/\tilde r\), where A and B are constants. In order to avoid the divergence of ϕ at \(\tilde r = 0\) we demand the condition B = −A, in which case the solution is

$$\phi (\tilde r) = {\phi _A} + {{A({e^{- {m_A}\tilde r}} - {e^{{m_A}\tilde r}})} \over {\tilde r}}\quad \quad (0 < \tilde r < {\tilde r_1}){.}$$
(5.26)

In fact, this satisfies the boundary condition (5.24).

In the region \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\) the field \(\vert\phi (\tilde r)\vert\) evolves toward larger values with the increase of \({\tilde r}\). In the lower panel of Figure 3 the field stays around the potential maximum for \(0 < \tilde r < {{\tilde r}_1}\), but in the regime \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\) it moves toward the left (largely negative ϕ region). Since ∣V,ϕ∣ ≪ κQρAeQκϕ∣ in this regime we have that dVeff/dϕκQρA in Eq. (5.15), where we used the condition Qκϕ ≪ 1. Hence we obtain the following solution

$$\phi (\tilde r) = {1 \over 6}\kappa Q{\rho _A}{\tilde r^2} - {C \over {\tilde r}} + D\quad \quad ({\tilde r_1} < \tilde r < {\tilde r_c}),$$
(5.27)

where C and D are constants.

Since the field acquires a sufficient kinetic energy in the region \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\), the field climbs up the potential hill toward the largely negative ϕ region outside the body \((\tilde r > {{\tilde r}_c})\). The shape of the effective potential changes relative to that inside the body because the density drops from ρA to ρB. The kinetic energy of the field dominates over the potential energy, which means that the term dVeff/dϕ in Eq. (5.15) can be neglected. Recall that one has ∣ϕB∣ ≫ ∣ϕA∣ under the condition ρAρB [see Eq. (5.22)]. Taking into account the mass term \(m_B^2 = {R_c}{({k^2}{\rho _B}/{R_c})^{2(n + 1)}}/[6n(n + 1)]\), we have \({\rm{d}}{V_{{\rm{eff}}}}/{\rm{d}}\phi \simeq m_B^2(\phi - {\phi _B})\) on the r.h.s. of Eq. (5.15). Hence we obtain the solution \(\phi (\tilde r) = {\phi _B} + E{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}/\tilde r + F{e^{{m_B}(\tilde r - {{\tilde r}_c})}}/\tilde r\) with constants E and F. Using the boundary condition (5.25), it follows that F = 0 and hence

$$\phi (\tilde r) = {\phi _B} + E{{{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}} \over {\tilde r}}\quad (\tilde r > {\tilde r_c}){.}$$
(5.28)

Three solutions (5.26), (5.27) and (5.28) should be matched at \(\tilde r = {{\tilde r}_1}\) and \(\tilde r = {{\tilde r}_c}\) by imposing continuous conditions for ϕ and \({\rm{d}}\phi {\rm{/d}}\tilde r\). The coefficients A, C, D and E are determined accordingly [575]:

$$C = {{{s_1}{s_2}[({\phi _B} - {\phi _A}) + (\tilde r_1^2 - \tilde r_c^2)\kappa Q{\rho _A}/6] + [{s_2}\tilde r_1^2({e^{- {m_A}{{\tilde r}_1}}} - {e^{{m_A}{{\tilde r}_1}}}) - {s_1}\tilde r_c^2]\kappa Q{\rho _A}/3} \over {{m_A}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}}){s_2} - {m_B}{s_1}}},$$
(5.29)
$$A = - {1 \over {{s_1}}}(C + \kappa Q{\rho _A}\tilde r_1^3/3),$$
(5.30)
$$E = - {1 \over {{s_2}}}(C + \kappa Q{\rho _A}\tilde r_c^3/3),$$
(5.31)
$$D = {\phi _B} - {1 \over 6}\kappa Q{\rho _A}\tilde r_c^2 + {1 \over {{{\tilde r}_c}}}(C + E),$$
(5.32)

where

$${s_1} \equiv {m_A}{\tilde r_1}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}}) + {e^{- {m_A}{{\tilde r}_1}}} - {e^{{m_A}{{\tilde r}_1}}},$$
(5.33)
$${s_2} \equiv 1 + {m_B}{\tilde r_c}.$$
(5.34)

if the maxss mB outside the body is small to satisfy the condition \({m_B}{{\tilde r}_c} \ll 1\) and mAmB, we can neglect the contribution of the mB-dependent terms in Eqs. (5.29)(5.32). Then the field profile is given by [575]

$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _A} - {1 \over {{m_A}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}})}}\left[ {{\phi _B} - {\phi _A} + {1 \over 2}\kappa Q{\rho _A}(\tilde r_1^2 - \tilde r_c^2)} \right]{{{e^{- {m_A}\tilde r}} - {e^{{m_A}\tilde r}}} \over {\tilde r}},}\\ {(0 < \tilde r < {{\tilde r}_1}),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ \end{array}$$
(5.35)
$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _B} + {1 \over 2}\kappa Q{\rho _A}({{\tilde r}^2} - 3\tilde r_c^2) + {{\kappa Q{\rho _A}\tilde r_1^3} \over {3\tilde r}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {- \left[ {1 + {{{e^{- {m_A}{{\tilde r}_1}}} - {e^{{m_A}{{\tilde r}_1}}}} \over {{m_A}{{\tilde r}_1}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}})}}} \right]\left[ {{\phi _B} - {\phi _A} + {1 \over 2}\kappa Q{\rho _A}(\tilde r_1^2 - \tilde r_c^2)} \right]{{{{\tilde r}_1}} \over {\tilde r}},}\\ {({{\tilde r}_1} < \tilde r < {{\tilde r}_c}),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ \end{array}$$
(5.36)
$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _B} - \left[ {{{\tilde r}_1}({\phi _B} - {\phi _A}) + {1 \over 6}\kappa Q{\rho _A}\tilde r_c^3\left({2 + {{{{\tilde r}_1}} \over {{{\tilde r}_c}}}} \right){{\left({1 - {{{{\tilde r}_1}} \over {{{\tilde r}_c}}}} \right)}^2}} \right.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {+ \left. {{{{e^{- {m_A}{{\tilde r}_1}}} - {e^{{m_A}{{\tilde r}_1}}}} \over {{m_A}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}})}}\left\{{{\phi _B} - {\phi _A} + {1 \over 2}\kappa Q{\rho _A}(\tilde r_1^2 - \tilde r_c^2)} \right\}} \right]{{{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}} \over {\tilde r}},\quad}\\ {(\tilde r > {{\tilde r}_c}){.}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ \end{array}$$
(5.37)

Originally a similar field profile was derived in [344, 343] by assuming that the field is frozen at ϕ = ϕA in the region \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\).

The radius r1 is determined by the following condition

$$m_A^2[\phi ({\tilde r_1}) - {\phi _A}] = \kappa Q{\rho _A}.$$
(5.38)

This translates into

$${\phi _B} - {\phi _A} + {1 \over 2}\kappa Q{\rho _A}(\tilde r_1^2 - \tilde r_c^2) = {{6Q{\Phi _c}} \over {\kappa {{({m_A}{{\tilde r}_c})}^2}}}{{{m_A}{{\tilde r}_1}({e^{{m_A}{{\tilde r}_1}}} + {e^{- {m_A}{{\tilde r}_1}}})} \over {{e^{{m_A}{{\tilde r}_1}}} - {e^{- {m_A}{{\tilde r}_1}}}}},$$
(5.39)

where \({\Phi _c} = {\kappa ^2}{M_c}/(8\pi {{\tilde r}_c}) = {\kappa ^2}{\rho _A}\tilde r_c^2/6\) is the gravitational potential at the surface of the body. Using this relation, the field profile (5.37) outside the body reduces to

$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _B} - {{2Q{\Phi _c}} \over \kappa}\left[ {1 - {{\tilde r_1^3} \over {\tilde r_c^3}} + 3{{{{\tilde r}_1}} \over {{{\tilde r}_c}}}{1 \over {{{({m_A}{{\tilde r}_c})}^2}}}\left\{{{{{m_A}{{\tilde r}_1}({e^{{m_A}{{\tilde r}_1}}} + {e^{- {m_A}{{\tilde r}_1}}})} \over {{e^{{m_A}{{\tilde r}_1}}} - {e^{- {m_A}{{\tilde r}_1}}}}} - 1} \right\}} \right]{{{{\tilde r}_c}{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}} \over {\tilde r}},}\\ {(\tilde r > {{\tilde r}_c}){.}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ \end{array}$$
(5.40)

If the field value at \(\tilde r = 0\) is away from ϕA, the field rolls down the potential for \(\tilde r > 0\). This corresponds to taking the limit \({{\tilde r}_1} \to 0\) in Eq. (5.40), in which case the field profile outside the body is given by

$$\phi (\tilde r) = {\phi _B} - {{2Q} \over \kappa}{{G{M_c}} \over {\tilde r}}{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}.$$
(5.41)

This shows that the effective coupling is of the order of Q and hence for \(\vert Q\vert = {\mathcal O}(1)\) local gravity constraints are not satisfied.

5.2.2 Thin-shell solutions

Let us consider the case in which \({{\tilde r}_1}\) is close to \({{\tilde r}_c}\), i.e.

$$\Delta {\tilde r_c} \equiv {\tilde r_c} - {\tilde r_1} \ll {\tilde r_c}.$$
(5.42)

This corresponds to the thin-shell regime in which the field is stuck inside the star except around its surface. If the field is sufficiently massive inside the star to satisfy the condition \({m_A}{{\tilde r}_c} \gg 1\), Eq. (5.39) gives the following relation

$${\epsilon _{{\rm{th}}}} \equiv {{\kappa ({\phi _B} - {\phi _A})} \over {6Q{\Phi _c}}} \simeq {{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}} + {1 \over {{m_A}{{\tilde r}_c}}},$$
(5.43)

where ϵth is called the thin-shell parameter [344, 343]. Neglecting second-order terms with respect to \(\Delta {{\tilde r}_c}/{{\tilde r}_c}\) and \(1/({m_A}{{\tilde r}_c})\) in Eq. (5.40), it follows that

$$\phi (\tilde r) \simeq {\phi _B} - {{2{Q_{{\rm{eff}}}}} \over \kappa}{{G{M_c}} \over {\tilde r}}{e^{- {m_B}(\tilde r - {{\tilde r}_c})}},$$
(5.44)

where Qeff is the effective coupling given by

$${Q_{{\rm{eff}}}} = 3Q{\epsilon _{{\rm{th}}}}.$$
(5.45)

Since õth ≪ 1 under the conditions \(\Delta {{\tilde r}_c}/{{\tilde r}_c} \ll 1\) and \(1/({m_A}{{\tilde r}_c}) \ll 1\), the amplitude of the effective coupling Qeff becomes much smaller than 1. In the original papers of Khoury and Weltman [344, 343] the thin-shell solution was derived by assuming that the field is frozen with the value ϕ = ϕA in the region \(0 < \tilde r < {{\tilde r}_1}\). In this case the thin-shell parameter is given by \({\epsilon _{{\rm{th}}}} \simeq \Delta {{\tilde r}_c}/{{\tilde r}_c}\), which is different from Eq. (5.43). However, this difference is not important because the condition \(\Delta {{\tilde r}_c}/{{\tilde r}_c} \gg 1/({m_A}{{\tilde r}_c})\) is satisfied for most of viable models [575].

5.2.3 Post Newtonian parameter

We derive the bound on the thin-shell parameter from experimental tests of the post Newtonian parameter in the solar system. The spherically symmetric metric in the Einstein frame is described by [251]

$${\rm{d}}{\tilde s^2} = {\tilde g_{\mu \nu}}{\rm{d}}{\tilde x^\mu}{\rm{d}}{\tilde x^\nu} = - [1 - 2\tilde {\mathcal A}(\tilde r)]{\rm{d}}{t^2} + [1 + 2\tilde {\mathcal B}(\tilde r)]{\rm{d}}{\tilde r^2} + {\tilde r^2}{\rm{d}}{\Omega ^2},$$
(5.46)

where \(\tilde {\mathcal A}(\tilde r)\) and \(\tilde {\mathcal B}(\tilde r)\) are functions of \({\tilde r}\) and dΩ2 = dθ2 + (sin2 θ)dϕ2. In the weak gravitational background \((\tilde {\mathcal A}(\tilde r) \ll 1\) and \(\tilde {\mathcal B}(\tilde r) \ll 1)\) the metric outside the spherically symmetric body with mass Mc is given by \(\tilde {\mathcal A}(\tilde r) \simeq \tilde {\mathcal B}(\tilde r) \simeq G{M_c}/\tilde r\).

Let us transform the metric (5.46) back to that in the Jordan frame under the inverse of the conformal transformation, \({g_{\mu \nu}} = {e^{2Q\kappa \phi}}{{\tilde g}_{\mu \nu}}\). Then the metric in the Jordan frame, \({\rm{d}}{s^2} = {e^{2Q\kappa \phi}}{\rm{d}}{{\tilde s}^2} = {g_{\mu \nu}}{\rm{d}}{x^\mu}{\rm{d}}{x^\nu}\), is given by

$${\rm{d}}{s^2} = - [1 - 2{\mathcal A}(r)]{\rm{d}}{t^2} + [1 + 2{\mathcal B}(r)]{\rm{d}}{r^2} + {r^2}{\rm{d}}{\Omega ^2}.$$
(5.47)

Under the condition ∣Qκϕ∣ ≪ 1 we obtain the following relations

$$\tilde r = {e^{Q\kappa \phi}}r,\quad {\mathcal A}(r) \simeq \tilde {\mathcal A}(\tilde r) - Q\kappa \phi (\tilde r),\quad {\mathcal B}(r) \simeq \tilde {\mathcal B}(\tilde r) - Q\kappa \tilde r{{{\rm{d}}\phi (\tilde r)} \over {{\rm{d}}\tilde r}}.$$
(5.48)

In the following we use the approximation \(r \simeq \tilde r\), which is valid for ∣Qκϕ∣ ≪ 1. Using the thin-shell solution (5.44), it follows that

$${\mathcal A}(r) = {{G{M_c}} \over r}[1 + 6{Q^2}{\epsilon _{{\rm{th}}}}(1 - r/{r_c})],\quad {\mathcal B}(r) = {{G{M_c}} \over r}(1 - 6{Q^2}{\epsilon _{{\rm{th}}}}),$$
(5.49)

where we have used the approximation ∣ϕB∣ ≫ ∣ϕA∣ and hence ϕB ≃ 6QΦcϵth/κ.

The term QκϕB in Eq. (5.48) is smaller than \({\mathcal A}(r) = G{M_c}/r\) under the condition r/rc < (6Q2ϵth)−1. Provided that the field ϕ reaches the value ϕB with the distance rB satisfying the condition rB/rc < (6Q2th)−1, the metric \({\mathcal A}(r)\) does not change its sign for r < rB. The post-Newtonian parameter γ is given by

$$\gamma \equiv {{{\mathcal B}(r)} \over {{\mathcal A}(r)}} \simeq {{1 - 6{Q^2}{\epsilon_{{\rm{th}}}}} \over {1 + 6{Q^2}{\epsilon_{{\rm{th}}}}(1 - r/{r_c})}}.$$
(5.50)

The experimental bound (5.14) can be satisfied as long as the thin-shell parameter ϵth is much smaller than 1. If we take the distance r = rc, the constraint (5.14) translates into

$${\epsilon_{{\rm{th}}, \odot}} < 3.8 \times {10^{- 6}}/{Q^2},$$
(5.51)

where ϵth,⊙ is the thin-shell parameter for Sun. In f(R) gravity \((Q = - 1/\sqrt 6)\) this corresponds to ϵth,⊙ < 2.3 × 10−5.

5.2.4 Experimental bounds from the violation of equivalence principle

Let us next discuss constraints on the thin-shell parameter from the possible violation of equivalence principle (EP). The tightest bound comes from the solar system tests of weak EP using the free-fall acceleration of Earth (a) and Moon (aMoon) toward Sun [343]. The experimental bound on the difference of two accelerations is given by [616, 83, 617]

$${{\vert {a_ \oplus} - {a_{{\rm{Moon}}}}\vert} \over {\vert {a_ \oplus} + {a_{{\rm{Moon}}}}\vert/2}} < {10^{- 3}}.$$
(5.52)

Provided that Earth, Sun, and Moon have thin-shells, the field profiles outside the bodies are given by Eq. (5.44) with the replacement of corresponding quantities. The presence of the field ϕ(r) with an effective coupling Qeff gives rise to an extra acceleration, afifth = ∣Qeffϕ(r)∣. Then the accelerations a and aMoon toward Sun (mass M) are [343]

$${a_\oplus} \simeq {{G{M_ \odot}} \over {{r^2}}}\left[ {1 + 18{Q^2}\epsilon _{{\rm{th}}, \oplus}^2{{{\Phi _ \oplus}} \over {{\Phi _ \odot}}}} \right],$$
(5.53)
$${a_{{\rm{Moon}}}} \simeq {{G{M_ \odot}} \over {{r^2}}}\left[ {1 + 18{Q^2}\epsilon _{{\rm{th}}, \oplus}^2{{\Phi _ \oplus ^2} \over {{\Phi _ \odot}{\Phi _{{\rm{Moon}}}}}}} \right],$$
(5.54)

where ϵth, ⊕ is the thin-shell parameter of Earth, and Φ ≃ 2.1 × 10−6, Φ ≃ 7.0 × 10−10, ΦMoon ≃ 3.1 × 10−11 are the gravitational potentials of Sun, Earth and Moon, respectively. Hence the condition (5.52) translates into [134, 596]

$${\epsilon_{{\rm{th}}, \oplus}} < 8.8 \times {10^{- 7}}/\vert Q\vert,$$
(5.55)

which corresponds to ϵth,⊕ < 2.2 × 10−6 in f(R) gravity. This bound provides a tighter bound on model parameters compared to (5.51).

Since the condition ∣ϕB∣≫ ∣ϕA∣ is satisfied for ρAρB, one has ϵth,⊕κϕB/(6QΦ) from Eq. (5.43). Then the bound (5.55) translates into

$$\vert \kappa {\phi _{B, \oplus}}\vert < 3.7 \times {10^{- 15}}.$$
(5.56)

5.2.5 Constraints on model parameters in f(R) gravity

We place constraints on the f(R) models given in Eqs. (4.83) and (4.84) by using the experimental bounds discussed above. In the region of high density where is much larger than Rc, one can use the asymptotic form (5.19) to discuss local gravity constraints. Inside and outside the spherically symmetric body the effective potential Veff for the model (5.19) has two minima at

$$\kappa {\phi _A} \simeq - \sqrt 6 n\mu {\left({{{{R_c}} \over {{\kappa ^2}{\rho _A}}}} \right)^{2n + 1}},\quad \;\;\kappa {\phi _B} \simeq - \sqrt 6 n\mu {\left({{{{R_c}} \over {{\kappa ^2}{\rho _B}}}} \right)^{2n + 1}}.$$
(5.57)

The bound (5.56) translates into

$${{n\mu} \over {x_d^{2n + 1}}}{\left({{{{R_1}} \over {{\kappa ^2}{\rho _B}}}} \right)^{2n + 1}} < 1.5 \times {10^{- 15}},$$
(5.58)

where xdR1/Rc and R1 is the Ricci scalar at the late-time de Sitter point. In the following we consider the case in which the Lagrangian density is given by (5.19) for RR1. If we use the models (4.83) and (4.84), then there are some modifications for the estimation of R1. However this change should be insignificant when we place constraints on model parameters.

At the de Sitter point the model (5.19) satisfies the condition \(\mu = x_d^{2n + 1}/[2(x_d^{2n} - n - 1)]\). Substituting this relation into Eq. (5.58), we find

$${n \over {2(x_d^{2n} - n - 1)}}{\left({{{{R_1}} \over {{\kappa ^2}{\rho _B}}}} \right)^{2n + 1}} < 1.5 \times {10^{- 15}}.$$
(5.59)

For the stability of the de Sitter point we require that m(R1) < 1, which translates into the condition \(x_d^{2n} > 2{n^2} + 3n + 1\). Hence the term \(n/[2(x_d^{2n} - n - 1)]\) in Eq. (5.59) is smaller than 0.25 for n > 0.

We use the approximation that R1 and ρB are of the orders of the present cosmological density 10−29 g/cm3 and the baryonic/dark matter density 10−24 g/cm3 in our galaxy, respectively. From Eq. (5.59) we obtain the bound [134]

$$n > 0{.}9{.}$$
(5.60)

Under this condition one can see an appreciable deviation from the ΛCDM model cosmologically as R decreases to the order of Rc.

If we consider the model (4.81), it was shown in [134] that the bound (5.56) gives the constraint n < 3 × 10−10. This means that the deviation from the ΛCDM model is very small. Meanwhile, for the models (4.83) and (4.84), the deviation from the ΛCDM model can be large even for \(n = {\mathcal O}(1)\), while satisfying local gravity constraints. We note that the model (4.89) is also consistent with local gravity constraints.

6 Cosmological Perturbations

The f(R) theories have one extra scalar degree of freedom compared to the ΛCDM model. This feature results in more freedom for the background. As we have seen previously, a viable cosmological sequence of radiation, matter, and accelerated epochs is possible provided some conditions hold for f(R). In principle, however, one can specify any given H = H(a) and solve Eqs. (2.15) and (2.16) for those f(R(a)) compatible with the given H(a).

Therefore the background cosmological evolution is not in general enough to distinguish f(R) theories from other theories. Even worse, for the same H(a), there may be some different forms of f(R) which fulfill the Friedmann equations. Hence other observables are needed in order to distinguish between different theories. In order to achieve this goal, perturbation theory turns out to be of fundamental importance. More than this, perturbations theory in cosmology has become as important as in particle physics, since it gives deep insight into these theories by providing information regarding the number of independent degrees of freedom, their speed of propagation, their time-evolution: all observables to be confronted with different data sets.

The main result of the perturbation analysis in f(R) gravity can be understood in the following way. Since it is possible to express this theory into a form of scalar-tensor theory, this should correspond to having a scalar-field degree of freedom which propagates with the speed of light. Therefore no extra vector or tensor modes come from the f(R) gravitational sector. Introducing matter fields will in general increase the number of degrees of freedom, e.g., a perfect fluid will only add another propagating scalar mode and a vector mode as well. In this section we shall provide perturbation equations for the general Lagrangian density f(R, ϕ) including metric f(R) gravity as a special case.

6.1 Perturbation equations

We start with a general perturbed metric about the flat FLRW background [57, 352, 231, 232, 437]

$${\rm{d}}{s^2} = - (1 + 2\alpha)\,{\rm{d}}{t^2} - 2a(t)\,({\partial _i}\beta - {S_i}){\rm{d}}t\,{\rm{d}}{x^i} + {a^2}(t)({\delta _{ij}} + 2\psi {\delta _{ij}} + 2{\partial _i}{\partial _j}\gamma + 2{\partial _j}{F_i} + {h_{ij}})\,{\rm{d}}{x^i}\,{\rm{d}}{x^j},$$
(6.1)

where α, β, ψ, γ, are scalar perturbations, Si, Fi are vector perturbations, and hij is the tensor perturbations, respectively. In this review we focus on scalar and tensor perturbations, because vector perturbations are generally unimportant in cosmology [71].

For generality we consider the following action

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \,\left[ {{1 \over {2{\kappa ^2}}}f(R,\phi) - {1 \over 2}\omega (\phi){g^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - V(\phi)} \right] + {S_M}({g_{\mu \nu}},{\Psi _M})\,,$$
(6.2)

where f(R, ϕ) is a function of the Ricci scalar R and the scalar field ϕ, ω(ϕ) and V(ϕ) are functions of ϕ, and SM is a matter action. We do not take into account an explicit coupling between the field ϕ and matter. The action (6.2) covers not only f(R) gravity but also other modified gravity theories such as Brans-Dicke theory, scalar-tensor theories, and dilaton gravity. We define the quantity F(R, ϕ) ≡ ∂, f/∂R. Varying the action (6.2) with respect to gμν and ϕ, we obtain the following field equations

$$\begin{array}{*{20}c} {F{R_{\mu \nu}} - {1 \over 2}f{g_{\mu \nu}} - {\nabla _\mu}{\nabla _\nu}F + {g_{\mu \nu}}\square F\quad \quad \quad \quad \quad \quad \;}\\ {= {\kappa ^2}\left[ {\omega \left({{\nabla _\mu}\phi {\nabla _\nu}\phi - {1 \over 2}{g_{\mu \nu}}{\nabla ^\lambda}\phi {\nabla _\lambda}\phi} \right) - V{g_{\mu \nu}} + T_{\mu \nu}^{(M)}} \right],}\\ \end{array}$$
(6.3)
$$\square\phi + {1 \over {2\omega}}\left({{\omega _{,\phi}}{\nabla ^\lambda}\phi {\nabla _\lambda}\phi - 2{V_{,\phi}} + {{{f_{,\phi}}} \over {{\kappa ^2}}}} \right) = 0\,,$$
(6.4)

where \(T_{\mu \nu}^{(M)}\) is the energy-momentum tensor of matter.

We decompose ϕ and F into homogeneous and perturbed parts, \(\phi = \bar \phi + \delta \phi\) and \(F = \bar F + \delta F\), respectively. In the following we omit the bar for simplicity. The energy-momentum tensor of an ideal fluid with perturbations is

$$T_0^0 = - ({\rho _M} + \delta {\rho _M})\,,\quad T_i^0 = - ({\rho _M} + {P_M}){\partial _i}v\,,\quad T_j^i = ({P_M} + \delta {P_M})\delta _j^i\,,$$
(6.5)

where υ characterizes the velocity potential of the fluid. The conservation of the energy-momentum tensor (∇μTμν = 0) holds for the theories with the action (6.2) [357].

For the action (6.2) the background equations (without metric perturbations) are given by

$$3F{H^2} = {1 \over 2}(RF - f) - 3H\dot F + {\kappa ^2}\left[ {{1 \over 2}\omega {{\dot \phi}^2} + V(\phi) + {\rho _M}} \right]\,,$$
(6.6)
$$- 2F\dot H = \ddot F - H\dot F + {\kappa ^2}\omega {\dot \phi ^2} + {\kappa ^2}({\rho _M} + {P_M})\,,$$
(6.7)
$$\ddot \phi + 3H\dot \phi + {1 \over {2\omega}}\left({{\omega _{,\phi}}{{\dot \phi}^2} + 2{V_{,\phi}} - {{{f_{,\phi}}} \over {{\kappa ^2}}}} \right) = 0\,,$$
(6.8)
$${\dot \rho _M} + 3H({\rho _M} + {P_M}) = 0\,,$$
(6.9)

where R is given in Eq. (2.13).

For later convenience, we define the following perturbed quantities

$$\chi \equiv a(\beta + a\dot \gamma)\,,\qquad A \equiv 3(H\alpha - \dot \psi) - {\Delta \over {{a^2}}}\chi \,.$$
(6.10)

Perturbing Einstein equations at linear order, we obtain the following equations [316, 317] (see also [436, 566, 355, 438, 312, 313, 314, 492, 138, 33, 441, 328])

$$\begin{array}{*{20}c} {{\Delta \over {{a^2}}}\psi + HA = - {1 \over {2F}}\left[ {\left({3{H^2} + 3\dot H + {\Delta \over {{a^2}}}} \right)\delta F - 3H\dot{\delta F} + {1 \over 2}\left({{\kappa ^2}{\omega _{,\phi}}{{\dot \phi}^2} + 2{\kappa ^2}{V_{,\phi}} - {f_{,\phi}}} \right)\delta \phi} \right.\;\;\;\;\;\;\;\;}\\ {\left. {+ {\kappa ^2}\omega \dot \phi \dot \delta \phi + (3H\dot F - {\kappa ^2}\omega {{\dot \phi}^2})\alpha + \dot FA + {\kappa ^2}\delta {\rho _M}} \right]\,,}\\ \end{array}$$
(6.11)
$$H\alpha - \dot \psi = {1 \over {2F}}\left[ {{\kappa ^2}\omega \dot \phi \delta \phi + \dot{\delta F}- H\delta F - \dot F\alpha + {\kappa ^2}({\rho _M} + {P_M})v} \right]\,,$$
(6.12)
$$\dot \chi + H\chi - \alpha - \psi = {1 \over F}(\delta F - \dot F\chi)\,,$$
(6.13)
$$\begin{array}{*{20}c} {\dot A + 2HA + \left({3H + {\Delta \over {{a^2}}}} \right)\alpha = {1 \over {2F}}\left[ {3\ddot{\delta F} + 3H\dot{\delta F} - \left({6{H^2} + {\Delta \over {{a^2}}}} \right)\delta F + 4{\kappa ^2}\omega \dot \phi \dot{\delta \phi}} \right.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {+ (2{\kappa ^2}{\omega _{,\phi}}{{\dot \phi}^2} - 2{\kappa ^2}{V_{,\phi}} + {f_{,\phi}})\delta \phi - 3\dot F\dot \alpha - \dot FA\quad \quad}\\ {\left. {- (4{\kappa ^2}\omega {{\dot \phi}^2} + 3H\dot F + 6\ddot F)\alpha + {\kappa ^2}(\delta {\rho _M} + \delta {P_M})} \right],}\\ \end{array}$$
(6.14)
$$\begin{array}{*{20}c} {\ddot{\delta F}+ 3H\dot{\delta F} - \left({{\Delta \over {{a^2}}} + {R \over 3}} \right)\delta F + {2 \over 3}{\kappa ^2}\dot \phi \dot{\delta} \phi + {1 \over 3}({\kappa ^2}{\omega _{,\phi}}{{\dot \phi}^2} - 4{\kappa ^2}{V_{,\phi}} + 2{f_{,\phi}})\delta \phi}\\ {= {1 \over 3}{\kappa ^2}(\delta {\rho _M} - 3\delta {P_M}) + \dot F(A + \dot \alpha) + \left({2\ddot F + 3H\dot F + {2 \over 3}{\kappa ^2}\omega {{\dot \phi}^2}} \right)\alpha - {1 \over 3}F\delta R\,,}\\ \end{array}$$
(6.15)
$$\begin{array}{*{20}c} {\delta \ddot \phi + \left({3H + {{{\omega _{,\phi}}} \over \omega}\dot \phi} \right)\delta \dot \phi + \left[ {- {\Delta \over {{a^2}}} + {{\left({{{{\omega _{,\phi}}} \over \omega}} \right)}_{,\phi}}{{{{\dot \phi}^2}} \over 2} + {{\left({{{2{V_{,\phi}} - {f_{,\phi}}} \over {2\omega}}} \right)}_{,\phi}}} \right]\delta \phi}\\ {= \dot \phi \dot \alpha + \left({2\ddot \phi + 3H\dot \phi + {{{\omega _{,\phi}}} \over \omega}{{\dot \phi}^2}} \right)\alpha + \dot \phi A + {1 \over {2\omega}}{F_{,\phi}}\delta R\,,\quad \quad \quad \quad \quad}\\ \end{array}$$
(6.16)
$$\delta {\dot \rho _M} + 3H(\delta {\rho _M} + \delta {P_M}) = ({\rho _M} + {P_M})\left({A - 3H\alpha + {\Delta \over {{a^2}}}v} \right)\,,$$
(6.17)
$${1 \over {{a^3}({\rho _M} + {P_M})}}\,{{\rm{d}} \over {{\rm{d}}t}}[{a^3}({\rho _M} + {P_M})v] = \alpha + {{\delta {P_M}} \over {{\rho _M} + {P_M}}}\,,$$
(6.18)

where δR is given by

$$\delta R = - 2\left[ {\dot A + 4HA + \left({{\Delta \over {{a^2}}} + 3\dot H} \right)\alpha + 2{\Delta \over {{a^2}}}\psi} \right]\,.$$
(6.19)

We shall solve the above equations in two different contexts: (i) inflation (Section 7), and (ii) the matter dominated epoch followed by the late-time cosmic acceleration (Section 8).

6.2 Gauge-invariant quantities

Before discussing the detail for the evolution of cosmological perturbations, we construct a number of gauge-invariant quantities. This is required to avoid the appearance of unphysical modes. Let us consider the gauge transformation

$$\hat t = t + \delta t\,,\qquad {\hat x^i} = {x^i} + {\delta ^{ij}}{\partial _j}\delta x\,,$$
(6.20)

where δt and δx characterize the time slicing and the spatial threading, respectively. Then the scalar metric perturbations α, β, ϕ and E transform as [57, 71, 412]

$$\hat \alpha = \alpha - \dot \delta t\,,$$
(6.21)
$$\hat \beta = \beta - {a^{- 1}}\delta t + a\dot \delta x\,,$$
(6.22)
$$\hat \psi = \psi - H\delta t\,,$$
(6.23)
$$\hat \gamma = \gamma - \delta x\,.$$
(6.24)

Matter perturbations such as δϕ and δρ obey the following transformation rule

$$\hat{\delta \phi} = \delta \phi - \dot \phi \,\delta t\,,$$
(6.25)
$$\hat{\delta \rho} = \delta \rho - \dot \rho \,\delta t\,.$$
(6.26)

Note that the quantity δF is also subject to the same transformation: \(\overset \wedge {\delta F} = \delta F - \dot F\delta t\). We express the scalar part of the 3-momentum energy-momentum tensor \(\delta T_i^0\) as

$$\delta T_i^0 = {\partial _i}\delta q\,.$$
(6.27)

for the scalar field and the perfect fluid one has \(\delta q = - \dot \phi \delta \phi\) and δq = −(ρM + PM)υ, respectively. This quantity transforms as

$$\hat{\delta q} = \delta q + (\rho + P)\delta t\,.$$
(6.28)

One can construct a number of gauge-invariant quantities unchanged under the transformation (6.20):

$$\Phi = \alpha - {{\rm{d}} \over {{\rm{d}}t}}[{a^2}(\gamma + \beta/a)]\,,\qquad \Psi = - \psi + {a^2}H(\dot \gamma + \beta/a)\,,$$
(6.29)
$${\mathcal R} = \psi + {H \over {\rho + P}}\delta q\,,\qquad {{\mathcal R}_{\delta \phi}} = \psi - {H \over {\dot \phi}}\delta \phi \,,\qquad {{\mathcal R}_{\delta F}} = \psi - {H \over {\dot F}}\delta F\,,$$
(6.30)
$$\delta {\rho _q} = \delta \rho - 3H\delta q\,.$$
(6.31)

Since \(\delta q = - \dot \phi \delta \phi\) for single-field inflation with a potential V(ϕ), \({\mathcal R}\) is identical to \({{\mathcal R}_{\delta \phi}}\) [where we used \(\rho = {{\dot \phi}^2}/2 + V(\phi)\) and \(P = {{\dot \phi}^2}/2 - V(\phi)]\). In f(R) gravity one can introduce a scalar field ϕ as in Eq. (2.31), so that \({{\mathcal R}_{\delta F}} = {{\mathcal R}_{\delta \phi}}\). From the gauge-invariant quantity (6.31) it is also possible to construct another gauge-invariant quantity for the matter perturbation of perfect fluids:

$${\delta _M} = {{\delta {\rho _M}} \over {{\rho _M}}} + 3H(1 + {w_M})v\,,$$
(6.32)

where wM = PM/ρM.

We note that the tensor perturbation hij is invariant under the gauge transformation [412].

We can choose specific gauge conditions to fix the gauge degree of freedom. After fixing a gauge, two scalar variables δt and δx are determined accordingly. The Longitudinal gauge corresponds to the gauge choice \(\hat \beta = 0\) and \(\hat \gamma = 0\), under which \(\delta t = a(\beta + \alpha \dot \gamma)\) and δx = γ. In this gauge one has \(\hat \Phi = \hat \alpha\) and \(\hat \Psi = - \hat \psi\), so that the line element (without vector and tensor perturbations) is given by

$${\rm{d}}{s^2} = - (1 + 2\Phi){\rm{d}}{t^2} + {a^2}(t)(1 - 2\Psi){\delta _{ij}}{\rm{d}}{x^i}{\rm{d}}{x^j}\,,$$
(6.33)

where we omitted the hat for perturbed quantities.

The uniform-field gauge corresponds to \(\overset \wedge {\delta \phi} = 0\) which fixes \(\delta t = \delta \phi/\dot \phi\). The spatial threading δx is fixed by choosing either \(\hat \beta = 0\) or \(\hat \gamma = 0\) (up to an integration constant in the former case). For this gauge choice one has \({{\hat {\mathcal R}}_{\delta \phi}} = \hat \psi\). Since the spatial curvature \(^{(3)}{\mathcal R}\) on the constant-time hypersurface is related to ϕ via the relation \(^{(3)}{\mathcal R} = - 4{\nabla ^2}\psi/{a^2}\), the quantity \({\mathcal R}\) is often called the curvature perturbation on the uniform-field hypersurface. We can also choose the gauge condition \(\overset \wedge {\delta q} = 0\) or \(\overset \wedge {\delta F} = 0\).

7 Perturbations Generated During Inflation

Let us consider scalar and tensor perturbations generated during inflation for the theories (6.2) without taking into account the perfect fluid (SM = 0). In f(R) gravity the contribution of the field ϕ such as δϕ is absent in the perturbation equations (6.11)(6.16). One can choose the gauge condition ϕF = 0, so that \({{\mathcal R}_{\delta F}} = \psi\). In scalar-tensor theory in which F is the function of ϕ alone (i.e., the coupling of the form F(ϕ)R without a non-linear term in R), the gauge choice δϕ = 0 leads to \({{\mathcal R}_{\delta \phi}} = \psi\). Since δF = F,ϕδϕ = 0 in this case, we have \({{\mathcal R}_{\delta F}} = {{\mathcal R}_{\delta \phi}} = \psi\).

We focus on the effective single-field theory such as f(R) gravity and scalar-tensor theory with the coupling F(ϕ)R, by choosing the gauge condition δϕ = 0 and δF = 0. We caution that this analysis does not cover the theory such as \({\mathcal L} = \xi (\phi)R + \alpha {R^2}\) [500], because the quantity F depends on both ϕ and R (in other words, δF = F,ϕδϕ + F,RδR). In the following we write the curvature perturbations \({{\mathcal R}_{\delta F}}\) and \({{\mathcal R}_{\delta \phi}}\) as \({\mathcal R}\).

7.1 Curvature perturbations

Since δϕ = 0 and δF = 0 in Eq. (6.12) we obtain

$$\alpha = {\dot {\mathcal R} \over {H + \dot F/(2F)}}\,.$$
(7.34)

Plugging Eq. (7.34) into Eq. (6.11), we have

$$A = - {1 \over {H + \dot F/(2F)}}\left[ {{\Delta \over {{a^2}}}{\mathcal R} + {{3H\dot F - {\kappa ^2}\omega {{\dot \phi}^2}} \over {2F\{H + \dot F/(2F)\}}}\dot {\mathcal R}} \right]\,.$$
(7.35)

Equation (6.14) gives

$$\dot A + \left({2H + {{\dot F} \over {2F}}} \right)A + {{3\dot F} \over {2F}}\dot \alpha + \left[ {{{3\ddot F + 6H\dot F + {\kappa ^2}\omega {{\dot \phi}^2}} \over {2F}} + {\Delta \over {{a^2}}}} \right]\alpha = 0\,,$$
(7.36)

where we have used the background equation (6.7). Plugging Eqs. (7.34) and (7.35) into Eq. (7.36), we find that the curvature perturbation satisfies the following simple equation in Fourier space

$$\ddot {\mathcal R}+ {{{{({a^3}{Q_s})}^ \cdot}} \over {{a^3}{Q_s}}}\dot {\mathcal R}+ {{{k^2}} \over {{a^2}}}{\mathcal R} = 0\,,$$
(7.37)

where k is a comoving wavenumber and

$${Q_s} \equiv {{\omega {{\dot \phi}^2} + 3{{\dot F}^2}/(2{\kappa ^2}F)} \over {{{[H + \dot F/(2F)]}^2}}}\,.$$
(7.38)

Introducing the variables \({z_s} = a{\sqrt Q _s}\) and \(u = {z_s}{\mathcal R}\), Eq. (7.37) reduces to

$$u^{\prime\prime} + \left({{k^2} - {{z_s^{\prime\prime}} \over {{z_s}}}} \right)u = 0\,,$$
(7.39)

where a prime represents a derivative with respect to the conformal time η = ∫ a−1dt.

In General Relativity with a canonical scalar field ϕ one has ω = 1 and F = 1, which corresponds to \({Q_s} = {{\dot \phi}^2}/{H^2}\). Then the perturbation u corresponds to \(u = a[ - \delta \phi + (\dot \phi/H)\psi ]\). In the spatially flat gauge (ω = 0) this reduces to u = − aδϕ, which implies that the perturbation u corresponds to a canonical scalar field δχ = aδϕ. In modified gravity theories it is not clear at this stage that the perturbation \(u = a\sqrt {{Q_s}} {\mathcal R}\) corresponds a canonical field that should be quantized, because Eq. (7.37) is unchanged by multiplying a constant term to the quantity Qs defined in Eq. (7.38). As we will see in Section 7.4, this problem is overcome by considering a second-order perturbed action for the theory (6.2) from the beginning.

In order to derive the spectrum of curvature perturbations generated during inflation, we introduce the following variables [315]

$${\epsilon _1} \equiv - {{\dot H} \over {{H^2}}}\,,\quad {\epsilon _2} \equiv {{\ddot \phi} \over {H\dot \phi}}\,,\quad {\epsilon _3} \equiv {{\dot F} \over {2HF}}\,,\quad {\epsilon _4} \equiv {{\dot E} \over {2HE}}\,,$$
(7.40)

where \(E \equiv F[\omega + 3{{\dot F}^2}/(2{\kappa ^2}{{\dot \phi}^2}F)]\). Then the quantity Qs can be expressed as

$${Q_s} = {\dot \phi ^2}{E \over {F{H^2}{{(1 + {\epsilon _3})}^2}}}\,.$$
(7.41)

if the parameter ϵ1 is constant, it follows that η =−1/[(1−ϵ1)aH] [573]. If \({{\dot \epsilon}_i} = 0\) (i = 1, 2, 3, 4) one has

$${{z_s^{\prime\prime}} \over {{z_s}}} = {{\nu _{\mathcal R}^2 - 1/4} \over {{\eta ^2}}}\,,\qquad {\rm{with}}\qquad \nu _{\mathcal R}^2 = {1 \over 4} + {{(1 + {\epsilon _1} + {\epsilon _2} - {\epsilon _3} + {\epsilon _4})(2 + {\epsilon _2} - {\epsilon _3} + {\epsilon _4})} \over {{{(1 - {\epsilon _1})}^2}}}\,.$$
(7.42)

then the solution to Eq. (7.39) can be expressed as a linear combination of Hankel functions,

$$u = {{\sqrt {\pi \vert \eta \vert}} \over 2}{e^{i(1 + 2{\nu _{\mathcal R}})\pi/4}}\left[ {{c_1}H_{{\nu_{\mathcal R}}}^{(1)}(k\vert \eta \vert) + {c_2}H_{{\nu _{\mathcal R}}}^{(2)}(k\vert \eta \vert)} \right]\,,$$
(7.43)

where c1 and c2 are integration constants.

During inflation one has ∣ϵi∣ ≪ 1, so that \(z_s^{^{\prime\prime}}/{z_s} \approx {(aH)^2}\). For the modes deep inside the Hubble radius (kaH, i.e., ∣∣ ≫1) the perturbation u satisfies the standard equation of a canonical field in the Minkowski spacetime: u″+ k2u ≃ 0. After the Hubble radius crossing (k = aH) during inflation, the effect of the gravitational term \(z_s^{^{\prime\prime}}/{z_s}\) becomes important. In the super-Hubble limit (kaH, i.e., ∣≪ 1) the last term on the l.h.s. of Eq. (7.37) can be neglected, giving the following solution

$${\mathcal R} = {c_1} + {c_2}\int {{{{\rm{d}}t} \over {{a^3}{Q_s}}}} \,,$$
(7.44)

where c1 and c2 are integration constants. The second term can be identified as a decaying mode, which rapidly decays during inflation (unless the field potential has abrupt features). Hence the curvature perturbation approaches a constant value c1 after the Hubble radius crossing (k < aH).

In the asymptotic past (kη → −∞) the solution to Eq. (7.39) is determined by a vacuum state in quantum field theory [88], as \(u \rightarrow {e^{- ik\eta}}/\sqrt {2k}\). This fixes the coefficients to be c1 = 1 and c2 = 0, giving the following solution

$$u = {{\sqrt {\pi \vert \eta \vert}} \over 2}{e^{i(1 + 2{\nu _{\mathcal R}})\pi/4}}H_{{\nu _{\mathcal R}}}^{(1)}(k\vert \eta \vert)\,.$$
(7.45)

We define the power spectrum of curvature perturbations,

$${{\mathcal P}_{\mathcal R}} \equiv {{4\pi {k^3}} \over {{{(2\pi)}^3}}}{\left\vert {\mathcal R} \right\vert ^2}\,.$$
(7.46)

Using the solution (7.45), we obtain the power spectrum [317]

$${{\mathcal P}_{\mathcal R}} = {1 \over {{Q_s}}}{\left({(1 - {\epsilon _1}){{\Gamma ({\nu _{\mathcal R}})} \over {\Gamma (3/2)}}{H \over {2\pi}}} \right)^2}{\left({{{\vert k\eta \vert} \over 2}} \right)^{3 - 2{\nu _{\mathcal R}}}}\,,$$
(7.47)

where we have used the relations \(H_\nu ^{(1)}(k\vert \eta \vert) \rightarrow - (i/\pi)\Gamma (\nu){(k\vert \eta \vert/2)^{- \nu}}\) for → 0 and \(\Gamma (3/2) = \sqrt \pi/2\). Since the curvature perturbation is frozen after the Hubble radius crossing, the spectrum (7.47) should be evaluated at k = aH. The spectral index of \({\mathcal R}\), which is defined by \({n_{\mathcal R}} - 1 = {\rm{d ln}}\,{{\mathcal P}_{\mathcal R}}/{\rm{d}}\,{\rm{ln}}\,k{\vert _{k = aH}}\), is

$${n_{\mathcal R}} - 1 = 3 - 2{\nu _{\mathcal R}}\,,$$
(7.48)

where \({\nu _{\mathcal R}}\) is given in Eq. (7.42). As long as ∣ϵi∣(i = 1, 2, 3, 4) are much smaller than 1 during inflation, the spectral index reduces to

$${n_{\mathcal R}} - 1 \simeq - 4{\epsilon _1} - 2{\epsilon _2} + 2{\epsilon _3} - 2{\epsilon _4}\,,$$
(7.49)

where we have ignored those terms higher than the order of ϵi’s. Provided that ∣ϵi∣ ≫ 1 the spectrum is close to scale-invariant \(({n_{\mathcal R}} \simeq 1)\). From Eq. (7.47) the power spectrum of curvature perturbations can be estimated as

$${{\mathcal P}_{\mathcal R}} \simeq {1 \over {{Q_s}}}{\left({{H \over {2\pi}}} \right)^2}\,.$$
(7.50)

A minimally coupled scalar field ϕ in Einstein gravity corresponds to ϵ3 = 0, ϵ4 = 0 and \({Q_s} = {{\dot \phi}^2}/{H^2}\), in which case we obtain the standard results \({n_{\mathcal R}} - 1 \simeq - 4{\epsilon _1} - 2{\epsilon _2}\) and \({{\mathcal P}_{\mathcal R}} \simeq {H^4}/(4{\pi ^2}{{\dot \phi}^2})\) in slow-roll inflation [573, 390].

7.2 Tensor perturbations

Tensor perturbations hij have two polarization states, which are generally written as λ = +, × [391]. In terms of polarization tensors \(e_{ij}^ +\) and \(e_{ij}^ \times\). they are given by

$${h_{ij}} = {h_ +}e_{ij}^ + + {h_ \times}e_{ij}^ \times \,.$$
(7.51)

If the direction of a momentum k is along the z-axis, the non-zero components of polarization tensors are given by \(e_{xx}^ + = - e_{yy}^ + = 1\) and \(e_{xy}^ \times = e_{yx}^ \times = 1\).

For the action (6.2) the Fourier components hλ (λ = +, ×) obey the following equation [314]

$$\ddot {{h_\lambda}}+ {{{{({a^3}F)}^ \cdot}} \over {{a^3}F}}\dot{{h_\lambda}} + {{{k^2}} \over {{a^2}}}{h_\lambda} = 0\,.$$
(7.52)

This is similar to Eq. (7.37) of curvature perturbations, apart from the difference of the factor F instead of Qs. Defining new variables \({z_t} = a\sqrt F\) and \({u_\lambda} = {z_t}{h_\lambda}/\sqrt {16\pi G}\), it follows that

$$u_\lambda ^{\prime\prime} + \left({{k^2} - {{z_t^{\prime\prime}} \over {{z_t}}}} \right){u_\lambda} = 0\,.$$
(7.53)

We have introduced the factor 16πG to relate a dimensionless massless field hλ with a massless scalar field uλ having a unit of mass.

If \({{\dot \epsilon}_i} = 0\), we obtain

$${{z_t^{\prime\prime}} \over {{z_t}}} = {{\nu _t^2 - 1/4} \over {{\eta ^2}}}\,,\qquad {\rm{with}}\qquad \nu _t^2 = {1 \over 4} + {{(1 + {\epsilon _3})(2 - {\epsilon _1} + {\epsilon _3})} \over {{{(1 - {\epsilon _1})}^2}}}\,.$$
(7.54)

We follow the similar procedure to the one given in Section 7.1. Taking into account polarization states, the spectrum of tensor perturbations after the Hubble radius crossing is given by

$${{\mathcal P}_T} = 4 \times {{16\pi G} \over {{a^2}F}}{{4\pi {k^3}} \over {{{(2\pi)}^3}}}\vert {u_\lambda}{\vert ^2} \simeq {{16} \over \pi}{\left({{H \over {{m_{{\rm{pl}}}}}}} \right)^2}{1 \over F}{\left({(1 - {\epsilon _1}){{\Gamma ({\nu _t})} \over {\Gamma (3/2)}}} \right)^2}{\left({{{\vert k\eta \vert} \over 2}} \right)^{3 - 2{\nu _t}}}\,,$$
(7.55)

which should be evaluated at the Hubble radius crossing (k = aH). The spectral index of \({{\mathcal P}_T}\) is

$${n_T} = 3 - 2{\nu _t}\,,$$
(7.56)

where νt is given in Eq. (7.54). If ∣ϵi∣ ≪ 1, this reduces to

$${n_T} \simeq - 2{\epsilon _1} - 2{\epsilon _3}\,.$$
(7.57)

then the amplitude of tensor perturbations is given by

$${{\mathcal P}_T} \simeq {{16} \over \pi}{\left({{H \over {{m_{{\rm{pl}}}}}}} \right)^2}{1 \over F}\,.$$
(7.58)

We define the tensor-to-scalar ratio

$$r \equiv {{{{\mathcal P}_T}} \over {{{\mathcal P}_{\mathcal R}}}} \simeq {{64\pi} \over {m_{{\rm{pl}}}^2}}{{{Q_s}} \over F}\,.$$
(7.59)

For a minimally coupled scalar field ϕ in Einstein gravity, it follows that nT ≃ −2ϵ1, \({{\mathcal P}_T} \simeq 16{H^2}/(\pi m_{{\rm{p1}}}^2)\), and r ≃ 16ϵ1.

7.3 The spectra of perturbations in inflation based onf(R) gravity

Let us study the spectra of scalar and tensor perturbations generated during inflation in metric f(R) gravity. Introducing the quantity E = 32/(2κ2), we have \({\epsilon _4} = \ddot F/(H\dot F)\) and

$${Q_s} = {{6F\epsilon _3^2} \over {{\kappa ^2}{{(1 + {\epsilon _3})}^2}}} = {E \over {F{H^2}{{(1 + {\epsilon _3})}^2}}}\,.$$
(7.60)

Since the field kinetic term \({{\dot \phi}^2}\) is absent, one has ϵ2 = 0 in Eqs. (7.42) and (7.49). Under the conditions ∣ϵi∣ ≪ 1 (i = 1, 3, 4), the spectral index of curvature perturbations is given by \({n_{\mathcal R}} - 1 \simeq - 4{\epsilon _1} + 2{\epsilon _3} - 2{\epsilon _4}\).

In the absence of the matter fluid, Eq. (2.16) translates into

$${\epsilon _1} = - {\epsilon _3}(1 - {\epsilon _4})\,,$$
(7.61)

which gives ϵ1≃ −ϵ3 for ∣ϵ4∣ ≪ 1. Hence we obtain [315]

$${n_{\mathcal R}} - 1 \simeq - 6{\epsilon _1} - 2{\epsilon _4}\,.$$
(7.62)

From Eqs. (7.50) and (7.60), the amplitude of \({\mathcal R}\) is estimated as

$${{\mathcal P}_{\mathcal R}} \simeq {1 \over {3\pi F}}{\left({{H \over {{m_{{\rm{pl}}}}}}} \right)^2}{1 \over {\epsilon _3^2}}\,.$$
(7.63)

Using the relation ϵ1 ≃ −ϵ3, the spectral index (7.57) of tensor perturbations is given by

$${n_T} \simeq 0\,,$$
(7.64)

which vanishes at first-order of slow-roll approximations. From Eqs. (7.58) and (7.63) we obtain the tensor-to-scalar ratio

$$r \simeq 48\epsilon _3^2 \simeq 48\epsilon _1^2\,.$$
(7.65)

7.3.1 The model f(R) = αRn (n > 0)

Let us consider the inflation model: f(R) = αRn (n > 0). From the discussion given in Section 3.1 the slow-roll parameters ϵi (i = 1, 3, 4) are constants:

$${\epsilon _1} = {{2 - n} \over {(n - 1)(2n - 1)}}\,,\qquad {\epsilon _3} = - (n - 1){\epsilon _1}\,,\qquad {\epsilon _4} = {{n - 2} \over {n - 1}}\,.$$
(7.66)

In this case one can use the exact results (7.48) and (7.56) with \({\nu _{\mathcal R}}\) and νt given in Eqs. (7.42) and (7.54) (with ϵ2 = 0). Then the spectral indices are

$${n_{\mathcal R}} - 1 = {n_T} = - {{2{{(n - 2)}^2}} \over {2{n^2} - 2n - 1}}\,.$$
(7.67)

If n = 2 we obtain the scale-invariant spectra with \({n_{\mathcal R}} = 1\) and nT = 0. Even the slight deviation from n = 2 leads to a rather large deviation from the scale-invariance. If n = 1.7, for example, one has \({n_{\mathcal R}} - 1 = {n_T} = - 0.13\), which does not match with the WMAP 5-year constraint: \({n_{\mathcal R}} = 0.960 \pm 0.013\) [367].

7.3.2 The model f(R) = R+R2/(6 M2)

For the model f(R) = R+R2/(6M2), the spectrum of the curvature perturbation \({\mathcal R}\) shows some deviation from the scale-invariance. Since inflation occurs in the regime RM2 and ∣∣≪ H2, one can approximate FR/(3M2) ≃ 4H2/M2. Then the power spectra (7.63) and (7.58) yield

$${{\mathcal P}_{\mathcal R}} \simeq {1 \over {12\pi}}{\left({{M \over {{m_{{\rm{pl}}}}}}} \right)^2}{1 \over {\epsilon _1^2}}\,,\qquad {{\mathcal P}_T} \simeq {4 \over \pi}{\left({{M \over {{m_{{\rm{pl}}}}}}} \right)^2}\,,$$
(7.68)

where we have employed the relation ϵ3 ≃ − ϵ1.

Recall that the evolution of the Hubble parameter during inflation is given by Eq. (3.9). As long as the time tk at the Hubble radius crossing (k = aH) satisfies the condition (M2/6)(tkti) ≪ Hi, one can approximate H(tk) ≪ Hi. Using Eq. (3.9), the number of e-foldings from t = tk to the end of inflation can be estimated as

$${N_k} \simeq {1 \over {2{\epsilon _1}({t_k})}}\,.$$
(7.69)

then the amplitude of the curvature perturbation is given by

$${{\mathcal P}_{\mathcal R}} \simeq {{N_k^2} \over {3\pi}}{\left({{M \over {{m_{{\rm{pl}}}}}}} \right)^2}\,.$$
(7.70)

The WMAP 5-year normalization corresponds to \({{\mathcal P}_{\mathcal R}} = (2.445 \pm 0.096) \times {10^{- 9}}\) at the scale k = 0.002 Mpc—1 [367]. Taking the typical value Nk = 55, the mass M is constrained to be

$$M \simeq 3 \times {10^{- 6}}{m_{{\rm{pl}}}}\,.$$
(7.71)

Using the relation F ≪ 4H2/M2, it follows that ϵ4 ≃ − ϵ1. Hence the spectral index (7.62) reduces to

$${n_{\mathcal R}} - 1 \simeq - 4{\epsilon _1} \simeq - {2 \over {{N_k}}} = - 3.6 \times {10^{- 2}}{\left({{{{N_k}} \over {55}}} \right)^{- 1}}\,.$$
(7.72)

for Nk = 55 we have \({n_{\mathcal R}} \simeq 0.964\), which is in the allowed region of the WMAP 5-year constraint (\({n_{\mathcal R}} = 0.964 \pm 0.013\) at the 68% confidence level [367]). The tensor-to-scalar ratio (7.65) can be estimated as

$$r \simeq {{12} \over {N_k^2}} \simeq 4.0 \times {10^{- 3}}{\left({{{{N_k}} \over {55}}} \right)^{- 2}}\,,$$
(7.73)

which satisfies the current observational bound r < 0.22 [367]. We note that a minimally coupled field with the potential V(ϕ) = m2ϕ2/2 in Einstein gravity (chaotic inflation model [393]) gives rise to a larger tensor-to-scalar ratio of the order of 0.1. Since future observations such as the Planck satellite are expected to reach the level of \(r = {\mathcal O}({10^{- 2}})\), they will be able to discriminate between the chaotic inflation model and the Starobinsky’s f(R) model.

7.3.3 The power spectra in the Einstein frame

Let us consider the power spectra in the Einstein frame. Under the conformal transformation \({{\tilde g}_{\mu \nu}} = F{g_{\mu \nu}}\), the perturbed metric (6.1) is transformed as

$$\begin{array}{*{20}c} {{\rm{d}}{{\tilde s}^2} = F{\rm{d}}{s^2}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\;\quad \quad \quad \quad \quad}\\ {= - (1 + 2\tilde \alpha)\,{\rm{d}}{{\tilde t}^2} - 2\tilde a(\tilde t)\,({\partial _i}\tilde \beta - {{\tilde S}_i}){\rm{d}}\tilde t\;{\rm{d}}{{\tilde x}^i}\quad \quad \quad \quad \quad}\\ {+ {{\tilde a}^2}(\tilde t)({\delta _{ij}} + 2\tilde \psi {\delta _{ij}} + 2{\partial _i}{\partial _j}\tilde \gamma + 2{\partial _j}{{\tilde F}_i} + {{\tilde h}_{ij}}){\rm{d}}{{\tilde x}^i}{\rm{d}}{{\tilde x}^j}\,.}\\ \end{array}$$
(7.74)

We decompose the conformal factor into the background and perturbed parts, as

$$F(t,x) = \bar F(t)\left({1 + {{\delta F(t,x)} \over {\bar F(t)}}} \right)\,.$$
(7.75)

In what follows we omit a bar from F. We recall that the background quantities are transformed as Eqs. (2.44) and (2.47). The transformation of scalar metric perturbations is given by

$$\tilde \alpha = \alpha + {{\delta F} \over {2F}}\,,\qquad \tilde \beta = \beta \,,\qquad \tilde \psi = \psi + {{\delta F} \over {2F}}\,,\qquad \tilde \gamma = \gamma \,.$$
(7.76)

Meanwhile vector and tensor perturbations are invariant under the conformal transformation (\(({{\tilde S}_i} = {S_i},{{\tilde F}_i} = {F_i},{{\tilde h}_{ij}} = {h_{ij}})\)).

Using the above transformation law, one can easily show that the curvature perturbation \({\mathcal R} = \psi - H\delta F/\dot F\) in f(R) gravity is invariant under the conformal transformation:

$$\tilde {\mathcal R} = {\mathcal R}\,.$$
(7.77)

Since the tensor perturbation is also invariant, the tensor-to-scalar ratio in the Einstein frame is identical to that in the Jordan frame. For example, let us consider the model f(R) = R + R2/(6M2). Since the action in the Einstein frame is given by Eq. (2.32), the slow-roll parameters \({{\tilde \epsilon}_3}\) and \({{\tilde \epsilon}_4}\) vanish in this frame. Using Eqs. (7.49) and (3.27), the spectral index of curvature perturbations is given by

$${\tilde n_{\mathcal R}} - 1 \simeq - 4{\tilde{\epsilon}_1} - 2{\tilde{\epsilon}_2} \simeq - {2 \over {{{\tilde N}_k}}}\,,$$
(7.78)

where we have ignored the term of the order of \(1/\tilde N_k^2\). Since \({{\tilde N}_k} \simeq {N_k}\) in the slow-roll limit (∣/(2HF)∣ ≪ 1), Eq. (7.78) agrees with the result (7.72) in the Jordan frame. Since \({Q_s} = {({\rm{d}}\phi {\rm{/d}}\tilde t)^2}/{H^2}\) in the Einstein frame, Eq. (7.59) gives the tensor-to-scalar ratio

$$\tilde r = {{64\pi} \over {m_{{\rm{pl}}}^2}}{\left({{{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}}} \right)^2}{1 \over {{{\tilde H}^2}}} \simeq 16{\tilde \epsilon _1} \simeq {{12} \over {\tilde N_k^2}}\,,$$
(7.79)

where the background equations (3.21) and (3.22) are used with slow-roll approximations. Equation (7.79) is consistent with the result (7.73) in the Jordan frame.

The equivalence of the curvature perturbation between the Jordan and Einstein frames also holds for scalar-tensor theory with the Lagrangian \({\mathcal L} = F(\phi)R/(2{\kappa ^2}) - (1/2)\omega (\phi){g^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - V(\phi)\) [411, 240]. For the non-minimally coupled scalar field with F(ϕ) = 1 − ξk2ϕ2 [269, 241] the spectral indices of scalar and tensor perturbations have been derived by using such equivalence [366, 590].

7.4 The Lagrangian for cosmological perturbations

In Section 7.1 we used the fact that the field which should be quantized corresponds to \(u = a{\sqrt Q _s}{\mathcal R}\). This can be justified by writing down the action (6.1) expanded at second-order in the perturbations [437]. We recall again that we are considering an effective single-field theory such as f(R) gravity and scalar-tensor theory with the coupling F(ϕ)R. Carrying out the expansion of the action (6.2) in second order, we find that the action for the curvature perturbation \({\mathcal R}\) (either \({{\mathcal R}_{\delta F}}\) or \({{\mathcal R}_{\delta \phi}}\)) is given by [311]

$$\delta {S^{(2)}} = \int {\rm{d}} t\;{{\rm{d}}^3}x\,{a^3}\,{Q_s}\left[ {{1 \over 2}\,\dot{\mathcal R}^2 - {1 \over 2}\,{1 \over {{a^2}}}{{(\nabla {\mathcal R})}^2}} \right]\,,$$
(7.80)

where Qs is given in Eq. (7.38). In fact, the variation of this action in terms of the field \({\mathcal R}\) gives rise to Eq. (7.37) in Fourier space. We note that there is another approach called the Hamiltonian formalism which is also useful for the quantization of cosmological perturbations. See [237, 209, 208, 127] for this approach in the context of f(R) gravity and modified gravitational theories.

Introducing the quantities \(u = {z_S}{\mathcal R}\) and \({z_S} = a{\sqrt Q _s}\), the action (7.80) can be written as

$$\delta {S^{(2)}} = \int {\rm{d}} \eta \,{{\rm{d}}^3}x\left[ {{1 \over 2}\,{{u}^{\prime 2}} - {1 \over 2}{{(\nabla u)}^2} + {1 \over 2}{{z_s^{\prime \prime}} \over {{z_s}}}{u^2}} \right]\,,$$
(7.81)

where a prime represents a derivative with respect to the conformal time η = ∫ a−1dt. The action (7.81) leads to Eq. (7.39) in Fourier space. The transformation of the action (7.80) to (7.81) gives rise to the effective massFootnote 6

$$M_s^2 \equiv - {1 \over {{a^2}}}{{z_s^{\prime\prime}} \over {{z_s}}} = {{\dot Q_s^2} \over {4Q_s^2}} - {{{{\ddot Q}_s}} \over {2{Q_s}}} - {{3H{{\dot Q}_s}} \over {2{Q_s}}}.$$
(7.82)

We have seen in Eq. (7.42) that during inflation the quantity \(z_s^{^{\prime\prime}}/{z_s}\) can be estimated as \(z_s^{^{\prime\prime}}/{z_s} \simeq 2{(aH)^2}\) in the slow-roll limit, so that \(M_s^2 \simeq - 2{H^2}\). For the modes deep inside the Hubble radius (kaH) the action (7.81) reduces to the one for a canonical scalar field u in the flat spacetime. Hence the quantization should be done for the field \(u = a\sqrt {{Q_s}} {\mathcal R}\), as we have done in Section 7.1.

From the action (7.81) we understand a number of physical properties in f(R) theories and scalar-tensor theories with the coupling F(ϕ)R listed below.

  1. 1.

    Having a standard d’Alambertian operator, the mode has speed of propagation equal to the speed of light. This leads to a standard dispersion relation ω = k/a for the high-k modes in Fourier space.

  2. 2.

    The sign of Qs corresponds to the sign of the kinetic energy of \({\mathcal R}\). The negative sign corresponds to a ghost (phantom) scalar field. In f(R) gravity (with \(\dot \phi = 0\)) the ghost appears for F < 0. In Brans-Dicke theory with F(ϕ) = κ2ϕ and ω(ϕ) = ωBD/ϕ [100] (where ϕ > 0) the condition for the appearance of the ghost \((\omega {{\dot \phi}^2} + 3{F^2}/(2{\kappa ^2}F) < 0)\) translates into ωBD < −3/2. In these cases one would encounter serious problems related to vacuum instability [145, 161].

  3. 3.

    The field u has the effective mass squared given in Eq. (7.82). In f(R) gravity it can be written as

    $$M_s^2 = - {{72{F^2}{H^4}} \over {{{(2FH + {f_{,RR}}\dot R)}^2}}} + {1 \over 3}F\left({{{288{H^3} - 12HR} \over {2FH + {f_{,RR}}\dot R}} + {1 \over {{f_{,RR}}}}} \right) + {{f_{,RR}^2{{\dot R}^2}} \over {4{F^2}}} - 24{H^2} + {7 \over 6}R\,,$$
    (7.83)

    where we used the background equation (2.16) to write in terms of R and H2. In Fourier space the perturbation u obeys the equation of motion

    $$u^{\prime\prime} +({{k^2} + M_s^2{a^2}})\;u = 0\,.$$
    (7.84)

    For \({k^2}/{a^2} \gg M_s^2\), the field u propagates with speed of light. For small k satisfying \({k^2}/{a^2} \ll M_s^2\), we require a positive \(M_s^2\) to avoid the tachyonic instability of perturbations. Recall that the viable dark energy models based on f(R) theories need to satisfy Rf,RRF (i.e., m = Rf,RR/f,R ≪ 1) at early times, in order to have successful cosmological evolution from radiation domination till matter domination. At these epochs the mass squared is approximately given by

    $$M_s^2 \simeq {F \over {3{f_{,RR}}}}\,,$$
    (7.85)

    which is consistent with the result (5.2) derived by the linear analysis about the Minkowski background. Together with the ghost condition F > 0, this leads to f,RR > 0. Recall that these correspond to the conditions presented in Eq. (4.56).

8 Observational Signatures of Dark Energy Models in f(R) Theories

In this section we discuss a number of observational signatures of dark energy models based on metric f(R) gravity. Our main interest is to distinguish these models from the ΛCDM model observationally. In particular we study the evolution of matter density perturbations as well as the gravitational potential to confront f(R) models with the observations of large-scale structure (LSS) and Cosmic Microwave Background (CMB). The effect on weak lensing will be discussed in Section 13.1 in more general modified gravity theories including f(R) gravity.

8.1 Matter density perturbations

Let us consider the perturbations of non-relativistic matter with the background energy density ρm and the negligible pressure (Pm = 0). In Fourier space Eqs. (6.17) and (6.18) give

$${\dot{\delta} {\rho _m}} + 3H\delta {\rho _m} = {\rho _m}\left({A - 3H\alpha - {{{k^2}} \over {{a^2}}}v} \right)\,,$$
(8.86)
$$\dot v = \alpha \,,$$
(8.87)

where in the second line we have used the continuity equation, \({{\dot \rho}_m} + 3H{\rho _m} = 0\). The density contrast defined in Eq. (6.32), i.e.

$${\delta _m} = {{\delta {\rho _m}} \over {{\rho _m}}} + 3Hv\,,$$
(8.88)

obeys the following equation from Eqs. (8.86) and (8.87):

$${\ddot \delta _m} + 2H{\dot \delta _m} + {{{k^2}} \over {{a^2}}}(\alpha - \dot \chi) = 3\ddot B + 6H\dot B\,,$$
(8.89)

where Bψ and we used the relation \(A = 3(H\alpha - \dot \psi) + ({k^2}/{a^2})\chi\).

In the following we consider the evolution of perturbations in f(R) gravity in the Longitudinal gauge (6.33). Since χ = 0, α = Φ, ψ = −Ψ, and \(A = 3(H\Phi + \dot \Psi)\) in this case, Eqs. (6.11), (6.13), (6.15), and (8.89) give

$$\begin{array}{*{20}c} {{{{k^2}} \over {{a^2}}}\Psi + 3H(H\Phi + \dot \Psi) = - {1 \over {2F}}\left[ {\left({3{H^2} + 3\dot H - {{{k^2}} \over {{a^2}}}} \right)\delta F - 3H\dot{\delta F}} \right.}\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\\ {\left. {+ 3H\dot F\Phi + 3\dot F(H\Phi + \dot \Psi) + {\kappa ^2}\delta {\rho _m}} \right]\,,}\\ \end{array}$$
(8.90)
$$\Psi - \Phi = {{\delta F} \over F}\,,$$
(8.91)
$$\ddot{\delta F} + 3H\dot{\delta F} + \left({{{{k^2}} \over {{a^2}}} + {M^2}} \right)\delta F = {{{\kappa ^2}} \over 3}\delta {\rho _m} + \dot F(3H\Phi + 3\dot \Psi + \dot \Phi) + (2\ddot F + 3H\dot F)\Phi \,,$$
(8.92)
$${\ddot \delta _m} + 2H{\dot \delta _m} + {{{k^2}} \over {{a^2}}}\Phi = 3\ddot B + 6H\dot B\,,$$
(8.93)

where B = + Ψ. In order to derive Eq. (8.92), we have used the mass squared M2 = (F/F,RR)/3 introduced in Eq. (5.2) together with the relation δR = δF/F,R.

Let us consider the wavenumber k deep inside the Hubble radius (kaH). In order to derive the equation of matter perturbations approximately, we use the quasi-static approximation under which the dominant terms in Eqs. (8.90)(8.93) correspond to those including k2/a2, δρm (or δm) and M2. In General Relativity this approximation was first used by Starobinsky in the presence of a minimally coupled scalar field [567], which was numerically confirmed in [403]. This was further extended to scalar-tensor theories [93, 171, 586] and f(R) gravity [586, 597]. Precisely speaking, in f(R) gravity, this approximation corresponds to

$$\left\{{{{{k^2}} \over {{a^2}}}\vert \Phi \vert, {{{k^2}} \over {{a^2}}}\vert \Psi \vert, {{{k^2}} \over {{a^2}}}\vert \delta F\vert, {M^2}\vert \delta F\vert} \right\} \gg \{{H^2}\vert \Phi \vert, {H^2}\vert \Psi \vert, {H^2}\vert B\vert, {H^2}\vert \delta F\vert \} \,,$$
(8.94)

and

$$\vert \dot X\vert \underset{\sim}{<} \vert HX\vert \,,\quad {\rm{where}}\quad X = \Phi, \Psi, F,\dot F,\delta F,\dot{\delta F}\,.$$
(8.95)

From Eqs. (8.90) and (8.91) it then follows that

$$\Psi \simeq {1 \over {2F}}\left({\delta F - {{{a^2}} \over {{k^2}}}{\kappa ^2}\delta {\rho _m}} \right)\,,\qquad \Phi \simeq - {1 \over {2F}}\left({\delta F + {{{a^2}} \over {{k^2}}}{\kappa ^2}\delta {\rho _m}} \right)\,.$$
(8.96)

Since (k2/a2 + M2)δFκ2δρm/3 from Eq. (8.92), we obtain

$${{{k^2}} \over {{a^2}}}\Psi \simeq - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}{{2 + 3{M^2}{a^2}/{k^2}} \over {3(1 + {M^2}{a^2}/{k^2})}}\,,\qquad {{{k^2}} \over {{a^2}}}\Phi \simeq - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}{{4 + 3{M^2}{a^2}/{k^2}} \over {3(1 + {M^2}{a^2}/{k^2})}}\,.$$
(8.97)

We also define the effective gravitational potential

$${\Phi _{{\rm{eff}}}} \equiv (\Phi + \Psi)/2\,.$$
(8.98)

This quantity characterizes the deviation of light rays, which is linked with the Integrated Sachs-Wolfe (ISW) effect in CMB [544] and weak lensing observations [27]. From Eq. (8.97) we have

$${\Phi _{{\rm{eff}}}} \simeq - {{{\kappa ^2}} \over {2F}}{{{a^2}} \over {{k^2}}}\delta {\rho _m}\,.$$
(8.99)

From Eq. (6.12) the term is of the order of H2Φ/(κ2ρm) provided that the deviation from the ΛCDM model is not significant. Using Eq. (8.97) we find that the ratio 3Hυ/(\(3H\upsilon/(\delta {\rho _m}/{\rho _m})\)) is of the order of (aH/k)2, which is much smaller than unity for sub-horizon modes. Then the gauge-invariant perturbation δm given in Eq. (8.88) can be approximated as δmδρm/ρm. Neglecting the r.h.s. of Eq. (8.93) relative to the l.h.s. and using Eq. (8.97) with δρmρmδm, we get the equation for matter perturbations:

$${\ddot \delta _m} + 2H{\dot \delta _m} - 4\pi {G_{{\rm{eff}}}}{\rho _m}{\delta _m} \simeq 0\,,$$
(8.100)

where Geff is the effective (cosmological) gravitational coupling defined by [586, 597]

$${G_{{\rm{eff}}}} \equiv {G \over F}{{4 + 3{M^2}{a^2}/{k^2}} \over {3(1 + {M^2}{a^2}/{k^2})}}\,.$$
(8.101)

We recall that viable f(R) dark energy models are constructed to have a large mass M in the region of high density (RR0). During the radiation and deep matter eras the deviation parameter m = Rf,RR/f,R is much smaller than 1, so that the mass squared satisfies

$${M^2} = {R \over 3}\left({{1 \over m} - 1} \right) \gg R\,.$$
(8.102)

if m grows to the order of 0.1 by the present epoch, then the mass M today can be of the order of H0. In the regimes M2k2/a2 and M2k2/a2 the effective gravitational coupling has the asymptotic forms GeffG/F and Geff ≃ 4G/(3F), respectively. The former corresponds to the “General Relativistic (GR) regime” in which the evolution of δm mimics that in GR, whereas the latter corresponds to the “scalar-tensor regime” in which the evolution of δm is non-standard. For the f(R) models (4.83) and (4.84) the transition from the former regime to the latter regime, which is characterized by the condition M2 = k2/a2, can occur during the matter domination for the wavenumbers relevant to the matter power spectrum [306, 568, 587, 270, 589].

In order to derive Eq. (8.100) we used the approximation that the time-derivative terms of δF on the l.h.s. of Eq. (8.92) is neglected. In the regime M2k2/a2, however, the large mass M can induce rapid oscillations of δF. In the following we shall study the evolution of the oscillating mode [568]. For sub-horizon perturbations Eq. (8.92) is approximately given by

$$\ddot{\delta F}+ 3H\dot{\delta F}+ \left({{{{k^2}} \over {{a^2}}} + {M^2}} \right)\delta F \simeq {{{\kappa ^2}} \over 3}\delta {\rho _m}\,.$$
(8.103)

The solution of this equation is the sum of the matter induce mode δFind ≃ (κ2/3)δρm/(k2/a2+ M2) and the oscillating mode δFosc satisfying

$${\ddot{\delta F}_{{\rm{osc}}}} + 3H{\dot{\delta F} _{{\rm{osc}}}} + \left({{{{k^2}} \over {{a^2}}} + {M^2}} \right)\delta {F_{{\rm{osc}}}} = 0\,.$$
(8.104)

As long as the frequency \(\omega = \sqrt {{k^2}/{a^2} + {M^2}}\) satisfies the adiabatic condition \(\vert \dot \omega \vert \, \ll {\omega ^2}\), we obtain the solution of Eq. (8.104) under the WKB approximation:

$$\delta {F_{{\rm{osc}}}} \simeq c{a^{- 3/2}}{1 \over {\sqrt {2\omega}}}\cos \left({\int \omega {\rm{d}}t} \right)\,,$$
(8.105)

where c is a constant. Hence the solution of the perturbation δR is expressed by [568, 587]

$$\delta R \simeq {1 \over {3{f_{,RR}}}}{{{\kappa ^2}\delta {\rho _m}} \over {{k^2}/{a^2} + {M^2}}} + c{a^{- 3/2}}{1 \over {{f_{,RR}}\sqrt {2\omega}}}\cos \left({\int \omega {\rm{d}}t} \right)\,.$$
(8.106)

For viable f(R) models, the scale factor a and the background Ricci scalar R(0) evolve as at2/3 and R(0) ≃ 4/(3t2) during the matter era. Then the amplitude of δRosc relative to R(0) has the time-dependence

$${{\vert \delta {R_{{\rm{osc}}}}\vert} \over {{R^{(0)}}}} \propto {{{M^2}t} \over {{{({k^2}/{a^2} + {M^2})}^{1/4}}}}\,.$$
(8.107)

The f(R) models (4.83) and (4.84) behave as m(r) = C(−r − 1)p with p = 2n + 1 in the regime RRc. During the matter-dominated epoch the mass M evolves as Mt−(p+1). In the regime M2k2/a2 one has ∣δRosc∣/R(0)t−(3p+1)/2 and hence the amplitude of the oscillating mode decreases faster than R(0). However the contribution of the oscillating mode tends to be more important as we go back to the past. In fact, this behavior was confirmed in the numerical simulations of [587, 36]. This property persists in the radiation-dominated epoch as well. If the condition ∣δR∣ < R(0) is violated, then R can be negative such that the condition f,R > 0 or f,RR > 0 is violated for the models (4.83) and (4.84). Thus we require that ∣δR∣ is smaller than R(0) at the be ginning of the radiation era. This can be achieved by choosing the constant in Eq. (8.106) to be sufficiently small, which amounts to a fine tuning for these models.

For the models (4.83) and (4.84) one has F = 1 − 2(R/Rc)2n−1 in the regime RRc. Then the field ϕ defined in Eq. (2.31) rapidly approaches 0 as we go back to the past. Recall that in the Einstein frame the effective potential of the field has a potential minimum around ϕ = 0 because of the presence of the matter coupling. Unless the oscillating mode of the field perturbation δϕ is strongly suppressed relative to the background field ϕ(0), the system can access the curvature singularity at ϕ = 0 [266]. This is associated with the condition ∣δR∣ < R(0) discussed above. This curvature singularity appears in the past, which is not related to the future singularities studied in [461, 54]. The past singularity can be cured by taking into account the R2 term [37], as we will see in Section 13.3. We note that the f(R) models proposed in [427] [e.g., f(R) = RαRc ln(1+R/Rc)] to cure the singularity problem satisfy neither the local gravity constraints [580] nor observational constraints of large-scale structure [194].

As long as the oscillating mode δRosc is negligible relative to the matter-induced mode δRind, we can estimate the evolution of matter perturbations δm as well as the effective gravitational potential Φeff. Note that in [192, 434] the perturbation equations have been derived without neglecting the oscillating mode. As long as the condition ∣δRosc∣ < ∣δRindδ∣ is satisfied initially, the approximate equation (8.100) is accurate to reproduce the numerical solutions [192, 589]. Equation (8.100) can be written as

$${{{{\rm{d}}^2}{\delta _m}} \over {{\rm{d}}{N^2}}} + \left({{1 \over 2} - {3 \over 2}{w_{{\rm{eff}}}}} \right){{{\rm{d}}{\delta _m}} \over {{\rm{d}}N}} - {3 \over 2}{\Omega _m}{{4 + 3{M^2}{a^2}/{k^2}} \over {3(1 + {M^2}{a^2}/{k^2})}} = 0\,,$$
(8.108)

where N = lna, weff = −1 − 2 /(3H2), and Ωm = 8πGρm/(3FH2). The matter-dominated epoch corresponds to weff = 0 and Ωm = 1. In the regime M2k2/a2 the evolution of δm and Φeff during the matter dominance is given by

$${\delta _m} \propto {t^{2/3}},\qquad {\Phi _{{\rm{eff}}}} = {\rm{constant}}\,,$$
(8.109)

where we used Eq. (8.99). The matter-induced mode δRind relative to the background Ricci scalar R(0) evolves as ∣δRind∣/R(0)t2/3δm. At late times the perturbations can enter the regime M2k2/a2, depending on the wavenumber k and the mass M. When M2k2/a2, the evolution of δm and Φeff during the matter era is [568]

$${\delta _m} \propto {t^{(\sqrt {33} - 1)/6}}\,,\qquad {\Phi _{{\rm{eff}}}} \propto {t^{(\sqrt {33} - 5)/6}}\,.$$
(8.110)

For the model m(r) = C(−r − 1)p, the evolution of the matter-induced mode in the region M2k2/a2 is given by \(\vert \delta {R_{{\rm{ind}}}}\vert/{R^{(0)}} \propto {t^{- 2p +}}(\sqrt {33} - 5)/6\). This decreases more slowly relative to the ratio ∣δRosc∣/R(0) [587], so the oscillating mode tends to be unimportant with time.

8.2 The impact on large-scale structure

We have shown that the evolution of matter perturbations during the matter dominance is given by δmt2/3 for M2k2/a2 (GR regime) and \({\delta _m} \propto {t^{(\sqrt {33} - 1)/6}}\) for M2k2/a2 (scalar-tensor regime), respectively. The existence of the latter phase gives rise to the modification to the matter power spectrum [146, 74, 544, 526, 251] (see also [597, 493, 494, 94, 446, 278, 435] for related works). The transition from the GR regime to the scalar-tensor regime occurs at M2 = k2/a2. If it occurs during the matter dominance (R ≃ 3H2), the condition M2 = k2/a2 translates into [589]

$$m \simeq {(aH/k)^2}\,,$$
(8.111)

where we have used the relation M2R/(3m) (valid for m ≪ 1).

We are interested in the wavenumbers k relevant to the linear regime of the galaxy power spectrum [577, 578]:

$$0.01\,h\;{\rm{Mp}}{{\rm{c}}^{- 1}}\underset{\sim}{<} k \underset{\sim}{<} 0.,h\;{\rm{Mp}}{{\rm{c}}^{- 1}}\,,$$
(8.112)

where h = 0.72 ± 0.08 corresponds to the uncertainty of the Hubble parameter today. Non-linear effects are important for k ≳ 0.2 h Mpc−1. The current observations on large scales around k ∼ 0.01 h Mpc−1 are not so accurate but can be improved in future. The upper bound = 0.2 h Mpc−1 corresponds to k ≃ 600a0H0, where the subscript “0” represents quantities today. If the transition from the GR regime to the scalar-tensor regime occurred by the present epoch (the redshift z = 0) for the mode k = 600a0H0, then the parameter m today is constrained to be

$$m(z = 0)\underset{\sim}{>} 3 \times {10^{- 6}}\,.$$
(8.113)

When m(z = 0) ≲ 3 × 10−6 the linear perturbations have been always in the GR regime by today, in which case the models are not distinguished from the ΛCDM model. The bound (8.113) is relaxed for non-linear perturbations with k ≳ 0.2 h Mpc−1, but the linear analysis is not valid in such cases.

If the transition characterized by the condition (8.111) occurs during the deep matter era (z ≫ 1), we can estimate the critical redshift zk at the transition point. In the following let us consider the models (4.83) and (4.84). In addition to the approximations \({H^2} \simeq H_0^2\Omega _m^{(0)}{(1 + z)^3}\) and R ≃ 3H2 during the matter dominance, we use the the asymptotic forms mC(−r − 1)2n+1 and r ≃ −1 − μRc/R with C = 2n(2n + 1)/μ2n. Since the dark energy density today can be approximated as \(\rho _{{\rm{DE}}}^{(0)} \approx \mu {R_c}/2\), it follows that \(\mu {R_c} \approx 6H_0^2\Omega _{{\rm{DE}}}^{(0)}\). Then the condition (8.111) translates into the critical redshift [589]

$${z_k} = {\left[ {{{\left({{k \over {{a_0}{H_0}}}} \right)}^2}{{2n(2n + 1)} \over {{\mu ^{2n}}}}{{{{(2\Omega _{{\rm{DE}}}^{(0)})}^{2n + 1}}} \over {{{(\Omega _m^0)}^{2(n + 1)}}}}} \right]^{1/(6n + 4)}} - 1\,.$$
(8.114)

For n = 1, μ = 3, \(\Omega _m^{(0)} = 0.28\), and k = 300a0H0 the numerical value of the critical redshift is zk = 4.5, which is in good agreement with the analytic value estimated by (8.114).

The estimation (8.114) shows that, for larger k, the transition occurs earlier. The time tk at the transition has a k-dependence: tkk−3/(6n+4). For t > tk the matter perturbation evolves as \({\delta _m} \propto {t^{(\sqrt {33} - 1)/6}}\) by the time t = tΛ corresponding to the onset of cosmic acceleration (ä = 0). The matter power spectrum \({P_{{\delta _m}}} = \vert {\delta _m}{\vert ^2}\) at the time tΛ shows a difference compared to the case of the ΛCDM model [568]:

$${{{P_{{\delta _m}}}({t_\Lambda})} \over {{P_{{\delta _m}}}^{\Lambda {\rm{CDM}}}({t_\Lambda})}} = {\left({{{{t_\Lambda}} \over {{t_k}}}} \right)^{2\left({{{\sqrt {33} - 1} \over 6} - {2 \over 3}} \right)}} \propto {k^{{{\sqrt {33} - 5} \over {6n + 4}}}}\,.$$
(8.115)

We caution that, when zk is close to zΛ (the redshift at t = tΛ), the estimation (8.115) begins to lose its accuracy. The ratio of the two power spectra today, i.e., \({P_{{\delta _m}}}({t_0})/{P_{{\delta _m}}}^{\Lambda {\rm{CDM}}}({t_0})\) is in general different from Eq. (8.115). However, numerical simulations in [587] show that the difference is small for n of the order of unity.

The modified evolution (8.110) of the effective gravitational potential for z < zk leads to the integrated Sachs-Wolfe (ISW) effect in CMB anisotropies [544, 382, 545]. However this is limited to very large scales (low multipoles) in the CMB spectrum. Meanwhile the galaxy power spectrum is directly affected by the non-standard evolution of matter perturbations. From Eq. (8.115) there should be a difference between the spectral indices of the CMB spectrum and the galaxy power spectrum on the scale (8.112) [568]:

$$\Delta {n_s} = {{\sqrt {33} - 5} \over {6n + 4}}\,.$$
(8.116)

Observationally we do not find any strong signature for the difference of slopes of the two spectra. If we take the mild bound Δns < 0.05, we obtain the constraint n > 2. Note that in this case the local gravity constraint (5.60) is also satisfied.

In order to estimate the growth rate of matter perturbations, we introduce the growth index γ defined by [484]

$${f_\delta} \equiv {{{{\dot \delta}_m}} \over {H{\delta _m}}} = {({\tilde \Omega _m})^\gamma}\,,$$
(8.117)

where \({{\tilde \Omega}_m} = {\kappa ^2}{\rho _m}/(3{H^2}) = F{\Omega _m}\). This choice of \({{\tilde \Omega}_m}\) comes from writing Eq. (4.59) in the form 3H2 = ρDE + κ2ρm, where ρDE ≡ (FRf)/2 − 3 HḞ + 3H2(1 − F) and we have ignored the contribution of radiation. Since the viable f(R) models are close to the ΛCDM model in the region of high density, the quantity F approaches 1 in the asymptotic past. Defining ρDE and \({{\tilde \Omega}_m}\) in the above way, the Friedmann equation can be cast in the usual GR form with non-relativistic matter and dark energy [568, 270, 589].

The growth index in the ΛCDM model corresponds to γ ≃ 0.55 [612, 395], which is nearly constant for 0 < z < 1. In f(R) gravity, if the perturbations are in the GR regime (M2k2/a2) today, γ is close to the GR value. Meanwhile, if the transition to the scalar-tensor regime occurred at the redshift zk larger than 1, the growth index becomes smaller than 0.55 [270]. Since \(0 < {{\tilde \Omega}_m} < 1\), the smaller γ implies a larger growth rate.

In Figure 4 we plot the evolution of the growth index γ in the model (4.83) with n = 1 and μ = 1.55 for a number of different wavenumbers. In this case the present value of γ is degenerate around γ0 ≃ 0.41 independent of the scales of our interest. For the wavenumbers k = 0.1 h Mpc−1 and k = 0.01 h Mpc−1 the transition redshifts correspond to zk = 5.2 and zk = 2.7, respectively. Hence these modes have already entered the scalar-tensor regime by today.

Figure 4
figure 4

Evolution of γ versus the redshift z in the model (4.83) with n = 1 and μ = 1.55 for four different values of k. For these model parameters the dispersion of γ with respect to k is very small. All the perturbation modes shown in the figure have reached the scalar-tensor regime (M2k2/a2) by today. From [589].

From Eq. (8.114) we find that zk gets smaller for larger n and μ. If the mode k = 0.2 h Mpc−1 crossed the transition point at \({z_k} > {\mathcal O}(1)\) and the mode k = 0.01 h Mpc−1 has marginally entered (or has not entered) the scalar-tensor regime by today, then the growth indices should be strongly dispersed. For sufficiently large values of n and μ one can expect that the transition to the regime M2k2/a2 has not occurred by today. The following three cases appear depending on the values of n and μ [589]:

  1. (i)

    All modes have the values of γ0 close to the ΛCDM value: γ0 =. 55, i.e., 0.53 ≲ γ0 ≲ 0.55.

  2. (ii)

    All modes have the values of γ0 close to the value in the range 0.40 ≲ γ0 ≲ 0.43.

  3. (iii)

    The values of γ0 are dispersed in the range 0.40 ≲ γ0 ≲ 0.55.

The region (i) corresponds to the opposite of the inequality (8.113), i.e., m(z = 0) ≲ 3 × 10−6, in which case n and μ take large values. The border between (i) and (iii) is characterized by the condition m(z = 0) ≈ 3 × 10−6. The region (ii) corresponds to small values of n and μ (as in the numerical simulation of Figure 4), in which case the mode k = 0. 01 h Mpc−1 entered the scalar-tensor regime for \({z_k} > {\mathcal O}(1)\).

The regions (i), (ii), (iii) can be found numerically by solving the perturbation equations. In Figure 5 we plot those regions for the model (4.84) together with the bounds coming from the local gravity constraints as well as the stability of the late-time de Sitter point. Note that the result in the model (4.83) is also similar to that in the model (4.84). The parameter space for n ≲ 3 and \(\mu = {\mathcal O}(1)\) is dominated by either the region (ii) or the region (iii). While the present observational constraint on γ is quite weak, the unusual converged or dispersed spectra found above can be useful to distinguish metric f(R) gravity from the ΛCDM model in future observations. We also note that for other viable f(R) models such as (4.89) the growth index today can be as small as γ0 ≃ 0.4 [589]. If future observations detect such unusually small values of γ0, this can be a smoking gun for f(R) models.

Figure 5
figure 5

The regions (i), (ii) and (iii) for the model (4.84). We also show the bound n > 0.9 coming from the local gravity constraints as well as the condition (4.87) coming from the stability of the de Sitter point. From [589].

8.3 Non-linear matter perturbations

So far we have discussed the evolution of linear perturbations relevant to the matter spectrum for the scale k ≲ 0.01–0.2 h Mpc−1. For smaller scale perturbations the effect of non-linearity becomes important. In GR there are some mapping formulas from the linear power spectrum to the non-linear power spectrum such as the halo fitting by Smith et al. [540]. In the halo model the non-linear power spectrum P(k) is defined by the sum of two pieces [169]:

$$P(k) = {I_1}(k) + {I_2}{(k)^2}{P_L}(k)\,,$$
(8.118)

where PL(k) is a linear power spectrum and

$${I_1}(k) = \int {{{{\rm{d}}M} \over M}} {\left({{M \over {\rho _m^{(0)}}}} \right)^2}{{{\rm{d}}n} \over {{\rm{d}}\ln M}}\,{y^2}(M,k)\,,\qquad {I_2}(k) = \int {{{{\rm{d}}M} \over M}} {\left({{M \over {\rho _m^{(0)}}}} \right)^2}{{{\rm{d}}n} \over {{\rm{d}}\ln M}}\,b(M)y(M,k)\,.$$
(8.119)

Here M is the mass of dark matter halos, \(\rho _m^{(0)}\) is the dark matter density today, dn/dln M is the mass function describing the comoving number density of halos, y(M, k) is the Fourier transform of the halo density profile, and b(M) is the halo bias.

In modified gravity theories, Hu and Sawicki (HS) [307] provided a fitting formula to describe a non-linear power spectrum based on the halo model. The mass function dn/d ln M and the halo profile ρ depend on the root-mean-square σ(M) of a linear density field. The Sheth-Tormen mass function [535] and the Navarro-Frenk-White halo profile [449] are usually employed in GR. Replacing σ for σGR obtained in the GR dark energy model that follows the same expansion history as the modified gravity model, we obtain a non-linear power spectrum P(k) according to Eq. (8.118). In [307] this non-linear spectrum is called P(k). It is also possible to obtain a nonlinear spectrum P0(k) by applying a usual (halo) mapping formula in GR to modified gravity. This approach is based on the assumption that the growth rate in the linear regime determines the non-linear spectrum. Hu and Sawicki proposed a parametrized non-linear spectrum that interpolates between two spectra P(k) and P0(k) [307]:

$$P(k) = {{{P_0}(k) + {c_{{\rm{nl}}}}{\Sigma ^2}(k){P_\infty}(k)} \over {1 + {c_{{\rm{nl}}}}{\Sigma ^2}(k)}}\,,$$
(8.120)

where cnl is a parameter which controls whether P(k) is close to P0(k) or P(k). In [307] they have taken the form Σ2(k) = k3PL(k)/(2π2).

The validity of the HS fitting formula (8.120) should be checked with N-body simulations in modified gravity models. In [478, 479, 529] N-body simulations were carried out for the f (R) model (4.83) with n = 1/2 (see also [562, 379] for N-body simulations in other modified gravity models). The chameleon mechanism should be at work on small scales (solar-system scales) for the consistency with local gravity constraints. In [479] it was found that the chameleon mechanism tends to suppress the enhancement of the power spectrum in the non-linear regime that corresponds to the recovery of GR. On the other hand, in the post Newtonian intermediate regime, the power spectrum is enhanced compared to the GR case at the measurable level.

Koyama et al. [371] studied the validity of the HS fitting formula by comparing it with the results of N-body simulations. Note that in this paper the parametrization (8.120) was used as a fitting formula without employing the halo model explicitly. In their notation P0 corresponds to “Pnon-GR” derived without non-linear interactions responsible for the recovery of GR (i.e., gravity is modified down to small scales in the same manner as in the linear regime), whereas P corresponds to “PGR” obtained in the GR dark energy model following the same expansion history as that in the modified gravity model. Note that cnl characterizes how the theory approaches GR by the chameleon mechanism. Choosing Σ as

$${\Sigma ^2}(k,z) = {\left({{{{k^3}} \over {2{\pi ^2}}}{P_L}(k,z)} \right)^{1/3}}\,,$$
(8.121)

where PL is the linear power spectrum in the modified gravity model, they showed that, in the f(R) model (4.83) with n = 1/2, the formula (8.120) can fit the solutions in perturbation theory very well by allowing the time-dependence of the parameter cnl in terms of the redshift z. In the regime 0 < z < 1 the parameter cnl is approximately given by cnl(z = 0) = 0.085.

In the left panel of Figure 6 the relative difference of the non-linear power spectrum P(k) from the GR spectrum PGR(k) is plotted as a dashed curve (“no chameleon” case with cnl = 0) and as a solid curve (“chameleon” case with non-zero cnl derived in the perturbative regime). Note that in this simulation the fitting formula by Smith et al. [540] is used to obtain the non-linear power spectrum from the linear one. The agreement with N-body simulations is not very good in the non-linear regime (k > 0.1h Mpc−1). In [371] the power spectrum Pnon-GR in the no chameleon case (i.e., cnl = 0) was derived by interpolating the N-body results in [479]. This is plotted as the dashed line in the right panel of Figure 6. Using this spectrum Pnon-GR for cnl ≠ 0, the power spectrum in N-body simulations in the chameleon case can be well reproduced by the fitting formula (8.120) for the scale k < 0.5h Mpc−1 (see the solid line in Figure 6). Although there is some deviation in the regime k > 0.5h Mpc−1, we caution that N-body simulations have large errors in this regime. See [530] for clustered abundance constraints on the f(R) model (4.83) derived by the calibration of N-body simulations.

Figure 6
figure 6

Comparison between N-body simulations and the two fitting formulas in the f(R) model (4.83) with n = 1/2. The circles and triangles show the results of N-body simulations with and without the chameleon mechanism, respectively. The arrow represents the maximum value of k(= 0.08h Mpc−1) by which the perturbation theory is valid. (Left) The fitting formula by Smith et al. [540] is used to predict Pnon-GR and PGR. The solid and dashed lines correspond to the power spectra with and without the chameleon mechanism, respectively. For the chameleon case cnl(z) is determined by the perturbation theory with cnl(z = 0) = 0.085. (Right) The N-body results in [479] are interpolated to derive Pnon-GR without the chameleon mechanism. The obtained Pnon-GR is used for the HS fitting formula to derive the power spectrum P in the chameleon case. From [371].

In the quasi non-linear regime a normalized skewness, \({S_3} = \langle \delta _m^3\rangle/{\langle \delta _m^2\rangle ^2}\), of matter perturbations can provide a good test for the picture of gravitational instability from Gaussian initial conditions [79]. If large-scale structure grows via gravitational instability from Gaussian initial perturbations, the skewness in a universe dominated by pressureless matter is known to S3 = 34/7 in GR [484]. In the ΛCDM model the skewness depends weakly on the expansion history of the universe (less than a few percent) [335]. In f(R) dark energy models the difference of the skewness from the ΛCDM model is only less than a few percent [576], even if the growth rate of matter perturbations is significantly different. This is related to the fact that in the Einstein frame dark energy has a universal coupling \(Q = - 1/\sqrt 6\) with all non-relativistic matter, unlike the coupled quintessence scenario with different couplings between dark energy and matter species (dark matter, baryons) [30].

8.4 Cosmic Microwave Background

The effective gravitational potential (8.98) is directly related to the ISW effect in CMB anisotropies. This contributes to the temperature anisotropies today as an integral [308, 214]

$${\Theta _{{\rm{ISW}}}} \equiv \int\nolimits_0^{{\eta _0}} {\rm{d}} \eta {e^{- \tau}}{{{\rm{d}}{\Phi _{{\rm{eff}}}}} \over {{\rm{d}}\eta}}{j_\ell}[k({\eta _0} - \eta)]\,,$$
(8.122)

where τ is the optical depth, η = ∫a−1dt is the conformal time with the present value η0, and j[k(η0η)] is the spherical Bessel function for CMB multipoles and the wavenumber k. In the limit ≫ 1 (i.e., small-scale limit) the spherical Bessel function has a dependence j(x) ≃ (x/ℓ)−1/2, which is suppressed for large . Hence the dominant contribution to the ISW effect comes from the low modes (\(\ell = \mathcal O(1)\)).

In the ΛCDM model the effective gravitational potential is constant during the matter dominance, but it begins to decay after the Universe enters the epoch of cosmic acceleration (see the left panel of Figure 7). This late-time variation of Φeff leads to the contribution to ΘISW, which works as the ISW effect.

Figure 7
figure 7

(Left) Evolution of the effective gravitational potential Φeff (denoted as Φ in the figure) versus the scale factor a (with the present value a = 1) on the scale k−1 = 103 Mpc for the ΛCDM model and f(R) models with B0 = 0.5, 1.5, 3.0, 5.0. As the parameter B0 increases, the decay of Φeff decreases and then turns into growth for B0 ≳ 1.5. (Right) The CMB power spectrum ( + 1)C/(2π) for the ΛCDM model and f(R) models with B0 = 0.5, 1.5, 3.0, 5.0. As B0 increases, the ISW contributions to low multipoles decrease, reach the minimum around B0 = 1.5, and then increase. The black points correspond to the WMAP 3-year data [561]. From [545].

For viable f(R) dark energy models the evolution of Φeff during the early stage of the matter era is constant as in the ΛCDM model. After the transition to the scalar-tensor regime, the effective gravitational potential evolves as \({\Phi _{{\rm{eff}}}} \propto {t^{(\sqrt {33 - 5)}/6}}\) during the matter dominance [as we have shown in Eq. (8.110)]. The evolution of Φeff during the accelerated epoch is also subject to change compared to the ΛCDM model. In the left panel of Figure 7 we show the evolution of Φeff versus the scale factor a for the wavenumber k = 10−3 Mpc−1 in several different cases. In this simulation the background cosmological evolution is fixed to be the same as that in the ΛCDM model. In order to quantify the difference from the ΛCDM model at the level of perturbations, [628, 544, 545] defined the following quantity

$$B \equiv m\,{{\dot R} \over R}\,{H \over {\dot H}}\,,$$
(8.123)

where m = Rf,RR/f,R. If the effective equation of state weff defined in Eq. (4.69) is constant, it then follows that R = 3H2(1–3 weff) and hence B = 2 m. The stability of cosmological perturbations requires the condition B > 0 [544, 526]. The left panel of Figure 7 shows that, as we increase the values of B today (= B0), the evolution of Φeff at late times tends to be significantly different from that in the ΛCDM model. This comes from the fact that, for increasing B, the transition to the scalar-tensor regime occurs earlier.

From the right panel of Figure 7 we find that, as B0 increases, the CMB spectrum for low multipoles first decreases and then reaches the minimum around B0 = 1.5. This comes from the reduction in the decay rate of Φeff relative to the ΛCDM model, see the left panel of Figure 7. Around B0 = 1.5 the effective gravitational potential is nearly constant, so that the ISW effect is almost absent (i.e., ΘISW ≈ 0). For B0 ≳ 1.5 the evolution of Φeff turns into growth. This leads to the increase of the large-scale CMB spectrum, as B0 increases. The spectrum in the case B0 = 3.0 is similar to that in the ΛCDM model. The WMAP 3-year data rule out B0 > 4.3 at the 95% confidence level because of the excessive ISW effect [545].

There is another observational constraint coming from the angular correlation between the CMB temperature field and the galaxy number density field induced by the ISW effect [544]. The f(R) models predict that, for B0 ≳ 1, the galaxies are anticorrelated with the CMB because of the sign change of the ISW effect. Since the anticorrelation has not been observed in the observational data of CMB and LSS, this places an upper bound of B0 ≳ 1 [545]. This is tighter than the bound B0 < 4.3 coming from the CMB angular spectrum discussed above.

Finally we briefly mention stochastic gravitational waves produced in the early universe [421, 172, 122, 123, 174, 173, 196, 20]. For the inflation model f(R) = R + R2/(6M2) the primordial gravitational waves are generated with the tensor-to-scalar ratio of the order of 10−3, see Eq. (7.73). It is also possible to generate stochastic gravitational waves after inflation under the modification of gravity. Capozziello et al. [122, 123] studied the evolution of tensor perturbations for a toy model f = R1+ϵ in the FLRW universe with the power-law evolution of the scale factor. Since the parameter ϵ is constrained to be very small (∣ϵ∣ < 7.2 × 10−19) [62, 160], it is very difficult to detect the signature of f(R) gravity in the stochastic gravitational wave background. This property should hold for viable f(R) dark energy models in general, because the deviation from GR during the radiation and the deep matter era is very small.

9 Palatini Formalism

In this section we discuss f(R) theory in the Palatini formalism [481]. In this approach the action (2.1) is varied with respect to both the metric gμν and the connection \(\Gamma _{\beta \gamma}^\alpha\). Unlike the metric approach, gμν and \(\Gamma _{\beta \gamma}^\alpha\) are treated as independent variables. Variations using the Palatini approach [256, 607, 608, 261, 262, 260] lead to second-order field equations which are free from the instability associated with negative signs of f,RR [422, 423]. We note that even in the 1930s Lanczos [378] proposed a specific combination of curvature-squared terms that lead to a second-order and divergence-free modified Einstein equation.

The background cosmological dynamics of Palatini f(R) gravity has been investigated in [550, 553, 21, 253, 495], which shows that the sequence of radiation, matter, and accelerated epochs can be realized even for the model f(R) = Ra/Rn with n > 0 (see also [424, 457, 495]). The equations for matter density perturbations were derived in [359]. Because of a large coupling Q between dark energy and non-relativistic matter dark energy models based on Palatini f(R) gravity are not compatible with the observations of large-scale structure, unless the deviation from the ΛCDM model is very small [356, 386, 385, 597]. Such a large coupling also gives rise to non-perturbative corrections to the matter action, which leads to a conflict with the Standard Model of particle physics [261, 262, 260] (see also [318, 472, 473, 475, 55]).

There are also a number of works [470, 471, 216, 552] about the Newtonian limit in the Palatini formalism (see also [18, 19, 107, 331, 511, 510]). In particular it was shown in [55, 56] that the non-dynamical nature of the scalar-field degree of freedom can lead to a divergence of non-vacuum static spherically symmetric solutions at the surface of a compact object for commonly-used polytropic equations of state. Hence Palatini f(R) theory is difficult to be compatible with a number of observations and experiments, as long as the models are constructed to explain the late-time cosmic acceleration. Moreover it is also known that in Palatini gravity the Cauchy problem [609] is not well-formulated due to the presence of higher derivatives of matter fields in field equations [377] (see also [520, 135] for related works). We also note that the matter Lagrangian (such as the Lagrangian of Dirac particles) cannot be simply assumed to be independent of connections. Even in the presence of above mentioned problems it will be useful to review this theory because we can learn the way of modifications of gravity from GR to be consistent with observations and experiments.

9.1 Field equations

Let us derive field equations by treating gμν and \(\Gamma _{\beta \gamma}^\alpha\) as independent variables. Varying the action (2.1) with respect to gμν, we obtain

$$F(R){R_{\mu \nu}}(\Gamma) - {1 \over 2}f(R){g_{\mu \nu}} = {\kappa ^2}T_{\mu \nu}^{(M)},$$
(9.1)

where F(R) = ∂f/∂R, Rμν(Γ) is the Ricci tensor corresponding to the connections \(\Gamma _{\beta \gamma}^\alpha\), and \(T_{\mu \nu}^{(M)}\) is defined in Eq. (2.5). Note that Rμν(Γ) is in general different from the Ricci tensor calculated in terms of metric connections Rμν(g). The trace of Eq. (9.1) gives

$$F(R)R - 2f(R) = {\kappa ^2}T,$$
(9.2)

where \(T = {g^{\mu \nu}}T_{\mu \nu}^{(M)}\). Here the Ricci scalar R(T) is directly related to T and it is different from the Ricci scalar R(g) = gμνRμν(g) in the metric formalism. More explicitly we have the following relation [556]

$$R(T) = R(g) + {3 \over {2{{({f^\prime}(R(T)))}^2}}}({\nabla _\mu}{f^\prime}(R(T)))({\nabla ^\mu}{f^\prime}(R(T))) + {3 \over {{f^\prime}(R(T))}}\square {f^\prime}(R(T)),$$
(9.3)

where a prime represents a derivative in terms of R(T). The variation of the action (2.1) with respect to the connection leads to the following equation

$$\begin{array}{*{20}c} {{R_{\mu \nu}}(g) - {1 \over 2}{g_{\mu \nu}}R(g) = {{{\kappa ^2}{T_{\mu \nu}}} \over F} - {{FR(T) - f} \over {2F}}{g_{\mu \nu}} + {1 \over F}({\nabla _\mu}{\nabla _\nu}F - {g_{\mu \nu}}\square F)\quad \quad \quad \;} \\ {- {3 \over {2{F^2}}}\;\left[ {{\partial _\mu}F{\partial _\nu}F - {1 \over 2}{g_{\mu \nu}}{{(\nabla F)}^2}} \right].} \\ \end{array}$$
(9.4)

In Einstein gravity (f(R) = R − 2Λ and F(R) = 1) the field equations (9.2) and (9.4) are identical to the equations (2.7) and (2.4), respectively. However, the difference appears for the f(R) models which include non-linear terms in R. While the kinetic term □F is present in Eq. (2.7), such a term is absent in Palatini f(R) gravity. This has the important consequence that the oscillatory mode, which appears in the metric formalism, does not exist in the Palatini formalism. As we will see later on, Palatini f(R) theory corresponds to Brans-Dicke (BD) theory [100] with a parameter ωBD = −3/2 in the presence of a field potential. Such a theory should be treated separately, compared to BD theory with ωBD ≠ −3/2 in which the field kinetic term is present.

As we have derived the action (2.21) from (2.18), the action in Palatini f(R) gravity is equivalent to

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \;\left[ {{1 \over {2{\kappa ^2}}}\varphi R(T) - U(\varphi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(9.5)

where

$$\varphi = {f^\prime}(R(T)),\qquad U = {{R(T){f^\prime}(R(T)) - f(R(T))} \over {2{\kappa ^2}}}.$$
(9.6)

Since the derivative of U in terms of φ is U,φ = R/(2κ2), we obtain the following relation from Eq. (9.2):

$$4U - 2\varphi {U_{,\varphi}} = T.$$
(9.7)

Using the relation (9.3), the action (9.5) can be written as

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over {2{\kappa ^2}}}\varphi R(g) + {3 \over {4{\kappa ^2}}}{1 \over \varphi}{{(\nabla \varphi)}^2} - U(\varphi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}).$$
(9.8)

Comparing this with Eq. (2.23) in the unit κ2 = 1, we find that Palatini f(R) gravity is equivalent to BD theory with the parameter ωBD = −3/2 [262, 470, 551]. As we will see in Section 10.1, this equivalence can be also seen by comparing Eqs. (9.1) and (9.4) with those obtained by varying the action (2.23) in BD theory. In the above discussion we have implicitly assumed that \({\mathcal L_M}\) does not explicitly depend on the Christoffel connections \(\Gamma _{\mu \nu}^\lambda\). This is true for a scalar field or a perfect fluid, but it is not necessarily so for other matter Lagrangians such as those describing vector fields.

There is another way for taking the variation of the action, known as the metric-affine formalism [299, 558, 557, 121]. In this formalism the matter action SM depends not only on the metric gμν but also on the connection \(\Gamma _{\mu \nu}^\lambda\). Since the connection is independent of the metric in this approach, one can define the quantity called hypermomentum [299], as \(\Delta _\lambda ^{\mu \nu} \equiv (- 2/\sqrt {- g})\delta {\mathcal L_M}/\delta \Gamma _{\mu \nu}^\lambda\). The usual assumption that the connection is symmetric is also dropped, so that the antisymmetric quantity called the Cartan torsion tensor, \(S_{\mu \nu}^\lambda \equiv \Gamma _{[\mu \nu ]}^\lambda\), is defined. The non-vanishing property of \(S_{\mu \nu}^\lambda\) allows the presence of torsion in this theory. If the condition \(\Delta _\lambda ^{[\mu \nu ]} = 0\) holds, it follows that the Cartan torsion tensor vanishes \((S_{\mu \nu}^\lambda = 0)\) [558]. Hence the torsion is induced by matter fields with the anti-symmetric hypermomentum. The f(R) Palatini gravity belongs to f(R) theories in the metric-affine formalism with \(\Delta _\lambda ^{\mu \nu} = 0\). In the following we do not discuss further f(R) theory in the metric-affine formalism. Readers who are interested in those theories may refer to the papers [557, 556].

9.2 Background cosmological dynamics

We discuss the background cosmological evolution of dark energy models based on Palatini f(R) gravity. We shall carry out general analysis without specifying the forms of f(R). We take into account non-relativistic matter and radiation whose energy densities are ρm and ρr, respectively. In the flat FLRW background (2.12) we obtain the following equations

$$FR - 2f = - {\kappa ^2}{\rho _m},$$
(9.9)
$$6F{\left({H + {{\dot F} \over {2F}}} \right)^2} - f = {\kappa ^2}({\rho _m} + 2{\rho _r}),$$
(9.10)

together with the continuity equations, \({\dot \rho _m} + 3H{\rho _m} = 0\) and \({\dot \rho _r} + 4H{\rho _r} = 0\). Combing Eqs. (9.9) and (9.10) together with continuity equations, it follows that

$$\dot R = {{3{\kappa ^2}H{\rho _m}} \over {{F_{,R}}R - F}} = - 3H{{F\,R - 2f} \over {{F_{,R}}R - F}},$$
(9.11)
$${H^2} = {{2{\kappa ^2}({\rho _m} + {\rho _r}) + F\,R - f} \over {6F\xi}},$$
(9.12)

where

$$\xi \equiv {\left[ {1 - {3 \over 2}{{{F_{,R}}(F\,R - 2f)} \over {F({F_{,R}}R - F)}}} \right]^2}.$$
(9.13)

In order to discuss cosmological dynamics it is convenient to introduce the dimensionless variables:

$${y_1} \equiv {{F\,R - f} \over {6F\xi {H^2}}},\qquad {y_2} \equiv {{{\kappa ^2}{\rho _r}} \over {3F\xi {H^2}}},$$
(9.14)

by which Eq (9.12) can be written as

$${{{\kappa ^2}{\rho _m}} \over {3F\xi {H^2}}} = 1 - {y_1} - {y_2}.$$
(9.15)

Differentiating y1 and y2 with respect to N = ln a, we obtain [253]

$${{{\rm{d}}{y_1}} \over {{\rm{d}}N}} = {y_1}\left[ {3 - 3{y_1} + {y_2} + C(R)(1 - {y_1})} \right],$$
(9.16)
$${{{\rm{d}}{y_2}} \over {{\rm{d}}N}} = {y_2}\left[ {- 1 - 3{y_1} + {y_2} - C(R){y_1}} \right],$$
(9.17)

where

$$C(R) \equiv {{R\dot F} \over {H(F\,R - f)}} = - 3{{(F\,R - 2f){F_{,R}}R} \over {(F\,R - f)({F_{,R}}R - F)}}.$$
(9.18)

The following constraint equation also holds

$${{1 - {y_1} - {y_2}} \over {2{y_1}}} = - {{F\,R - 2f} \over {F\,R - f}}.$$
(9.19)

Hence the Ricci scalar R can be expressed in terms of y1 and y2.

Differentiating Eq. (9.11) with respect to t, it follows that

$${{\dot H} \over {{H^2}}} = - {3 \over 2} + {3 \over 2}{y_1} - {1 \over 2}{y_2} - {{\dot F} \over {2H\,F}} - {{\dot \xi} \over {2H\xi}} + {{\dot F\,R} \over {12F\xi {H^3}}},$$
(9.20)

from which we get the effective equation of state:

$${w_{{\rm{eff}}}} = - 1 - {2 \over 3}{{\dot H} \over {{H^2}}} = - {y_1} + {1 \over 3}{y_2} + {{\dot F} \over {3H\,F}} + {{\dot \xi} \over {3H\xi}} - {{\dot F\,R} \over {18F\xi {H^3}}}.$$
(9.21)

The cosmological dynamics is known by solving Eqs. (9.16) and (9.17) with Eq. (9.18). If C(R) is not constant, then one can use Eq. (9.19) to express R and C(R) in terms of y1 and y2.

The fixed points of Eqs. (9.16) and (9.17) can be found by setting dy1/dN = 0 and dy2/dN = 0. Even when C(R) is not constant, except for the cases C(R) = −3 and C(R) = −4, we obtain the following fixed points [253]:

  1. 1.

    Pr: (y1,y2) = (0, 1),

  2. 2.

    Pm: (y1, y2) = (0, 0),

  3. 3.

    Pd: (y1, y2) = (1, 0).

The stability of the fixed points can be analyzed by considering linear perturbations about them. As long as dC/dy1 and dC/dy2 are bounded, the eigenvalues λ1 and λ2 of the Jacobian matrix of linear perturbations are given by

  1. 1.

    Pr: (λ1, λ2) = (4 + C(R), 1),

  2. 2.

    Pm: (λ1, λ2) = (3 + C(R), −1),

  3. 3.

    Pd: (λ1, λ2) = (−3 − C(R), −4 − C(R)).

In the ΛCDM model (f(R) = R − 2Λ) one has weff = −y1 + y2/3 and C(R) = 0. Then the points Pr, Pm, and Pd correspond to weff = 1/3, (λ1, λ2) = (4, 1) (radiation domination, unstable), weff = 0, (λ1, λ2) = (3, −1) (matter domination, saddle), and weff = −1, (λ1, λ2) = (−3, −4) (de Sitter epoch, stable), respectively. Hence the sequence of radiation, matter, and de Sitter epochs is in fact realized.

Let us next consider the model f(R) = Rβ/Rn with β > 0 and n > −1. In this case the quantity C(R) is

$$C(R) = 3n{{{R^{1 + n}} - (2 + n)\beta} \over {{R^{1 + n}} + n(2 + n)\beta}}.$$
(9.22)

The constraint equation (9.19) gives

$${\beta \over {{R^{1 + n}}}} = {{2{y_1}} \over {3{y_1} + n({y_1} - {y_2} + 1) - {y_2} + 1}}.$$
(9.23)

The late-time de Sitter point corresponds to R1+n = (2 + n)β, which exists for n > −2. Since C(R) = 0 in this case, the de Sitter point Pd is stable with the eigenvalues (λ1, λ2) = (−3, −4). During the radiation and matter domination we have β/R1+n ≪ 1 (i.e., f(R) ≃ R) and hence C(R) = 3n. Pr corresponds to the radiation point (weff = 1/3) with the eigenvalues (λ1, λ2) = (4 + 3n, 1), whereas Pm to the matter point (weff = 0) with the eigenvalues (λ1, λ2) = (3 + 3n, −1). Provided that n > −1, Pr and Pm correspond to unstable and saddle points respectively, in which case the sequence of radiation, matter, and de Sitter eras can be realized. For the models f(R) = R + αRmβ/Rn, it was shown in [253] that unified models of inflation and dark energy with radiation and matter eras are difficult to be realized.

In Figure 8 we plot the evolution of weff as well as y1 and y2 for the model f(R) = Rβ/Rn with n = 0.02. This shows that the sequence of (Pr) radiation domination (weff = 1/3), (Pm) matter domination (weff = 0), and de Sitter acceleration (weff = −1) is realized. Recall that in metric f(R) gravity the model f(R) = Rβ/Rn (β > 0, n > 0) is not viable because f,RR is negative. In Palatini f(R) gravity the sign of f,RR does not matter because there is no propagating degree of freedom with a mass M associated with the second derivative f,RR [554].

Figure 8
figure 8

The evolution of the variables y1 and y2 for the model f(R) = Rβ/Rn with n = 0.02, together with the effective equation of state weff. Initial conditions are chosen to be y1 = 10−40 and y2 = 1.0–10−5. From [253].

In [21, 253] the dark energy model f(R) = Rβ/Rn was constrained by the combined analysis of independent observational data. From the joint analysis of Super-Nova Legacy Survey [39], BAO [227] and the CMB shift parameter [561], the constraints on two parameters n and β are n ∈ [−0.23, 0.42] and β ∈ [2.73, 10.6] at the 95% confidence level (in the unit of H0 = 1) [253]. Since the allowed values of n are close to 0, the above model is not particularly favored over the ΛCDM model. See also [116, 148, 522, 46, 47] for observational constraints on f(R) dark energy models based on the Palatini formalism.

9.3 Matter perturbations

We have shown that f(R) theory in the Palatini formalism can give rise to the late-time cosmic acceleration preceded by radiation and matter eras. In this section we study the evolution of matter density perturbations to confront Palatini f(R) gravity with the observations of large-scale structure [359, 356, 357, 598, 380, 597]. Let us consider the perturbation δρm of non-relativistic matter with a homogeneous energy density ρm. Koivisto and Kurki-Suonio [359] derived perturbation equations in Palatini f(R) gravity. Using the perturbed metric (6.1) with the same variables as those introduced in Section 6, the perturbation equations are given by

$$\begin{array}{*{20}c} {{\Delta \over {{a^2}}}\psi + \left({H + {{\dot F} \over {2F}}} \right)A + {1 \over {2F}}\left({{{3{{\dot F}^2}} \over {2F}} + 3H\,\dot F} \right)\alpha \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= {1 \over {2F}}\;\left[ {\left({3{H^2} - {{3{{\dot F}^2}} \over {4{F^2}}} - {R \over 2} - {\Delta \over {{a^2}}}} \right)\;\delta F + \left({{{3\dot F} \over {2F}} + 3H} \right)\; \dot{\delta F} - {\kappa ^2}\delta {\rho _m}} \right],} \\ \end{array}$$
(9.24)
$$H\alpha - \dot \psi = {1 \over {2F}}\;\left[ {\dot {\delta F} - \left({H + {{3\dot F} \over {2F}}} \right)\delta F - \dot F\alpha + {\kappa ^2}{\rho _m}v} \right],$$
(9.25)
$$\dot \chi + H\chi - \alpha - \psi = {1 \over F}(\delta F - \dot F\chi),$$
(9.26)
$$\begin{array}{*{20}c} {\dot A + \left({2H + {{\dot F} \over {2F}}} \right)A + \left({3\dot H + {{3\ddot F} \over F} + {{3H\,\dot F} \over {2F}} - {{3{{\dot F}^2}} \over {{F^2}}} + {\Delta \over {{a^2}}}} \right)\alpha + {3 \over 2}{{\dot F} \over F}\dot \alpha \quad \quad \quad \quad \quad \quad \;\;} \\ {= {1 \over {2F}}\left[ {{\kappa ^2}\delta {\rho _m} + \left({6{H^2} + 6\dot H + {{3{{\dot F}^2}} \over {{F^2}}} - R - {\Delta \over {{a^2}}}} \right)\delta F + \left({3H - {{6\dot F} \over F}} \right)\dot {\delta F} + 3\ddot {\delta F}} \right],} \\ \end{array}$$
(9.27)
$$R\delta F - F\delta R = - {\kappa ^2}\delta {\rho _m},$$
(9.28)

where the Ricci scalar R can be understood as R(T).

From Eq. (9.28) the perturbation δF can be expressed by the matter perturbation δρm, as

$$\delta F = {{{F_{,R}}} \over R}{{{\kappa ^2}\delta {\rho _m}} \over {1 - m}},$$
(9.29)

where m = RF,R/F. This equation clearly shows that the perturbation δF is sourced by the matter perturbation only, unlike metric f(R) gravity in which the oscillating mode of δF is present. The matter perturbation δρm and the velocity potential υ obey the same equations as given in Eqs. (8.86) and (8.87), which results in Eq. (8.89) in Fourier space.

Let us consider the perturbation equations in Fourier space. We choose the Longitudinal gauge (χ = 0) with α = Φ and ψ = Ψ. In this case Eq. (9.26) gives

$$\Psi - \Phi = {{\delta F} \over F}.$$
(9.30)

Under the quasi-static approximation on sub-horizon scales used in Section 8.1, Eqs. (9.24) and (8.89) reduce to

$${{{k^2}} \over {{a^2}}}\Psi \simeq {1 \over {2F}}\left({{{{k^2}} \over {{a^2}}}\delta F - {\kappa ^2}\delta {\rho _m}} \right)\,,$$
(9.31)
$${\ddot \delta _m} + 2H{\dot \delta _m} + {{{k^2}} \over {{a^2}}}\Phi \simeq 0\,.$$
(9.32)

Combining Eq. (9.30) with Eq. (9.31), we obtain

$${{{k^2}} \over {{a^2}}}\Psi = - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}\left({1 - {\zeta \over {1 - m}}} \right)\,,\qquad {{{k^2}} \over {{a^2}}}\Phi = - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}\left({1 + {\zeta \over {1 - m}}} \right)\,,$$
(9.33)

where

$$\zeta \equiv {{{k^2}} \over {{a^2}}}{{{F_{,R}}} \over F} = {{{k^2}} \over {{a^2}R}}m\,.$$
(9.34)

Then the matter perturbation satisfies the following Eq. [597]

$${\ddot \delta _m} + 2H{\dot \delta _m} - {{{\kappa ^2}{\rho _m}} \over {2F}}\left({1 + {\zeta \over {1 - m}}} \right){\delta _m} \simeq 0\,.$$
(9.35)

The effective gravitational potential defined in Eq. (8.98) obeys

$${\Phi _{{\rm{eff}}}} \simeq - {{{\kappa ^2}{\rho _m}} \over {2F}}{{{a^2}} \over {{k^2}}}{\delta _m}\,.$$
(9.36)

In the above approximation we do not need to worry about the dominance of the oscillating mode of perturbations in the past. Note also that the same approximate equation of δm as Eq. (9.35) can be derived for different gauge choices [597].

The parameter ζ is a crucial quantity to characterize the evolution of perturbations. This quantity can be estimated as ζ ≈ (k/aH)2m, which is much larger than m for sub-horizon modes (kaH). In the regime ζ ≪ 1 the matter perturbation evolves as δmt2/3. Meanwhile the evolution of δm in the regime ζ ≫ 1 is completely different from that in GR. If the transition characterized by ζ = 1 occurs before today, this gives rise to the modification to the matter spectrum compared to the GR case.

In the regime ζ ≫ 1, let us study the evolution of matter perturbations during the matter dominance. We shall consider the case in which the parameter m (with ∣m∣ ≪ 1) evolves as

$$m \propto {t^p}\,,$$
(9.37)

where p is a constant. For the model f(R) = RμRc(R/Rc)n (n < 1) the power p corresponds to p = 1 + n, whereas for the models (4.83) and (4.84) with n > 0 one has p = 1 + 2n. During the matter dominance the parameter ζ evolves as ζ = ±(t/tk)2p+2/3, where the subscript “k” denotes the value at which the perturbation crosses ζ = ±1. Here + and − signs correspond to the cases m > 0 and m < 0, respectively. Then the matter perturbation equation (9.35) reduces to

$${{{{\rm{d}}^2}{\delta _m}} \over {{\rm{d}}{N^2}}} + {1 \over 2}{{{\rm{d}}{\delta _m}} \over {{\rm{d}}N}} - {3 \over 2}\left[ {1 \pm {e^{(3p + 1)(N - {N_k})}}} \right]{\delta _m} = 0\,.$$
(9.38)

When m > 0, the growing mode solution to Eq. (9.38) is given by

$${\delta _m} \propto \exp \left({{{\sqrt 6 {e^{(3p + 1)(N - {N_k})/2}}} \over {3p + 1}}} \right)\,,\qquad {f_\delta} \equiv {{{{\dot \delta}_m}} \over {H{\delta _m}}} = {{\sqrt 6} \over 2}{e^{(3p + 1)(N - {N_k})/2}}\,.$$
(9.39)

This shows that the perturbations exhibit violent growth for p > −1/3, which is not compatible with observations of large-scale structure. In metric f(R) gravity the growth of matter perturbations is much milder.

When m < 0, the perturbations show a damped oscillation:

$${\delta _m} \propto {e^{- (3p + 2)(N - {N_k})/4}}\,\cos (x + \theta)\,,\qquad {f_\delta} = - {1 \over 4}(3p + 2) - {{3p + 1} \over 2}x\tan (x + \theta)\,,$$
(9.40)

where \(x = \sqrt 6 {e^{(3p + 1)(N - {N_k})/2}}/(3p + 1)\), and θ is a constant. The averaged value of the growth rate fδ is given by \({\bar f_\delta} = - (3p + 2)/4\), but it shows a divergence every time x changes by π. These negative values of fδ are also difficult to be compatible with observations.

The f(R) models can be consistent with observations of large-scale structure if the universe does not enter the regime ∣ζ∣ > 1 by today. This translates into the condition [597]

$$\left\vert {m(z = 0)} \right\vert \underset{\sim}{<} {({a_0}{H_0}/k)^2}\,.$$
(9.41)

Let us consider the wavenumbers 0.01 h Mpc−1k ≲ 0.2 h Mpc−1 that corresponds to the linear regime of the matter power spectrum. Since the wavenumber k = 0.2 h Mpc−1 corresponds to k ≈ 600a0H0 (where “0” represents present quantities), the condition (9.41) gives the bound ∣m(z = 0)∣ ≲ 3 × 10−6.

If we use the observational constraint of the growth rate, fδ ≲ 1.5 [418, 605, 211], then the deviation parameter m today is constrained to be ∣m(z = 0)∣ ≲ 10−5–10−4 for the model f(R) = R − λRc(R/Rc)n (n < 1) as well as for the models (4.83) and (4.84) [597]. Recall that, in metric f(R) gravity, the deviation parameter can grow to the order of 0.1 by today. Meanwhile f(R) dark energy models based on the Palatini formalism are hardly distinguishable from the ΛCDM model [356, 386, 385, 597]. Note that the bound on m(z = 0) becomes even severer by considering perturbations in non-linear regime. The above peculiar evolution of matter perturbations is associated with the fact that the coupling between non-relativistic matter and a scalar-field degree of freedom is very strong (as we will see in Section 10.1).

The above results are based on the fact that dark matter is described by a cold and perfect fluid with no pressure. In [358] it was suggested that the tight bound on the parameter m can be relaxed by considering imperfect dark matter with a shear stress. Although the approach taken in [358] did not aim to explain the origin of a dark matter stress Π that cancels the k-dependent term in Eq. (9.35), it will be of interest to further study whether some theoretically motivated choice of Π really allows the possibility that Palatini f(R) dark energy models can be distinguished from the ΛCDM model.

9.4 Shortcomings of Palatini f(R) gravity

In addition to the fact that Palatini f(R) dark energy models are hardly distinguished from the ΛCDM model from observations of large-scale structure, there are a number of problems in Palatini f(R) gravity associated with non-dynamical nature of the scalar-field degree of freedom.

The dark energy model f = Rμ4/R based on the Palatini formalism was shown to be in conflict with the Standard Model of particle physics [261, 262, 260, 318, 55] because of large non-perturbative corrections to the matter Lagrangian [here we use for the meaning of R(T)]. Let us consider this issue for a more general model f = Rμ2(n+1)/Rn. From the definition of φ in Eq. (9.6) the field potential U(φ) is given by

$$U(\varphi) = {{n + 1} \over {2{n^{n/(n + 1)}}}}{{{\mu ^2}} \over {{\kappa ^2}}}{(\varphi - 1)^{n/(n + 1)}}\,,$$
(9.42)

where φ =1 + 2(n+1)Rn−1. Using Eq. (9.7) for the vacuum (T = 0), we obtain the solution

$$\varphi (T = 0) = {{2(n + 1)} \over {n + 2}}\,.$$
(9.43)

In the presence of matter we expand the field φ as φ = φ(T = 0) + δφ. Substituting this into Eq. (9.7), we obtain

$$\delta \varphi \simeq {n \over {{{(n + 2)}^{{{n + 2} \over {n + 1}}}}}}{{{\kappa ^2}T} \over {{\mu ^2}}}\,.$$
(9.44)

for \(n = \mathcal O(1)\) we have \(\delta \varphi \approx {\kappa ^2}T/{\mu ^2} = T/({\mu ^2}M_{{\rm{pl}}}^2)\) with φ(T = 0) ≈ 1. Let us consider a matter action of a Higgs scalar field ϕ with mass mϕ:

$${S_M} = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {- {1 \over 2}{g^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - {1 \over 2}m_\phi ^2{\phi ^2}} \right]\,.$$
(9.45)

Since \(T \approx m_\phi ^2\delta {\phi ^2}\) it follows that \(\delta \varphi \approx m_\phi ^2\delta {\phi ^2}/({\mu ^2}M_{{\rm{pl}}}^2)\). Perturbing the Jordan-frame action (9.8) [which is equivalent to the action in Palatini f(R) gravity] to second-order and using the solution \(\varphi \approx 1 + m_\phi ^2\delta {\phi ^2}/({\mu ^2}M_{{\rm{pl}}}^2)\), we find that the effective action of the Higgs field ϕ for an energy scale E much lower than mϕ (= 100–1000 GeV) is given by [55]

$$\delta {S_M} \simeq \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {- {1 \over 2}{g^{\mu \nu}}{\partial _\mu}\delta \phi {\partial _\nu}\delta \phi - {1 \over 2}m_\phi ^2\delta {\phi ^2}} \right]\left({1 + {{m_\phi ^2\delta {\phi ^2}} \over {{\mu ^2}M_{{\rm{pl}}}^2}} + \cdots} \right)\,.$$
(9.46)

Since δϕmϕ for Emϕ, the correction term can be estimated as

$$\delta \varphi \approx {{m_\phi ^2\delta {\phi ^2}} \over {{\mu ^2}M_{{\rm{pl}}}^2}} \approx {\left({{{{m_\phi}} \over \mu}} \right)^2}{\left({{{{m_\phi}} \over {{M_{{\rm{pl}}}}}}} \right)^2}\,.$$
(9.47)

In order to give rise to the late-time acceleration we require that μH0 ≈ 10−42 GeV. For the Higgs mass mϕ = 100 GeV it follows that δϕ ≈ 1056 ≫ 1. This correction is too large to be compatible with the Standard Model of particle physics.

The above result is based on the models f(R) = Rμ2(n+1)/Rn with \(n = \mathcal O(1)\). Having a look at Eq. (9.44), the only way to make the perturbation δϕ small is to choose n very close to 0. This means that the deviation from the ΛCDM model is extremely small (see [388] for a related work). In fact, this property was already found by the analysis of matter density perturbations in Section 9.3. While the above analysis is based on the calculation in the Jordan frame in which test particles follow geodesics [55], the same result was also obtained by the analysis in the Einstein frame [261, 262, 260, 318].

Another unusual property of Palatini f(R) gravity is that a singularity with the divergent Ricci scalar can appear at the surface of a static spherically symmetric star with a polytropic equation of state \(P = c\rho _0^\Gamma\) with 3/2 < Γ < 2 (where P is the pressure and ρ0 is the rest-mass density) [56, 55] (see also [107, 331]). Again this problem is intimately related with the particular algebraic dependence (9.2) in Palatini f(R) gravity. In [56] it was claimed that the appearance of the singularity does not very much depend on the functional forms of f(R) and that the result is not specific to the choice of the polytropic equation of state.

The Palatini gravity has a close relation with an effective action which reproduces the dynamics of loop quantum cosmology [477]. [474] showed that the model f(R) = R + R2/(6M2), where M is of the order of the Planck mass, is not plagued by a singularity problem mentioned above, while the singularity typically arises for the f(R) models constructed to explain the late-time cosmic acceleration (see also [504] for a related work). Since Planck-scale corrected Palatini f(R) models may cure the singularity problem, it will be of interest to understand the connection with quantum gravity around the cosmological singularity (or the black hole singularity). In fact, it was shown in [60] that non-singular bouncing solutions can be obtained for power-law f(R) Lagrangians with a finite number of terms.

Finally we note that the extension of Palatini f(R) gravity to more general theories including Ricci and Riemann tensors was carried out in [384, 387, 95, 236, 388, 509, 476]. While such theories are more involved than Palatini f(R) gravity, it may be possible to construct viable modified gravity models of inflation or dark energy.

10 Extension to Brans-Dicke Theory

So far we have discussed f(R) gravity theories in the metric and Palatini formalisms. In this section we will see that these theories are equivalent to Brans-Dicke (BD) theory [100] in the presence of a scalar-field potential, by comparing field equations in f(R) theories with those in BD theory. It is possible to construct viable dark energy models based on BD theory with a constant parameter ωBD. We will discuss cosmological dynamics, local gravity constraints, and observational signatures of such generalized theory.

10.1 Brans-Dicke theory and the equivalence with f(R) theories

Let us start with the following 4-dimensional action in BD theory

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over 2}\varphi R - {{{\omega _{{\rm{BD}}}}} \over {2\varphi}}{{(\nabla \varphi)}^2} - U(\varphi)} \right] + {S_M}({g_{\mu \nu}},{\Psi _M}),$$
(10.1)

where ωBD is the BD parameter, U(φ) is a potential of the scalar field φ, and SM is a matter action that depends on the metric gμν and matter fields ΨM. In this section we use the unit \({\kappa ^2} = 8\pi G = 1/M_{{\rm{pl}}}^2 = 1\), but we recover the gravitational constant G and the reduced Planck mass Mpl when the discussion becomes transparent. The original BD theory [100] does not possess the field potential U(φ).

Taking the variation of the action (10.1) with respect to gμν and φ, we obtain the following field equations

$$\begin{array}{*{20}c} {{R_{\mu \nu}}(g) - {1 \over 2}{g_{\mu \nu}}R(g) = {1 \over \varphi}{T_{\mu \nu}} - {1 \over \varphi}{g_{\mu \nu}}U(\varphi) + {1 \over \varphi}({\nabla _\mu}{\nabla _\nu}\varphi - {g_{\mu \nu}}\square \varphi)\quad \quad \quad \quad \;} \\ {+ {{{\omega _{{\rm{BD}}}}} \over {{\varphi ^2}}}\left[ {{\partial _\mu}\varphi {\partial _\nu}\varphi - {1 \over 2}{g_{\mu \nu}}{{(\nabla \varphi)}^2}} \right],} \\ \end{array}$$
(10.2)
$$(3 + 2{\omega _{{\rm{BD}}}})\square \varphi + 4U(\varphi) - 2\varphi {U_{,\varphi}} = T,$$
(10.3)

where R(g) is the Ricci scalar in metric f(R) gravity, and Tμν is the energy-momentum tensor of matter. In order to find the relation with f(R) theories in the metric and Palatini formalisms, we consider the following correspondence

$$\varphi = F(R),\qquad U(\varphi) = {{RF - f} \over 2}.$$
(10.4)

Recall that this potential (which is the gravitational origin) already appeared in Eq. (2.28). We then find that Eqs. (2.4) and (2.7) in metric f(R) gravity are equivalent to Eqs. (10.2) and (10.3) with the BD parameter ωBD = 0. Hence f(R) theory in the metric formalism corresponds to BD theory with ωBD = 0 [467, 579, 152, 246, 112]. In fact we already showed this by rewriting the action (2.1) in the form (2.21). We also notice that Eqs. (9.4) and (9.2) in Palatini f(R) gravity are equivalent to Eqs. (2.4) and (2.7) with the BD parameter ωBD = −3/2. Then f(R) theory in the Palatini formalism corresponds to BD theory with ωBD = −3/2 [262, 470, 551]. Recall that we also showed this by rewriting the action (2.1) in the form (9.8).

One can consider more general theories called scalar-tensor theories [268] in which the Ricci scalar R is coupled to a scalar field φ. The general 4-dimensional action for scalar-tensor theories can be written as

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \;\left[ {{1 \over 2}F(\varphi)R - {1 \over 2}\omega (\varphi){{(\nabla \varphi)}^2} - U(\varphi)} \right] + {S_M}({g_{\mu \nu}},{\Psi _M}),$$
(10.5)

where F(φ) and U(φ) are functions of φ. Under the conformal transformation \({\tilde g_{\mu \nu}} = F{g_{\mu \nu}}\), we obtain the action in the Einstein frame [408, 611]

$${S_E} = \int {{{\rm{d}}^4}} x\sqrt {- \tilde g} \;\left[ {{1 \over 2}\tilde R - {1 \over 2}{{(\tilde \nabla \phi)}^2} - V(\phi)} \right] + {S_M}({F^{- 1}}{\tilde g_{\mu \nu}},{\Psi _M}),$$
(10.6)

where V = U/F2. We have introduced a new scalar field ϕ to make the kinetic term canonical:

$$\phi \equiv \int {\rm{d}} \varphi \,\sqrt {{3 \over 2}{{\left({{{{F_{,\varphi}}} \over F}} \right)}^2} + {\omega \over F}}.$$
(10.7)

We define a quantity Q that characterizes the coupling between the field ϕ and non-relativistic matter in the Einstein frame:

$$Q \equiv - {{{F_{,\phi}}} \over {2F}} = - {{{F_{,\varphi}}} \over F}\;{\left[ {{3 \over 2}\;{{\left({{{{F_{,\varphi}}} \over F}} \right)}^2} + {\omega \over F}} \right]^{- 1/2}}.$$
(10.8)

Recall that, in metric f(R) gravity, we introduced the same quantity Q in Eq. (2.40), which is constant \((Q = - 1\sqrt 6)\). For theories with Q =constant, we obtain the following relations from Eqs. (10.7) and (10.8):

$$F = {e^{- 2Q\phi}},\qquad \omega = (1 - 6{Q^2})F\;{\left({{{{\rm{d}}\phi} \over {{\rm{d}}\varphi}}} \right)^2}.$$
(10.9)

In this case the action (10.5) in the Jordan frame reduces to [596]

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \;\left[ {{1 \over 2}F(\phi)R - {1 \over 2}(1 - 6{Q^2})F(\phi){{(\nabla \phi)}^2} - U(\phi)} \right] + {S_M}({g_{\mu \nu}},{\Psi _M}),\quad {\rm{with}}\quad F(\phi) = {e^{- 2Q\phi}}.$$
(10.10)

In the limit that Q → 0 we have F(ϕ) → 1, so that Eq. (10.10) recovers the action of a minimally coupled scalar field in GR.

Let us compare the action (10.10) with the action (10.1) in BD theory. Setting φ = F = e−2, the former is equivalent to the latter if the parameter ωBD is related to Q via the relation [343, 596]

$$3 + 2{\omega _{{\rm{BD}}}} = {1 \over {2{Q^2}}}.$$
(10.11)

This shows that the GR limit (ωBD → ∞) corresponds to the vanishing coupling (Q → 0). Since \(Q = - 1\sqrt 6\) in metric f(R) gravity one has ωBD = 0, as expected. The Palatini f(R) gravity corresponds to ωBD = −3/2, which corresponds to the infinite coupling (Q2 → ∞). In fact, Palatini gravity can be regarded as an isolated “fixed point” of a transformation involving a special conformal rescaling of the metric [247]. In the Einstein frame of the Palatini formalism, the scalar field ϕ does not have a kinetic term and it can be integrated out. In general, this leads to a matter action which is non-linear, depending on the potential U(ϕ). This large coupling poses a number of problems such as the strong amplification of matter density perturbations and the conflict with the Standard Model of particle physics, as we have discussed in Section 9.

Note that BD theory is one of the examples in scalar-tensor theories and there are some theories that give rise to non-constant values of Q. For example, the action of a nonminimally coupled scalar field with a coupling ξ corresponds to F(φ) = 1 −ξφ2 and ω(φ) = 1, which gives the field-dependent coupling Q(φ) = ξφ/[1 − ξφ2(1 − 6ξ)]1/2. In fact the dynamics of dark energy in such a theory has been studied by a number of authors [22, 601, 151, 68, 491, 44, 505]. In the following we shall focus on the constant coupling models with the action (10.10). We stress that this is equivalent to the action (10.1) in BD theory.

10.2 Cosmological dynamics of dark energy models based on Brans-Dicke theory

The first attempt to apply BD theory to cosmic acceleration is the extended inflation scenario in which the BD field φ is identified as an inflaton field [374, 571]. The first version of the inflation model, which considered a first-order phase transition in BD theory, resulted in failure due to the graceful exit problem [375, 613, 65]. This triggered further study of the possibility of realizing inflation in the presence of another scalar field [394, 78]. In general the dynamics of such a multi-field system is more involved than that in the single-field case [71]. The resulting power spectrum of density perturbations generated during multi-field inflation in BD theory was studied by a number of authors [570, 272, 156, 569].

In the context of dark energy it is possible to construct viable single-field models based on BD theory. In what follows we discuss cosmological dynamics of dark energy models based on the action (10.10) in the flat FLRW background given by (2.12) (see, e.g., [596, 22, 85, 289, 5, 327, 139, 168] for dynamical analysis in scalar-tensor theories). Our interest is to find conditions under which a sequence of radiation, matter, and accelerated epochs can be realized. This depends upon the form of the field potential U(ϕ). We first carry out general analysis without specifying the forms of the potential. We take into account non-relativistic matter with energy density ρm and radiation with energy density ρr. The Jordan frame is regarded as a physical frame due to the usual conservation of non-relativistic matter (ρma−3). Varying the action (10.10) with respect to gμν and ϕ, we obtain the following equations

$$3F\,{H^2} = (1 - 6{Q^2})F{\dot \phi ^2}/2 + U - 3H\,\dot F + {\rho _m} + {\rho _r},$$
(10.12)
$$2F\,\dot H = - (1 - 6{Q^2})F{\dot \phi ^2} - \ddot F + H\,\dot F - {\rho _m} - 4{\rho _r}/3,$$
(10.13)
$$(1 - 6{Q^2})\;F\;\left[ {\ddot \phi + 3H\,\dot \phi + \dot F/(2F)\dot \phi} \right] + {U_{,\phi}} + Q\,F\,R = 0,$$
(10.14)

where F = e−2

We introduce the following dimensionless variables

$${x_1} \equiv {{\dot \phi} \over {\sqrt 6 H}},\qquad {x_2} \equiv {1 \over H}\sqrt {{U \over {3F}}}, \qquad {x_3} \equiv {1 \over H}\sqrt {{{{\rho _r}} \over {3F}}},$$
(10.15)

and also the density parameters

$${\Omega _m} \equiv {{{\rho _m}} \over {3F\,{H^2}}},\qquad {\Omega _r} \equiv x_3^2,\qquad {\Omega _{{\rm{DE}}}} \equiv (1 - 6{Q^2})x_1^2 + x_2^2 + 2\sqrt 6 Q{x_1}.$$
(10.16)

These satisfy the relation Ωm + Ωr + ΩDE = 1 from Eq. (10.12). From Eq. (10.13) it follows that

$${{\dot H} \over {{H^2}}} = - {{1 - 6{Q^2}} \over 2}\;\left({3 + 3x_1^2 - 3x_2^2 + x_3^2 - 6{Q^2}x_1^2 + 2\sqrt 6 Q{x_1}} \right) + 3Q(\lambda x_2^2 - 4Q).$$
(10.17)

Taking the derivatives of x1, x2 and x3 with respect to N = ln a, we find

$$\begin{array}{*{20}c} {{{{\rm{d}}{x_1}} \over {{\rm{d}}N}} = {{\sqrt 6} \over 2}(\lambda x_2^2 - \sqrt 6 {x_1})\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ {+ {{\sqrt 6 Q} \over 2}\;\left[ {(5 - 6{Q^2})x_1^2 + 2\sqrt 6 Q{x_1} - 3x_2^2 + x_3^2 - 1} \right] - {x_1}{{\dot H} \over {{H^2}}},} \\ \end{array}$$
(10.18)
$${{{\rm{d}}{x_2}} \over {{\rm{d}}N}} = {{\sqrt 6} \over 2}(2Q - \lambda){x_1}{x_2} - {x_2}{{\dot H} \over {{H^2}}},$$
(10.19)
$${{{\rm{d}}{x_3}} \over {{\rm{d}}N}} = \sqrt 6 Q{x_1}{x_3} - 2{x_3} - {x_3}{{\dot H} \over {{H^2}}},$$
(10.20)

where λ ≡ − U,ϕ/U.

If λ is a constant, i.e., for the exponential potential U = U0eλϕ, one can derive fixed points for Eqs. (10.18)(10.20) by setting dxi/dN = 0 (i = 1, 2, 3). In Table 1 we list the fixed points of the system in the absence of radiation (x3 = 0). Note that the radiation point corresponds to (x1, x2, x3) = (0, 0, 1). The point (a) is the ϕ-matter-dominated epoch (ϕMDE) during which the density of non-relativistic matter is a non-zero constant. Provided that Q2 ≪ 1 this can be used for the matter-dominated epoch. The kinetic points (b1) and (b2) are responsible neither for the matter era nor for the accelerated epoch (for ∣Q∣ ≲ 1). The point (c) is the scalar-field dominated solution, which can be used for the late-time acceleration for weff < −1/3. When Q2 ≪ 1 this point yields the cosmic acceleration for \(- \sqrt 2 + 4Q < \lambda < \sqrt 2 + 4Q\). The scaling solution (d) can be responsible for the matter era for ∣Q∣≪∣λ∣, but in this case the condition weff < −1/3 for the point (c) leads to λ2 ≲ 2. Then the energy fraction of the pressureless matter for the point (d) does not satisfy the condition Ωm ≃ 1. The point (e) gives rise to the de Sitter expansion, which exists for the special case with λ = 4Q[which can be also regarded as the special case of the point (c)]. From the above discussion the viable cosmological trajectory for constant λ is the sequence from the point (a) to the scalar-field dominated point (c) under the conditions Q2 ≪ 1 and \(- \sqrt 2 + 4Q < \lambda < \sqrt 2 + 4Q\). The analysis based on the Einstein frame action (10.6) also gives rise to the ϕMDE followed by the scalar-field dominated solution [23, 22].

Table 1 The critical points of dark energy models based on the action (10.10) in BD theory with constant λ = −U,ϕ/U in the absence of radiation (x3 = 0). The effective equation of state \({w_{{\rm{eff}}}} = - 1 - 2\dot H/(3{H^2})\) is known from Eq. (10.17).

Let us consider the case of non-constant λ. The fixed points derived above may be regarded as “instantaneous” pointsFootnote 7 [195, 454] varying with the time-scale smaller than H−1. As in metric f(R) gravity \((Q = - 1\sqrt 6)\) we are interested in large coupling models with ∣Q∣ of the order of unity. In order for the potential U(ϕ) to satisfy local gravity constraints, the field needs to be heavy in the region \(R \gg {R_0} \sim H_0^2\) such that ∣λ∣ ≫ 1. Then it is possible to realize the matter era by the point (d) with ∣Q∣ ≪ ∣λ∣. Moreover the solutions can finally approach the de Sitter solution (e) with λ = 4Q or the field-dominated solution (c). The stability of the point (e) was analyzed in [596, 250, 242] by considering linear perturbations δx1, δx2 and δF. One can easily show that the point (e) is stable for

$$Q{{{\rm{d}}\lambda} \over {{\rm{d}}F}}({F_1}) > 0\quad \rightarrow \quad {{{\rm{d}}\lambda} \over {{\rm{d}}\phi}}({\phi _1}) < 0,$$
(10.21)

where F1 = e−21 with ϕ1 being the field value at the de Sitter point. In metric f(R) gravity \((Q = - 1\sqrt 6)\) this condition is equivalent to m = Rf,RR/f,R < 1.

For the f(R) model (5.19) the field ϕ is related to the Ricci scalar R via the relation \({e^{2\phi/\sqrt 6}} = 1 - 2n\mu {(R/{R_c})^{- (2n + 1)}}\). Then the potential U = (FRf)/2 in the Jordan frame can be expressed as

$$U(\phi) = {{\mu {R_c}} \over 2}\;\left[ {1 - {{2n + 1} \over {{{(2n\mu)}^{2n/(2n + 1)}}}}\;{{\left({1 - {e^{2\phi/\sqrt 6}}} \right)}^{2n/(2n + 1)}}} \right].$$
(10.22)

for theories with general couplings Q we consider the following potential [596]

$$U(\phi) = {U_0}\;\left[ {1 - C{{(1 - {e^{- 2Q\phi}})}^p}} \right]\qquad ({U_0} > 0,\;C > 0,\;0 < p < 1),$$
(10.23)

which includes the potential (10.22) in f(R) gravity as a specific case with the correspondence U0 = μRc/2 and C = (2n + 1)/(2)2n/(2n+1), \(Q = - 1/\sqrt 6\), and p = 2n/(2n + 1). The potential behaves as U(ϕ) → U0 for ϕ → 0 and U(ϕ) → U0(1−C) in the limits ϕ → ∞ (for Q > 0) and ϕ → −∞ (for Q < 0). This potential has a curvature singularity at ϕ = 0 as in the models (4.83) and (4.84) of f(R) gravity, but the appearance of the singularity can be avoided by extending the potential to the regions ϕ > 0 (Q < 0) or ϕ < 0 (Q > 0) with a field mass bounded from above. The slope λ = −U,ϕ/U is given by

$$\lambda = {{2Cp\,Q{e^{- 2Q\phi}}{{(1 - {e^{- 2Q\phi}})}^{p - 1}}} \over {1 - C{{(1 - {e^{- 2Q\phi}})}^p}}}.$$
(10.24)

During the radiation and deep matter eras one has R = 6(2H2 + Ḣ) ≃ ρm/F from Eqs. (10.12)(10.13) by noting that U0 is negligibly small relative to the background fluid density. From Eq. (10.14) the field is nearly frozen at a value satisfying the condition U,ϕ + m ≃ 0. Then the field ϕ evolves along the instantaneous minima given by

$${\phi _m} \simeq {1 \over {2Q}}\;{\left({{{2{U_0}pC} \over {{\rho _m}}}} \right)^{1/(1 - p)}}.$$
(10.25)

As long as ρm ≫ 2U0pC we have that ∣ϕm∣ ≪ 1. In this regime the slope λ in Eq. (10.24) is much larger than 1. The field value ∣ϕm∣ increases for decreasing ρm and hence the slope λ decreases with time.

Since λ ≫ 1 around ϕ = 0, the instantaneous fixed point (d) can be responsible for the matter-dominated epoch provided that ∣Q∣ ≪λ. The variable F = e−2 decreases in time irrespective of the sign of the coupling Q and hence 0 < F < 1. The de Sitter point is characterized by λ = 4Q, i.e.,

$$C = {{2{{(1 - {F_1})}^{1 - p}}} \over {2 + (p - 2){F_1}}}.$$
(10.26)

The de Sitter solution is present as long as the solution of this equation exists in the region 0 < F1 < 1. From Eq. (10.24) the derivative of λ in terms of ϕ is given by

$${{{\rm{d}}\lambda} \over {{\rm{d}}\phi}} = - {{4Cp{Q^2}F{{(1 - F)}^{p - 2}}[1 - pF - C{{(1 - F)}^p}]} \over {{{[1 - C{{(1 - F)}^p}]}^2}}}.$$
(10.27)

When 0 < C < 1, we can show that the function g(F) ≡ 1 − pFC(1−F)p is positive and hence the condition dλ/dϕ < 0 is satisfied. This means that the de Sitter point (e) is a stable attractor. When C > 1, the function g(F) can be negative. Plugging Eq. (10.26) into Eq. (10.27), we find that the de Sitter point is stable for

$${F_1} > {1 \over {2 - p}}.$$
(10.28)

If this condition is violated, the solutions choose another stable fixed point [such as the point (c)] as an attractor.

The above discussion shows that for the model (10.23) the matter point (d) can be followed by the stable de Sitter solution (e) for 0 < C < 1. In fact numerical simulations in [596] show that the sequence of radiation, matter and de Sitter epochs can be in fact realized. Introducing the energy density ρDE and the pressure PDE of dark energy as we have done for metric f(R) gravity, the dark energy equation of state wDE = PDE/ρDE is given by the same form as Eq. (4.97). Since for the model (10.23) F increases toward the past, the phantom equation of state (wDE < − 1) as well as the cosmological constant boundary crossing (wDE = − 1) occurs as in the case of metric f(R) gravity [596].

As we will see in Section 10.3, for a light scalar field, it is possible to satisfy local gravity constraints for ∣Q∣ ≲ 10−3. In those cases the potential does not need to be steep such that λ ≫ 1 in the region RR0. The cosmological dynamics for such nearly flat potentials have been discussed by a number of authors in several classes of scalar-tensor theories [489, 451, 416, 271]. It is also possible to realize the condition wDE < −1, while avoiding the appearance of a ghost [416, 271].

10.3 Local gravity constraints

We study local gravity constraints (LGC) for BD theory given by the action (10.10). In the absence of the potential U(ϕ) the BD parameter ωBD is constrained to be ωBD > 4 × 104 from solar-system experiments [616, 83, 617]. This bound also applies to the case of a nearly massless field with the potential U(ϕ) in which the Yukawa correction eMr is close to unity (where M is a scalar-field mass and r is an interaction length). Using the bound ωBD > 4 × 104 in Eq. (10.11), we find that

$$\vert Q\vert \; < 2.5 \times {10^{- 3}}.$$
(10.29)

This is a strong constraint under which the cosmological evolution for such theories is difficult to be distinguished from the ΛCDM model (Q = 0).

If the field potential is present, the models with large couplings\((\vert Q\vert = \mathcal O(1))\) can be consistent with local gravity constraints as long as the mass M of the field ϕ is sufficiently large in the region of high density. For example, the potential (10.23) is designed to have a large mass in the high-density region so that it can be compatible with experimental tests for the violation of equivalence principle through the chameleon mechanism [596]. In the following we study conditions under which local gravity constraints can be satisfied for the model (10.23).

As in the case of metric f(R) gravity, let us consider a configuration in which a spherically symmetric body has a constant density ρA inside the body with a constant density ρ = ρB (≪ ρA) outside the body. For the potential V = U/F2 in the Einstein frame one has V,ϕ ≃ − 2U0QpC(2)p−1 under the condition ∣∣ ≪ 1. Then the field values at the potential minima inside and outside the body are

$${\phi _i} \simeq {1 \over {2Q}}\;{\left({{{2{U_0}\,p\,C} \over {{\rho _i}}}} \right)^{1/(1 - p)}},\qquad i = A,B.$$
(10.30)

The field mass squared \(m_i^2 \equiv {V_{,\phi \phi}}\) at ϕ = ϕi (i = A, B) is approximately given by

$$m_i^2 \simeq {{1 - p} \over {{{({2^p}\,pC)}^{1/(1 - p)}}}}{Q^2}\;{\left({{{{\rho _i}} \over {{U_0}}}} \right)^{(2 - p)/(1 - p)}}{U_0}.$$
(10.31)

Recall that U0 is roughly the same order as the present cosmological density ρ0 ≃ 10−29 g/cm3. The baryonic/dark matter density in our galaxy corresponds to ρB ≃ 10−24 g/cm3. The mean density of Sun or Earth is about \({\rho _A} = \mathcal O(1)\;{\rm{g}}/{\rm{c}}{{\rm{m}}^3}\). Hence mA and mB are in general much larger than H0 for local gravity experiments in our environment. For \({m_A}{{\tilde r}_c} \gg 1\) the chameleon mechanism we discussed in Section 5.2 can be directly applied to BD theory whose Einstein frame action is given by Eq. (10.6) with F = e−2.

The bound (5.56) coming from the possible violation of equivalence principle in the solar system translates into

$${\left({2{U_0}pC/{\rho _B}} \right)^{1/(1 - p)}} < 7.4 \times {10^{- 15}}\,\vert Q\vert.$$
(10.32)

Let us consider the case in which the solutions finally approach the de Sitter point (e) in Table 1. At this de Sitter point we have \(3{F_1}H_1^2 = {U_0}[1 - C{(1 - {F_1})^p}]\) with C given in Eq. (10.26). Then the following relation holds

$${U_0} = 3H_1^2\left[ {2 + (p - 2){F_1}} \right]/p.$$
(10.33)

Substituting this into Eq. (10.32) we obtain

$${\left({{R_1}/{\rho _B}} \right)^{1/(1 - p)}}(1 - {F_1}) < 7.4 \times {10^{- 15}}\vert Q\vert,$$
(10.34)

where \({R_1} = 12H_1^2\) is the Ricci scalar at the de Sitter point. Since (1 − F1) is smaller than 1/2 from Eq. (10.28), it follows that (R1/ρB)1/(1−p) < 1.5 × 10−14Q∣. Using the values R1 = 10−29 g/cm3 and ρB = 10−24 g/cm3, we get the bound for p [596]:

$$p > 1 - {5 \over {13.8 - {{\log}_{10}}\,\vert Q\vert}}.$$
(10.35)

When ∣Q∣ = 10−1 and ∣Q∣ = 1 we have P > 0.66 and p > 0.64, respectively. Hence the model can be compatible with local gravity experiments even for \(\vert Q\vert = \mathcal O(1)\).

10.4 Evolution of matter density perturbations

Let us next study the evolution of perturbations in non-relativistic matter for the action (10.10) with the potential U(ϕ) and the coupling F(ϕ) = e−2. As in metric f(R) gravity, the matter perturbation δm satisfies Eq. (8.93) in the Longitudinal gauge. We define the field mass squared as M2U,ϕϕ. For the potential consistent with local gravity constraints [such as (10.23)], the mass M is much larger than the present Hubble parameter H0 during the radiation and deep matter eras. Note that the condition M2R is satisfied in most of the cosmological epoch as in the case of metric f(R) gravity.

The perturbation equations for the action (10.10) are given in Eqs. (6.11)(6.18) with f = F(ϕ)R, ω = (1 − 6Q2)F, and V = U. We use the unit κ2 = 1, but we restore κ2 when it is necessary. In the Longitudinal gauge one has χ = 0, α = Φ, ψ = −Ψ, and \(A = 3(H\Phi + \dot \Psi)\) in these equations. Since we are interested in sub-horizon modes, we use the approximation that the terms containing k2/a2, δρm, δR, and M2 are the dominant contributions in Eqs. (6.11)(6.19). We shall neglect the contribution of the time-derivative terms of δϕ in Eq. (6.16). As we have discussed for metric f(R) gravity in Section 8.1, this amounts to neglecting the oscillating mode of perturbations. The initial conditions of the field perturbation in the radiation era need to be chosen so that the oscillating mode δϕosc is smaller than the matter-induced mode δϕind. In Fourier space Eq. (6.16) gives

$$\left({{{{k^2}} \over {{a^2}}} + {{{M^2}} \over \omega}} \right)\;\delta {\phi _{{\rm{ind}}}} \simeq {1 \over {2\omega}}{F_{,\phi}}\delta R.$$
(10.36)

Using this relation together with Eqs. (6.13) and (6.18), it follows that

$$\delta {\phi _{{\rm{ind}}}} \simeq {{2QF} \over {({k^2}/{a^2})(1 - 2{Q^2})F + {M^2}}}{{{k^2}} \over {{a^2}}}\Psi.$$
(10.37)

Combing this equation with Eqs. (6.11) and (6.13), we obtain [596, 547] (see also [84, 632, 631])

$${{{k^2}} \over {{a^2}}}\Psi \simeq - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}{{({k^2}/{a^2})(1 - 2{Q^2})F + {M^2}} \over {({k^2}/{a^2})F + {M^2}}},$$
(10.38)
$${{{k^2}} \over {{a^2}}}\Phi \simeq - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}{{({k^2}/{a^2})(1 + 2{Q^2})F + {M^2}} \over {({k^2}/{a^2})F + {M^2}}},$$
(10.39)

where we have recovered κ2. Defining the effective gravitational potential Φeff = (Φ + Ψ)/2, we find that Φeff satisfies the same form of equation as (8.99).

Substituting Eq. (10.39) into Eq. (8.93), we obtain the equation of matter perturbations on sub-horizon scales [with the neglect of the r.h.s. of Eq. (8.93)]

$${\ddot \delta _m} + 2H{\dot \delta _m} - 4\pi {G_{{\rm{eff}}}}{\rho _m}{\delta _m} \simeq 0,$$
(10.40)

where the effective gravitational coupling is

$${G_{{\rm{eff}}}} = {G \over F}{{({k^2}/{a^2})(1 + 2{Q^2})F + {M^2}} \over {({k^2}/{a^2})F + {M^2}}}.$$
(10.41)

In the regime M2/Fk2/a2 (“GR regime”) this reduces to Geff = G/F, so that the evolution of δm and Φeff during the matter domination (Ωm = ρm/(3FH2) ≃ 1) is standard: δmt2/3 and Φeff ∝ constant.

In the regime M2/Fk2/a2 (“scalar-tensor regime”) we have

$${G_{{\rm{eff}}}} \simeq {G \over F}(1 + 2{Q^2}) = {G \over F}{{4 + 2{\omega _{{\rm{BD}}}}} \over {3 + 2{\omega _{{\rm{BD}}}}}},$$
(10.42)

where we used the relation (10.11) between the coupling Q and the BD parameter ωBD. Since ωBD = 0 in f(R) gravity, it follows that Geff = 4G/(3F). Note that the result (10.42) agrees with the effective Newtonian gravitational coupling between two test masses [93, 175]. The evolution of δm and Φeff during the matter dominance in the regime M2/Fk2/a2 is

$${\delta _m} \propto {t^{(\sqrt {25 + 48{Q^2}} - 1)/6}},\qquad {\Phi _{{\rm{eff}}}} \propto {t^{(\sqrt {25 + 48{Q^2}} - 5)/6}}.$$
(10.43)

Hence the growth rate of δm for Q = 0 is larger than that for Q = 0.

As an example, let us consider the potential (10.23). During the matter era the field mass squared around the potential minimum (induced by the matter coupling) is approximately given by

$${M^2} \simeq {{1 - p} \over {{{({2^p}pC)}^{1/(1 - p)}}}}{Q^2}\;{\left({{{{\rho _m}} \over {{U_0}}}} \right)^{(2 - p)/(1 - p)}}{U_0},$$
(10.44)

which decreases with time. The perturbations cross the point M2/F = k2/a2 at time t = tk, which depends on the wavenumber k. Since the evolution of the mass during the matter domination is given by \(M \propto {t^{- {{2 - p} \over {1 - p}}}}\), the time tk has a scale-dependence: \({t_k} \propto {k^{- {{3(1 - p)} \over {4 - p}}}}\). More precisely the critical redshift zk at time tk can be estimated as [596]

$${z_k} \simeq {\left[ {{{\left({{k \over {{a_0}{H_0}}}{1 \over {\vert Q\vert}}} \right)}^{2(1 - p)}}{{{2^p}pC} \over {{{(1 - p)}^{1 - p}}}}{1 \over {{{(3{F_0}\Omega _m^{(0)})}^{2 - p}}}}{{{U_0}} \over {H_0^2}}} \right]^{{1 \over {4 - p}}}} - 1,$$
(10.45)

where the subscript “0” represents present quantities. For the scales 30a0H0k ≲ 600a0 H0, which correspond to the linear regime of the matter power spectrum, the critical redshift can be in the region zk > 1. Note that, for larger p, zk decreases.

When t < tk and t < tk the matter perturbation evolves as δmt2/3 and \({\delta _m} \propto {t^{(\sqrt {25 + 48{Q^2}} - 1)/6}}\), respectively (apart from the epoch of the late-time cosmic acceleration). The matter power spectrum \({P_{{\delta _m}}}\) at time t = tΛ (at which ä = 0) shows a difference compared to the ΛCDM model, which is given by

$${{{P_{{\delta _m}}}({t_\Lambda})} \over {P_{{\delta _m}}^{\Lambda \,{\rm{CDM}}}({t_\Lambda})}} = {\left({{{{t_\Lambda}} \over {{t_k}}}} \right)^{2\left({{{\sqrt {25 + 48{Q^2}} - 1} \over 6} - {2 \over 3}} \right)}} \propto {k^{{{(1 - p)(\sqrt {25 + 48{Q^2}} - 5)} \over {4 - p}}}}.$$
(10.46)

The CMB power spectrum is also modified by the non-standard evolution of the effective gravitational potential Φeff for t > tk. This mainly affects the low multipoles of CMB anisotropies through of the ISW effect. Hence there is a difference between the spectral indices of the matter power spectrum and of the CMB spectrum on the scales (0.01 h Mpc−1k ≲ 0.2 h Mpc−1) [596]:

$$\Delta {n_s}({t_\Lambda}) = {{(1 - p)(\sqrt {25 + 48{Q^2}} - 5)} \over {4 - p}}.$$
(10.47)

Note that this covers the result (8.116) in f(R) gravity (\(Q = - 1\sqrt 6\) and p = 2n/(2n + 1)) as a special case. Under the criterion Δns(tΛ) < 0.05 we obtain the bounds p > 0.957 for Q = 1 and p > 0. 855 for Q = 0.5. As long as p is close to 1, the model can be consistent with both cosmological and local gravity constraints. The allowed region coming from the bounds Δns(tΛ) < 0.05 and (10.35) are illustrated in Figure 9.

Figure 9
figure 9

The allowed region of the parameter space in the (Q, p) plane for BD theory with the potential (10.23). We show the allowed region coming from the bounds Δns(tΛ) < 0.05 and fδ < 2 as well as the the equivalence principle (EP) constraint (10.35).

The growth rate of δm for t > tk is given by \({f_\delta} = {{\dot \delta}_m}/(H{\delta _m}) = (\sqrt {25 + 48{Q^2}} - 1)/4\). As we mentioned in Section 8, the observational bound on fδ is still weak in current observations. If we use the criterion fδ < 2 for the analytic estimation \({f_\delta} = (\sqrt {25 + 48{Q^2}} - 1)/4\), we obtain the bound Q < 1.08 (see Figure 9). The current observational data on the growth rate fδ as well as its growth index γ is not enough to place tight bounds on Q and p, but this will be improved in future observations

11 Relativistic Stars in f(R) Gravity and Chameleon Theories

In Section 5 we discussed the existence of thin-shell solutions in metric f(R) gravity in the Minkowski background, i.e., without the backreaction of metric perturbations For the f(R) dark energy models (4.83) and (4.84), Frolov [266] anticipated that the curvature singularity at ϕ = 0 (shown in Figure 3) can be accessed in a strong gravitational background such as neutron stars. Kobayashi and Maeda [349, 350] studied spherically symmetric solutions for a constant density star with a vacuum exterior and claimed the difficulty of obtaining thin-shell solutions in the presence of the backreaction of metric perturbations. In [594] thin-shell solutions were derived analytically in the Einstein frame of BD theory (including f(R) gravity) under the linear expansion of the gravitational potential Φc at the surface of the body (valid for Φc < 0.3). In fact, the existence of such solutions was numerically confirmed for the inverse power-law potential V(ϕ) = M4+nϕn [594].

For the f(R) models (4.83) and (4.84), it was numerically shown that thin-shell solutions exist for ϕc ≲ 0.3 by the analysis in the Jordan frame [43, 600, 42] (see also [167]). In particular Babichev and Langlois [43, 42] constructed static relativistic stars both for constant energy density configurations and for a polytropic equation of state, provided that the pressure does not exceed one third of the energy density. Since the relativistic pressure tends to be stronger around the center of the spherically symmetric body for larger Φc, the boundary conditions at the center of the body need to be carefully chosen to obtain thin-shell solutions numerically. In this sense the analytic estimation of thin-shell solutions carried out in [594] can be useful to show the existence of static star configurations, although such analytic solutions have been so far derived only for a constant density star.

In the following we shall discuss spherically symmetric solutions in a strong gravitational background with Φc ≲ 0.3 for BD theory with the action (10.10). This analysis covers metric f(R) gravity as a special case (the scalar-field degree of freedom ϕ defined in Eq. (2.31) with \(Q = - 1\sqrt 6\)). While field equations will be derived in the Einstein frame, we can transform back to the Jordan frame to find the corresponding equations (as in the analysis of Babichev and Langlois [42]). In addition to the papers mentioned above, there are also a number of works about spherically symmetric solutions for some equation state of matter [330, 332, 443, 444, 300, 533].

11.1 Field equations

We already showed that under the conformal transformation \({{\tilde g}_{\mu \nu}} = {e^{- 2Q\kappa \phi}}{g_{\mu \nu}}\) the action (10.10) is transformed to the Einstein frame action:

$${S_E} = \int {{{\rm{d}}^4}} x\sqrt {- \tilde g} \left[ {{1 \over {2{\kappa ^2}}}\tilde R - {1 \over 2}{{(\tilde \nabla \phi)}^2} - V(\phi)} \right] + \int {{{\rm{d}}^4}} x\,{{\mathcal L}_M}({e^{2Q\kappa \phi}}{\tilde g_{\mu \nu}},{\Psi _M}).$$
(11.1)

Recall that in the Einstein frame this gives rise to a constant coupling Q between non-relativistic matter and the field ϕ. We use the unit κ2 = 8πG = 1, but we restore the gravitational constant when it is required.

Let us consider a spherically symmetric static metric in the Einstein frame:

$${\rm{d}}{\tilde s^2} = - {e^{2\Psi (\tilde r)}}{\rm{d}}{t^2} + {e^{2\Phi (\tilde r)}}{\rm{d}}{\tilde r^2} + {\tilde r^2}\left({{\rm{d}}{\theta ^2} + {{\sin}^2}\theta {\rm{d}}{\phi ^2}} \right),$$
(11.2)

where \(\Psi (\tilde r)\) and \(\Phi (\tilde r)\) are functions of the distance \({\tilde r}\) from the center of symmetry. For the action (11.1) the energy-momentum tensors for the scalar field ϕ and the matter are given, respectively, by

$$\tilde T_{\mu \nu}^{(\phi)} = {\partial _\mu}\phi {\partial _\nu}\phi - {\tilde g_{\mu \nu}}\left[ {{1 \over 2}{{\tilde g}^{\alpha \beta}}{\partial _\alpha}\phi {\partial _\beta}\phi + V(\phi)} \right],$$
(11.3)
$$\tilde T_{\mu \nu}^{(M)} = - {2 \over {\sqrt {- \tilde g}}}{{\delta {{\mathcal L}_M}} \over {\delta {{\tilde g}^{\mu \nu}}}}.$$
(11.4)

For the metric (11.2) the (00) and (11) components for the energy-momentum tensor of the field are

$$\tilde T_0^{0(\phi)} = - {1 \over 2}{e^{- 2\Phi}}{\phi ^{{\prime}2}} - V(\phi),\qquad \tilde T_1^{1(\phi)} = {1 \over 2}{e^{- 2\Phi}}{\phi ^{{\prime}2}} - V(\phi),$$
(11.5)

where a prime represents a derivative with respect to \({\tilde r}\). The energy-momentum tensor of matter in the Einstein frame is given by \(\tilde T_\nu ^\mu = {\rm{diag}}\;(- {{\tilde \rho}_M},{{\tilde P}_M},{{\tilde P}_M},{{\tilde P}_M})\), which is related to \(T_\nu ^{\mu (M)}\) in the Jordan frame via \(\tilde T_\nu ^{\mu (M)} = {e^{4Q\phi}}T_\nu ^{\mu (M)}\). Hence it follows that \({{\tilde \rho}_M} = {e^{4Q\phi}}{\rho _M}\) and \({{\tilde P}_M} = {e^{4Q\phi}}{P_M}\).

Variation of the action (11.1) with respect to ϕ gives

$$- {\partial _\mu}\left({{{\partial (\sqrt {- \tilde g} {{\mathcal L}_\phi})} \over {\partial ({\partial _\mu}\phi)}}} \right) + {{\partial (\sqrt {- \tilde g} {{\mathcal L}_\phi})} \over {\partial \phi}} + {{\partial {{\mathcal L}_M}} \over {\partial \phi}} = 0,$$
(11.6)

where \({\mathcal L_\phi} = - {(\tilde \nabla \phi)^2}/2 - V(\phi)\) is the field Lagrangian density. Since the derivative of \({{\mathcal L}_M}\) in terms of ϕ is given by Eq. (2.41), i.e., \(\partial {\mathcal L_M}/\partial \phi = \sqrt {- \tilde g} Q(- {{\tilde \rho}_M} + 3{{\tilde P}_M})\), we obtain the equation of the field ϕ [594, 42]:

$$\phi ^{{\prime}{\prime}} + \left({{2 \over {\tilde r}} + \Psi ^{\prime} - \Phi ^{\prime}} \right)\;\phi ^{\prime} = {e^{2\Phi}}\left[ {{V_{,\phi}} + Q({{\tilde \rho}_M} - 3{{\tilde P}_M})} \right],$$
(11.7)

where a tilde represents a derivative with respect to \({\tilde r}\). From the Einstein equations it follows that

$${\Phi ^\prime} = {{1 - {e^{2\Phi}}} \over {2\tilde r}} + 4\pi G\tilde r\left[ {{1 \over 2}{\phi ^{\prime 2}} + {e^{2\Phi}}V(\phi) + {e^{2\Phi}}{{\tilde \rho}_M}} \right],$$
(11.8)
$${\Psi ^\prime} = {{{e^{2\Phi}} - 1} \over {2\tilde r}} + 4\pi G\tilde r\left[ {{1 \over 2}{\phi ^{\prime 2}} - {e^{2\Phi}}V(\phi) + {e^{2\Phi}}{{\tilde P}_M}} \right],$$
(11.9)
$${\Psi ^{\prime \prime}} + {\Psi ^{\prime 2}} - {\Psi ^\prime}{\Phi ^\prime} + {{{\Psi ^\prime} - {\Phi ^\prime}} \over {\tilde r}} = - 8\pi G\left[ {{1 \over 2}{\phi ^{\prime 2}} + {e^{2\Phi}}V(\phi) - {e^{2\Phi}}{{\tilde P}_M}} \right].$$
(11.10)

Using the continuity equation \({\nabla _\mu}T_1^\mu = 0\) in the Jordan frame, we obtain

$$\tilde P_M^\prime + ({\tilde \rho _M} + {\tilde P_M}){\Psi ^\prime} + Q{\phi ^\prime}({\tilde \rho _M} - 3{\tilde P_M}) = 0.$$
(11.11)

In the absence of the coupling Q this reduces to the Tolman-Oppenheimer-Volkoff equation, \(\tilde P_M^{\prime} + ({{\tilde \rho}_M} + {{\tilde P}_M})\Psi ^{\prime} = 0\).

If the field potential V(ϕ) is responsible for dark energy, we can neglect both V(ϕ) and ϕ2 relative to \({{\tilde \rho}_M}\) in the local region whose density is much larger than the cosmological density (ρ0 ∼ 10−29 g/cm3). In this case Eq. (11.8) is integrated to give

$${e^{2\Phi (\tilde r)}} = {\left[ {1 - {{2Gm(\tilde r)} \over {\tilde r}}} \right]^{- 1}},\quad m(\tilde r) = \int\nolimits_0^{\tilde r} 4 \pi {\bar r^2}{\tilde \rho _M}\,{\rm{d}}\bar r.$$
(11.12)

Substituting Eqs. (11.8) and (11.9) into Eq. (11.7), it follows that

$${\phi ^{\prime \prime}} + \left[ {{{1 + {e^{2\Phi}}} \over {\tilde r}} - 4\pi G\tilde r{e^{2\Phi}}({{\tilde \rho}_M} - {{\tilde P}_M})} \right]{\phi ^\prime} = {e^{2\Phi}}\left[ {{V_{,\phi}} + Q({{\tilde \rho}_M} - 3{{\tilde P}_M})} \right].$$
(11.13)

The gravitational potential Φ around the surface of a compact object can be estimated as \(\Phi \approx G{{\tilde \rho}_M}\tilde r_c^2\), where \({{\tilde \rho}_M}\) is the mean density of the star and \({{\tilde r}_c}\) is its radius. Provided that Φ ≪ 1, Eq. (11.13) reduces to Eq. (5.15) in the Minkowski background (note that the pressure \({{\tilde P}_M}\) is also much smaller than the density \({{\tilde \rho}_M}\) for non-relativistic matter).

11.2 Constant density star

Let us consider a constant density star with \({{\tilde \rho}_M} = {{\tilde \rho}_A}\). We also assume that the density outside the star is constant, \({{\tilde \rho}_M} = {{\tilde \rho}_B}\). We caution that the conserved density\(\tilde \rho _M^{(c)}\) in the Einstein frame is given by \(\tilde \rho _M^{(c)} = {e^{- Q\phi}}{{\tilde \rho}_M}\) [343]. However, since the condition ≪ 1 holds in most cases of our interest, we do not distinguish between \(\tilde \rho _M^{(c)}\) and \({{\tilde \rho}_M}\) in the following discussion.

Inside the spherically symmetric body \((0 < \tilde r < {{\tilde r}_c})\), Eq. (11.12) gives

$${e^{2\Phi (\tilde r)}} = {\left({1 - {{8\pi G} \over 3}{{\tilde \rho}_A}{{\tilde r}^2}} \right)^{- 1}}.$$
(11.14)

Neglecting the field contributions in Eqs. (11.8)(11.11), the gravitational background for 0 < \(\tilde r < \tilde r\) is characterized by the Schwarzschild interior solution. Then the pressure \({{\tilde P}_M}(\tilde r)\) inside the body relative to the density \({{\tilde P}_A}\) can be analytically expressed as

$${{{{\tilde P}_M}(\tilde r)} \over {{{\tilde \rho}_A}}} = {{\sqrt {1 - 2({{\tilde r}^2}/\tilde r_c^2){\Phi _c}} - \sqrt {1 - 2{\Phi _c}}} \over {3\sqrt {1 - 2{\Phi _c}} - \sqrt {1 - 2({{\tilde r}^2}/\tilde r_c^2){\Phi _c}}}}\quad \quad (0 < \tilde r < {\tilde r_c}),$$
(11.15)

where Φc is the gravitational potential at the surface of the body:

$${\Phi _c} \equiv {{G{M_c}} \over {{{\tilde r}_c}}} = {1 \over 6}{\tilde \rho _A}\tilde r_c^2.$$
(11.16)

Here \({M_c} = 4\pi \tilde r_c^3{{\tilde \rho}_A}/3\) is the mass of the spherically symmetric body. The density \({{\tilde \rho}_B}\) is much smaller than \({{\tilde \rho}_A}\), so that the metric outside the body can be approximated by the Schwarzschild exterior solution

$$\Phi (\tilde r) \simeq {{G{M_c}} \over {\tilde r}} = {\Phi _c}{{{{\tilde r}_c}} \over {\tilde r}},\qquad {\tilde P_M}(\tilde r) \simeq 0\qquad (\tilde r > {\tilde r_c}).$$
(11.17)

In the following we shall derive the analytic field profile by using the linear expansion in terms of the gravitational potential Φc. This approximation is expected to be reliable for \({\Phi _c} < {\mathcal O}(0.1)\). From Eqs. (11.14)(11.16) it follows that

$$\Phi (\tilde r) \simeq {\Phi _c}{{{{\tilde r}^2}} \over {\tilde r_c^2}},\qquad {{{{\tilde P}_M}(\tilde r)} \over {{{\tilde \rho}_A}}} \simeq {{{\Phi _c}} \over 2}\;\left({1 - {{{{\tilde r}^2}} \over {\tilde r_c^2}}} \right)\qquad (0 < \tilde r < {\tilde r_c}).$$
(11.18)

Substituting these relations into Eq. (11.13), the field equation inside the body is approximately given by

$${\phi ^{\prime \prime}} + {2 \over {\tilde r}}\;\left({1 - {{{{\tilde r}^2}} \over {2\tilde r_c^2}}{\Phi _c}} \right){\phi ^\prime} - ({V_{,\phi}} + Q{\tilde \rho _A})\;\left({1 + 2{\Phi _c}{{{{\tilde r}^2}} \over {\tilde r_c^2}}} \right) + {3 \over 2}Q{\tilde \rho _A}{\Phi _c}\left({1 - {{{{\tilde r}^2}} \over {\tilde r_c^2}}} \right) = 0.$$
(11.19)

if ϕ is close to ϕA at \(\tilde r = 0\), the field stays around ϕA in the region \(0 < \tilde r < {{\tilde r}_1}\). The body has a thin-shell if \({{\tilde r}_1}\) is close to the radius \({{\tilde r}_c}\) of the body.

In the region \(0 < \tilde r < {{\tilde r}_1}\) the field derivative of the effective potential around ϕ = ϕA can be approximated by \({\rm{d}}{V_{{\rm{eff}}}}/{\rm{d}}\phi \, = {V_{,\phi}} + Q{{\tilde \rho}_A} \simeq m_A^2(\phi - {\phi _A})\). The solution to Eq. (11.19) can be obtained by writing the field as ϕ = ϕ0 + δϕ, where ϕ0 is the solution in the Minkowski background and δϕ is the perturbation induced by Φc. At linear order in δϕ and Φc we obtain

$$\delta {\phi ^{\prime \prime}} + {2 \over {\tilde r}}\delta {\phi ^\prime} - m_A^2\delta \phi = {\Phi _c}\;\left[ {{{2m_A^2{{\tilde r}^2}} \over {\tilde r_c^2}}({\phi _0} - {\phi _A}) + {{\tilde r} \over {\tilde r_c^2}}\phi _0^\prime - {3 \over 2}Q{{\tilde \rho}_A}\;\left({1 - {{{{\tilde r}^2}} \over {\tilde r_c^2}}} \right)} \right],$$
(11.20)

where ϕ0 satisfies the equation \(\phi _0^{^{\prime\prime}} + (2/\tilde r)\phi _0^{\prime} - m_A^2({\phi _0} - {\phi _A}) = 0\). The solution of ϕ0 with the boundary conditions dϕ0/dr = 0 at r = 0 is given by \({\phi _0}(\tilde r) = {\phi _A} + A({e^{- {m_A}\tilde r}} - {e^{{m_A}\tilde r}})/\tilde r\), where A is a constant. Plugging this into Eq. (11.20), we get the following solution for \(\phi (\tilde r)\) [594]:

$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _A} + {{A({e^{- {m_A}\tilde r}} - {e^{{m_A}\tilde r}})} \over {\tilde r}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\;} \\ {- {{A{\Phi _c}} \over {{m_A}\tilde r_c^2}}\;\left[ {\left({{1 \over 3}m_A^2{{\tilde r}^2} - {1 \over 4}{m_A}\tilde r - {1 \over 4} + {1 \over {8{m_A}\tilde r}}} \right)\;{e^{{m_A}\tilde r}} + \left({{1 \over 3}m_A^2{{\tilde r}^2} + {1 \over 4}{m_A}\tilde r - {1 \over 4} - {1 \over {8{m_A}\tilde r}}} \right)\;{e^{- {m_A}\tilde r}}} \right]} \\ {- {{3Q{{\tilde \rho}_A}{\Phi _c}} \over {2m_A^4\tilde r_c^2}}\;\left[ {m_A^2({{\tilde r}^2} - \tilde r_c^2) + 6} \right].\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(11.21)

In the region \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\) the field \(\vert \phi (\tilde r)\vert\) evolves towards larger values with increasing \({\tilde r}\). Since the matter coupling term \(Q{{\tilde \rho}_A}\) dominates over V,ϕ in this regime, it follows that \({\rm{d}}{V_{{\rm{eff}}}}/{\rm{d}}\phi \simeq Q{{\tilde \rho}_A}\). Hence the field perturbation δϕ satisfies

$$\delta {\phi ^{\prime \prime}} + {2 \over {\tilde r}}\delta {\phi ^\prime} = {\Phi _c}\left[ {{{\tilde r} \over {\tilde r_c^2}}\phi _0^\prime - {1 \over 2}Q{{\tilde \rho}_A}\;\left({3 - 7\,{{{{\tilde r}^2}} \over {\tilde r_c^2}}} \right)} \right],$$
(11.22)

where ϕ0 obeys the equation \(\phi _0^ + (2/\tilde r)\phi _0^{\prime} - Q{{\tilde \rho}_A} = 0\). Hence we obtain the solution

$$\phi (\tilde r) = - {B \over {\tilde r}}\;\left({1 - {\Phi _c}{{{{\tilde r}^2}} \over {2\tilde r_c^2}}} \right) + C + {1 \over 6}Q{\rho _A}{\tilde r^2}\left({1 - {3 \over 2}{\Phi _c} + {{23} \over {20}}{\Phi _c}{{{{\tilde r}^2}} \over {\tilde r_c^2}}} \right),$$
(11.23)

where B and C are constants.

In the region outside the body \((\tilde r > {{\tilde r}_c})\) the field ϕ climbs up the potential hill after it acquires sufficient kinetic energy in the regime \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\). Provided that the field kinetic energy dominates over its potential energy, the r.h.s. of Eq. (11.13) can be neglected relative to its l.h.s. of it. Moreover the terms that include \({{\tilde \rho}_M}\) and \({{\tilde P}_M}\) in the square bracket on the l.h.s. of Eq. (11.13) is much smaller than the term \((1 + {e^{2\Phi}})/\tilde r\). Using Eq. (11.17), it follows that

$${\phi ^{\prime \prime}} + {2 \over {\tilde r}}\;\left({1 + {{G{M_c}} \over {\tilde r}}} \right)\;{\phi ^\prime} \simeq 0,$$
(11.24)

whose solution satisfying the boundary condition \(\phi (\tilde r \rightarrow \infty) = {\phi _B}\) is

$$\phi (\tilde r) = {\phi _B} + {D \over {\tilde r}}\;\left({1 + {{G{M_c}} \over {\tilde r}}} \right),$$
(11.25)

where D is a constant.

The coefficients A, B, C, D are known by matching the solutions (11.21), (11.23), (11.25) and their derivatives at \(\tilde r = {{\tilde r}_1}\) and \(\tilde r = {{\tilde r}_c}\). If the body has a thin-shell, then the condition \(\Delta {{\tilde r}_c} = {{\tilde r}_c} \ll {{\tilde r}_c}\) is satisfied. Under the linear expansion in terms of the three parameters \(\Delta {{\tilde r}_c}/{{\tilde r}_c},\,{\Phi _c}\), and \(1/({m_A}{{\tilde r}_c})\) we obtain the following field profile [594]:

$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _A} + {{Q{{\tilde \rho}_A}} \over {m_A^2{e^{{m_A}{{\tilde r}_1}}}}}{{{{\tilde r}_1}} \over {\tilde r}}\;{{\left({1 + {{{m_A}\tilde r_1^3{\Phi _c}} \over {3\tilde r_c^2}} - {{{\Phi _c}\tilde r_1^2} \over {4\tilde r_c^2}}} \right)}^{- 1}}({e^{{m_A}\tilde r}} - {e^{- {m_A}\tilde r}})\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {{3Q{{\tilde \rho}_A}{\Phi _c}} \over {2m_A^2}}\;\left[ {1 - {{{{\tilde r}^2}} \over {\tilde r_c^2}} - {6 \over {{{({m_A}{{\tilde r}_c})}^2}}}} \right] + {{{\Phi _c}{{\tilde r}_1}} \over {{m_A}\tilde r_c^2}}{{Q{{\tilde \rho}_A}} \over {m_A^2{e^{{m_A}{{\tilde r}_1}}}}}\;{{\left({1 + {{{m_A}\tilde r_1^3{\Phi _c}} \over {3\tilde r_c^2}} - {{{\Phi _c}\tilde r_1^2} \over {4\tilde r_c^2}}} \right)}^{- 1}}\quad \quad \;\,} \\ {\times \left[ {\left({{1 \over 3}m_A^2{{\tilde r}^2} - {1 \over 4}{m_A}\tilde r - {1 \over 4} + {1 \over {8{m_A}\tilde r}}} \right)\;{e^{{m_A}\tilde r}} + \left({{1 \over 3}m_A^2{{\tilde r}^2} + {1 \over 4}{m_A}\tilde r - {1 \over 4} - {1 \over {8{m_A}\tilde r}}} \right)\;{e^{- {m_A}\tilde r}}} \right]} \\ {(0 < \tilde r < {{\tilde r}_1}),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\,} \\ \end{array}$$
(11.26)
$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _A} + {{Q{{\tilde \rho}_A}\tilde r_c^2} \over 6}\;\left[ {6{\epsilon _{{\rm{th}}}} + 6{C_1}{{{{\tilde r}_1}} \over {\tilde r}}\left({1 - {{{\Phi _c}{{\tilde r}^2}} \over {2\tilde r_c^2}}} \right) - 3\;\left({1 - {{{\Phi _c}} \over 4}} \right) + {{\left({{{\tilde r} \over {{{\tilde r}_c}}}} \right)}^2}\left({1 - {3 \over 2}{\Phi _c} + {{23{\Phi _c}{{\tilde r}^2}} \over {20\tilde r_c^2}}} \right)} \right]} \\ {({{\tilde r}_1} < \tilde r < {{\tilde r}_c}),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(11.27)
$$\phi (\tilde r) = {\phi _A} + Q{\tilde \rho _A}\tilde r_c^2\;\left[ {{\epsilon _{{\rm{th}}}} - {C_2}{{{{\tilde r}_c}} \over {\tilde r}}\;\left({1 + {\Phi _c}{{{{\tilde r}_c}} \over {\tilde r}}} \right)} \right]\qquad (\tilde r > {\tilde r_c}),$$
(11.28)

where ϵth = (ϕBϕA)/(6QΦc) is the thin-shell parameter, and

$$\begin{array}{*{20}c} {{C_1} \equiv (1 - \alpha)\;\left[ {- {\epsilon _{{\rm{th}}}}\left({1 + {{{\Phi _c}\tilde r_1^2} \over {2\tilde r_c^2}}} \right) + {1 \over 2}\;\left({1 - {{{\Phi _c}} \over 4} + {{{\Phi _c}\tilde r_1^2} \over {2\tilde r_c^2}}} \right) - {{\tilde r_1^2} \over {2\tilde r_c^2}}\;\left({1 - {3 \over 2}{\Phi _c} + {{7{\Phi _c}\tilde r_1^2} \over {4\tilde r_c^2}}} \right)} \right]} \\ {+ {{\tilde r_1^2} \over {3\tilde r_c^2}}\;\left({1 - {3 \over 2}{\Phi _c} + {{9{\Phi _c}\tilde r_1^2} \over {5\tilde r_c^2}}} \right),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\;} \\ \end{array}$$
(11.29)
$$\begin{array}{*{20}c} {{C_2} \equiv (1 - \alpha)\;\left[ {{\epsilon_{{\rm{th}}}}{{{{\tilde r}_1}} \over {{{\tilde r}_c}}}\;\left({1 + {{{\Phi _c}\tilde r_1^2} \over {2\tilde r_c^2}} - {{3{\Phi _c}} \over 2}} \right) - {{{{\tilde r}_1}} \over {2{{\tilde r}_c}}}\;\left({1 - {7 \over 4}{\Phi _c} + {{{\Phi _c}\tilde r_1^2} \over {2\tilde r_c^2}}} \right) + {{\tilde r_1^3} \over {2\tilde r_c^3}}\;\left({1 - 3{\Phi _c} + {{7{\Phi _c}\tilde r_1^2} \over {4\tilde r_c^2}}} \right)} \right]} \\ {+ {1 \over 3}\;\left({1 - {6 \over 5}{\Phi _c}} \right) - {{\tilde r_1^3} \over {3\tilde r_c^3}}\;\left({1 - 3{\Phi _c} + {{9{\Phi _c}\tilde r_1^2} \over {5\tilde r_c^2}}} \right),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ \end{array}$$
(11.30)

where

$$\alpha \equiv {{(\tilde r_1^2/3\tilde r_c^2){\Phi _c} + 1/({m_A}{{\tilde r}_1})} \over {1 + (\tilde r_1^2/4\tilde r_c^2){\Phi _c} + ({m_A}\tilde r_1^3{\Phi _c}/3\tilde r_c^2)[1 - (\tilde r_1^2/2\tilde r_c^2){\Phi _c}]}}.$$
(11.31)

As long as \({m_A}{{\tilde r}_1}{\Phi _c} \gg 1\), the parameter α is much smaller than 1.

In order to derive the above field profile we have used the fact that the radius \({{\tilde r}_1}\) is determined by the condition \(m_A^2[\phi ({{\tilde r}_1}) - {\phi _A}] = Q{{\tilde \rho}_A}\), and hence

$${\phi _A} - {\phi _B} = - Q{\tilde \rho _A}\tilde r_c^2\left[ {{{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}}\;\left({1 + {\Phi _c} - {1 \over 2}{{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}}} \right) + {1 \over {{m_A}{{\tilde r}_c}}}\left({1 - {{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}}} \right)\;(1 - \beta)} \right],$$
(11.32)

where β is defined by

$$\beta \equiv {{({m_A}\tilde r_1^3{\Phi _c}/3\tilde r_c^2)(\tilde r_1^2/\tilde r_c^2){\Phi _c}} \over {1 + ({m_A}\tilde r_1^3{\Phi _c}/3\tilde r_c^2) - (\tilde r_1^2/4\tilde r_c^2){\Phi _c}}},$$
(11.33)

which is much smaller than 1. Using Eq. (11.32) we obtain the thin-shell parameter

$${\epsilon_{{\rm{th}}}} = {{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}}\;\left({1 + {\Phi _c} - {1 \over 2}{{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}}} \right) + {1 \over {{m_A}{{\tilde r}_c}}}\;\left({1 - {{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}}} \right)\;(1 - \beta).$$
(11.34)

In terms of a linear expansion of a, β, \(\Delta {{\tilde r}_c}/{{\tilde r}_c}\), Φc, the field profile (11.28) outside the body is

$$\phi (\tilde r) \simeq {\phi _B} - 2{Q_{{\rm{eff}}}}{{G{M_c}} \over {\tilde r}}\;\left({1 + {{G{M_c}} \over {\tilde r}}} \right),$$
(11.35)

where the effective coupling is

$${Q_{{\rm{eff}}}} = 3Q\;\left[ {{{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}}\;\left({1 - {{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}}} \right) + {1 \over {{m_A}{{\tilde r}_c}}}\;\left({1 - 2{{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}} - {\Phi _c} - \alpha - \beta} \right)} \right].$$
(11.36)

To leading-order this gives \({Q_{{\rm{eff}}}} = 3Q\,[\Delta {{\tilde r}_c}/{{\tilde r}_c} + 1/({m_A}{{\tilde r}_c})] = 3Q{\epsilon _{{\rm{th}}}}\), which agrees with the result (5.45) in the Minkowski background. As long as \(\Delta {{\tilde r}_c}/{{\tilde r}_c} \ll 1\) and \(1/({m_A}{{\tilde r}_c}) \ll 1\), the effective coupling \(\vert {Q_{{\rm{eff}}}}\vert\) can be much smaller than the bare coupling ∣Q∣, even in a strong gravitational background.

From Eq. (11.26) the field value and its derivative around the center of the body with radius \(\tilde r \ll 1/{m_A}\) are given by

$$\begin{array}{*{20}c} {\phi (\tilde r) \simeq {\phi _A} + {{2Q{{\tilde \rho}_A}{{\tilde r}_1}} \over {{m_A}{e^{{m_A}{{\tilde r}_1}}}}}\;{{\left({1 + {{{m_A}\tilde r_1^3{\Phi _c}} \over {3\tilde r_c^2}} - {{{\Phi _c}\tilde r_1^2} \over {4\tilde r_c^2}}} \right)}^{- 1}}\left[ {1 + {1 \over 6}{{({m_A}\tilde r)}^2} + {{{\Phi _c}} \over {2{{({m_A}{{\tilde r}_c})}^2}}}} \right]} \\ {+ {{3Q{{\tilde \rho}_A}{\Phi _c}} \over {2m_A^2}}\;\left[ {1 - {{{{\tilde r}^2}} \over {\tilde r_c^2}} - {6 \over {{{({m_A}{{\tilde r}_c})}^2}}}} \right],\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\;} \\ \end{array}$$
(11.37)
$${\phi ^\prime}(\tilde r) \simeq Q{\tilde \rho _A}\tilde r_c^2\left[ {{{2{m_A}{{\tilde r}_1}} \over {3{e^{{m_A}{{\tilde r}_1}}}}}\;{{\left({1 + {{{m_A}\tilde r_1^3{\Phi _c}} \over {3\tilde r_c^2}} - {{{\Phi _c}\tilde r_1^2} \over {4\tilde r_c^2}}} \right)}^{- 1}} - {{3{\Phi _c}} \over {{{({m_A}{{\tilde r}_c})}^2}}}} \right]\;{{\tilde r} \over {\tilde r_c^2}}.$$
(11.38)

In the Minkowski background (Φc = 0), Eq. (11.38) gives\({\phi ^{\prime}}(\tilde r) > 0\) for Q > 0 (or \({\phi ^{\prime}}(\tilde r) < 0\) for Q < 0). In the strong gravitational background (Φc ≠ 0) the second term in the square bracket of Eq. (11.38) can lead to negative\({\phi ^{\prime}}(\tilde r)\) for Q > 0 (or positive \({\phi ^{\prime}}(\tilde r)\) for Q < 0), which leads to the evolution of \(\vert \phi (\tilde r)\vert\) toward 0. This effects comes from the presence of the strong relativistic pressure around the center of the body. Unless the boundary conditions at \(\tilde r = 0\) are appropriately chosen the field tends to evolve toward \(\vert\phi (\tilde r)\vert = 0\), as seen in numerical simulations in [349, 350] for the f(R) model (4.84). However there exists a thin-shell field profile even for \({\phi ^{\prime}}(\tilde r) > 0\) (and \(Q = - 1/\sqrt 6)\)) around the center of the body. In fact, the derivative \({\phi ^{\prime}}(\tilde r)\) can change its sign in the regime \(1/{m_A} < \tilde r < {{\tilde r}_1}\) for thin-shell solutions, so that the field does not reach the curvature singularity at ϕ = 0 [594].

For the inverse power-law potential V(ϕ) = M4+nϕn, the existence of thin-shell solutions was numerically confirmed in [594] for Φc < 0.3. Note that the analytic field profile (11.26) was used to set boundary conditions around the center of the body. In Figure 10 we show the normalized field φ = ϕ/ϕA versus \(\tilde r/{{\tilde r}_c}\) for the model V(ϕ) = M6ϕ−2 with Φc = 0.2, \(\Delta {{\tilde r}_c}/{{\tilde r}_c} = 0.1,\,{m_A}{{\tilde r}_c} = 20\), and Q = 1. While we have neglected the term V,ϕ relative to \(Q{{\tilde \rho}_A}\) to estimate the solution in the region \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\) analytically, we find that this leads to some overestimation for the field value outside the body \((\tilde r > {{\tilde r}_c})\). In order to obtain a numerical field profile similar to the analytic one in the region \(\tilde r > {{\tilde r}_c}\), we need to choose a field value slightly larger than the analytic value around the center of the body. The numerical simulation in Figure 10 corresponds to the choice of such a boundary condition, which explicitly shows the presence of thin-shell solutions even for a strong gravitational background.

Figure 10
figure 10

The thin-shell field profile for the model V = M6ϕ−2 with Φc = 0.2, \(\Delta {{\tilde r}_c} > {{\tilde r}_c} = 0.1,\,\,{m_A}{{\tilde r}_c} = 20\), and Q = 1. This case corresponds to \({{\tilde \rho}_A}/{{\tilde \rho}_B} = 1.04 \times {10^4}\), ϕA = 8.99 × 10−3, ϕB = 1.97 × 10−1 and ϵth = 1.56 × 10−1. The boundary condition of φ = ϕ/ϕA at \({x_i} = {{\tilde r}_i}/{{\tilde r}_c} = {10^{- 5}}\) is φ(xi) = 1.2539010, which is larger than the analytic value ϕ(xi) = 1.09850009. The derivative ϕ′(xi) is the same as the analytic value. The left and right panels show \(\varphi (\tilde r)\) for \(0 < \tilde r/{{\tilde r}_c} < 10\) and \(0 < \tilde r/{{\tilde r}_c} < 2\), respectively. The black and dotted curves correspond to the numerically integrated solution and the analytic field profile (11.26)(11.28), respectively. From [594].

11.3 Relativistic stars in metric f(R) gravity

The results presented above are valid for BD theory including metric f(R) gravity with the coupling \(Q = - 1/\sqrt 6\). While the analysis was carried out in the Einstein frame, thin-shell solutions were numerically found in the Jordan frame of metric f(R) gravity for the models (4.83) and (4.84) [43, 600, 42]. In these models the field \(\phi = \sqrt {3/2} \ln \;F\) in the region of high density (RRc) is very close to the curvature singularity at ϕ = 0. Originally it was claimed in [266, 349] that relativistic stars are absent because of the presence of this accessible singularity. However, as we have discussed in Section 11.2, the crucial point for obtaining thin-shell solutions is not the existence of the curvature singularity but the choice of appropriate boundary conditions around the center of the star. For the correct choice of boundary conditions the field does not reach the singularity and thin-shell field profiles can be instead realized. In the Starobinsky’s model (4.84), static configurations of a constant density star have been found for the gravitational potential Φc smaller than 0.345 [600].

For the star with an equation of state \({{\tilde \rho}_M} < 3{{\tilde P}_M}\), the effective potential of the field ϕ (in the presence of a matter coupling) does not have an extremum, see Eq. (11.7). In those cases the analytic results in Section 11.2 are no longer valid. For the equation of state \({{\tilde \rho}_M} < 3{{\tilde P}_M}\) there is a tachyonic instability that tends to prevent the existence of a static star configuration [42]. For realistic neutron stars, however, the equation of state proposed in the literature satisfies the condition \({{\tilde \rho}_M} < 3{{\tilde P}_M}\) throughout the star.

Babichev and Langlois [43, 42] chose a polytropic equation of state for the energy density \({\rho _M}\) and the pressure \({P_M}\) in the Jordan frame:

$${\rho _M}(n) = {m_B}\left({n + K{{{n^2}} \over {{n_0}}}} \right),\qquad {P_M}(n) = K{m_B}{{{n^2}} \over {{n_0}}},$$
(11.39)

where mB = 1.66 × 10−27 kg, n0 = 0.1 fm−1, and K = 0.1. Solving the continuity equation \({\nabla _\mu}T_\nu ^\mu = 0\) coupled with Einstein equations, [43, 42] showed that \(3{{\tilde P}_M}\) can remain smaller than \({{\tilde \rho}_M}\) for realistic neutron stars. Note that the energy density is a decreasing function with respect to the distance from the center of star. Even for such a varying energy density, static star configurations have been shown to exist [43, 42].

The ratio between the central density ρcenter and the cosmological density at infinity is parameterized by the quantity \({v_0} = M_{{\rm{pl}}}^2{R_c}/{\rho _{{\rm{center}}}}\). Realistic values of υ0 are extremely small and it is a challenging to perform precise numerical simulations in such cases. We also note that the field mass mA in the relativistic star is very much larger than its cosmological mass and hence a very high accuracy is required for solving the field equation numerically [600, 581]. The authors in [43, 600, 42] carried out numerical simulations for the values of υ0 of the order of 10−3–10−4. Figure 11 illustrates an example of the thin-shell field profile for the polytropic equation of state (11.39) in the model (4.84) with n = 1 and υ0 = 10−4 [43]. In the regime \(0 < \tilde r < 1.5\) the field is nearly frozen around the extremum of the effective potential, but it starts to evolve toward its asymptotic value ϕ = ϕB for \(\tilde r > 1.5\).

Figure 11
figure 11

The profile of the field \(\phi = \sqrt {3/2} \ln \;F\) (in units of Mpl) versus the radius \({\tilde r}\) (denoted as r in the figure, in units of \({M_{{\rm{Pl}}}}\rho _{{\rm{center}}}^{- 1/2}\)) for the model (4.84) with n = 1, R/Rc = 3.6, and υ0 = 10−4 (shown as a solid line). The dashed line corresponds to the value ϕmin for the minimum of the effective potential. (Inset) The enlarged figure in the region \(0 < \tilde r < 2.5\). From [43].

Although the above analysis is based on the f(R) models (4.83) and (4.84) having a curvature singularity at ϕ = 0, such a singularity can be cured by adding the R2 term [350]. The presence of the R2 term has an advantage of realizing inflation in the early universe. However, the f(R) models (4.83) and (4.84) plus the R2 term cannot relate the epoch of two accelerations smoothly [37]. An example of viable models that can allow a smooth transition without a curvature singularity is [37]

$$f(R) = (1 - c)R + c\epsilon \ln \;\left[ {{{\cosh (R/\epsilon - b)} \over {\cosh b}}} \right] + {{{R^2}} \over {6{M^2}}},\qquad \epsilon \equiv {{{R_c}} \over {b + \ln (2\cosh b)}},$$
(11.40)

where b, c (0 < c < 1/2), Rc, and M are constants. In [42] a static field profile was numerically obtained even for the model (11.40).

Although we have focused on the stellar configuration with Φc ≲ 0.3, there are also works of finding static or rotating black hole solutions in f(R) gravity [193, 497]. Cruz-Dombriz et al. [193] derived static and spherically symmetric solutions by imposing that the curvature is constant. They also used a perturbative approach around the Einstein-Hilbert action and found that only solutions of the Schwarzschild-Anti de Sitter type are present up to second order in perturbations. The existence of general black hole solutions in f(R) gravity certainly deserves for further detailed study. It will be also of interest to study the transition from neutron stars to a strong-scalar-field state in f(R) gravity [464]. While such an analysis was carried out for a massless field in scalar-tensor theory, we need to take into account the field mass in the region of high density for realistic models of f(R) gravity.

Pun et al. [498] studied physical properties of matter forming an accretion disk in the spherically symmetric metric in f(R) models and found that specific signatures of modified gravity can appear in the electromagnetic spectrum. In [92] the virial theorem for galaxy clustering in metric f(R) gravity was derived by using the collisionless Boltzmann equation. In [398] the construction of traversable wormhole geometries was discussed in metric f(R) gravity. It was found that the choice of specific shape functions and several equations of state gives rise to some exact solutions for f(R).

12 Gauss-Bonnet Gravity

So far we have studied modification to the Einstein-Hilbert action via the introduction of a general function of the Ricci scalar. Among the possible modifications of gravity this may be indeed a very special case. Indeed, one could think of a Lagrangian with all the infinite and possible scalars made out of the Riemann tensor and its derivatives. If one considers such a Lagrangian as a fundamental action for gravity, one usually encounters serious problems in the particle representations of such theories. It is well known that such a modification would introduce extra tensor degrees of freedom [635, 283, 284]. In fact, it is possible to show that these theories in general introduce other particles and that some of them may lead to problems.

For example, besides the graviton, another spin-2 particle typically appears, which however, has a kinetic term opposite in sign with respect to the standard one [572, 67, 302, 465, 153, 303, 99]. The graviton does interact with this new particle, and with all the other standard particles too. The presence of ghosts, implies the existence of particles propagating with negative energy. This, in turn, implies that out of the vacuum a particle (or more than one) and a ghost (or more than one) can appear at the same time without violating energy conservation. This sort of vacuum decay makes each single background unstable, unless one considers some explicit Lorentz-violating cutoff in order to set a typical energy/time scale at which this phenomenon occurs [145, 161].

However, one can treat these higher-order gravity Lagrangians only as effective theories, and consider the free propagating mode only coming from the strongest contribution in the action, the Einstein-Hilbert one, for which all the modes are well behaved. The remaining higher-derivative parts of the Lagrangian can be regarded as corrections at energies below a certain fundamental scale. This scale is usually set to be equal to the Planck scale, but it can be lower, for example, in some models of extra dimensions. This scale cannot be nonetheless equal to the dark energy density today, as otherwise, one would need to consider all these corrections for energies above this scale. This means that one needs to re-sum all these contributions at all times before the dark energy dominance. Another possible approach to dealing with the ghost degrees of freedom consists of using the Euclidean-action path formalism, for which, one can indeed introduce a notion of probability amplitude for these spurious degrees of freedom [294, 162].

The late-time modifications of gravity considered in this review correspond to those in low energy scales. Therefore we have a correction which begins to be important at very low energy scales compared to the Planck mass. In general this means that somehow these corrections cannot be treated any longer as corrections to the background, but they become the dominant contribution. In this case the theory cannot be treated as an effective one, but we need to assume that the form of the Lagrangian is exact, and the theory becomes a fundamental theory for gravity. In this sense these theories are similar to quintessence, that is, a minimally coupled scalar field with a suitable potential. The potential is usually chosen such that its energy scale matches with the dark energy density today. However, for this theory as well, one needs to consider this potential as fundamental, i.e., it does not get quantum corrections that can spoil the form of the potential itself. Still it may not be renormalizable, but so far we do not know any 4-dimensional renormalizable theory of gravity. In this case then, if we introduce a general modification of gravity responsible for the late-time cosmic acceleration, we should prevent this theory from introducing spurious ghost degrees of freedom.

12.1 Lovelock scalar invariants

One may wonder whether it is possible to remove these spin-2 ghosts. To answer this point, one should first introduce the Lovelock scalars [399]. These scalars are particular combinations/contractions of the Riemann tensor which have a fundamental property: if present in the Lagrangian, they only introduce second-order derivative contributions to the equations of motion. Let us give an example of this property [399]. Soon after Einstein proposed General Relativity [226] and Hilbert found the Lagrangian to describe it [301], Kretschmann [372] pointed out that general covariance alone cannot explain the form of the Lagrangian for gravity. In the action he introduced, instead of the Ricci scalar, the scalar which now has been named after him, the Kretschmann scalar:

$$S = \int {{{\rm{d}}^4}x\sqrt {- g} \,{R_{\alpha \beta \gamma \delta}}\,{R^{\alpha \beta \gamma \delta}}.}$$
(12.1)

At first glance this action looks well motivated. The Riemann tensor Rαβγδ is a fundamental tensor for gravitation, and the scalar quantity P1Rαβγδ Rαβγδ can be constructed by just squaring it. Furthermore, it is a theory for which Bianchi identities hold, as the equations of motion have both sides covariantly conserved. However, in the equations of motion, there are terms proportional to \({\nabla _\mu}{\nabla _\nu}{R^\mu}\alpha {\beta ^\nu}\) together with its symmetric partner (αβ). This forces us to give in general at a particular slice of spacetime, together with the metric elements gμν, their first, second, and third derivatives. Hence the theory has many more degrees of freedom with respect to GR.

In addition to the Kretschmann scalar there is another scalar P2RαβRαβ which is quadratic in the Riemann tensor Rαβ. One can avoid the appearance of terms proportional to ΔμΔνRμ(αβ)ν for the scalar quantity,

$${\mathcal G} \equiv {R^2} - 4{R_{\alpha \beta}}\,{R^{\alpha \beta}} + {R_{\alpha \beta \gamma \delta}}\,{R^{\alpha \beta \gamma \delta}},$$
(12.2)

which is called the Gauss-Bonnet (GB) term [572, 67]. If one uses this invariant in the action of D dimensions, as

$$S = \int {{{\rm{d}}^D}} x\sqrt {- g} \,{\mathcal G},$$
(12.3)

then the equations of motion coming from this Lagrangian include only the terms up to second derivatives of the metric. The difference between this scalar and the Einstein-Hilbert term is that this tensor is not linear in the second derivatives of the metric itself. It seems then an interesting theory to study in detail. Nonetheless, it is a topological property of four-dimensional manifolds that \(\sqrt {- g} \;{\mathcal G}\) can be expressed in terms of a total derivative [150], as

$$\sqrt {- g} \,{\mathcal G} = {\partial _\alpha}{{\mathcal D}^\alpha},$$
(12.4)

where

$${{\mathcal D}^\alpha} = \sqrt {- g} \,{\epsilon ^{\alpha \beta \gamma \delta}}{\epsilon _{\rho \sigma}}^{\mu \nu}{\Gamma ^\rho}_{\mu \beta}\left[ {{R^\sigma}_{\nu \gamma \delta}/2 + {\Gamma ^\sigma}_{\lambda \gamma}\,{\Gamma ^\lambda}_{\nu \sigma}/3} \right].$$
(12.5)

then the contribution to the equations of motion disappears for any boundaryless manifold in four dimensions.

In order to see the contribution of the GB term to the equations of motion one way is to couple it with a scalar field ϕ, i.e., \(f(\phi)\;{\mathcal G}\), where f(ϕ) is a function of ϕ. More explicitly the action of such theories is in general given by

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over 2}F(\phi)R - {1 \over 2}\omega (\phi){{(\nabla \phi)}^2} - V(\phi) - f(\phi){\mathcal G}} \right],$$
(12.6)

where F(ϕ), ω(ϕ), and V(ϕ) are functions of ϕ. The GB coupling of this form appears in the low energy effective action of string-theory [275, 273], due to the presence of dilaton-graviton mixing terms.

There is another class of general GB theories with a self-coupling of the form [458],

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \,\left[ {{1 \over 2}R + f({\mathcal G})} \right]\,,$$
(12.7)

where \(f({\mathcal G})\) is a function in terms of the GB term (here we used the unit κ2 = 1). The equations of motion, besides the standard GR contribution, will get contributions proportional to \({\nabla _\mu}{\nabla _\nu}f_{,\mathcal G}\) [188, 189]. This theory possesses more degrees of freedom than GR, but the extra information appears only in a scalar quantity \(f_{,\mathcal G}\) and its derivative. Hence it has less degrees of freedom compared to Kretschmann gravity, and in particular these extra degrees of freedom are not tensor-like. This property comes from the fact that the GB term is a Lovelock scalar. Theories with the more general Lagrangian density R/2 + f(R, P1, P2) have been studied by many people in connection to the dark energy problem [142, 110, 521, 420, 585, 64, 166, 543, 180]. These theories are plagued by the appearance of spurious spin-2 ghosts, unless the Gauss-Bonnet (GB) combination is chosen as in the action (12.7) [465, 153, 447] (see also [110, 181, 109]).

Let us go back to discuss the Lovelock scalars. How many are they? The answer is infinite (each of them consists of linear combinations of equal powers of the Riemann tensor). However, because of topological reasons, the only non-zero Lovelock scalars in four dimensions are the Ricci scalar R and the GB term \({\mathcal G}\). Therefore, for the same reasons as for the GB term, a general function of f(R) will only introduce terms in the equations of motion of the form ΔμΔνF, where F ≡ ∂f/R. Once more, the new extra degrees of freedom introduced into the theory comes from a scalar quantity, F.

In summary, the Lovelock scalars in the Lagrangian prevent the equations of motion from getting extra tensor degrees of freedom. A more detailed analysis of perturbations on maximally symmetric spacetimes shows that, if non-Lovelock scalars are used in the action, then new extra tensor-like degrees of freedom begin to propagate [572, 67, 302, 465, 153, 303, 99]. Effectively these theories, such as Kretschmann gravity, introduce two gravitons, which have kinetic operators with opposite sign. Hence one of the two gravitons is a ghost. In order to get rid of this ghost we need to use the Lovelock scalars. Therefore, in four dimensions, one can in principle study the following action

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \,f(R,{\mathcal G})\,.$$
(12.8)

This theory will not introduce spin-2 ghosts. Even so, the scalar modes need to be considered more in detail: they may still become ghosts. Let us discuss more in detail what a ghost is and why we need to avoid it in a sensible theory of gravity.

12.2 Ghosts

What is a ghost for these theories? A ghost mode is a propagating degree of freedom with a kinetic term in the action with opposite sign. In order to see if a ghost is propagating on a given background, one needs to expand the action at second order around the background in terms of the perturbation fields. After integrating out all auxiliary fields, one is left with a minimal number of gauge-invariant fields \({\vec \phi}\). These are not unique, as we can always perform a field redefinition (e.g., a field rotation). However, no matter which fields are used, we typically need — for non-singular Lagrangians — to define the kinetic operator, the operator which in the Lagrangian appears as \({\mathcal L} = {{\dot \vec \phi}^t}A\dot \vec \phi + \ldots\) [186, 185]. Then the sign of the eigenvalues of the matrix A defines whether a mode is a ghost or not. A negative eigenvalue would correspond to a ghost particle. On a FLRW background the matrix A will be in general time-dependent and so does the sign of the eigenvalues. Therefore one should make sure that the extra scalar modes introduced for these theories do not possess wrong signs in the kinetic term at any time during the evolution of the Universe, at least up to today.

An overall sign in the Lagrangian does not affect the classical equations of motion. However, at the quantum level, if we want to preserve causality by keeping the optical theorem to be valid, then the ghost can be interpreted as a particle which propagates with negative energy, as already stated above. In other words, in special relativity, the ghost would have a four-momentum (Eg, \({{\vec p}_g}\)) with Eg < 0. However it would still be a timelike particle as \(E_g^2 - \vec p_g^2 > 0\), whether Eg is negative or not. The problem arises when this particle is coupled to some other normal particle, because in this case the process 0 = Eg + E1 + E2 + … with Eg < 0 can be allowed. This means in general that for such a theory one would expect the pair creation of ghost and normal particles out of the vacuum. Notice that the energy is still conserved, but the energy is pumped out of the ghost particle.

Since all the particles are coupled at least to gravity, one would think that out of the vacuum particles could be created via the decay of a couple of gravitons emitted in the vacuum into ghosts and non-ghosts particles. This process does lead to an infinite contribution unless one introduces a cutoff for the theory [145, 161], for which one can set observational constraints.

We have already seen that, for metric f(R) gravity, the kinetic operator in the FLRW background reduces to Qs given in Eq. (7.60) with the perturbed action (7.80). Since the sign of Qs is determined by F, one needs to impose F > 0 in order to avoid the propagation of a ghost mode.

12.3 \(f({\mathcal G})\) gravity

Let us consider the theory (12.7) in the presence of matter, i.e.

$$S = {1 \over {{\kappa ^2}}}\int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over 2}R + f({\mathcal G})} \right] + {S_M},$$
(12.9)

where we have recovered κ2. For the matter action SM we consider perfect fluids with an equation of state w. The variation of the action (12.9) leads to the following field equations [178, 383]

$$\begin{array}{*{20}c} {{G_{\mu \nu}} + 8\left[ {{R_{\mu \rho \nu \sigma}} + {R_{\rho \nu}}{g_{\sigma \mu}} - {R_{\rho \sigma}}{g_{\nu \mu}} - {R_{\mu \nu}}{g_{\sigma \rho}} + {R_{\mu \sigma}}{g_{\nu \rho}} + (R/2)\,({g_{\mu \nu}}{g_{\sigma \rho}} - {g_{\mu \sigma}}{g_{\nu \rho}})} \right]{\nabla ^\rho}{\nabla ^\sigma}{f_{,{\mathcal G}}}} \\ {+ ({\mathcal G}{f_{,{\mathcal G}}} - f)\,{g_{\mu \nu}} = {\kappa ^2}\,{T_{\mu \nu}}\,,\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \,} \\ \end{array}$$
(12.10)

where Tμν is the energy-momentum tensor of matter. If \(f \propto {\mathcal G}\), then it is clear that the theory reduces to GR.

12.3.1 Cosmology at the background level and viable \(f({\mathcal G})\) models

In the flat FLRW background the (00) component of Eq. (12.10) leads to

$$3{H^2} = {\mathcal G}{f_{,{\mathcal G}}} - f - 24{H^3}{\dot f_{,{\mathcal G}}} + {\kappa ^2}\left({{\rho _m} + {\rho _r}} \right)\,,$$
(12.11)

where ρm and ρr are the energy densities of non-relativistic matter and radiation, respectively. The cosmological dynamics in \(f({\mathcal G})\) dark energy models have been discussed in [458, 165, 383, 188, 633, 430]. We can realize the late-time cosmic acceleration by the existence of a de Sitter point satisfying the condition [458]

$$3H_1^2 = {{\mathcal G}_1}{f_{,{\mathcal G}}}({{\mathcal G}_1}) - f({{\mathcal G}_1})\,,$$
(12.12)

where H1 and \({{\mathcal G}_1}\) are the Hubble parameter and the GB term at the de Sitter point, respectively. From the stability of the de Sitter point we require the following condition [188]

$$0 < H_1^6{f_{,{\mathcal G}{\mathcal G}}}({H_1}) < 1/384\,.$$
(12.13)

The GB term is given by

$${\mathcal G} = 24{H^2}({H^2} + \dot H) = - 12{H^4}(1 + 3{w_{{\rm{eff}}}})\,,$$
(12.14)

where weff = − 1 − 2/(3H2) is the effective equation of state. We have \({\mathcal G} < 0\) and \(\dot {\mathcal {G}} < 0\) during both radiation and matter domination. The GB term changes its sign from negative to positive during the transition from the matter era \((\mathcal G = - 12{H^4})\) to the de Sitter epoch \(({\mathcal G} = 24{H^4})\). Perturbing Eq. (12.11) about the background radiation and matter dominated solutions, the perturbations in the Hubble parameter involve the mass squared given by \({M^2} \equiv 1/(96{H^4}f_{,\mathcal G\mathcal G})\) [188]. For the stability of background solutions we require that M2 > 0, i.e., \(f_{,{{\mathcal G}{\mathcal G}}} > 0\). Since the term \(24{H^3}{{\dot f}_\mathcal G}\) in Eq. (12.11) is of the order of \({H^8}f_{,{{\mathcal G}{\mathcal G}}}\), this is suppressed relative to 3H2 for \({H^6}f_{,{{\mathcal G}{\mathcal G}}} \ll 1\) during the radiation and matter dominated epochs. In order to satisfy this condition we require that \(f_{,{{\mathcal G}{\mathcal G}}}\) approaches 0 in the limit \(\vert {\mathcal G}\vert \; \rightarrow \infty\). The deviation from the ΛCDM model can be quantified by the following quantity [182]

$$\mu \equiv H{\dot f_{,{\mathcal G}}} = H\dot {\mathcal G}{f_{,{\mathcal G}{\mathcal G}}} = 72{H^6}{f_{,{\mathcal G}{\mathcal G}}}\left[ {(1 + {w_{{\rm{eff}}}})\,(1 + 3{w_{{\rm{eff}}}}) - w_{{\rm{eff}}}^{\prime}/2} \right],$$
(12.15)

where a prime represents a derivative with respect to N = ln a. During the radiation and matter eras we have \(\mu = 192{H^6}f_{,{{\mathcal G}{\mathcal G}}}\) and \(\mu = 72{H^6}f_{,{{\mathcal G}{\mathcal G}}}\), respectively, whereas at the de Sitter attractor μ = 0.

The GB term inside and outside a spherically symmetric body (mass M and radius r) with a homogeneous density is given by \({\mathcal G} = - 48{(G{M_ \odot})^2}/r_{\odot}^6\) and \({\mathcal G} = 48{(G{M_ \odot})^2}/{r^6}\), respectively (r is a distance from the center of symmetry). In the vicinity of Sun or Earth, \(\vert {\mathcal G}\vert\) is much larger than the present cosmological GB term, \({{\mathcal G}_0}\). As we move from the interior to the exterior of the star, the GB term crosses 0 from negative to positive. This means that \(f({\mathcal G})\) and its derivatives with respect to \({\mathcal G}\) need to be regular for both negative and positive values of \({\mathcal G}\) whose amplitudes are much larger than \({{\mathcal G}_0}\).

The above discussions show that viable \(f(\mathcal G)\) models need to obey the following conditions:

  1. 1.

    \(f({\mathcal G})\) and its derivatives \({f_{,\mathcal G}},{f_{,\mathcal G\mathcal G}}, \ldots\) are regular.

  2. 2.

    \(f{,_{{\mathcal G}{\mathcal G}}} > 0\) for all \({\mathcal G}\) and \({f_{,{\mathcal G}{\mathcal G}}}\) approaches +0 in the limit \(\vert {\mathcal G}\vert \; \rightarrow \infty\).

  3. 3.

    \(0 < H_1^6{f_{,\mathcal G\mathcal G}}({H_1}) < 1/384\) at the de Sitter point.

A couple of representative models that can satisfy these conditions are [188]

$$({\rm{A}})\qquad f({\mathcal G}) = \lambda {{\mathcal G} \over {\sqrt {{{\mathcal G}_{\ast}}}}}\,\arctan \left({{{\mathcal G} \over {{{\mathcal G}_{\ast}}}}} \right) - {1 \over 2}\lambda \sqrt {{{\mathcal G}_{\ast}}} \,\ln \left({1 + {{{{\mathcal G}^2}} \over {{\mathcal G}_{\ast}^2}}} \right) - \alpha \lambda \sqrt {{{\mathcal G}_{\ast}}} \,,$$
(12.16)
$$({\rm{B}})\qquad f({\mathcal G}) = \lambda {{\mathcal G} \over {\sqrt {{{\mathcal G}_{\ast}}}}}\,\arctan \left({{{\mathcal G} \over {{{\mathcal G}_{\ast}}}}} \right) - \alpha \lambda \sqrt {{{\mathcal G}_{\ast}}},$$
(12.17)

where α, λ, and \({\mathcal G_{\ast}} \sim H_0^4\) are positive constants. The second derivatives of f in terms of \({\mathcal G}\) for the models (A) and (B) are \({f_{,{\mathcal G}{\mathcal G}}} = \lambda/[\mathcal G_{\ast}^{3/2}(1 + {\mathcal G^2}/\mathcal G_{\ast}^2)]\) and \({f_{,{\mathcal G}{\mathcal G}}} = 2\lambda/[{\mathcal G}_{\ast}^{3/2}{(1 + {{\mathcal G}^2}/{\mathcal G}_{\ast}^2)^2}]\), respectively. They are constructed to give rise to the positive \({f_{,{\mathcal G}{\mathcal G}}}\) for all \({\mathcal G}\). Of course other models can be introduced by following the same prescription. These models can pass the constraint of successful expansion history that allows the smooth transition from radiation and matter eras to the accelerated epoch [188, 633]. Although it is possible to have a viable expansion history at the background level, the study of matter density perturbations places tight constraints on these models. We shall address this issue in Section 12.3.4.

12.3.2 Numerical analysis

In order to discuss cosmological solutions in the low-redshift regime numerically for the models (12.16) and (12.17), it is convenient to introduce the following dimensionless quantities

$$x \equiv {{\dot H} \over {{H^2}}},\quad y \equiv {H \over {{H_{\ast}}}},\quad {\Omega _m} \equiv {{{\kappa ^2}{\rho _m}} \over {3{H^2}}},\quad {\Omega _r} \equiv {{{\kappa ^2}{\rho _r}} \over {3{H^2}}},$$
(12.18)

where \({H_\ast} = G_\ast^{1/4}\). We then obtain the following equations of motion [188]

$$x^{\prime} = - 4{x^2} - 4x + {1 \over {{{24}^2}{H^6}{f_{,{\mathcal G}{\mathcal G}}}}}\left[ {{{{\mathcal G}{f_{,{\mathcal G}}} - f} \over {{H^2}}} - 3(1 - {\Omega _m} - {\Omega _r})} \right],$$
(12.19)
$${\rm{y^{\prime} = xy}}\,,$$
(12.20)
$$\Omega _m^{\prime} = - (3 + 2x){\Omega _m}\,,$$
(12.21)
$$\Omega _r^{\prime} = - (4 + 2x){\Omega _r}\,,$$
(12.22)

where a prime represents a derivative with respect to N = ln a. The quantities \({H^6}f,{\mathcal G}{\mathcal G}\) and \(({\mathcal G}f{,_{\mathcal G}} - f)/{H^2}\) can be expressed by x and y once the model is specified.

Figure 12 shows the evolution of μ and weff without radiation for the model (12.16) with parameters α = 100 and λ = 3 × 10−4. The quantity μ is much smaller than unity in the deep matter era (weff ≃ 0) and it reaches a maximum value prior to the accelerated epoch. This is followed by the decrease of μ toward 0, as the solution approaches the de Sitter attractor with weff = − 1. While the maximum value of μ in this case is of the order of 10−4, it is also possible to realize larger maximum values of μ such as μmax ≳ 0.1.

Figure 12
figure 12

The evolution of μ (multiplied by 104) and weff versus the redshift z = a0/a − 1 for the model (12.16) with parameters α = 100 and λ = 3 × 10−4. The initial conditions are chosen to be x = −1.499985, y = 20, and Ωm = 0.99999. We do not take into account radiation in this simulation. From [182].

For high redshifts the equations become too stiff to be integrated directly. This comes from the fact that, as we go back to the past, the quantity \(f{,_{{\mathcal G}{\mathcal G}}}\) (or μ) becomes smaller and smaller. In fact, this also occurs for viable f(R) dark energy models in which f,RR decreases rapidly for higher z. Here we show an iterative method (known as the “fixed-point” method) [420, 188] that can be used in these cases, provided no singularity is present in the high redshift regime [188]. We define \({{\bar H}^2}\) and \({\bar {\mathcal G}}\) to be \({{\bar H}^2} \equiv {H^2}/H_0^2\) and \(\bar {\mathcal G} \equiv {\mathcal G}/H_0^4\), where the subscript “0” represents present values. The models (A) and (B) can be written in the form

$$f({\mathcal G}) = \bar f({\mathcal G})H_0^2 - \bar \Lambda \,H_0^2\,,$$
(12.23)

where \(\bar \Lambda = \alpha \lambda \sqrt {{G_\ast}}/H_0^2\) and \(\bar f({\mathcal G})\) is a function of \({\mathcal G}\). The modified Friedmann equation reduces to

$${\bar H^2} - \bar H_\Lambda ^2 = {1 \over 3}\,({\bar f_{,\overline {\mathcal G}}}\overline {\mathcal G} - \bar f) - 8{{{\rm{d}}{{\bar f}_{,\overline {\mathcal G}}}} \over {{\rm{d}}N}}\,{\bar H^4}\,,$$
(12.24)

where \(\bar H_\Lambda ^2 = \Omega _m^{(0)}/{a^3} + \Omega _r^{(0)}/{a^4} + \bar \Lambda/3\) (which represents the Hubble parameter in the ΛCDM model). In the following we omit the tilde for simplicity.

In Eq. (12.24) there are derivatives of H in terms of up to second-order. Then we write Eq. (12.24) in the form

$${H^2} - H_\Lambda ^2 = C\left({{H^2},{H^2}^\prime, {H^2}^{\prime \prime}} \right)\,,$$
(12.25)

where \(C = (f{,_{\mathcal G}}{\mathcal G} - f)/3 - 8{H^4}(df{,_{\mathcal G}}/{\rm{d}}N)\). At high redshifts (a ≲ 0.01) the models (A) and (B) are close to the ΛCDM model, i.e., \({H^2} \simeq H_\Lambda ^2\). As a starting guess we set the solution to be \(H_{(0)}^2 = H_\Lambda ^2\). The first iteration is then \(H_{(1)}^2 = H_\Lambda ^2 + {C_{(0)}}\), where \({C_{(0)}} \equiv C\left( {H_{(0)}^2,H{{_{(0)}^2}^\prime },H{{_{(0)}^2}^{\prime \prime }}} \right)\). The second iteration is \(H_{(2)}^2 = H_\Lambda ^2 + {C_{(1)}}\), where \({C_{(1)}} \equiv C\left( {H_{(1)}^2,H{{_{(1)}^2}^\prime },H{{_{(1)}^2}^{\prime \prime }}} \right)\).

If the starting guess is in the basin of a fixed point, \(H_{(i)}^2\) will converge to the solution of the equation after the i-th iteration. For the convergence we need the following condition

$${{H_{i + 1}^2 - H_i^2} \over {H_{i + 1}^2 + H_i^2}} < {{H_i^2 - H_{i - 1}^2} \over {H_i^2 + H_{i - 1}^2}}\,,$$
(12.26)

which means that each correction decreases for larger i. The following relation is also required to be satisfied:

$${{H_{i + 1}^2 - H_\Lambda ^2 - {C_{i + 1}}} \over {H_{i + 1}^2 - H_\Lambda ^2 + {C_{i + 1}}}} < {{H_i^2 - H_\Lambda ^2 - {C_i}} \over {H_i^2 - H_\Lambda ^2 + {C_i}}}.$$
(12.27)

Once the solution begins to converge, one can stop the iteration up to the required/available level of precision. In Figure 13 we plot absolute errors for the model (12.16), which shows that the iterative method can produce solutions accurately in the high-redshift regime. Typically this method stops working when the initial guess is outside the basin of convergence. This happen for low redshifts in which the modifications of gravity come into play. In this regime we just need to integrate Eqs. (12.19)(12.22) directly.

Figure 13
figure 13

Plot of the absolute errors \({\log _{10}}(\vert H_i^2 - H_\Lambda ^2 - {C_i}\vert)\) (left) and \({\log _{10}}\left[ {{{\vert H_i^2 - H_\Lambda ^2 - {C_i}\vert} \over {\vert H_i^2 - H_\Lambda ^2 + {C_i}\vert}}} \right]\) (right) versus N = ln a for the model (12.16) with i = 0, 1, …, 6. The model parameters are α = 10 and λ = 0.075. The iterative method provides the solutions with high accuracy in the regime N ≲ −4. From [188].

12.3.3 Solar system constraints

We study local gravity constraints on cosmologically viable \(f({\mathcal G})\) models. First of all there is a big difference between \(f({\mathcal G})\) and f(R) theories. The vacuum GR solution of a spherically symmetric manifold, the Schwarzschild metric, corresponds to a vanishing Ricci scalar (R = 0) outside the star. In the presence of non-relativistic matter, R approximately equals to the matter density κ2ρm for viable f(R) models.

On the other hand, even for the vacuum exterior of the Schwarzschild metric, the GB term has a non-vanishing value \({\mathcal G} = {R_{\alpha \beta \gamma \delta}}{R^{\alpha \beta \gamma \delta}} = 12r_s^2/{r^6}\) [178, 185], where rs = 2GM/r is the Schwarzschild radius of the object. In the regime \(\vert {\mathcal G}\vert \, \gg {{\mathcal G}_\ast}\) the models (A) and (B) have a correction term of the order \(\lambda \sqrt {{{\mathcal G}_\ast}} {\mathcal G}_\ast^2/{{\mathcal G}^2}\) plus a cosmological constant term \(- (\alpha + 1)\lambda \sqrt {{{\mathcal G}_\ast}}\). Since \({\mathcal G}\) does not vanish even in the vacuum, the correction term \({\mathcal G}_\ast^2/{{\mathcal G}^2}\) can be much smaller than 1 even in the absence of non-relativistic matter. If matter is present, this gives rise to the contribution of the order of R2 ≈ (κ2ρm)2 to the GB term. The ratio of the matter contribution to the vacuum value \({{\mathcal G}^{(0)}} = 12r_s^2/{r^6}\) is estimated as

$${R_m} \equiv {{{R^2}} \over {{{\mathcal G}^{(0)}}}} \approx {{{{(8\pi)}^2}} \over {48}}{{\rho _m^2{r^6}} \over {M_ \odot ^2}}\,.$$
(12.28)

At the surface of Sun (radius r = 6.96 × 1010 cm = 3.53 × 1024 GeV−1 and mass M = 1.99 × 1033 g = 1.12 × 1057 GeV), the density ρm drops down rapidly from the order ρm ≈ 10−2 g/cm3 to the order ρm ≈ 10−16 g/cm3. If we take the value ρm = 10−2 g/cm3 we have Rm ≈ 4 × 10−5 (where we have used 1 g/cm3 = 4.31 × 10−18 GeV4). Taking the value ρm = 10−16 g/cm3 leads to a much smaller ratio: Rm ≈ 4 × 10−33. The matter density approaches a constant value ρm ≈ 10−24 g/cm3 around the distance r = 103r from the center of Sun. Even at this distance we have Rm ≈ 4 × 10−31, which means that the matter contribution to the GB term can be neglected in the solar system we are interested in.

In order to discuss the effect of the correction term \({\mathcal G}_\ast^2/{{\mathcal G}^2}\) on the Schwarzschild metric, we introduce a dimensionless parameter

$$\varepsilon = \sqrt {{{\mathcal G}_{\ast}}/{{\mathcal G}_s}},$$
(12.29)

where \({{\mathcal G}_s} = 12/r_s^4\) is the scale of the GB term in the solar system. Since \(\sqrt {{{\mathcal G}_\ast}}\) is of the order of the Hubble parameter H0 ≈ 70 kmsec−1 Mpc−1, the parameter for the Sun is approximately given by ϵ ≈ 10−46. We can then decompose the vacuum equations in the form

$${G_{\mu \nu}} + \varepsilon {\Sigma _{\mu \nu}} = 0\,,$$
(12.30)

where Gμν is the Einstein tensor and

$$\begin{array}{*{20}c} {{\Sigma _{\mu \nu}} = 8\left[ {{R_{\mu \rho \nu \sigma}} + {R_{\rho \nu}}{g_{\sigma \mu}} - {R_{\rho \sigma}}{g_{\nu \mu}} - {R_{\mu \nu}}{g_{\sigma \rho}} + {R_{\mu \sigma}}{g_{\nu \rho}} + R({g_{\mu \nu}}{g_{\sigma \rho}} - {g_{\mu \sigma}}{g_{\nu \rho}})/2} \right]{\nabla ^\rho}{\nabla ^\sigma}{{\tilde f}_{,{\mathcal G}}}} \\ {+ ({\mathcal G}{{\tilde f}_{,{\mathcal G}}} - \tilde f){g_{\mu \nu}}.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ \end{array}$$
(12.31)

Here \({\tilde f}\) is defined by \(f = \varepsilon \tilde f\).

We introduce the following ansatz for the metric

$${\rm{d}}{s^2} = - A(r,\varepsilon)\,{\rm{d}}{t^2} + {B^{- 1}}(r,\varepsilon){\rm{d}}{r^2} + {r^2}({\rm{d}}{\theta ^2} + {\sin ^2}\theta {\rm{d}}{\phi ^2})\,,$$
(12.32)

where the functions A and B are expanded as power series in ϵ, as

$$A = {A_0}(r) + {A_1}(r)\varepsilon + O({\varepsilon ^2})\,,\qquad B = {B_0}(r) + {B_1}(r)\varepsilon + O({\varepsilon ^2})\,.$$
(12.33)

Then we can solve Eq. (12.30) as follows. At zero-th order the equations read

$${G^\mu}{_\nu ^{(0)}}({A_0},\,{B_0}) = 0\,,$$
(12.34)

which leads to the usual Schwarzschild solution, A0 = B0 = 1 − rs/r. At linear order one has

$$\varepsilon \,[{G^\mu}_\nu {}^{(1)}({A_1},\,{B_1},\,{A_0},\,{B_0}) + {\Sigma ^\mu}_\nu {}^{(0)}({A_0},\,{B_0})] = 0\,.$$
(12.35)

Since A0 and B0 are known, one can solve the differential equations for A1 and B1. This process can be iterated order by order in ε.

For the model (A) introduced in (12.16), we obtain the following differential equations for A1 and B1 [185]:

$$\rho {{{\rm{d}}{B_1}} \over {{\rm{d}}\rho}} + {B_1} = 32\sqrt 3 \lambda {\rho ^3} + 12\sqrt 3 \lambda {\rho ^2}\ln \,(\rho) + (4\ln \varepsilon - 2\alpha - 28)\sqrt 3 \lambda {\rho ^2}\,,$$
(12.36)
$$\begin{array}{*{20}c} {(\rho - {\rho ^2}){{{\rm{d}}{A_1}} \over {{\rm{d}}\rho}} + {A_1} = 8\sqrt 3 \lambda {\rho ^4} - 2\sqrt 3 (10 + 6\ln \rho + 2\ln \varepsilon - \alpha)\lambda {\rho ^3}} \\ {\quad \quad \quad \quad \quad \quad \quad - 2\sqrt 3 (\alpha - 6\ln \rho - 2\ln \varepsilon - 6)\lambda {\rho ^2} + \rho {B_1},} \\ \end{array}$$
(12.37)

where ρr/rs. The solutions to these equations are

$${B_1} = 8\sqrt 3 \lambda {\rho ^3} + 4\sqrt 3 \lambda {\rho ^2}\ln \rho + {2 \over 3}\sqrt 3 \left({2\ln \varepsilon - \alpha - 16} \right)\lambda {\rho ^2}\,,$$
(12.38)
$${A_1} = - {{16} \over 3}\sqrt 3 \lambda {\rho ^3} + {2 \over 3}\sqrt 3 \left({4 - \alpha + 6\ln \rho + 2\ln \varepsilon} \right)\lambda {\rho ^2}\,.$$
(12.39)

Here we have neglected the contribution coming from the homogeneous solution, as this would correspond to an order ϵ renormalization contribution to the mass of the system. Although ϵ ≪ 1, the term in ln ϵ only contributes by a factor of order 102. Since ρ ≪ 1 the largest contributions to B1 and A1 correspond to those proportional to ρ3, which are different from the Schwarzschild-de Sitter contribution (which grows as ρ2). Hence the model (12.16) gives rise to the corrections larger than that in the cosmological constant case by a factor of ρ. Since ϵ is very small, the contributions to the solar-system experiments due to this modification are too weak to be detected. The strongest bound comes from the shift of the perihelion of Mercury, which gives the very mild bound λ < 2 × 105 [185]. For the model (12.17) the constraint is even weaker, λ(1 + α) < 1014. In other words, the corrections look similar to the Schwarzschild-de Sitter metric on which only very weak bounds can be placed.

12.3.4 Ghost conditions in the FLRW background

In the following we shall discuss ghost conditions for the action (12.9). For simplicity let us consider the vacuum case (SM = 0) in the FLRW background. The action (12.9) can be expanded at second order in perturbations for the perturbed metric (6.1), as we have done for the action (6.2) in Section 7.4. Before doing so, we introduce the gauge-invariant perturbed quantity

$${\mathcal R} = \psi - {H \over {\dot \xi}}\delta \xi \,,\quad {\rm{where}}\quad \xi \equiv {f_{,{\mathcal G}}}\,.$$
(12.40)

This quantity completely describes the dynamics of all the scalar perturbations. Note that for the gauge choice δξ = 0 one has \({\mathcal R} = \psi\). Integrating out all the auxiliary fields, we obtain the second-order perturbed action [186]

$$\delta {S^{(2)}} = \int {\rm{d}} t\,{{\rm{d}}^3}x\,{a^3}\,{Q_s}\left[ {{1 \over 2}{{\dot {\mathcal R}}^2} - {1 \over 2}\,{{c_s^2} \over {{a^2}}}{{(\nabla {\mathcal R})}^2}} \right]\,,$$
(12.41)

where we have defined

$${Q_s} \equiv {{24(1 + 4\mu)\,{\mu ^2}} \over {{\kappa ^2}{{(1 + 6\mu)}^2}}},$$
(12.42)
$$c_s^2 \equiv 1 + {{2\dot H} \over {{H^2}}} = - 2 - 3{w_{{\rm{eff}}}}.$$
(12.43)

Recall that μ has been introduced in Eq. (12.15).

In order to avoid that the scalar mode becomes a ghost, one requires that Qs > 0, i.e.

$$\mu > - 1/4.$$
(12.44)

This relation is dynamical, as one requires to know how H and its derivatives change in time. Therefore whatever \(f({\mathcal G})\) is, the propagating scalar mode can still become a ghost. If \({{\dot f}_{,{\mathcal G}}} > 0\) and H > 0, then μ > 0 and hence the ghost does not appear. The quantity cs characterizes the speed of propagation for the scalar mode, which is again dependent on the dynamics. For any GB theory, one can give initial conditions of H and such that \(c_s^2\) becomes negative. This instability, if present, governs the high momentum modes in Fourier space, which corresponds to an Ultra-Violet (UV) instability. In order to avoid this UV instability in the vacuum, we require that the effective equation of state satisfies weff ϕ − 2/3. At the de Sitter point (weff = −1) the speed cs is time-independent and reduces to the speed of light (cs = 1).

Suppose that the scalar mode does not have a ghost mode, i.e., Qs > 0. Making the field redefinition \(u = {z_s}{\mathcal R}\) and \({z_s} = a\sqrt {{Q_s}}\), the action (12.41) can be written as

$$\delta {S^{(2)}} = \int {{\rm{d}}\eta \,{{\rm{d}}^3}x\left[ {{1 \over 2}\,{{u^{\prime}}^2} - {1 \over 2}\,c_s^2{{(\nabla u)}^2} - {1 \over 2}\,{a^2}\,M_s^2\,{u^2}} \right]\,,}$$
(12.45)

where a prime represents a derivative with respect to η = ∫ a−1dt and \(M_s^2 \equiv - z_s^{^{\prime\prime}}/({a^2}{z_s})\). In order to realize the positive mass squared \((M_s^2 > 0)\), the condition \({f_,}_{{\mathcal G}{\mathcal G}} > 0\) needs to be satisfied in the regime μ ≪ 1 (analogous to the condition f, RR > 0 in metric f(R) gravity).

12.3.5 Viability of \(f({\mathcal G})\) gravity in the presence of matter

In the presence of matter, other degrees of freedom appear in the action. Let us take into account a perfect fluid with the barotropic equation of state wM = PM/ρM. It can be proved that, for small scales (i.e., for large momenta k) in Fourier space, there are two different propagation speeds given by [182]

$$c_1^2 = {w_M}\,,$$
(12.46)
$$c_2^2 = 1 + {{2\dot H} \over {{H^2}}} + {{1 + {w_M}} \over {1 + 4\mu}}{{{\kappa ^2}{\rho _M}} \over {3{H^2}}}.$$
(12.47)

The first result is expected, as it corresponds to the matter propagation speed. Meanwhile the presence of matter gives rise to a correction term to \(c_2^2\) in Eq. (12.43). This latter result is due to the fact that the background equations of motion are different between the two cases. Recall that for viable \(f({\mathcal G})\) models one has ∣μ∣≪ 1 at high redshifts. Since the background evolution is approximately given by 3H2 ≃ 8πGρM and Ḣ/H2 ≃ −(3/2)(1 + wM), it follows that

$$c_2^2 \simeq - 1 - 2{w_M}\,.$$
(12.48)

Hence the UV instability can be avoided for wM < −1/2. During the radiation era (wM = 1/3) and the matter era (wM = 0), the large momentum modes are unstable. In particular this leads to the violent growth of matter density perturbations incompatible with the observations of large-scale structure [383, 182]. The onset of the negative instability can be characterized by the condition [182]

$$\mu \approx {(aH/k)^2}\,.$$
(12.49)

As long as μ ≠ 0 we can always find a wavenumber k (≫aH) satisfying the condition (12.49). For those scales linear perturbation theory breaks down, and in principle one should look for all higher-order contributions. Hence the background solutions cannot be trusted any longer, at least for small scales, which makes the theory unpredictable. In the same regime, one can easily see that the scalar mode is not a ghost, as Eq. (12.44) is satisfied (see Figure 12). Therefore the instability is purely classical. This kind of UV instability sets serious problems for any theory, including \(f({\mathcal G})\) gravity.

12.3.6 The speed of propagation in more general modifications of gravity

We shall also discuss more general theories given by Eq. (12.8), i.e.

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \,f(R,{\mathcal G})\,,$$
(12.50)

where we do not take into account the matter term here. It is clear that this function allows more freedom with respect to the background cosmological evolutionFootnote 8, as now one needs a two-parameter function to choose. However, once more the behavior of perturbations proves to be a strong tool in order to have a deep insight into the theory.

The second-order action for perturbations is given by

$$S = \int {\rm{d}} t\,{{\rm{d}}^3}x\,{a^3}\,{Q_s}\left[ {{1 \over 2}\,{{\dot {\mathcal R}}^2} - {1 \over 2}\,{{{B_1}} \over {{a^2}}}{{(\nabla {\mathcal R})}^2} - {1 \over 2}\,{{{B_2}} \over {{a^4}}}{{({\nabla ^2}{\mathcal R})}^2}} \right]\,,$$
(12.51)

where we have introduced the gauge-invariant field

$${\mathcal R} = \psi - {{H(\delta F + 4{H^2}\delta \xi)} \over {\dot F + 4{H^2}\dot \xi}}\,,$$
(12.52)

with Ff, R and \(\xi \equiv {f_,}_{\mathcal G}\). The forms of Qs(t), B1(t) and B2(t) are given explicitly in [186].

The quantity B2 vanishes either on the de Sitter solution or for those theories satisfying

$$\Delta \equiv {{{\partial ^2}f} \over {\partial {R^2}}}{{{\partial ^2}f} \over {\partial {{\mathcal G}^2}}} - {\left({{{{\partial ^2}f} \over {\partial R\partial {\mathcal G}}}} \right)^2} = 0\,.$$
(12.53)

if Δ ≠ 0, then the modes with high momenta k have a very different propagation. Indeed the frequency ω becomes k-dependent, that is [186]

$${\omega ^2} = {B_2}\,{{{k^4}} \over {{a^4}}}.$$
(12.54)

If B2 < 0, then a violent instability arises. If B2 > 0, then these modes propagate with a group velocity

$${\upsilon _g} = 2\sqrt {{B_2}} \,{k \over a}\,.$$
(12.55)

This result implies that the superluminal propagation is always present in these theories, and the speed is scale-dependent. On the other hand, when Δ = 0, this behavior is not present at all. Therefore, there is a physical property by which different modifications of gravity can be distinguished. The presence of an extra matter scalar field does not change this regime at high k [185], because the Laplacian of the gravitational field is not modified by the field coupled to gravity in the form \(f(\phi. R,{\mathcal G})\).

12.4 Gauss-Bonnet gravity coupled to a scalar field

At the end of this section we shall briefly discuss theories with a GB term coupled to a scalar field with the action given in Eq. (12.6). The scalar coupling with the GB term often appears as higher-order corrections to low-energy, tree-level effective string theory based on toroidal compactifications [275, 276]. More explicitly the low-energy string effective action in four dimensions is given by

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} {e^{- \phi}}\left[ {{1 \over 2}R + {1 \over 2}{{(\nabla \phi)}^2} + {{\mathcal L}_M} + {{\mathcal L}_c} \cdots} \right]\,,$$
(12.56)

where ϕ is a dilaton field that controls the string coupling parameter, \(g_s^2 = {e^\phi}\). The above action is the string frame action in which the dilaton is directly coupled to a scalar curvature, R. The Lagrangian \({{\mathcal L}_M}\) is that of additional matter fields (fluids, axion, modulus etc.). The Lagrangian \({{\mathcal L}_c}\) corresponds to higher-order string corrections including the coupling between the GB term and the dilaton. A possible set of corrections include terms of the form [273, 105, 147]

$${{\mathcal L}_c} = - {1 \over 2}\alpha^{\prime}\lambda \zeta (\phi)\left[ {c\,{\mathcal G} + d{{(\nabla \phi)}^4}} \right]\,,$$
(12.57)

where α′ is a string expansion parameter and ζ(ϕ) is a general function of ϕ. The constant λ is an additional parameter which depends on the types of string theories:λ = −1/4, −1/8, and 0 correspond to bosonic, heterotic, and superstrings, respectively. If we require that the full action agrees with the three-graviton scattering amplitude, the coefficients c and d are fixed to be c = −1, d =1, and ζ(ϕ) = −eϕ [425].

In the Pre-Big-Bang (PBB) scenario [275] the dilaton evolves from a weakly coupled regime (gs ≡ 1) toward a strongly coupled region (gs ≳ 1) during which the Hubble parameter grows in the string frame (superinflation). This superinflation is driven by a kinetic energy of the dilaton field and it is called a PBB branch. There exists another Friedmann branch with a decreasing curvature. If \({{\mathcal L}_c} = 0\) these branches are disconnected to each other with the appearance of a curvature singularity. However the presence of the correction \({{\mathcal L}_c}\) allows the existence of non singular solutions that connect two branches [273, 105, 147].

The corrections \({{\mathcal L}_c}\) are the sum of the tree-level α′ corrections and the quantum n-loop corrections (n = 1, 2, 3, …) with the function ζ(ϕ) given by \(\zeta (\phi) = - \sum\nolimits_{n = 0} {{C_{n = 0}}{C_n}{e^{(n - 1)\phi}}}\) where Cn (n ≥ 1) are coefficients of n-loop corrections (with C0 = 1). In the context of the PBB cosmology it was shown in [105] there exist regular cosmological solutions in the presence of tree-level and one-loop corrections, but this is not realistic in that the Hubble rate in Einstein frame continues to increase after the bounce. Nonsingular solutions that connect to a Friedmann branch can be obtained by accounting for the corrections up to two-loop with a negative coefficient (C2 < 0) [105, 147]. In the context of Ekpyrotic cosmology where a negative potential V(ϕ) is present in the Einstein frame, it is possible to realize nonsingular solutions by taking into account corrections similar to \({{\mathcal L}_c}\) given above [588]. For a system in which a modulus field is coupled to the GB term, one can also realize regular solutions even without the higher-derivative term (∇ϕ)4 in Eq. (12.57) [34, 224, 336, 337, 338, 623, 12, 582]. These results show that the GB term can play a crucial role to eliminate the curvature singularity.

In the context of dark energy there are some works which studied the effect of the GB term on the late-time cosmic acceleration. A simple model that can give rise to cosmic acceleration is provided by the action [463]

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over 2}R - {1 \over 2}{{(\nabla \phi)}^2} - V(\phi) - f(\phi)\,{\mathcal G}} \right] + {S_M}\,,$$
(12.58)

where V(ϕ) and f(ϕ) are functions of a scalar field ϕ. This can be viewed as the action in the Einstein frame corresponding to the Jordan frame action (12.56). We note that the conformal transformation gives rise to a coupling between the field ϕ and non-relativistic matter in the Einstein frame. Such a coupling is assumed to be negligibly small at low energy scales, as in the case of the runaway dilaton scenario [274, 176]. For the exponential potential V(ϕ) = V0eλϕ and the coupling f(ϕ) = (f0/μ)eμϕ, cosmological dynamics has been extensively studied in [463, 360, 361, 593] (see also [523, 452, 453, 381]). In particular it was found in [360, 593] that a scaling matter era can be followed by a late-time de Sitter solution which appears due to the presence of the GB term.

Koivisto and Mota [360] placed observational constraints on the above model using the Gold data set of Supernovae Ia together with the CMB shift parameter data of WMAP. The parameter λ is constrained to be 3.5 < λ < 4.5 at the 95% confidence level. In the second paper [361], they included the constraints coming from the BBN, LSS, BAO and solar system data and showed that these data strongly disfavor the GB model discussed above. Moreover, it was shown in [593] that tensor perturbations are subject to negative instabilities in the above model when the GB term dominates the dynamics (see also [290]). Amendola et al. [25] studied local gravity constraints on the model (12.58) and showed that the energy contribution coming from the GB term needs to be strongly suppressed for consistency with solar-system experiments. This is typically of the order of ΩGB ≲ 10−30 and hence the GB term of the coupling \(f(\phi){\mathcal G}\) cannot be responsible for the current accelerated expansion of the universe.

In summary the GB gravity with a scalar field coupling allows nonsingular solutions in the high curvature regime, but such a coupling is difficult to be compatible with the cosmic acceleration at low energy scales. Recall that dark energy models based on \(f({\mathcal G})\) gravity also suffers from the UV instability problem. This shows how the presence of the GB term makes it difficult to satisfy all experimental and observational constraints if such a modification is responsible for the late-time acceleration. This property is different from metric f(R) gravity in which viable dark energy models can be constructed.

13 Other Aspects of f(R) Theories and Modified Gravity

In this section we discuss a number of topics related with f(R) theories and modified gravity. These include weak lensing, thermodynamics and horizon entropy, unified models of inflation and dark energy, f(R) theories in the extra dimensions, Vainshtein mechanism, DGP model, Noether and Galileon symmetries.

13.1 Weak lensing

Weak gravitational lensing is sensitive to the growth of large scale structure as well as the relation between matter and gravitational potentials. Since the evolution of matter perturbations and gravitational potentials is different from that of GR, the observations of weak lensing can provide us an important test for probing modified gravity on galactic scales (see [2, 527, 27, 595, 528, 548] for theoretical aspects and [546, 348, 322, 3, 629, 373, 326, 177, 73] for observational aspects). In particular a number of wide-field galaxy surveys are planned to measure galaxy counts and weak lensing shear with high accuracy, so these will be useful to distinguish between modified gravity and the ΛCDM model in future observations.

Let us consider BD theory with the action (10.10), which includes f(R) gravity as a specific case. Note that the method explained below can be applied to other modified gravity models as well. The equations of matter perturbations δm and gravitational potentials Φ, Ψ in BD theory have been already derived under the quasi-static approximation on sub-horizon scales (kaH), see Eqs. (10.38), (10.39), and (10.40). In order to discuss weak lensing observables, we define the lensing deflecting potential

$${\Phi _{{\rm{wl}}}} \equiv \Phi + \Psi \,,$$
(13.1)

and the effective density field

$${\delta _{{\rm{eff}}}} \equiv - {a \over {3H_0^2\Omega _m^{(0)}}}{k^2}{\Phi _{{\rm{wl}}}},$$
(13.2)

where the subscript “0” represents present values with a0 = 1. Using the relation \({\rho _m} = 3{F_0}H_0^2\Omega _m^{(0)}/{a^3}\) with Eqs. (13.1) and (13.2), it follows that

$${\Phi _{{\rm{wl}}}} = - {{{a^2}} \over {{k^2}}}{{{\rho _m}} \over F}{\delta _m}\,,\qquad {\delta _{{\rm{eff}}}} = {{{F_0}} \over F}{\delta _m}\,.$$
(13.3)

Writing the angular position of a source and the direction of weak lensing observation to be \({{\vec \theta}_S}\) and \({{\vec \theta}_I}\), respectively, the deformation of the shape of galaxies can be quantified by the amplification matrix \({\mathcal A} = {\rm{d}}{{\vec \theta}_S}/{\rm{d}}{{\vec \theta}_I}\). The components of the matrix \({\mathcal A}\) are given by [66]

$${{\mathcal A}_{\mu \nu}} = {I_{\mu \nu}} - \int\nolimits_0^\chi {{{\chi ^{\prime}(\chi - \chi ^{\prime})} \over \chi}} {\partial _{\mu \nu}}{\Phi _{{\rm{wl}}}}[\chi ^{\prime}\vec \theta, \chi ^{\prime}]{\rm{d}}\chi ^{\prime}\,,$$
(13.4)

where \(\chi = \int\nolimits_0^z {{\rm{d}}{z^{\prime}}/H({z^{\prime}})}\) is a comoving radial distance (z is a redshift). The convergence κwl and the shear \(\vec \gamma = ({\gamma _1},{\gamma _2})\) can be derived from the components of the 2 × 2 matrix \({\mathcal A}\), as \({\kappa _{{\rm{wl}}}} = 1 - (1/2){\rm{Tr}}{\mathcal A}\) and \(\vec \gamma = ([{{\mathcal A}_{22}} - {{\mathcal A}_{11}}]/2,{{\mathcal A}_{12}})\). For a redshift distribution p(χ)dχ of the source, the convergence can be expressed as \({\kappa _{{\rm{wl}}}}(\vec \theta) = \int {p(\chi){\kappa _{{\rm{wl}}}}(\vec \theta, \chi){\rm{d}}\chi}\). Using Eqs. (13.2) and (13.4) it follows that

$${\kappa _{{\rm{wl}}}}(\vec \theta) = {3 \over 2}H_0^2\Omega _m^{(0)}\int\nolimits_0^{{\chi _H}} g (\chi)\chi {{{\delta _{{\rm{eff}}}}[\chi \,\vec \theta, \chi ]} \over a}{\rm{d}}\chi \,,$$
(13.5)

where χH is the maximum distance to the source and \(g(\chi) \equiv \int\nolimits_\chi ^{{\chi _H}} p ({\chi ^{\prime}})({\chi ^{\prime}} - \chi)/{\chi ^{\prime}}{\rm{d}}{\chi ^{\prime}}\).

The convergence is a function on the 2-sphere and hence it can be expanded in the form \({\kappa _{{\rm{wl}}}}(\vec \theta) = \int {{{\hat \kappa}_{{\rm{wl}}}}} (\vec \ell){e^{i\vec \ell \cdot \vec \theta}}{{{{\rm{d}}^2}\vec \ell} \over {2\pi}}\), where \(\vec \ell = ({\ell _1},{\ell _2})\) with 1 and 2 integers. We define the power spectrum of the shear to be \(\langle {{\hat \kappa}_{{\rm{wl}}}}(\vec \ell)\hat \kappa _{{\rm{wl}}}^{\ast}({{\vec \ell}^{\prime}})\rangle = {P_\kappa}(\ell){\delta ^{(2)}}(\vec \ell - {{\vec \ell}^{\prime}})\). Then the convergence has the same power spectrum as Pκ, which is given by [66, 601]

$${P_\kappa}(\ell) = {{9H_0^4{{(\Omega _m^{(0)})}^2}} \over 4}\int\nolimits_0^{{\chi _H}} {{{\left[ {{{g(\chi)} \over {a(\chi)}}} \right]}^2}} {P_{{\delta _{{\rm{eff}}}}}}\left[ {{\ell \over \chi},\chi} \right]{\rm{d}}\chi \,.$$
(13.6)

We assume that the sources are located at the distance χs (corresponding to the redshift zs), giving the relations p(χ) = δ(χχs) and g(χ) = (χsχ)/χs. From Eq. (13.3) \({P_{{\delta _{{\rm{eff}}}}}}\) can be expressed as \({P_{{\delta _{{\rm{eff}}}}}} = {({F_0}/F)^2}{P_{{\delta _m}}}\), where \({P_{{\delta _{\rm{m}}}}}\) is the matter power spectrum. Hence the convergence spectrum (13.6) reads

$${P_\kappa}(\ell) = {{9H_0^4{{(\Omega _m^{(0)})}^2}} \over 4}\int\nolimits_0^{{\chi _s}} {{{\left({{{{\chi _s} - \chi} \over {{\chi _s}a}}{{{F_0}} \over F}} \right)}^2}} {P_{{\delta _m}}}\left[ {{\ell \over \chi},\chi} \right]{\rm{d}}\chi.$$
(13.7)

We recall that, during the matter era, the transition from the GR regime (δmt2/3 and Φwl = constant) to the scalar-tensor regime (\(({\delta _m} \propto {t^{(\sqrt {25 + 48{Q^2}} - 1)/6}}\) and \({\Phi _{{\rm{wl}}}} \propto {t^{(\sqrt {25 + 48{Q^2}} - 5)/6}}\)) occurs at the redshift zk characterized by the condition (10.45). Since the early evolution of perturbations is similar to that in the ΛCDM model, the weak lensing potential at late times is given by the formula [214]

$${\Phi _{{\rm{wl}}}}(k,a) = {9 \over {10}}{\Phi _{{\rm{wl}}}}(k,{a_i})T(k){{D(k,a)} \over a}\,,$$
(13.8)

where Φwl(k, ai) ≃ 2Φ(k, ai) is the initial potential generated during inflation, T(k) is a transfer function describing the epochs of horizon crossing and radiation/matter transition (50 ≲ z ≲ 106), and D(k, a) is the growth function at late times defined by D(k, a)/a = Φwl(a)/Φwl(aI) (aI corresponds to the scale factor at a redshift 1 ≪ zI < 50). Our interest is the case in which the transition redshift zk is smaller than 50, so that we can use the standard transfer function of Bardeen-Bond-Kaiser-Szalay [58]:

$$T(x) = {{\ln (1 + 0.171x)} \over {0.171x}}{\left[ {1.0 + 0.284x + {{(1.18x)}^2} + {{(0.399x)}^3} + {{(0.490x)}^4}} \right]^{- 0.25}},$$
(13.9)

where x = k/kEQ and \({k_{{\rm{EQ}}}} = 0.073\Omega _m^{(0)}{h^2}{\rm{Mp}}{{\rm{c}}^{- 1}}\). In the ΛCDM model the growth function D(k, a) during the matter dominance is scale-independent (D(a) = a), but in BD theory with the action (10.10) the growth of perturbations is generally scale-dependent.

From Eqs. (13.2) and (13.8) we obtain the matter perturbation δm for z < zI:

$${\delta _m}(k,a) = - {3 \over {10}}{F \over {{F_0}}}{{{k^2}} \over {\Omega _m^{(0)}H_0^2}}{\Phi _{{\rm{wl}}}}(k,{a_i})T(k)D(k,a)\,.$$
(13.10)

The initial power spectrum generated during inflation is \({P_{{\Phi _{{\rm{wl}}}}}} \equiv 4\vert \Phi {\vert ^2} = (200{\pi ^2}/9{k^3}){(k/{H_0})^{{n_\Phi} - 1}}\delta _H^2\), where nΦ is the spectral index and \(\delta _H^2\) is the amplitude of Φwl [71, 214]. Therefore we obtain the power spectrum of matter perturbations, as

$${P_{{\delta _m}}}(k,a) \equiv \vert {\delta _m}{\vert ^2} = 2{\pi ^2}{\left({{F \over {{F_0}}}} \right)^2}{{{k^{{n_\Phi}}}} \over {{{(\Omega _m^{(0)})}^2}H_0^{{n_\Phi} + 3}}}\delta _H^2{T^2}(k){D^2}(k,a).$$
(13.11)

Plugging Eq. (13.11) into Eq. (13.7), we find that the convergence spectrum is given by

$${P_\kappa}(\ell) = {{9{\pi ^2}} \over 2}\int\nolimits_0^{{z_s}} {{{\left({1 - {X \over {{X_s}}}} \right)}^2}} {1 \over {E(z)}}\delta _H^2{\left({{\ell \over X}} \right)^{{n_\Phi}}}{T^2}(x){\left({{{{\Phi _{{\rm{wl}}}}(z)} \over {{\Phi _{{\rm{wl}}}}({z_I})}}} \right)^2}{\rm{d}}z\,,$$
(13.12)

where

$$E(z) = {{H(z)} \over {{H_0}}}\,,\qquad X = {H_0}\chi \,,\qquad x = {{{H_0}} \over {{k_{{\rm{EQ}}}}}}{\ell \over X}\,.$$
(13.13)

Note that X satisfies the differential equation dX/dz = 1/E(z).

In Figure 14 we plot the convergence spectrum in f(R) gravity with the potential (10.23) for two different values of p together with the ΛCDM spectrum. Recall that this model corresponds to the f(R) model f(R) = RμRc [1 − (R/Rc)−2n] with the correspondence p = 2n/(2n + 1). Figure 14 shows the convergence spectrum in the linear regime characterized by ≲ 200. The ΛCDM model corresponds to the limit n → ∞, i.e., p → 1. The deviation from the ΛCDM model becomes more significant for smaller p away from 1. Since the evolution of Φwl changes from Φwl = constant to \({\Phi _{{\rm{wl}}}} \propto {t^{(\sqrt {25 + 48{Q^2}} - 5)/6}}\) at the transition time t characterized by the condition M2/F = (ℓ/χ)2/a2, this leads to a difference of the spectral index of the convergence spectrum compared to that of the ΛCDM model [595]:

$${{{P_\kappa}(\ell)} \over {P_\kappa ^{\Lambda {\rm{CDM}}}(\ell)}} \propto {\ell ^{\Delta {n_s}}}\,,\quad {\rm{where}}\quad \Delta {n_s} = {{(1 - p)(\sqrt {25 + 48{Q^2}} - 5)} \over {4 - p}}\,.$$
(13.14)

This estimation is reliable for the transition redshift z much larger than 1. In the simulation of Figure 14 the numerical value of Δns for p = 0.7 at = 200 is 0.056 (with z = 3.26), which is slightly smaller than the analytic value Δns = 0.068 estimated by Eq. (13.14). The deviation of the spectral index of Pκ from the ΛCDM model will be useful to probe modified gravity in future high-precision observations. Note that the galaxy-shear correlation spectrum will be also useful to constrain modified gravity models [528].

Figure 14
figure 14

The convergence power spectrum Pκ() in f(R) gravity \((Q = - 1/\sqrt 6)\) for the model (5.19). This model corresponds to the field potential (10.23). Each case corresponds to (a) p = 0.5, C = 0.9, (b) p = 0.7, C = 0.9, and (c) the ΛCDM model. The model parameters are chosen to be \(\Omega _m^{(0)} = 0.28\), nΦ = 1, and \(\delta _H^2 = 3.2 \times {10^{- 10}}\). From [595].

Recent data analysis of the weak lensing shear field from the Hubble Space Telescope’s COSMOS survey along with the ISW effect of CMB and the cross-correlation between the ISW and galaxy distributions from 2MASS and SDSS surveys shows that the anisotropic parameter η = Ψ/Φ is constrained to be η < 1 at the 98% confidence level [73]. For BD theory with the action (10.10) the quasi-static results (10.38) and (10.39) of the gravitational potentials give

$$\eta \simeq {{({k^2}/{a^2})(1 - 2{Q^2})F + {M^2}} \over {({k^2}/{a^2})(1 + 2{Q^2})F + {M^2}}}\,.$$
(13.15)

Since η ≃ (1 − 2Q2)/(1 + 2Q2) in the scalar-tensor regime (k2/a2M2/F), one can realize η < 1 in BD theory Of course we need to wait for further observational data to reach the conclusion that modified gravity is favored over the ΛCDM model.

To conclude this session we would like to point out the possibility of using the method of gravitational lensing tomography [574]. This method consists of considering lensing on different redshift data-bins. In order to use this method, one needs to know the evolution of both the linear growth rate and the non-linear one (typically found through a standard linear-to-non-linear mapping). Afterward, from observational data, one can separate different bins in order to make fits to the models. Having good data sets, this procedure is strong enough to further constrain the models, especially together with other probes such as CMB [322, 320, 632, 292].

13.2 Thermodynamics and horizon entropy

It is known that in Einstein gravity the gravitational entropy S of stationary black holes is proportional to the horizon area A, such that S = A/(4G), where G is gravitational constant [75]. A black hole with mass M obeys the first law of thermodynamics, TdS = dM [59], where T = κs/(2π) is a Hawking temperature determined by the surface gravity κs [293]. This shows a deep physical connection between gravity and thermodynamics. In fact, Jacobson [324] showed that Einstein equations can be derived by using the Clausius relation TdS = dQ on local horizons in the Rindler spacetime together with the relation SA, where dQ and T are the energy flux across the horizon and the Unruh temperature seen by an accelerating observer just inside the horizon respectively.

Unlike stationary black holes the expanding universe with a cosmic curvature K has a dynamically changing apparent horizon with the radius \({{\bar r}_A} = {({H^2} + K/{a^2})^{- 1/2}}\), where K is a cosmic curvature [108] (see also [296]). Even in the FLRW spacetime, however, the Friedmann equation can be written in the thermodynamical form TdS = − dE + WdV, where W is the work density present in the dynamical background [8]. For matter contents of the universe with energy density ρ and pressure P, the work density is given by W = (ρP)/2 [297, 298]. This method is identical to the one established by Jacobson [324], that is, dQ = − dE + WdV.

In metric f(R) gravity Eling et al. [228] showed that a non-equilibrium treatment is required such that the Clausius relation is modified to dS = dQ/T + diS, where S = FA/(4G) is the Wald horizon entropy [610] and diS is the bulk viscosity entropy production term. Note that S corresponds to a Noether charge entropy. Motivated by this work, the connections between thermodynamics and modified gravity have been extensively discussed — including metric f(R) gravity [6, 7, 281, 431, 619, 620, 230, 103, 51, 50, 157] and scalar-tensor theory [281, 619, 620, 108].

Let us discuss the relation between thermodynamics and modified gravity for the following general action [53]

$$I = \int {{\rm{d}}^4} x\sqrt {- g} \left[ {{{f(R,\phi, X)} \over {16\pi G}} + {{\mathcal L}_M}} \right]\,,$$
(13.16)

where X ≡ − (1/2) gμνμϕ∇νϕ is a kinetic term of a scalar field ϕ. For the matter Lagrangian \({{\mathcal L}_M}\) we take into account perfect fluids (radiation and non-relativistic matter) with energy density ρM and pressure PM. In the FLRW background with the metric \({\rm{d}}{s^2} = {h_{\alpha \beta}}{\rm{d}}{x^\alpha}{\rm{d}}{x^\beta} + {{\bar r}^2}{\rm{d}}{\Omega ^2}\), where \(\bar r = a(t)r\) and x0 = t, x1 = r with the two dimensional metric hαβ = diag(−1, a2(t)/[1 − Kr2]), the Friedmann equations are given by

$${H^2} + {K \over {{a^2}}} = {{8\pi G} \over {3F}}\left({{\rho _d} + {\rho _M}} \right)\,,$$
(13.17)
$$\dot H - {K \over {{a^2}}} = - {{4\pi G} \over F}\left({{\rho _d} + {P_d} + {\rho _M} + {P_M}} \right)\,,$$
(13.18)
$${\dot \rho _M} + 3H({\rho _M} + {P_M}) = 0\,,$$
(13.19)

where F ≡ ∂f/R and

$${\rho _d} \equiv {1 \over {8\pi G}}\left[ {{f_{,X}}X + {1 \over 2}(FR - f) - 3H\dot F} \right]\,,$$
(13.20)
$${P_d} \equiv {1 \over {8\pi G}}\left[ {\ddot F + 2H\dot F - {1 \over 2}(FR - f)} \right]\,.$$
(13.21)

Note that ρd and Pd originate from the energy-momentum tensor \(T_{\mu \nu}^{(d)}\) defined by

$$T_{\mu \nu}^{(d)} \equiv {1 \over {8\pi G}}\left[ {{1 \over 2}{g_{\mu \nu}}(f - RF) + {\nabla _\mu}{\nabla _\nu}F - {g_{\mu \nu}} \square F + {1 \over 2}{f_{,X}}{\nabla _\mu}\phi {\nabla _\nu}\phi} \right]\,,$$
(13.22)

where the Einstein equation is given by

$${G_{\mu \nu}} = {{8\pi G} \over F}\left({T_{\mu \nu}^{(d)} + T_{\mu \nu}^{(M)}} \right)\,.$$
(13.23)

Defining the density ρd and the pressure Pd of “dark” components in this way, they obey the following equation

$${\dot \rho _d} + 3H({\rho _d} + {P_d}) = {3 \over {8\pi G}}({H^2} + K/{a^2})\dot F\,.$$
(13.24)

For the theories with ≠ 0 (including f(R) gravity and scalar-tensor theory) the standard continuity equation does not hold because of the presence of the last term in Eq. (13.24).

In the following we discuss the thermodynamical property of the theories given above. The apparent horizon is determined by the condition \({h^{\alpha \beta}}{\partial _\alpha}\bar r{\partial _\beta}\bar r = 0\), which gives \({{\bar r}_A} = {({H^2} + K/{a^2})^{- 1/2}}\) in the FLRW spacetime. Taking the differentiation of this relation with respect to t and using Eq. (13.18), we obtain

$${{F{\rm{d}}{{\bar r}_A}} \over {4\pi G}} = \bar r_A^3H\left({{\rho _d} + {P_d} + {\rho _M} + {P_M}} \right){\rm{d}}t\,.$$
(13.25)

In Einstein gravity the horizon entropy is given by the Bekenstein-Hawking entropy S = A/(4G), where \(A = 4\pi \bar r_A^2\) is the area of the apparent horizon [59, 75, 293]. In modified gravity theories one can introduce the Wald entropy associated with the Noether charge [610]:

$$S = {{AF} \over {4G}}\,.$$
(13.26)

then, from Eqs. (13.25) and (13.26), it follows that

$${1 \over {2\pi {{\bar r}_A}}}{\rm{d}}S = 4\pi \bar r_A^3H\left({{\rho _d} + {P_d} + {\rho _M} + {P_M}} \right){\rm{d}}t + {{{{\bar r}_A}} \over {2G}}{\rm{d}}F\,.$$
(13.27)

The apparent horizon has the Hawking temperature T = ∣κs∣/(2π), where κs is the surface gravity given by

$${\kappa _s} = - {1 \over {{{\bar r}_A}}}\left({1 - {{{{\dot \bar r}_A}} \over {2H{{\bar r}_A}}}} \right) = - {{{{\bar r}_A}} \over 2}\left({\dot H + 2{H^2} + {K \over {{a^2}}}} \right) = - {{2\pi G} \over {3F}}{\bar r_A}\left({{\rho _T} - 3{P_T}} \right)\,.$$
(13.28)

Here we have defined ρTρd + ρM and PT = Pd +PM. For the total equation of state wT = PT/ρT less than 1/3, as is the case for standard cosmology, one has κs ≤ 0 so that the horizon temperature is given by

$$T = {1 \over {2\pi {{\bar r}_A}}}\left({1 - {{{{\dot \bar r}_A}} \over {2H{{\bar r}_A}}}} \right)\,.$$
(13.29)

Multiplying the term \(1 - {{\dot \bar r}_A}/(2H{{\bar r}_A})\) for Eq. (13.27), we obtain

$$T{\rm{d}}S = 4\pi \bar r_A^3H({\rho _d} + {P_d} + {\rho _M} + {P_M}){\rm{d}}t - 2\pi \bar r_A^2({\rho _d} + {P_d} + {\rho _M} + {P_M}){\rm{d}}{\bar r_A} + {T \over G}\pi \bar r_A^2{\rm{d}}F.$$
(13.30)

In Einstein gravity the Misner-Sharp energy [428] is defined by \(E = {{\bar r}_A}/(2G)\). In f(R) gravity and scalar-tensor theory this can be extended to \(E = {{\bar r}_A}F/(2G)\) [281]. Using this expression for f(R, ϕ, X) theory, we have

$$E = {{{{\bar r}_A}F} \over {2G}} = V{{3F({H^2} + K/{a^2})} \over {8\pi G}} = V({\rho _d} + {\rho _M})\,,$$
(13.31)

where \(V = 4\pi \bar r_A^3/3\) is the volume inside the apparent horizon. Using Eqs. (13.19) and (13.24), we find

$${\rm{d}}E = - 4\pi \bar r_A^3H({\rho _d} + {P_d} + {\rho _M} + {P_M}){\rm{d}}t + 4\pi \bar r_A^2({\rho _d} + {\rho _M}){\rm{d}}{\bar r_A} + {{{{\bar r}_A}} \over {2G}}{\rm{d}}F\,.$$
(13.32)

From Eqs. (13.30) and (13.32) it follows that

$$T{\rm{d}}S = - {\rm{d}}E + 2\pi \bar r_A^2({\rho _d} + {\rho _M} - {P_d} - {P_M}){\rm{d}}{\bar r_A} + {{{{\bar r}_A}} \over {2G}}\left({1 + 2\pi {{\bar r}_A}T} \right){\rm{d}}F\,.$$
(13.33)

Following [297, 298, 108] we introduce the work density W = (ρd + ρMPdPM)/2. Then Eq. (13.33) reduces to

$$T{\rm{d}}S = - {\rm{d}}E + W{\rm{d}}V + {{{{\bar r}_A}} \over {2G}}\left({1 + 2\pi {{\bar r}_A}T} \right){\rm{d}}F\,,$$
(13.34)

which can be written in the form [53]

$$T{\rm{d}}S + T{{\rm{d}}_i}S = - {\rm{d}}E + W{\rm{d}}V,$$
(13.35)

where

$${{\rm{d}}_i}S = - {1 \over T}{{{{\bar r}_A}} \over {2G}}\left({1 + 2\pi {{\bar r}_A}T} \right){\rm{d}}F = - \left({{E \over T} + S} \right){{{\rm{d}}F} \over F}\,.$$
(13.36)

The modified first-law of thermodynamics (13.35) suggests a deep connection between the horizon thermodynamics and Friedmann equations in modified gravity. The term diS can be interpreted as a term of entropy production in the non-equilibrium thermodynamics [228]. The theories with F = constant lead to diŜ = 0, which means that the first-law of equilibrium thermodynamics holds. The theories with dF ≠ 0, including f(R) gravity and scalar-tensor theory, give rise to the additional non-equilibrium term (13.36) [6, 7, 281, 619, 620, 108, 50, 53].

The main reason why the non-equilibrium term diS appears is that the energy density ρd and the pressure Pd defined in Eqs. (13.20) and (13.21) do not satisfy the standard continuity equation for ≠ 0. On the other hand, if we define the effective energy-momentum tensor \(T_{\mu \nu}^{(D)}\) as Eq. (2.9) in Section 2, it satisfies the continuity equation (2.10). This correspond to rewriting the Einstein equation in the form (2.8) instead of (13.23). Using this property, [53] showed that equilibrium description of thermodynamics can be possible by introducing the Bekenstein-Hawking entropy Ŝ = A/(4G). In this case the horizon entropy Ŝ takes into account the contribution of both the Wald entropy S in the non-equilibrium thermodynamics and the entropy production term.

13.3 Curing the curvature singularity in f(R) dark energy models, unified models of inflation and dark energy

In Sections 5.2 and 8.1 we showed that there is a curvature singularity for viable f(R) models such as (4.83) and (4.84). More precisely this singularity appears for the models having the asymptotic behavior (5.19) in the region of high density (RRc). As we see in Figure 3, the field potential V(ϕ) = (FRf)/(2κ2F) has a finite value μRc/(2κ2) in the limit \(\phi = \sqrt {3/(16\pi)} {m_{{\rm{pl}}}}\) ln F → 0. In this limit one has f,RR → 0, so that the scalaron mass 1/(3f,RR) goes to infinity.

This problem of the past singularity can be cured by adding the term R2/(6M2) to the Lagrangian in f(R) dark energy models [37]. Let us then consider the modified version of the model (4.83):

$$f(R) = R - \mu {R_c}{{{{(R/{R_c})}^{2n}}} \over {{{(R/{R_c})}^{2n}} + 1}} + {{{R^2}} \over {6{M^2}}}\,.$$
(13.37)

For this model one can easily show that the potential V(ϕ) = (FRf)/(2κ2F) extends to the region ϕ > 0 and that the curvature singularity disappears accordingly Also the scalaron mass approaches the finite value M in the limit ϕ → ∞. The perturbation δR is bounded from above, which can evade the problem of the dominance of the oscillation mode in the past.

Since the presence of the term R2/(6M2) can drive inflation in the early universe, one may anticipate that both inflation and the late-time acceleration can be realized for the model of the type (13.37). This is like a modified gravity version of quintessential inflation based on a single scalar field [486, 183, 187, 392]. However, we have to caution that the transition between two accelerated epochs needs to occur smoothly for successful cosmology. In other words, after inflation, we require a mechanism in which the universe is reheated and then the radiation/matter dominated epochs follow. However, for the model (13.37), the Ricci scalar R evolves to the point f,RR =0 and it enters the region f,RR < 0. Crossing the point f,RR =0 implies the divergence of the scalaron mass. Moreover, in the region f,RR < 0, the Minkowski space is not a stable vacuum state. This is problematic for the particle creation from the vacuum during reheating. The similar problem arises for the models (4.84) and (4.89) in addition to the model proposed by Appleby and Battye [35]. Thus unified f(R) models of inflation and dark energy cannot be constructed easily in general (unlike a number of related works [456, 460, 462]). Brookfield et al. [104] studied the viability of the model f(R) = Rα/Rn + βRm (n, m > 0) by using the constraints coming from Big Bang Nucleosynthesis and fifth-force experiments and showed that it is difficult to find a unique parameter range for consistency of this model.

In order to cure the above mentioned problem, Appleby et al. [37] proposed the f(R) model (11.40). Note that the case c = 0 corresponds to the Starobinsky inflationary model f(R) = R + R2/(6M2) [564] and the case c = 1/2 corresponds to the model of Appleby and Battye [35] plus the R2/(6M2) term. Although the above mentioned problem can be evaded in this model, the reheating proceeds in a different way compared to that in the model f(R) = R + R2/(6M2) [which we discussed in Section 3.3]. Since the Hubble parameter periodically evolves between H = 1/(2t) and H = ϵ/M, the reheating mechanism does not occur very efficiently [37]. The reheating temperature can be significantly lower than that in the model f(R) = R + R2/(6M2). It will be of interest to study observational signatures in such unified models of inflation and dark energy.

13.4 f(R) theories in extra dimensions

Although f(R) theories have been introduced mainly in four dimensions, the same models may appear in the context of braneworld [502, 501] in which our universe is described by a brane embedded in extra dimensions (see [404] for a review). This scenario implies a careful use of f(R) theories, because a boundary (brane) appears. Before looking at the real working scenario in braneworld, it is necessary to focus on the mathematical description of f(R) models through a sensible definition of boundary conditions for the metric elements on the surface of the brane.

Some works appeared regarding the possibility of introducing f(R) theories in the context of braneworld scenarios [499, 40, 96, 513, 97]. In doing so one requires a surface term [222, 482, 69, 48, 49, 286], which is known as the Hawking-Luttrell term [295] (analogous to the York-Gibbons-Hawking one for General Relativity). The action we consider is given by

$$S = \int\nolimits_\Omega {{{\rm{d}}^n}} x\sqrt {- g} f(R) + 2\int\nolimits_{\partial \Omega} {{{\rm{d}}^{n - 1}}} x\sqrt {\vert \gamma \vert} \,FK,$$
(13.38)

where F∂f/∂R, γ is the determinant of the induced metric on the n − 1 dimensional boundary, and K is the trace of the extrinsic curvature tensor.

In this case particular attention should be paid to boundary conditions on the brane, that is, the Israel junction conditions [323]. In order to have a well-defined geometry in five dimensions, we require that the metric is continuous across the brane located at y = 0. However its derivatives with respect to y can be discontinuous at y = 0. The Ricci tensor Rμν in Eq. (2.4) is made of the metric up to the second derivatives g″ with respect to y. This means that have a delta-function dependence proportional to the energy-momentum tensor at a distributional source (i.e., with a Dirac’s delta function centered on the brane) [87, 86, 536]. In general this also leads to the discontinuity of the Ricci scalar R across the brane. Since the discontinuity of R can lead to inconsistencies in f(R) gravity, one should add this extra-constraint as a junction condition. In other words, one needs to impose that, although the metric derivative is discontinuous, the Ricci scalar should still remain continuous on the brane.

This is tantamount to imposing that the extra scalar degree of freedom introduced is continuous on the brane. We use Gaussian normal coordinates with the metric

$${\rm{d}}{s^2} = {\rm{d}}{y^2} + {\gamma _{\mu \nu}}\,{\rm{d}}{x^\mu}{\rm{d}}{x^\nu}.$$
(13.39)

In terms of the extrinsic curvature tensor Kμν = − yγμν/2 for a brane, the l.h.s. of the equations of motion tensor [which is analogous to the l.h.s. of Eq. (2.4) in 4 dimensions] is defined by

$${\Sigma _{AB}} \equiv F{R_{AB}} - {1 \over 2}f{g_{AB}} - {\nabla _A}{\nabla _B}F(R) + {g_{AB}} \square F(R)\,.$$
(13.40)

This has a delta function behavior for the μ-ν components, leading to [207]

$${D_{\mu \nu}} \equiv [F\,({K_{\mu \nu}} - K\,{\gamma _{\mu \nu}}) + {\gamma _{\mu \nu}}\,{F_{,R}}\,{\partial _y}R]_ - ^ + = {T_{\mu \nu}},$$
(13.41)

where Tμν is the matter stress-energy tensor on the brane. Hence R is continuous, whereas its first derivative is not, in general. This imposes an extra condition on the metric crossing the brane at y = 0, compared to General Relativity in which the condition for the continuity of R is not present. However, it is not easy to find a solution for which the metric derivative is discontinuous but R is not. Therefore some authors considered matter on the brane which is not universally coupled with the induced metric. This approach leads to the relaxation of the condition that R is continuous. Such a matter action can be found by analyzing the action in the Einstein frame and introducing a scalar field ψ coupled to the scalaron ϕ on the brane as follows [207]

$${S_M} = \int {{{\rm{d}}^{n - 1}}} x\sqrt {- \gamma} \exp [(n - 1)\,C(\phi)]\left[ {- {1 \over 2}\exp [ - 2C(\phi)]{\gamma ^{\mu \nu}}{\nabla _\mu}\psi {\nabla _\nu}\psi - V(\psi)} \right]\,.$$
(13.42)

The presence of the coupling C(ϕ) with the field ϕ modifies the Israel junction conditions. Indeed, if C = 0, then R must be continuous, but if C ≠ 0, R can have a delta function profile. This method may help for finding a solution for the bulk that satisfies boundary conditions on the brane.

13.5 Vainshtein mechanism

Modifications of gravity in recent works have been introduced mostly in order to explain the late-time cosmic acceleration. This corresponds to the large-distance modification of gravity, but gravity at small distances is subject to change as well. Modified gravity models of dark energy must pass local gravity tests in the solar system. The f(R) models discussed in Section 4 are designed to satisfy local gravity constraints by having a large scalar-field mass, while at the same time they are responsible for dark energy with a small mass compatible with the Hubble parameter today.

It is interesting to see which modified gravity theories have successful Newton limits. There are two known mechanisms for satisfying local gravity constraints, (i) the Vainshtein mechanism [602], and (ii) the chameleon mechanism [344, 343] (already discussed in Section 5.2). Both consist of using non-linearities in order to prevent any other fifth force from propagating freely. The chameleon mechanism uses the non-linearities coming from matter couplings, whereas the Vainshtein mechanism uses the self-coupling of a scalar-field degree of freedom as a source for the non-linear effect.

There are several examples where the Vainshtein mechanism plays an important role. One is the massive gravity in which a consistent free massive graviton is uniquely defined by Pauli-Fierz theory [258, 259]. The massive gravity described by the Fierz-Pauli action cannot be studied through the linearization close to a point-like mass source, because of the crossing of the Vainshtein radius, that is the distance under which the linearization fails to study the metric properly [602]. Then the theory is in the strong-coupling regime, and things become obscure as the theory cannot be understood well mathematically. A similar behavior also appears for the Dvali-Gabadadze-Porrati (DGP) model (we will discuss in the next section), in which the Vainshtein mechanism plays a key role for the small-scale behavior of this model.

Besides a standard massive term, other possible operators which could give rise to the Vainshtein mechanism come from non-linear self-interactions in the kinetic term of a matter field ϕ. One of such terms is given by

$${\nabla _\mu}\phi \,{\nabla ^\mu}\phi \square \phi,$$
(13.43)

which respects the Galilean invariance under which ϕ’s gradient shifts by a constant [455] (treated in section 13.7.2). This allows a robust implementation of the Vainshtein mechanism in that nonlinear self-interacting term can allow the decoupling of the field ϕ from matter in the gravitationally bounded system without introducing ghosts.

Another example of the Vainshtein mechanism may be seen in \(f({\mathcal G})\) gravity. Recall that in this theory the contribution to the GB term from matter can be neglected relative to the vacuum value \({{\mathcal G}^{(0)}} = 12{(2GM)^2}/{r^6}\). In Section 12.3.3 we showed that on the Schwarzschild geometry the modification of gravity is very small for the models (12.16) and (12.17), because the GB term has a value much larger than its cosmological value today. The scalar-field degree of freedom acquires a large mass in the region of high density, so that it does not propagate freely. For the model (12.16) we already showed that at the linear level the coefficients A and B of the spherically symmetric metric (12.32) are

$$A = 1 - {1 \over \rho} + {A_1}(\rho)\,\varepsilon + O({\varepsilon ^2}),\qquad B = 1 - {1 \over \rho} + {B_1}(\rho)\,\varepsilon + O({\varepsilon ^2}),$$
(13.44)

where ρr/(2GM), A1(ρ) and B1(ρ) are given by Eqs. (12.38) and (12.39), and ε ≈ 10−46 for our solar system. Of course this result is trustable only in the region for which A1ε ≡ 1/ρ. Outside this region non-linearities are important and one cannot rely on approximate methods any longer. Therefore, for this model, we can define the Vainshtein radius rV as

$$\lambda \varepsilon \rho _V^3 \sim {1 \over {{\rho _V}}}\quad \rightarrow \quad {r_V} \sim 2GM{(\lambda \varepsilon)^{- 1/4}}.$$
(13.45)

For λ ∼ 1, this value is well outside the region in which solar-system experiments are carried out. This example shows that the Vainshtein radius is generally model-dependent.

In metric f(R) gravity a non-linear effect coming from the coupling to matter fields (in the Einstein frame) is crucially important, because vanishes in the vacuum Schwarzschild background. The local gravity constraints can be satisfied under the chameleon mechanism rather than the non-linear self coupling of the Vainshtein mechanism.

13.6 DGP model

The Dvali-Gabadadze-Porrati (DGP) [220] braneworld model has been considered as a model which could modify gravity because of the existence of the extra-dimensions. In the DGP model the 3-brane is embedded in a Minkowski bulk spacetime with infinitely large 5th extra dimensions. The Newton’s law can be recovered by adding a 4-dimensional (4D) Einstein-Hilbert action sourced by the brane curvature to the 5D action [219]. While the DGP model recovers the standard 4D gravity for small distances, the effect from the 5D gravity manifests itself for large distances. Remarkably it is possible to realize the late-time cosmic acceleration without introducing an exotic matter source [201, 203].

The DGP model is given by the action

$$S = {1 \over {2\kappa _{(5)}^2}}\int {{{\rm{d}}^5}} X\sqrt {- \tilde g} \,\tilde R + {1 \over {2\kappa _{(4)}^2}}\int {{{\rm{d}}^4}} x\sqrt {- g} \,R + \int {{{\rm{d}}^4}} x\sqrt {- g} \,{\mathcal L}_M^{{\rm{brane}}},$$
(13.46)

where \({{\tilde g}_{AB}}\) is the metric in the 5D bulk and \({g_{\mu \nu}} = {\partial _\mu}{X^A}{\partial _\nu}{X^B}{{\tilde g}_{AB}}\) is the induced metric on the brane with XA(xc) being the coordinates of an event on the brane labeled by xc. The first and second terms in Eq. (13.46) correspond to Einstein-Hilbert actions in the 5D bulk and on the brane, respectively. Note that \(\kappa _{(5)}^2\) and \(\kappa _{(4)}^2\) are 5D and 4D gravitational constants, respectively, which are related with 5D and 4D Planck masses, M(5) and M(4), via \(\kappa _{(5)}^2 = 1/M_{(5)}^3\) and \(\kappa _{(4)}^2 = 1/M_{(4)}^2\). The Lagrangian \({\mathcal L}_M^{{\rm{brane}}}\) describes matter localized on the 3-brane.

The equations of motion read

$$G_{AB}^{(5)} = 0,$$
(13.47)

where \(G_{AB}^{(5)}\) is the 5D Einstein tensor. The Israel junction conditions on the brane, under which a Z2 symmetry is imposed, read [304]

$${G_{\mu \nu}} - {1 \over {{r_c}}}({K_{\mu \nu}} - {g_{\mu \nu}}K) = \kappa _{(4)}^2{T_{\mu \nu}},$$
(13.48)

where Kμν is the extrinsic curvature [609] calculated on the brane and Tμν is the energy-momentum tensor of localized matter. Since ∇μ(KμνgμνK) = 0, the continuity equation ∇μTμν = 0 follows from Eq. (13.48). The length scale rc is defined by

$${r_c} \equiv {{\kappa _{(5)}^2} \over {2\kappa _{(4)}^2}} = {{M_{(4)}^2} \over {2M_{(5)}^3}}.$$
(13.49)

If we consider the flat FLRW brane (K = 0), we obtain the modified Friedmann equation [201, 203]

$${H^2} - {\epsilon \over {{r_c}}}H = {{\kappa _{(4)}^2} \over 3}{\rho _M},$$
(13.50)

where ϵ = ±1, H and ρM are the Hubble parameter and the matter energy density on the brane, respectively. In the regime rcH−1 the first term in Eq. (13.50) dominates over the second one and hence the standard Friedmann equation is recovered. Meanwhile, in the regime rcH−1, the second term in Eq. (13.50) leads to a modification to the standard Friedmann equation. If ϵ = 1, there is a de Sitter solution characterized by

$${H_{{\rm{dS}}}} = 1/{r_c}.$$
(13.51)

One can realize the cosmic acceleration today if rc is of the order of the present Hubble radius \(H_0^{- 1}\). This self acceleration is the result of gravitational leakage into extra dimensions at large distances. In another branch (ϵ = − 1) such cosmic acceleration is not realized.

In the DGP model the modification of gravity comes from a scalar-field degree of freedom, usually called π, which is identified as a brane bending mode in the bulk. Then one may wonder if such a field mediates a fifth force incompatible with local gravity constraints. However, this is not the case, as the Vainshtein mechanism is at work in the DGP model for the length scale smaller than the Vainshtein radius \({r_\ast} = {({r_g}r_c^2)^{1/3}}\), where rg is the Schwarzschild radius of a source. The model can evade solar system constraints at least under some range of conditions on the energy-momentum tensor [204, 285, 496]. The Vainshtein mechanism in the DGP model originates from a non-linear self-interaction of the scalar-field degree of freedom.

Although the DGP model is appealing and elegant, it is also plagued by some problems. The first one is that, although the model does not possess ghosts on asymptotically flat manifolds, at the quantum level, it does have the problem of strong coupling for typical distances smaller than 1000 km, so that the theory is not easily under control [401]. Besides the model typically possesses superluminal modes. This may not directly violate causality, but it implies a non-trivial non-Lorentzian UV completion of the theory [304]. Also, on scales relevant for structure formation (between cluster scales and the Hubble radius), a quasi-static approximation to linear cosmological perturbations shows that the DGP model contains a ghost mode [369]. This linear analysis is valid as long as the Vainshtein radius r* is smaller than the cluster scales.

The original DGP model has been tested by using a number of observational data at the background level [525, 238, 405, 9, 549]. The joint constraints from the data of SN Ia, BAO, and the CMB shift parameter show that the flat DGP model is under strong observational pressure, while the open DGP model gives a slightly better fit [405, 549]. Xia [622] showed that the parameter α in the modified Friedmann equation \({H^2} - {H^\alpha}/r_c^{2 - \alpha} = \kappa _{(4)}^2{\rho _M}/3\) [221] is constrained to be α = 0.254 ± 0.153 (68% confidence level) by using the data of SN Ia, BAO, CMB, gamma ray bursts, and the linear growth factor of matter perturbations. Hence the flat DGP model (α = 1) is not compatible with current observations.

On the sub-horizon scales larger than the Vainshtein radius, the equation for linear matter perturbations δm in the DGP model was derived in [400, 369] under a quasi-static approximation:

$${\ddot \delta _m} + 2H{\dot \delta _m} - 4\pi {G_{{\rm{eff}}}}{\rho _m}{\delta _m} \simeq 0,$$
(13.52)

where ρm is the non-relativistic matter density on the brane and

$${G_{{\rm{eff}}}} = \left({1 + {1 \over {3\beta}}} \right)\,G\,,\qquad \beta (t) \equiv 1 - 2H{r_c}\left({1 + {{\dot H} \over {3{H^2}}}} \right).\,$$
(13.53)

In the deep matter era one has Hrc ≫ 1 and hence β ≃ − Hrc, so that β is largely negative (∣β∣ ≫ 1). In this regime the evolution of δm is similar to that in GR (δmt2/3). Since the background solution finally approaches the de Sitter solution characterized by Eq. (13.51), it follows that β ≃ 1 − 2 Hrc ≃ −1 asymptotically. Since 1 + 1/(3β) ≃ 2/3, the growth rate in this regime is smaller than that in GR.

The index γ of the growth rate fδ = (Ωm)γ is approximated by γ ≈ 0.68 [395]. This is quite different from the value γ ≃ 0.55 for the ΛCDM model. If the future imaging survey of galaxies can constrain γ within 20%, it will be possible to distinguish the ΛCDM model from the DGP model observationally [624]. We recall that in metric f(R) gravity the growth index today can be as small as γ = 0.4 because of the enhanced growth rate, which is very different from the value in the DGP model.

Comparing Eq. (13.53) with the effective gravitational constant (10.42) in BD theory with a massless limit (or the absence of the field potential), we find that the parameter ωBD has the following relation with β:

$${\omega _{{\rm{BD}}}} = {3 \over 2}(\beta - 1).$$
(13.54)

Since β < 0 for the self-accelerating DGP solution, it follows that ωBD < −3/2. This corresponds to the theory with ghosts, because the kinetic energy of a scalar-field degree of freedom becomes negative in the Einstein frame [175]. There is a claim that the ghost may disappear for the Vainshtein radius r* of the order of \(H_0^{- 1}\), because the linear perturbation theory is no longer applicable [218]. In fact, a ghost does not appear in a Minkowski brane in the DGP model. In [370] it was shown that the Vainshtein radius in the early universe is much smaller than the one in the Minkowski background, while in the self accelerating universe they agree with each other. Hence the perturbative approach seems to be still possible for the weak gravity regime beyond the Vainshtein radius.

There have been studies regarding a possible regularization in order to avoid the ghost/strong coupling limit. Some of these studies have focused on smoothing out the delta profile of the Ricci scalar on the brane, by coupling the Ricci scalar to some other scalar field with a given profile [363, 362]. In [516] the authors included the brane and bulk cosmological constants in addition to the scalar curvature in the action for the brane and showed that the effective equation of state of dark energy can be smaller than −1. A monopole in seven dimensions generated by a SO(3) invariant matter Lagrangian is able to change the gravitational law at its core, leading to a lower dimensional gravitational law. This is a first approach to an explanation of trapping of gravitons, due to topological defects in classical field theory [508, 184]. Other studies have focused on re-using the delta function profile but in a higher-dimensional brane [334, 333, 197]. There is also an interesting work about the possibility of self-acceleration in the normal DGP branch [ϵ = −1 in Eq. (13.50)] by considering an f(R) term on the brane action [97] (see also [4]). All these attempts indeed point to the direction that some mechanism, if not exactly DGP rather similar to it, may avoid a number of problems associated with the original DGP model.

13.7 Special symmetries

Since general covariance alone does not restrict the choice of the Lagrangian function, e.g., for f(R) theory, one can try to shrink the set of allowed functions by imposing some extra symmetry. In particular one can assume that the theory possesses a symmetry on some special background. However, allowing some theories to be symmetrical on some backgrounds does not imply these theories are viable by default. Nevertheless, this symmetry helps to give stronger constraints on them, as the allowed parameter space drastically reduces. We will discuss here two of the symmetries studied in the literature: Noether symmetries on a FLRW background and the Galileon symmetry on a Minkowski background.

13.7.1 Noether symmetry on FLRW

The action for metric f(R) gravity can be evaluated on a FLRW background, in terms of the fields a(t) and R(t), see [125, 124] (and also [129, 128, 415, 433, 429, 604, 603, 199, 200, 132]). Then the Lagrangian turns out to be non-singular, \({\mathcal L}({q_i},{{\dot q}_i})\), where q1 = a, and q2 = R. Its Euler-Lagrange equation is given by \({(\partial {\mathcal L}/\partial {{\dot q}^i})^ \cdot} - \partial {\mathcal L}/\partial {q^i} = 0\). Contracting these equations with a vector function αj (qi), (where α1 = α and α2 = β are two unknown functions of the qi), we obtain

$${\alpha ^i}\left({{{\rm{d}} \over {{\rm{d}}t}}{{\partial {\mathcal L}} \over {\partial {{\dot q}^i}}} - {{\partial {\mathcal L}} \over {\partial {q^i}}}} \right) = 0\,.\quad \rightarrow \quad {{\rm{d}} \over {{\rm{d}}t}}\left({{\alpha ^i}{{\partial {\mathcal L}} \over {\partial {{\dot q}^i}}}} \right) = {L_X}{\mathcal L}.$$
(13.55)

Here \({L_X}{\mathcal L}\) is the Lie derivative of \({\mathcal L}\) with respect to the vector field

$$X = {\alpha ^i}(q){\partial \over {\partial {q^i}}} + \left({{{\rm{d}} \over {{\rm{d}}t}}{\alpha ^i}(q)} \right){\partial \over {\partial {{\dot q}^i}}}.$$
(13.56)

if \({L_X}{\mathcal L} = 0\), the Noether Theorem states that the function \({\Sigma _0} = {\alpha ^i}(\partial {\mathcal L}/\partial {{\dot q}^i})\) is a constant of motion. The generator of the Noether symmetry in metric f(R) gravity on the flat FLRW background is

$$X = \alpha {\partial \over {\partial a}} + \beta {\partial \over {\partial R}} + \dot \alpha {\partial \over {\partial \dot a}} + \dot \beta {\partial \over {\partial \dot R}}.$$
(13.57)

A symmetry exists if the equation \({L_X}{\mathcal L} = 0\) has non-trivial solutions. As a byproduct, the form of f(R), not specified in the point-like Lagrangian \({\mathcal L}\), is determined in correspondence to such a symmetry.

It can be proved that such αi do exist [124], and they correspond to

$$\alpha = {c_1}\,a + {{{c_2}} \over a}\,,\qquad \beta = - \left[ {3\,{c_1} + {{{c_2}} \over {{a^2}}}} \right]{{{f_{,R}}} \over {{f_{,RR}}}} + {{{c_3}} \over {a\,{f_{,RR}}}},$$
(13.58)

where c1, c2, c3 are constants. However, in order that \({L_X}{\mathcal L}\) vanishes, one also needs to set the constraint (provided that c2 R ≠ 0)

$${f_{,R}} = {{3({c_1}\,{a^2} + {c_2})\,f - {c_3}aR} \over {2{c_2}R}} + {{({c_1}{a^2} + {c_2}){\kappa ^2}\rho _r^{(0)}} \over {{a^4}{c_2}R}},$$
(13.59)

where \(\rho _r^{(0)}\) is the radiation density today. Since now LX = 0, then \({\alpha ^i}(\partial {\mathcal L}/\partial {{\dot q}^i}) = {\rm{constant}}\). This constant of motion corresponds to

$$\alpha \,(6\,{f_{,RR}}\,{a^2}\,\dot R + 12\,{f_{,R}}\,a\,\dot a) + \beta \,(6\,{f_{,RR}}\,{a^2}\,\dot a) = 6\,\mu _0^3 = {\rm{constant}},$$
(13.60)

where μ0 has a dimension of mass.

For a general f it is not possible to solve the Euler-Lagrange equation and the constraint equation (13.59) at the same time. Hence, we have to use the Noether constraint in order to find the subset of those f which make this possible. Some partial solutions (only when μ0 = 0) were found, but whether this symmetry helps finding viable models of f(R) is still not certain. However, the f(R) theories which possess Noether currents can be more easily constrained, as now the original freedom for the function f in the Lagrangian reduced to the choice of the parameters ci and μ0.

13.7.2 Galileon symmetry

Recently another symmetry, the Galileon symmetry, for a scalar field Lagrangian was imposed on the Minkowski background [455]. This idea is interesting as it tries to decouple light scalar fields from matter making use of non-linearities, but without introducing new ghost degrees of freedom [455]. This symmetry was chosen so that the theory could naturally implement the Vainshtein mechanism. However, the same mechanism, at least in cosmology, seems to appear also in the FLRW background for scalar fields which do not possess such a symmetry (see [539, 351, 190]).

Keeping a universal coupling with matter (achieved through a pure nonminimal coupling with R), Nicolis et al. [455] imposed a symmetry called the Galilean invariance on a scalar field π in the Minkowski background. If the equations of motion are invariant under a constant gradient-shift on Minkowski spacetime, that is

$$\pi \rightarrow \pi + c + {b_\mu}{x^\mu},$$
(13.61)

where both c and bμ are constants, we call π a Galileon field. This implies that the equations of motion fix the field up to such a transformation. The point is that the Lagrangian must implement the Vainshtein mechanism in order to pass solar-system constraints. This is achieved by introducing self-interacting non-linear terms in the Lagrangian. It should be noted that the Lagrangian is studied only at second order in the fields (having a nonminimal coupling with R) and the metric itself, whereas the non-linearities are fully kept by neglecting their backreaction on the metric (as the biggest contribution should come only from standard matter). The equations of motion respecting the Galileon symmetry contain terms such as a constant, □π(up to fourth power), and other power contraction of the tensor ∇μνπ. It is due to these non-linear derivative terms by which the Vainshtein mechanism can be implemented, as it happens in the DGP model [401].

Nicolis et al. [455] found that there are only five terms \({{\mathcal L}_i}\) with i = 1,…, 5 which can be inserted into a Lagrangian, such that the equations of motion respect the Galileon symmetry in 4-dimensional Minkowski spacetime. The first three terms are given by

$${{\mathcal L}_1} = \pi,$$
(13.62)
$${{\mathcal L}_2} = {\nabla _\mu}\pi {\nabla ^\mu}\pi,$$
(13.63)
$${{\mathcal L}_3} = \square \pi {\nabla _\mu}\pi {\nabla ^\mu}\pi.$$
(13.64)

All these terms generate second-order derivative terms only in the equations of motion. The approach in the Minkowski spacetime has motivated to try to find a fully covariant framework in the curved spacetime. In particular, Deffayet et al. [205] found that all the previous 5-terms can be written in a fully covariant way. However, if we want to write down \({{\mathcal L}_4}\) and \({{\mathcal L}_5}\) covariantly in curved spacetime and keep the equations of motion free from higher-derivative terms, we need to introduce couplings between the field π and the Riemann tensor [205]. The following two terms keep the field equations to second-order,

$${{\mathcal L}_4} = ({\nabla _\mu}\pi {\nabla ^\mu}\pi)\,\left[ {2{{(\square \pi)}^2} - 2({\nabla _{\alpha \beta}}\pi)\,({\nabla ^{\alpha \beta}}\pi) - (1/2)\,R\,{\nabla _\mu}\pi {\nabla ^\mu}\pi} \right],$$
(13.65)
$$\begin{array}{*{20}c} {{{\mathcal L}_5} = ({\nabla _\lambda}\pi {\nabla ^\lambda}\pi)\,\left[ {{{(\square \pi)}^3} - 3 \square \pi \,({\nabla _{\alpha \beta}}\pi)\,({\nabla ^{\alpha \beta}}\pi) + 2({\nabla _\mu}{\nabla ^\nu}\pi)\,({\nabla _\nu}{\nabla ^\rho}\pi)\,({\nabla _\rho}{\nabla ^\mu}\pi)} \right.} \\ {\left. {- 6({\nabla _\mu}\pi)\,({\nabla ^\mu}{\nabla ^\nu}\pi)\,({\nabla ^\rho}\pi)\,{G_{\nu \rho}}} \right],} \\ \end{array}$$
(13.66)

where the last terms in Eqs. (13.65) and (13.66) are newly introduced in the curved spacetime. These terms possess the required symmetry in Minkowski spacetime, but mostly, they do not introduce derivatives higher than two into the equations of motion. In this sense, although originated from an implementation of the DGP idea, the covariant Galileon field is closer to the approach of the modifications of gravity in \(f(R,{\mathcal G})\), that is, a formalism which would introduce only second-order equations of motion.

This result can be extended to arbitrary D dimensions [202]. One can find, analogously to the Lovelock action-terms, an infinite tower of terms that can be introduced with the same property of keeping the equations of motion at second order. In particular, let us consider the action

$$S = \int {{{\rm{d}}^D}} x\sqrt {- g} \sum\limits_{p = 0}^{{p_{{\rm{max}}}}} {{{\mathcal C}_{(n + 1,p)}}} {{\mathcal L}_{(n + 1,p)}},$$
(13.67)

where pmax is the integer part of (n − 1)/2 (nD),

$${{\mathcal C}_{(n + 1,p)}} = {\left({- {1 \over 8}} \right)^p}{{(n - 1)!} \over {(n - 1 - 2p)!{{(p!)}^2}}},$$
(13.68)

and

$$\begin{array}{*{20}c} {{{\mathcal L}_{(n + 1,p)}} = - {1 \over {(D - n)!}}{\varepsilon ^{{\mu _1}{\mu _3} \ldots {\mu _{2n - 1}}{\nu _1} \ldots {\nu _{D - n}}}}{\varepsilon ^{{\mu _2}{\mu _4} \ldots {\mu _{2n}}}}_{{\nu _1} \ldots {\nu _{D - n}}}\,{\pi _{;{\mu _1}}}{\pi _{;{\mu _2}}}\,{{({\pi ^{;\lambda}}{\pi _{;\lambda}})}^p}} \\ {\times \prod\limits_{i = 1}^p {{R_{{\mu _{4i - 1}}{\mu _{4i + 1}}{\mu _{4i}}{\mu _{4i + 2}}}}} \prod\limits_{j = 0}^{n - 2 - 2p} {{\pi _{;{\mu _{2n - 1 - 2j}}{\mu _{2n - 2j}}}}}. \quad \quad \quad \quad \quad} \\ \end{array}$$
(13.69)

Here ϵ1…n is the Levi-Civita tensor. The first product in Eq. (13.69) is defined to be one when p = 0 and 0 for p < 0, and the second product is one when n = 1 + 2p, and 0 when n < 2 + 2p. In \({{\mathcal L}_{(n + 1,p)}}\) there will be n + 1 powers of π, and p powers of the Riemann tensor. In four dimensions, for example, \({{\mathcal L}_{(1,0)}}\) and \({{\mathcal L}_{(2,0)}}\) are identical to \({{\mathcal L}_1}\) and \({{\mathcal L}_2}\) introduced before, respectively. Instead, \({{\mathcal L}_{(3,0)}},{{\mathcal L}_{(4,0)}} - (1/4){{\mathcal L}_{(4,1)}}\), and \({{\mathcal L}_{(5,0)}} - (3/4){{\mathcal L}_{(5,1)}}\) reduce to \({{\mathcal L}_3},\,{{\mathcal L}_4}\), and \({{\mathcal L}_5}\), up to total derivatives, respectively.

In general non-linear terms discussed above may introduce the Vainshtein mechanism to decouple the scalar field from matter around a star, so that solar-system constraints can be satisfied. However the modes can have superluminal propagation, which is not surprising as the kinetic terms get heavily modified in the covariant formalism. Some studies have focused especially on the \({{\mathcal L}_3}\) term only, as this corresponds to the simplest case. For some models the background cosmological evolution is similar to that in the DGP model, although there are ghostlike modes depending on the sign of the time-velocity of the field π [158]. There are some works for cosmological dynamics in Brans-Dicke theory in the presence of the non-linear term \({{\mathcal L}_3}\) [539, 351, 190] (although the original Galileon symmetry is not preserved in this scenario). Interestingly the ghost can disappear even for the case in which the Brans-Dicke parameter ωBD is smaller than −2. Moreover this theory leaves a number of distinct observational signatures such as the enhanced growth rate of matter perturbations and the significant ISW effect in CMB anisotropies.

At the end of this section we should mention conformal gravity in which the conformal invariance forces the gravitational action to be uniquely given by a Weyl action [414, 340]. Interestingly the conformal symmetry also forces the cosmological constant to be zero at the level of the action [413]. It will be of interest to study the cosmological aspects of such theory, together with the possibility for the avoidance of ghosts and instabilities.

14 Conclusions

We have reviewed many aspects of f(R) theories studied extensively over the past decade. This burst of activities is strongly motivated by the observational discovery of dark energy. The idea is that the gravitational law may be modified on cosmological scales to give rise to the late-time acceleration, while Newton’s gravity needs to be recovered on solar-system scales. In fact, f(R) theories can be regarded as the simplest extension of General Relativity.

The possibility of the late-time cosmic acceleration in metric f(R) gravity was first suggested by Capozziello in 2002 [113]. Even if f(R) gravity looks like a simple theory, successful f(R) dark energy models need to satisfy a number of conditions for consistency with successful cosmological evolution (a late-time accelerated epoch preceded by a matter era) and with local gravity tests on solar-system scales. We summarize the conditions under which metric f(R) dark energy models are viable:

  1. 1.

    f, R > 0 for RR0, where R0 is the Ricci scalar today. This is required to avoid a ghost state.

  2. 2.

    f, RR > 0 for RR0. This is required to avoid the negative mass squared of a scalar-field degree of freedom (tachyon).

  3. 3.

    f(R) → − 2Λ for RR0. This is required for the presence of the matter era and for consistency with local gravity constraints.

  4. 4.

    \(4.0 < {{Rf{,_{RR}}} \over {f{,_R}}}(r = - 2) < 1\) at \(r = - {{R{f_{,R}}} \over f} = - 2\). This is required for the stability and the presence of a late-time de Sitter solution. Note that there is another fixed point that can be responsible for the cosmic acceleration (with an effective equation of state weff > −1).

We clarified why the above conditions are required by providing detailed explanation about the background cosmological dynamics (Section 4), local gravity constraints (Section 5), and cosmological perturbations (Sections 68).

After the first suggestion of dark energy scenarios based on metric f(R) gravity, it took almost five years to construct viable models that satisfy all the above conditions [26, 382, 31, 306, 568, 35, 587]. In particular, the models (4.83), (4.84), and (4.89) allow appreciable deviation from the ΛCDM model during the late cosmological evolution, while the early cosmological dynamics is similar to that of the ΛCDM. The modification of gravity manifests itself in the evolution of cosmological perturbations through the change of the effective gravitational coupling. As we discussed in Sections 8 and 13, this leaves a number of interesting observational signatures such as the modification to the galaxy and CMB power spectra and the effect on weak lensing. This is very important to distinguish f(R) dark energy models from the ΛCDM model in future high-precision observations.

As we showed in Section 2, the action in metric f(R) gravity can be transformed to that in the Einstein frame. In the Einstein frame, non-relativistic matter couples to a scalar-field degree of freedom (scalaron) with a coupling Q of the order of unity (\((Q = - 1/\sqrt 6)\)). For the consistency of metric f(R) gravity with local gravity constraints, we require that the chameleon mechanism [344, 343] is at work to suppress such a large coupling. This is a non-linear regime in which the linear expansion of the Ricci scalar R into the (cosmological) background value R0 and the perturbation δR is no longer valid, that is, the condition δRR0 holds in the region of high density. As long as a spherically symmetric body has a thin-shell, the effective matter coupling Qeff is suppressed to avoid the propagation of the fifth force. In Section 5 we provided detailed explanation about the chameleon mechanism in f(R) gravity and showed that the models (4.83) and (4.84) are consistent with present experimental bounds of local gravity tests for n > 0.9.

The construction of successful f(R) dark energy models triggered the study of spherically symmetric solutions in those models. Originally it was claimed that a curvature singularity present in the models (4.83) and (4.84) may be accessed in the strong gravitational background like neutron stars [266, 349]. Meanwhile, for the Schwarzschild interior and exterior background with a constant density star, one can approximately derive analytic thin-shell solutions in metric f(R) and Brans-Dicke theory by taking into account the backreaction of gravitational potentials [594]. In fact, as we discussed in Section 11, a static star configuration in the f(R) model (4.84) was numerically found both for the constant density star and the star with a polytropic equation of state [43, 600, 42]. Since the relativistic pressure is strong around the center of the star, the choice of correct boundary conditions along the line of [594] is important to obtain static solutions numerically.

The model f(R) = R+R2/(6M2) proposed by Starobinsky in 1980 is the first model of inflation in the early universe. Inflation occurs in the regime RM2, which is followed by the reheating phase with an oscillating Ricci scalar. In Section 3 we studied the dynamics of inflation and (p)reheating (with and without nonminimal couplings between a field χ and R) in detail. As we showed in Section 7, this model is well consistent with the WMAP 5-year bounds of the spectral index ns of curvature perturbations and of the tensor-to-scalar ratio r. It predicts the values of r smaller than the order of 0.01, unlike the chaotic inflation model with \(r = {\mathcal O}(0.1)\). It will be of interest to see whether this model continues to be favored in future observations.

Besides metric f(R) gravity, there is another formalism dubbed the Palatini formalism in which the metric gμν and the connection \(\Gamma _{\beta \gamma}^\alpha\) are treated as independent variables when we vary the action (see Section 9). The Palatini f(R) gravity gives rise to the specific trace equation (9.2) that does not have a propagating degree of freedom. Cosmologically we showed that even for the model f(R) = Rβ/Rn (β > 0, n > −1) it is possible to realize a sequence of radiation, matter, and de Sitter epochs (unlike the same model in metric f(R) gravity). However the Palatini f(R) gravity is plagued by a number of shortcomings such as the inconsistency with observations of large-scale structure, the conflict with Standard Model of particle physics, and the divergent behavior of the Ricci scalar at the surface of a static spherically symmetric star with a polytropic equation of state \(P = c\rho _0^\Gamma\) with 3/2 < Γ < 2. The only way to avoid these problems is that the f(R) models need to be extremely close to the ΛCDM model. This property is different from metric f(R) gravity in which the deviation from the ΛCDM model can be significant for R of the order of the Ricci scalar today.

In Brans-Dicke (BD) theories with the action (10.1), expressed in the Einstein frame, non-relativistic matter is coupled to a scalar field with a constant coupling Q. As we showed in in Section 10.1, this coupling Q is related to the BD parameter ωBD with the relation 1/(2Q2) = 3 + 2ωBD. These theories include metric and Palatini f(R) gravity theories as special cases where the coupling is given by \(Q = - 1/\sqrt 6\) (i.e., ωBD = 0) and Q = 0 (i.e., ωBD = −3/2), respectively. In BD theories with the coupling Q of the order of unity we constructed a scalar-field potential responsible for the late-time cosmic acceleration, while satisfying local gravity constraints through the chameleon mechanism. This corresponds to the generalization of metric f(R) gravity, which covers the models (4.83) and (4.84) as specific cases. We discussed a number of observational signatures in those models such as the effects on the matter power spectrum and weak lensing.

Besides the Ricci scalar R, there are other scalar quantities such as RμνRμν and RμνρσRμνρσ constructed from the Ricci tensor Rμν and the Riemann tensor Rμνρσ. For the Gauss-Bonnet (GB) curvature invariant \({\mathcal G} \equiv {R^2} - 4{R_{\mu \nu}}{R^{\mu \nu}} + {R_{\mu \nu \rho \sigma}}{R^{\mu \nu \rho \sigma}}\) one can avoid the appearance of spurious spin-2 ghosts. There are dark energy models in which the Lagrangian density is given by \({\mathcal L} = R + f({\mathcal G})\), where \(f({\mathcal G})\) is an arbitrary function in terms of \({\mathcal G}\). In fact, it is possible to explain the late-time cosmic acceleration for the models such as (12.16) and (12.17), while at the same time local gravity constraints are satisfied. However density perturbations in perfect fluids exhibit violent negative instabilities during both the radiation and the matter domination, irrespective of the form of \(f({\mathcal G})\). The growth of perturbations gets stronger on smaller scales, which is incompatible with the observed galaxy spectrum unless the deviation from GR is very small. Hence these models are effectively ruled out from this Ultra-Violet instability. This implies that metric f(R) gravity may correspond to the marginal theory that can avoid such instability problems.

In Section 13 we discussed other aspects of f(R) gravity and modified gravity theories — such as weak lensing, thermodynamics and horizon entropy, Noether symmetry in f(R) gravity, unified f(R) models of inflation and dark energy, f(R) theories in extra dimensions, Vainshtein mechanism, DGP model, and Galileon field. Up to early 2010 the number of papers that include the word “f(R)” in the title is over 460, and more than 1050 papers including the words “f(R)” or “modified gravity” or “Gauss-Bonnet” have been written so far. This shows how this field is rich and fruitful in application to many aspects to gravity and cosmology.

Although in this review we have focused on f(R) gravity and some extended theories such as BD theory and Gauss-Bonnet gravity, there are other classes of modified gravity theories, e.g., Einstein-Aether theory [325], tensor-vector-scalar theory of gravity [76], ghost condensation [38], Lorentz violating theories [144, 282, 389], and Hořava-Lifshitz gravity [305]. There are also attempts to study f(R) gravity in the context of Hořava-Lifshitz gravity [346, 347]. We hope that future high-precision observations can distinguish between these modified gravity theories, in connection to solving the fundamental problems for the origin of inflation, dark matter, and dark energy.