1 Introduction

Over the last 35 years, one of the greatest achievements in classical general relativity has certainly been the proof of the positivity of the total gravitational energy, both at spatial and null infinity. It is precisely its positivity that makes this notion not only important (because of its theoretical significance), but also a useful tool in the everyday practice of working relativists. This success inspired the more ambitious claim to associate energy (or rather energy-momentum and, ultimately, angular momentum as well) to extended, but finite, spacetime domains, i.e., at the quasi-local level. Obviously, the quasi-local quantities could provide a more detailed characterization of the states of the gravitational ‘field’ than the global ones, so they (together with more general quasi-local observables) would be interesting in their own right.

Moreover, finding an appropriate notion of energy-momentum and angular momentum would be important from the point of view of applications as well. For example, they may play a central role in the proof of the full Penrose inequality (as they have already played in the proof of the Riemannian version of this inequality). The correct, ultimate formulation of black hole thermodynamics should probably be based on quasi-locally defined internal energy, entropy, angular momentum, etc. In numerical calculations, conserved quantities (or at least those for which balance equations can be derived) are used to control the errors. However, in such calculations all the domains are finite, i.e., quasi-local. Therefore, a solid theoretical foundation of the quasi-local conserved quantities is needed.

However, contrary to the high expectations of the 1980s, finding an appropriate quasi-local notion of energy-momentum has proven to be surprisingly difficult. Nowadays, the state of the art is typically postmodern: although there are several promising and useful suggestions, we not only have no ultimate, generally accepted expression for the energy-momentum and especially for the angular momentum, but there is not even a consensus in the relativity community on general questions (for example, what do we mean by energy-momentum? just a general expression containing arbitrary functions, or rather a definite one, free of any ambiguities, even of additive constants), or on the list of the criteria of reasonableness of such expressions. The various suggestions are based on different philosophies/approaches and give different results in the same situation. Apparently, the ideas and successes of one construction have very little influence on other constructions.

The aim of the present paper is, therefore, twofold. First, to collect and review the various specific suggestions, and, second, to stimulate the interaction between the different approaches by clarifying the general, potentially-common points, issues and questions. Thus, we wanted not only to write a ‘who-did-what’ review, but to concentrate on the understanding of the basic questions (such as why should the gravitational energy-momentum and angular momentum, or, more generally, any observable of the gravitational ‘field’, be necessarily quasi-local) and ideas behind the various specific constructions. Consequently, one third of the present review is devoted to these general questions. We review the specific constructions and their properties only in the second part, and in the third part we discuss very briefly some (potential) applications of the quasi-local quantities. Although this paper is at heart a review of known and published results, we believe that it contains several new elements, observations, suggestions etc.

Surprisingly enough, most of the ideas and concepts that appear in connection with the gravitational energy-momentum and angular momentum can be introduced in (and hence can be understood from) the theory of matter fields in Minkowski spacetime. Thus, in Section 2.1, we review the Belinfante-Rosenfeld procedure that we will apply to gravity in Section 3, introduce the notion of quasi-local energy-momentum and angular momentum of the matter fields and discuss their properties. The philosophy of quasi-locality in general relativity will be demonstrated in Minkowski spacetime where the energy-momentum and angular momentum of the matter fields are treated quasi-locally. Then we turn to the difficulties of gravitational energy-momentum and angular momentum, and we clarify why the gravitational observables should necessarily be quasi-local. The tools needed to construct and analyze the quasi-local quantities are reviewed in the fourth section. This closes the first (general) part of the review (Sections 24).

The second part is devoted to the discussion of the specific constructions (Sections 512). Since most of the suggestions are constructions, they cannot be given as a short mathematical definition. Moreover, there are important physical ideas behind them, without which the constructions may appear ad hoc. Thus, we always try to explain these physical pictures, the motivations and interpretations. Although the present paper is intended to be a nontechnical review, the explicit mathematical definitions of the various specific constructions will always be given, while the properties and applications are usually summarized only. Sometimes we give a review of technical aspects as well, without which it would be difficult to understand even some of the conceptual issues. The list of references connected with this second part is intended to be complete. We apologize to all those whose results were accidentally left out.

The list of the (actual and potential) applications of the quasi-local quantities, discussed in Section 13, is far from being complete, and might be a bit subjective. Here we consider the calculation of gravitational energy transfer, applications to black hole physics and cosmology, and a quasi-local characterization of the pp-wave metrics. We close this paper with a discussion of the successes and deficiencies of the general and (potentially) viable constructions. In contrast to the positivistic style of Sections 512, Section 14 (as well as the choice of subject matter of Sections 24) reflects our own personal interest and view of the subject.

The theory of quasi-local observables in general relativity is far from being complete. The most important open problem is still the trivial one: ‘Find quasi-local energy-momentum and angular momentum expressions satisfying the points of the lists of Section 4.3’. Several specific open questions in connection with the specific definitions are raised both in the corresponding sections and in Section 14; these are simple enough to be worked out by graduate students. On the other hand, applying them to solve physical/geometrical problems (e.g., to some mentioned in Section 13) would be a real achievement.

In the present paper we adopt the abstract index formalism. The signature of the spacetime metric gab is −2, and the curvature Ricci tensors and curvature scalar of the covariant derivative ∇a are defined by (\(({\nabla _c}{\nabla _d} - {\nabla _d}{\nabla _c}){X^a}: = - {R^a}_{bcd}{X^b},{R_{bd}}: = {R^a}_{bad}\) and \(R: = {R_{bd}}{g^{bd}}\), respectively. Hence, Einstein’s equations take the form \({G_{ab}} + \lambda {g_{ab}}: = {R_{ab}} - {1 \over 2}R{g_{ab}} + \lambda {g_{ab}} = - 8\pi G{T_{ab}}\), where G is Newton’s gravitational constant and λ is the cosmological constant (and the speed of light is c =1). However, apart from special cases stated explicitly, the cosmological constant will be assumed to be vanishing, and in Sections 3.1.1, 13.3 and 13.4 we use the traditional cgs system.

2 Energy-Momentum and Angular Momentum of Matter Fields

2.1 Energy-momentum and angular-momentum density of matter fields

2.1.1 The symmetric energy-momentum tensor

It is a widely accepted view that the canonical energy-momentum and spin tensors are well defined and have relevance only in flat spacetime, and, hence, are usually underestimated and abandoned. However, it is only the analog of these canonical quantities that can be associated with gravity itself. Thus, we first introduce these quantities for the matter fields in a general curved spacetime.

To specify the state of the matter fields operationally, two kinds of devices are needed: the first measures the value of the fields, while the other measures the spatio-temporal location of the first. Correspondingly, the fields on the manifold M of events can be grouped into two sharply-distinguished classes. The first contains the matter field variables, e.g., finitely many (r, s)-type tensor fields \({\Phi _N}_{{b_1} \ldots {b_{\mathcal S}}}^{{a_1} \ldots {a_r}}\), whilst the second contains the fields specifying the spacetime geometry, i.e., the metric gab in Einstein’s theory. Suppose that the dynamics of the matter fields is governed by Hamilton’s principle specified by a Lagrangian \({L_{\rm{m}}} = {L_{\rm{m}}}({g^{ab}},{\Phi _N},{\nabla _e}{\Phi _N}, \ldots, {\nabla _{{e_1} \ldots}}{\nabla _{{e_k}}}{\Phi _N})\). If Im[gab, ΦN] is the action functional, i.e., the volume integral of Lm on some open domain D with compact closure, then the equations of motion are

$$E_{\;\;\;a \ldots}^{Nb \ldots}: = {1 \over {\sqrt {\vert g \vert}}}{{\delta {I_{\rm{m}}}} \over {\delta {\Phi _{N_{b \ldots}^{a \ldots}}}}} = \sum\limits_{n = 0}^k {{{(-)}^n}{\nabla _{{e_n}}} \ldots {\nabla _{{e_1}}}\left({{{\partial {L_{\rm{m}}}} \over {\partial \left({{\nabla _{{e_1}}} \ldots {\nabla _{{e_n}}}{\Phi _{N_{b \ldots}^{a \ldots}}}} \right)}}} \right) =} 0,$$

the Euler-Lagrange equations. (Here, of course, \(\delta {I_{\rm{m}}}/\delta {\Phi _N}_{b \ldots}^{a \ldots}\) denotes the formal variational derivative of Im with respect to the field variable \({\Phi _N}_{b \ldots}^{a \ldots}\).) The symmetric (or dynamical) energy-momentum tensor is defined (and is given explicitly) by

$${T_{ab}}: = {1 \over {\sqrt {\vert g\vert}}}{{\delta {I_{\rm{m}}}} \over {\delta {g^{ab}}}} = 2{{\partial {L_{\rm{m}}}} \over {\partial {g^{ab}}}} - {L_{\rm{m}}}{g_{ab}} + {1 \over 2}{\nabla ^e}({\sigma _{abe}} + {\sigma _{bae}} - {\sigma _{aeb}} - {\sigma _{bea}} - {\sigma _{eab}} - {\sigma _{eba}}),$$
(2.1)

where we introduced the canonical spin tensor

$${\sigma ^{ea}}_b: = \sum\limits_{n = 1}^k {\sum\limits_{i = 1}^n {{{( - )}^i}\delta _{{e_i}}^e{\nabla _{{e_{i - 1}}}}...{\nabla _{{e_1}}}\left( {\frac{{\partial {L_m}}}{{\partial ({\nabla _{{e_1}}}...{\nabla _{{e_n}}}{\Phi _N}_{d...}^{c...})}}} \right)}} \;\Delta_{be_{i+1...}e_{n}d...}^{ac...}{_{h...}^{f_{i+1...}f_{n}g}}\Delta_{f_{i+1}}...\Delta_{f_{n}}\;\Phi_{N_{g...}^{h...}}.$$
(2.2)

(The terminology will be justified in Section 2.2.) Here \(\Delta _{b{d_1} \ldots {d_q}{h_1} \ldots {h_p}}^{a{c_1} \ldots {c_p}{g_1} \ldots {g_q}}\) is the (p + q + 1, p + q + 1)-type invariant tensor, built from the Kronecker deltas, appearing naturally in the expression of the Lie derivative of the (p, q)-type tensor fields in terms of the torsion free covariant derivatives: \({{-\!\!\!\! L}}_{\rm{K}}\Phi _{d \ldots}^{c \ldots} = {\nabla _{\rm{K}}}\Phi _{d \ldots}^{c \ldots} - {\nabla _a}{K^b}\nabla _{bd \ldots h \ldots}^{ac \ldots g \ldots}\Phi _{g \ldots}^{h \ldots}\). (For the general idea behind the derivation of Tab and Eq. (2.2), see, e.g., Section 3 of [240].)

2.1.2 The canonical Noether current

Suppose that the Lagrangian is weakly diffeomorphism invariant in the sense that, for any vector field Ka and the corresponding local one-parameter family of diffeomorphisms ϕt, one has

$$(\phi _t^{\ast} {L_{\rm{m}}})({g^{ab}},{\Phi _N},{\nabla _e}{\Phi _N}, \ldots) - {L_{\rm{m}}}(\phi _t^{\ast} {g^{ab}},\phi _t^{\ast} {\Phi _N},\phi _t^{\ast} {\nabla _e}{\Phi _N}, \ldots) = {\nabla _e}B_t^e,$$

for some one-parameter family of vector fields \(B_t^e = B_t^e({g^{ab}},{\Phi _N}, \ldots)\). (Lm is called diffeomorphism invariant if \({\nabla _e}B_t^e = 0\), e.g., when Lm is a scalar.) Let Ka be any smooth vector field on M. Then, calculating the divergence a(LmKa) to determine the rate of change of the action functional Im along the integral curves of Ka, by a tedious but straightforward computation, one can derive the Noether identity: \({E^N}_{a \ldots}^{b \ldots}{{-\!\!\!\! L}_{\rm{K}}}{\Phi _N}_{b \ldots}^{a \ldots} + {1 \over 2}{T_{ab}}{{-\!\!\!\! L}_{\rm{K}}}{g^{ab}} + {\nabla _e}{C^e}[{\rm{K]}}\,{\rm{=}}\,{\rm{0}}\), where ŁK denotes the Lie derivative along Ka, and Ce[K], the Noether current, is given explicitly by

$${C^e}[{\bf{K}}] = {\dot B^e} + {\theta ^{ea}} + {K_a} + \left({{\sigma ^{e[ab]}} + {\sigma ^{a[be]}} + {\sigma ^{b[ae]}}} \right){\nabla _a}{K_b}.$$
(2.3)

Here e is the derivative of \(B_t^e\) with respect to t at t = 0, which may depend on Ka and its derivatives, and \({\theta ^a}_b\), the canonical energy-momentum tensor, is defined by

$${\theta ^a}_b: = - {L_{\rm{m}}}\delta _b^a - \sum\limits_{n = 1}^k {\sum\limits_{i = 1}^n {{{(-)}^i}\delta _{{e_i}}^a{\nabla _{{e_{i - 1}}}} \ldots {\nabla _{{e_1}}}\left({{{\partial {L_{\rm{m}}}} \over {\partial ({\nabla _{{e_1}}} \ldots {\nabla _{{e_n}}}{\Phi _{N_{d \ldots}^{c \ldots}}})}}} \right)}} {\nabla _b}{\nabla _{{e_{i + 1}}}} \ldots {\nabla _{{e_n}}}{\Phi _N}_{d \ldots}^{c \ldots}.$$
(2.4)

Note that, apart from the term e, the current Ce[K] does not depend on higher than the first derivative of Ka, and the canonical energy-momentum and spin tensors could be introduced as the coefficients of Ka and its first derivative, respectively, in Ce[K]. (For the original introduction of these concepts, see [73, 74, 438]. If the torsion \({\Theta ^c}_{ab}\) is not vanishing, then in the Noether identity there is a further term, \({1 \over 2}{S^{ab}}_c{{-\!\!\!\! L}_{\rm{K}}}{\Theta ^c}_{ab}\), where the dynamic spin tensor \({S^{ab}}_c\) is defined by \(\sqrt {\vert g\vert} {S^{ab}}_c: = 2\delta {I_{\rm{m}}}/\delta {\Theta ^c}_{ab}\), and the Noether current has a slightly different structure [259, 260].) Obviously, Ce[K] is not uniquely determined by the Noether identity, because that contains only its divergence, and any identically-conserved current may be added to it. In fact, \(B_t^e\) may be chosen to be an arbitrary nonzero (but divergence free) vector field, even for diffeomorphism-invariant Lagrangians. Thus, to be more precise, if e = 0, then we call the specific combination (2.3) the canonical Noether current. Other choices for the Noether current may contain higher derivatives of Ka, as well (see, e.g., [304]), but there is a specific one containing Ka algebraically (see points 3 and 4 below).

However, Ca[K] is sensitive to total divergences added to the Lagrangian, and, if the matter fields have gauge freedom (e.g., if the matter is a Maxwell or Yang-Mills field), then in general it is not gauge invariant, even if the Lagrangian is. On the other hand, Tab is gauge invariant and is independent of total divergences added to Lm because it is the variational derivative of the gauge invariant action with respect to the metric. Provided the field equations are satisfied, the Noether identity implies [73, 74, 438, 259, 260] that

  1. 1.

    aTab = 0,

  2. 2.

    Tab = θab + ∇c(σc[ab] + σc[ab] + σc[ab]),

  3. 3.

    Ca[K] = TabKb + ∇c((σc[ab]σc[ab]σc[ab]Kb), where the second term on the right is an identically-conserved (i.e., divergence-free) current, and

  4. 4.

    Ca[K] is conserved if Ka is a Killing vector.

Hence, TabKb is also conserved and can equally be considered as a Noether current. (For a formally different, but essentially equivalent, introduction of the Noether current and identity, see [536, 287, 191].)

The interpretation of the conserved currents, Ca[K] and TabKb, depends on the nature of the Killing vector, Ka. In Minkowski spacetime the ten-dimensional Lie algebra K of the Killing vectors is well known to split into the semidirect sum of a four-dimensional commutative ideal, T, and the quotient K/T, where the latter is isomorphic to so(1, 3). The ideal T is spanned by the constant Killing vectors, in which a constant orthonormal frame field \(\{E_{\underline a}^a\} {\rm{on}}\,M{\rm{,}}\,\underline a = 0, \ldots, 3\), forms a basis. (Thus, the underlined Roman indices \(\underline a, \underline b\), … are concrete, name indices.) By \({g_{ab}}E_{\underline a}^aE_{\underline b}^b: = {\eta _{\underline a \underline b}}: = {\rm{diag(1, - 1, - 1, - 1)}}\) the ideal T inherits a natural Lorentzian vector space structure. Having chosen an origin o ∈ M, the quotient K/T can be identified as the Lie algebra Ro of the boost-rotation Killing vectors that vanish at o. Thus, K has a ‘4 + 6’ decomposition into translations and boost rotations, where the translations are canonically defined but the boost-rotations depend on the choice of the origin o ∈ M. In the coordinate system \(\{{x^{\underline a}}\}\) adapted to \(\{E_{\underline a}^a\}\) (i.e., for which the one-form basis dual to \(\{E_{\underline a}^a\}\) has the form \(\vartheta _a^{\underline a} = {\nabla _a}{x^{\underline a}})\), the general form of the Killing vectors (or rather one-forms) is \({K_a} = {T_{\underline a}}\vartheta _a^{\underline a} + {M_{\underline a \underline b}}({x^{\underline a}}\vartheta _a^{\underline b} - {x^{\underline b}}\vartheta _a^{\underline a})\) for some constants \({T_{\underline a}}\) and \({M_{\underline a \underline b}} = - {M_{\underline b \underline a}}\). Then, the corresponding canonical Noether current is \({C^e}[{\bf{K}}] = E_{\underline e}^e({\theta ^{\underline e \underline a}}{T_{\underline a}} - ({\theta ^{\underline e \underline a}}{x^{\underline b}} - {\theta ^{\underline e \underline b}}{x^{\underline a}} - 2{\sigma ^{\underline e [\underline a \underline {b]}}}){M_{\underline a \underline b}})\), and the coefficients of the translation and the boost-rotation parameters \({T_{\underline a}}\) and \({M_{\underline a \underline b}}\) are interpreted as the density of the energy-momentum and of the sum of the orbital and spin angular momenta, respectively. Since, however, the difference Ca[K] − TabKb is identically conserved and TabKb has more advantageous properties, it is TabKb, that is used to represent the energy-momentum and angular-momentum density of the matter fields.

Since in de Sitter and anti-de Sitter spacetimes the (ten-dimensional) Lie algebra of the Killing vector fields, so(1, 4) and so(2, 3), respectively, are semisimple, there is no such natural notion of translations, and hence no natural ‘4 + 6’ decomposition of the ten conserved currents into energy-momentum and (relativistic) angular momentum density.

2.2 Quasi-local energy-momentum and angular momentum of the matter fields

In Section 3 we will see that well-defined (i.e., gauge-invariant) energy-momentum and angular-momentum density cannot be associated with the gravitational ‘field’, and if we do not want to talk only about global gravitational energy-momentum and angular momentum, then these quantities must be assigned to extended, but finite, spacetime domains.

In the light of modern quantum-field-theory investigations, it has become clear that all physical observables should be associated with extended but finite spacetime domains [232, 231]. Thus, observables are always associated with open subsets of spacetime, whose closure is compact, i.e., they are quasi-local. Quantities associated with spacetime points or with the whole spacetime are not observable in this sense. In particular, global quantities, such as the total energy or electric charge, should be considered as the limit of quasi-locally-defined quantities. Thus, the idea of quasi-locality is not new in physics. Although in classical nongravitational physics this is not obligatory, we adopt this view in talking about energy-momentum and angular momentum even of classical matter fields in Minkowski spacetime. Originally, the introduction of these quasi-local quantities was motivated by the analogous gravitational quasi-local quantities [488, 492]. Since, however, many of the basic concepts and ideas behind the various gravitational quasi-local energy-momentum and angular momentum definitions can be understood from the analogous nongravitational quantities in Minkowski spacetime, we devote Section 2.2 to the discussion of them and their properties.

2.2.1 The definition of quasi-local quantities

To define the quasi-local conserved quantities in Minkowski spacetime, first observe that, for any Killing vector Ka ∈ K, the 3-form ωabc:= KeTef εfabc is closed, and hence, by the triviality of the third de Rham cohomology group, H3(ℝ4) = 0, it is exact: For some 2-form ⋃[K]ab we have \({K_e}{T^{ef}}{\varepsilon _{fabc}} = 3{\nabla _{[a}} \cup {[{\bf{K}}]_{bc] \cdot}}\,{\vee ^{cd}}: = - {1 \over 2} \cup {[{\bf{K}}]_{ab}}{\varepsilon ^{abcd}}\) may be called a ‘superpotential’ for the conserved current 3-form ωabc. (However, note that while the superpotential for the gravitational energy-momentum expressions of Section 3 is a local function of the general field variables, the existence of this ‘superpotential’ is a consequence of the field equations and the Killing nature of the vector field Ka. The existence of globally-defined superpotentials that are local functions of the field variables can be proven even without using the Poincaré lemma [535].) If \(\tilde \cup {[{\bf{K}}]_{ab}}\) is (the dual of) another superpotential for the same current ωabc, then by \({\nabla _{[a}}(\cup {[{\bf{K}}]_{bc]}} - \tilde \cup {[{\bf{K}}]_{bc]}}) = 0\) and H2(ℝ4) = 0 the dual superpotential is unique up to the addition of an exact 2-form. If, therefore, \({\mathcal S}\) is any closed orientable spacelike two-surface in the Minkowski spacetime then the integral of ⋃[K]ab on \({\mathcal S}\) is free from this ambiguity. Thus, if Σ is any smooth compact spacelike hypersurface with smooth two-boundary \({\mathcal S}\), then

$${Q_{\mathcal S}}[{\bf{K}}]: = {\textstyle{1 \over 2}}\oint\nolimits_{\mathcal S} {\cup {{[{\bf{K}}]}_{ab}}} = \int\nolimits_\Sigma {{K_e}{T^{ef}}{\textstyle{1 \over {3!}}}{\varepsilon _{f\;abc}}}$$
(2.5)

depends only on \({\mathcal S}\). Hence, it is independent of the actual Cauchy surface Σ of the domain of dependence D(Σ) because all the spacelike Cauchy surfaces for D(Σ) have the same common boundary \({\mathcal S}\). Thus, \({Q_{\mathcal S}}[{\bf{K}}]\) can equivalently be interpreted as being associated with the whole domain of dependence D(Σ), and, hence, it is quasi-local in the sense of [232, 231] above. It defines the linear maps \({P_{\mathcal S}}:{\rm{T}} \rightarrow {\rm{{\mathbb R}}}\), and \({J_{\mathcal S}}:{{\rm{R}}_o} \rightarrow {\rm{\mathbb R}}\,{\rm{by}}\,{{\rm{Q}}_{\mathcal S}}[{\bf{K}}] =: {T_{\underline a}}P_{\mathcal S}^{\underline a} + {M_{\underline a \underline b}}J_{\mathcal S}^{\underline a \underline b}\) i.e., they are elements of the corresponding dual spaces. Under Lorentz rotations of the Cartesian coordinates \(P_{\mathcal S}^{\underline a}\) and \(J_{\mathcal S}^{\underline a \underline b}\) transform as a Lorentz vector and anti-symmetric tensor, respectively. Under the translation \({x^{\underline a}} \mapsto {a^{\underline a}} + {\eta ^{\underline a}}\) of the origin \(P_{\mathcal S}^{\underline a}\) is unchanged, but \(J_{\mathcal S}^{\underline a \underline b}\) transforms as \(J_{\mathcal S}^{\underline a \underline b} \mapsto J_{\mathcal S}^{\underline a \underline b} + 2{\eta ^{[\underline a}}P_{\mathcal S}^{\underline b ]}\). Thus, \(P_{\mathcal S}^{\underline a}\) and \(J_{\mathcal S}^{\underline a \underline b}\) may be interpreted as the quasi-local energy-momentum and angular momentum of the matter fields associated with the spacelike two-surface \({\mathcal S}\), or, equivalently, to D(Σ). Then the quasi-local mass and Pauli-Lubanski spin are defined, respectively, by the usual formulae \(m_{\mathcal S}^2: = {\eta _{\underline a \underline b}}P_{\mathcal S}^{\underline a}P_{\mathcal S}^{\underline b}\) and \(S_{\mathcal S}^{\underline a}: = {1 \over 2}{\varepsilon ^{\underline a}}_{\underline b \underline c \underline d}P_{\mathcal S}^{\underline b}J_{\mathcal S}^{\underline c \underline d}\). (If m2 ≠ 0, then the dimensionally-correct definition of the Pauli-Lubanski spin is \({1 \over m}S_{\mathcal S}^{\underline a}\).) As a consequence of the definitions, \({\eta _{\underline a \underline b}}P_{\mathcal S}^{\underline a}S_{\mathcal S}^b = 0\) holds, i.e., if \(P_{\mathcal S}^{\underline a}\) is timelike then \(S_{\mathcal S}^{\underline a}\) is spacelike or zero, but if \(P_{\mathcal S}^{\underline a}\) is null (i.e., \(m_{\mathcal S}^2 = 0\)) then \(S_{\mathcal S}^{\underline a}\) is spacelike or proportional to \(P_{\mathcal S}^{\underline a}\).

Obviously we can form the flux integral of the current Tabξb on the hypersurface even if ξa is not a Killing vector, even in general curved spacetime:

$${E_\Sigma}[{\xi ^a}]: = \int\nolimits_\Sigma {{\xi _e}{T^{e\;f}}{\textstyle{1 \over {3!}}}{\varepsilon _{f\;abc}}}.$$
(2.6)

then, however, the integral EΣ[ξa] does depend on the hypersurface, because it is not connected with the spacetime symmetries. In particular, the vector field ξa can be chosen to be the unit timelike normal ta of Σ. Since the component μ:= Tabtatb of the energy-momentum tensor is interpreted as the energy-density of the matter fields seen by the local observer ta, it would be legitimate to interpret the corresponding integral EΣ[ta] as ‘the quasi-local energy of the matter fields seen by the fleet of observers being at rest with respect to Σ’. Thus, EΣ[ta] defines a different concept of the quasi-local energy: While that based on \({Q_{\mathcal S}}[{\bf{K}}]\) is linked to some absolute element, namely to the translational Killing symmetries of the spacetime, and the constant timelike vector fields can be interpreted as the observers ‘measuring’ this energy, EΣ[ta] is completely independent of any absolute element of the spacetime and is based exclusively on the arbitrarily chosen fleet of observers. Thus, while \(P_{\mathcal S}^{\underline a}\) is independent of the actual normal ta of \({\mathcal S}\), EΣ[ξa] (for non-Killing ξa) depends on ta intrinsically and is a genuine three-hypersurface rather than a two-surface integral.

If \(P_b^{\underline a}: = \delta _b^a - {t^a}{t_b}\), the orthogonal projection to Σ, then the part \({j^a}: = P_b^a{T^{bc}}{t_c}\) of the energy-momentum tensor is interpreted as the momentum density seen by the observer ta. Hence,

$$({t_a}{T^{ab}})({t_c}{T^{cd}}){g_{bd}} = {\mu ^2} + {h_{ab}}{j^a}{j^b} = {\mu ^2} - \vert {j^a}{\vert ^2}$$

is the square of the mass density of the matter fields, where hab is the spatial metric in the plane orthogonal to ta. If Tab satisfies the dominant energy condition (i.e., TabVb is a future directed nonspacelike vector for any future directed nonspacelike vector Va, see, e.g., [240]), then this is non-negative, and hence,

$${M_\Sigma}: = \int\nolimits_\Sigma {\sqrt {{\mu ^2} - \vert {j^e}{\vert ^2}} {\textstyle{1 \over {3!}}}{t^f}{\varepsilon _{f\;abc}}}$$
(2.7)

can also be interpreted as the quasi-local mass of the matter fields seen by the fleet of observers being at rest with respect to Σ, even in general curved spacetime. However, although in Minkowski spacetime EΣ[K] for the four translational Killing vectors gives the four components of the energy-momentum \(P_{\mathcal S}^{\underline a}\), the mass MΣ is different from \({m_{\mathcal S}}\). In fact, while \({m_{\mathcal S}}\) is defined as the Lorentzian norm of \(P_{\mathcal S}^{\underline a}\) with respect to the metric on the space of the translations, in the definition of MΣ the norm of the current Tabtb is first taken with respect to the pointwise physical metric of the space-time, and then its integral is taken. Nevertheless, because of more advantageous properties (see Section 2.2.3), we prefer to represent the quasi-local energy(-momentum and angular momentum) of the matter fields in the form \({Q_{\mathcal S}}[{\bf{K}}]\) instead of EΣ[ξa].

Thus, even if there is a gauge-invariant and unambiguously-defined energy-momentum density of the matter fields, it is not a priori clear how the various quasi-local quantities should be introduced. We will see in the second part of this review that there are specific suggestions for the gravitational quasi-local energy that are analogous to \(P_{\mathcal S}^0\), others to EΣ[ta], and some to MΣ.

2.2.2 Hamiltonian introduction of the quasi-local quantities

In the standard Hamiltonian formulation of the dynamics of the classical matter fields on a given (not necessarily flat) spacetime (see, e.g., [283, 558] and references therein) the configuration and momentum variables, ϕA and πA, respectively, are fields on a connected three-manifold Σ, which is interpreted as the typical leaf of a foliation Σt of the spacetime. The foliation can be characterized on Σ by a function N, called the lapse. The evolution of the states in the spacetime is described with respect to a vector field Ka = Nta + Na (‘evolution vector field’ or ‘general time axis’), where ta is the future-directed unit normal to the leaves of the foliation and Na is some vector field, called the shift, being tangent to the leaves. If the matter fields have gauge freedom, then the dynamics of the system is constrained: Physical states can be only those that are on the constraint surface, specified by the vanishing of certain functions Ci = Ci(ϕA, DeϕA,…, πA, DeπA,…), i = 1,…, n, of the canonical variables and their derivatives up to some finite order, where De is the covariant derivative operator in Σ. Then the time evolution of the states in the phase space is governed by the Hamiltonian, which has the form

$$H\;[{\bf{K}}] = \int\nolimits_\Sigma {(\mu N + {j_a}{N^a} + {C_i}{N^i} + {D_a}{Z^a})} \;d\Sigma.$$
(2.8)

Here dΣ is the induced volume element, the coefficients μ and ja are local functions of the canonical variables and their derivatives up to some finite order, the Nis are functions on Σ, and Za is a local function of the canonical variables and is a linear function of the lapse, the shift, the functions Ni, and their derivatives up to some finite order. The part CiNi of the Hamiltonian generates gauge motions in the phase space, and the functions Ni are interpreted as the freely specifiable ‘gauge generators’.

However, if we want to recover the field equations for ϕA (which are partial differential equations on the spacetime with smooth coefficients for the smooth field ϕA) on the phase space as the Hamilton equations and not some of their distributional generalizations, then the functional differentiability of H[K] must be required in the strong sense of [534].Footnote 1 Nevertheless, the functional differentiability (and, in the asymptotically flat case, also the existence) of H[K] requires some boundary conditions on the field variables, and may yield restrictions on the form of Za. It may happen that, for a given Za, only too restrictive boundary conditions would be able to ensure the functional differentiability of the Hamiltonian, and, hence, the ‘quasi-local phase space’ defined with these boundary conditions would contain only very few (or no) solutions of the field equations. In this case, Za should be modified. In fact, the boundary conditions are connected to the nature of the physical situations considered. For example, in electrodynamics different boundary conditions must be imposed if the boundary is to represent a conducting or an insulating surface. Unfortunately, no universal principle or ‘canonical’ way of finding the ‘correct’ boundary term and the boundary conditions is known.

In the asymptotically flat case, the value of the Hamiltonian on the constraint surface defines the total energy-momentum and angular momentum, depending on the nature of Ka, in which the total divergence DaZa corresponds to the ambiguity of the superpotential 2-form ⋃[K]ab: An identically-conserved quantity can always be added to the Hamiltonian (provided its functional differentiability is preserved). The energy density and the momentum density of the matter fields can be recovered as the functional derivative of H[K] with respect to the lapse N and the shift Na, respectively. In principle, the whole analysis can be repeated quasi-locally too. However, apart from the promising achievements of [13, 14, 442] for the Klein-Gordon, Maxwell, and the Yang-Mills-Higgs fields, as far as we know, such a systematic quasi-local Hamiltonian analysis of the matter fields is still lacking.

2.2.3 Properties of the quasi-local quantities

Suppose that the matter fields satisfy the dominant energy condition. Then EΣ[ξa] is also non-negative for any nonspacelike ξa, and, obviously, EΣ[ta] is zero precisely when Tab = 0 on Σ, and hence, by the conservation laws (see, e.g., page 94 of [240]), on the whole domain of dependence D(Σ). Obviously, MΣ = 0 if and only if \({L^a}: = {T^{ab}}{t_b}\) is null on Σ. Then, by the dominant energy condition it is a future-pointing vector field on Σ, and LaTab = 0 holds. Therefore, Tab on Σ has a null eigenvector with zero eigenvalue, i.e., its algebraic type on Σ is pure radiation.

The properties of the quasi-local quantities based on \({Q_{\mathcal S}}[{\bf{K}}]\) in Minkowski spacetime are, however, more interesting. Namely, assuming that the dominant energy condition is satisfied, one can prove [488, 492] that

  1. 1.

    \(P_{\mathcal S}^{\underline a}\) is a future directed nonspacelike vector, \(m_{\mathcal S}^2 \geq 0\)

  2. 2.

    \(P_{\mathcal S}^{\underline a}\) if and only if Tab = 0 on D(Σ);

  3. 3.

    \(m_{\mathcal S}^2 = 0\) if and only if the algebraic type of the matter on D(Σ) is pure radiation, i.e., TabLb = 0 holds for some constant null vector La. Then Tab = τLaLb for some non-negative function τ. In this case \(P_{\mathcal S}^{\underline a} = e{L^{\underline a}}\), where \({L^{\underline a}}: = {L^a}\vartheta _a^{\underline a}\)

  4. 4.

    For \(m_{\mathcal S}^2\) = 0 the angular momentum has the form \(J_{\mathcal S}^{\underline a \underline b} = {e^{\underline a}}{L^{\underline b}} - {e^{\underline b}}{L^{\underline a}}\), where \({e^{\underline a}}: = \int\nolimits_\Sigma {{x^{\underline a}}} \tau {L^a}{1 \over {3!}}{\varepsilon _{abcd}}\). Thus, in particular, the Pauli-Lubanski spin is zero.

Therefore, the vanishing of the quasi-local energy-momentum characterizes the ‘vacuum state’ of the classical matter fields completely, and the vanishing of the quasi-local mass is equivalent to special configurations representing pure radiation.

Since EΣ[ta] and MΣ are integrals of functions on a hypersurface, they are obviously additive, e.g., for any two hypersurfaces Σ1 and Σ2 (having common points at most on their boundaries \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\) one has \({E_{{\Sigma _1} \cup {\Sigma _2}}}[{t^a}] = {E_{{\Sigma _1}}}[{t^a}] + {E_{{\Sigma _2}}}[{t^a}]\). On the other hand, the additivity of \(P_{\mathcal S}^{\underline a}\) is a slightly more delicate problem. Namely, \(P_{{{\mathcal S}_1}}^{\underline a}\) and \(P_{{{\mathcal S}_2}}^{\underline a}\) are elements of the dual space of the translations, and hence, we can add them and, as in the previous case, we obtain additivity. However, this additivity comes from the absolute parallelism of the Minkowski spacetime: The quasi-local energy-momenta of the different two-surfaces belong to one and the same vector space. If there were no natural connection between the Killing vectors on different two-surfaces, then the energy-momenta would belong to different vector spaces, and they could not be added. We will see that the quasi-local quantities discussed in Sections 7, 8, and 9 belong to vector spaces dual to their own ‘quasi-Killing vectors’, and there is no natural way of adding the energy-momenta of different surfaces.

2.2.4 Global energy-momenta and angular momenta

If Σ extends either to spatial or future null infinity, then, as is well known, the existence of the limit of the quasi-local energy-momentum can be ensured by slightly faster than \({\mathcal O}({r^{- 3}})\) (for example by \({\mathcal O}({r^{- 4}})\) falloff of the energy-momentum tensor, where r is any spatial radial distance. However, the finiteness of the angular momentum and center-of-mass is not ensured by the \({\mathcal O}({r^{- 4}})\) falloff. Since the typical falloff of Tab — for the electromagnetic field, for example — is \({\mathcal O}({r^{- 4}})\), we may not impose faster than this, because otherwise we would exclude the electromagnetic field from our investigations. Thus, in addition to the \({\mathcal O}({r^{- 4}})\) falloff, six global integral conditions for the leading terms of Tab must be imposed. At spatial infinity these integral conditions can be ensured by explicit parity conditions, and one can show that the ‘conservation equations’ Tab;b = 0 (as evolution equations for the energy density and momentum density) preserve these falloff and parity conditions [497].

Although quasi-locally the vanishing of the mass does not imply the vanishing of the matter fields themselves (the matter fields must be pure radiative field configurations with plane wave fronts), the vanishing of the total mass alone does imply the vanishing of the fields. In fact, by the vanishing of the mass, the fields must be plane waves, furthermore, by \({T_{ab}} = {\mathcal O}({r^{- 4}})\), they must be asymptotically vanishing at the same time. However, a plane-wave configuration can be asymptotically vanishing only if it is vanishing.

2.2.5 Quasi-local radiative modes and a classical version of the holography for matter fields

By the results of Section 2.2.4, the vanishing of the quasi-local mass, associated with a closed spacelike two-surface \({\mathcal S}\), implies that the matter must be pure radiation on a four-dimensional globally hyperbolic domain D(Σ). Thus, \({m_{\mathcal S}} = 0\) characterizes ‘simple’, ‘elementary’ states of the matter fields. In the present section we review how these states on D(Σ) can be characterized completely by data on the two-surface \({\mathcal S}\), and how these states can be used to formulate a classical version of the holographic principle.

For the (real or complex) linear massless scalar field ϕ and the Yang-Mills fields, represented by the symmetric spinor fields \(\phi _{AB}^\alpha, \alpha = 1, \ldots, N\), where N is the dimension of the gauge group, the vanishing of the quasi-local mass is equivalent [498] to plane waves and the pp-wave solutions of Coleman [152], respectively. Then, the condition TabLb = 0 implies that these fields are completely determined on the whole D(Σ) by their value on \({\mathcal S}\) (in which case the spinor fields \(\phi _{AB}^\alpha\) are necessarily null: \(\phi _{AB}^\alpha = {\phi ^\alpha}{O_A}{O_B}\), whereϕα are complex functions and OA is a constant spinor field such that La = OAOA). Similarly, the null linear zero-rest-mass fields ϕAB…E = ϕOAOBOE on D(Σ) with any spin and constant spinor OA are completely determined by their value on \({\mathcal S}\). Technically, these results are based on the unique complex analytic structure of the u = const. two-surfaces foliating Σ, where La = ∇au, and, by the field equations, the complex functions ϕ and ϕα turn out to be antiholomorphic [492]. Assuming, for the sake of simplicity, that \({\mathcal S}\) is future and past convex in the sense of Section 4.1.3 below, the independent boundary data for such a pure radiative solution consist of a constant spinor field on \({\mathcal S}\) and a real function with one, and another with two, variables. Therefore, the pure radiative modes on D(Σ) can be characterized completely by appropriate data (the holographic data) on the ‘screen’ \({\mathcal S}\).

These ‘quasi-local radiative modes’ can be used to map any continuous spinor field on D(Σ) to a collection of holographic data. Indeed, the special radiative solutions of the form ϕOA (with fixed constant-spinor field OA), together with their complex conjugate, define a dense subspace in the space of all continuous spinor fields on Σ. Thus, every such spinor field can be expanded by the special radiative solutions, and hence, can also be represented by the corresponding family of holographic data. Therefore, if we fix a foliation of D(Σ) by spacelike Cauchy surfaces Σt, then every spinor field on D(Σ) can also be represented on \({\mathcal S}\) by a time-dependent family of holographic data, as well [498]. This fact may be a specific manifestation in classical nongravitational physics of the holographic principle (see Section 13.4.2).

3 On the Energy-Momentum and Angular Momentum of Gravitating Systems

3.1 On the gravitational energy-momentum and angular momentum density: The difficulties

3.1.1 The root of the difficulties: Gravitational energy in Newton’s theory

In Newton’s theory the gravitational field is represented by a singe scalar field ϕ on the flat 3-space Σ ≈ ℝ3 satisfying the Poisson equation −habDaDbϕ = 4πGρ. (Here hab is the flat (negative definite) metric, Da is the corresponding Levi-Civita covariant derivative operator and ρ is the (non-negative) mass density of the matter source.) Hence, the mass of the source contained in some finite three-volume D ⊂ Σ can be expressed as the flux integral of the gravitational field strength on the boundary \({\mathcal S}: = \partial D\)

$${m_D} = {1 \over {4\pi G}}\oint\nolimits_{\mathcal S} {{\upsilon ^a}({D_a}\phi)\;d{\mathcal S}},$$
(3.1)

where va is the outward-directed unit normal to \({\mathcal S}\). If \({\mathcal S}\) is deformed in Σ through a source-free region, then the mass does not change. Thus, the rest mass of the source is analogous to charge in electrostatics. Following the analogy with electrostatics, we can introduce the energy density and the spatial stress of the gravitational field, respectively, by

$$\begin{array}{*{20}c}{U: = {1 \over {8\pi G}}{h^{cd}}({D_c}\phi)\,({D_d}\phi),} & {{\Sigma _{ab}}: = {1 \over {4\pi G}}\left({({D_a}\phi)\,({D_b}\phi) - {1 \over 2}{h_{ab}}{h^{cd}}({D_c}\phi)\,({D_d}\phi)} \right).} \\ \end{array}$$
(3.2)

Note that since gravitation is always attractive, U is a binding energy, and hence it is negative definite. However, by the Galileo-Eötvös experiment, i.e., the principle of equivalence, there is an ambiguity in the gravitational force: It is determined only up to an additive constant covector field ae, and hence by an appropriate transformation DeϕDeϕ + ae the gravitational force Deϕ at a given point p ∈ Σ can be made zero. Thus, at this point both the gravitational energy density and the spatial stress have been made vanishing. On the other hand, they can be made vanishing on an open subset U ⊂ Σ only if the tidal force, DaDbϕ, is vanishing on U. Therefore, the gravitational energy and the spatial stress cannot be localized to a point, i.e., they suffer from the ambiguity in the gravitational force above.

In a relativistically corrected Newtonian theory both the internal energy density u of the (matter) source and the energy density U of the gravitational field itself contribute to the source of gravity. Thus (in the traditional units, when c is the speed of light) the corrected field equation could be expected to be the genuinely non-linear equation

$$- {h^{ab}}{D_a}{D_b}\phi = 4\pi G\left({\rho + {1 \over {{c^2}}}\left({u + U} \right)} \right).$$
(3.3)

(Note that, together with additional corrections, this equation with the correct sign of U can be recovered from Einstein’s equations applied to static configurations [199] in the first post-Newtonian approximation. Note, however, that the theory defined by (3.3) and the usual formula for the force density, is internally inconsistent [221]. A thorough analysis of this theory, and in particular its inconsistency, is given by Giulini [221].) Therefore, by (3.3)

$${E_D}: = \int\nolimits_D {({c^2}\rho + u + U){\rm{d}}\Sigma} = {{{c^2}} \over {4\pi G}}\oint\nolimits_{\mathcal S} {{\upsilon ^a}({D_a}\phi)\;\,d{\mathcal S}},$$
(3.4)

i.e., now it is the energy of the source plus gravity system in the domain D that can be rewritten into the form of a two-surface integral on the boundary of the domain D. Note that the gravitational energy reduces the source term in (3.3) (and hence the energy ED also), and, more importantly, the quasi-local energy ED of the source + gravity system is free of the ambiguity that is present in the gravitational energy density. This in itself already justifies the introduction and use of the quasi-local concept of energy in the study of gravitating systems.

By the negative definiteness of U, outside the source the quasi-local energy ED is a decreasing set function, i.e., if D1D2 and D2D1 is source free, then \({E_{{D_2}}} \leq {E_{{D_1}}}\). In particular, for a 2-sphere of radius r surrounding a localized spherically symmetric homogeneous source with negligible internal energy, the quasi-local energy is \({E_{{D_r}}} = {{{c^4}} \over G}{\rm{m}}(1 + {1 \over 2}{{\rm{m}} \over r}) + O({r^{- 2}})\), where the mass parameter is \({\rm{m: =}}{{GM} \over {{c^2}}}(1 - {3 \over 5}{{GM} \over {{c^2}R}}) + O({c^{- 6}})\) and M is the rest mass and R is the radius of the source. For a more detailed discussion of the energy in the (relativistically corrected) Newtonian theory, see [199].

3.1.2 The root of the difficulties: Gravitational energy-momentum in Einstein’s theory

The action Im for the matter fields is a functional of both kinds of fields, thus one can take the variational derivatives both with respect to \({\Phi _N}_{b \ldots}^{a \ldots}\) and \({g^{ab}}\). The former give the field equations, while the latter define the symmetric energy-momentum tensor. Moreover, gab provides a metrical geometric background, in particular a covariant derivative, for carrying out the analysis of the matter fields. The gravitational action Ig is, on the other hand, a functional of the metric alone, and its variational derivative with respect to gab yields the gravitational field equations. The lack of any further geometric background for describing the dynamics of gab can be traced back to the principle of equivalence [36] (i.e., the Galileo-Eötvös experiment), and introduces a huge gauge freedom in the dynamics of gab because that should be formulated on a bare manifold: The physical spacetime is not simply a manifold M endowed with a Lorentzian metric gab, but the isomorphism class of such pairs, where (M, gab) and (M, ϕ*gab) are considered to be equivalent for any diffeomorphism ϕ of M onto itself.Footnote 2 Thus, we do not have, even in principle, any gravitational analog of the symmetric energy-momentum tensor of the matter fields. In fact, by its very definition, Tab is the source density for gravity, like the current \(J_A^a: = \delta {I_p}/\delta A_a^A\) in Yang-Mills theories (defined by the variational derivative of the action functional of the particles, e.g., of the fermions, interacting with a Yang-Mills field \(A_a^A\)), rather than energy-momentum. The latter is represented by the Noether currents associated with special spacetime displacements. Thus, in spite of the intimate relation between Tab and the Noether currents, the proper interpretation of Tab is only the source density for gravity, and hence it is not the symmetric energy-momentum tensor whose gravitational counterpart must be searched for. In particular, the Bel-Robinson tensor \({T_{abcd}}: = {\psi _{ABCD}}{{\bar \psi}_{{A{\prime}}{B{\prime}}{C{\prime}}{D{\prime}}}}\), given in terms of the Weyl spinor, (and its generalizations introduced by Senovilla [449, 448]), being a quadratic expression of the curvature (and its derivatives), is (are) expected to represent only ‘higher-order’ gravitational energy-momentum. (Note that according to the original tensorial definition the Bel-Robinson tensor is one-fourth of the expression above. Our convention follows that of Penrose and Rindler [425].) In fact, the physical dimension of the Bel-Robinson ‘energy-density’ Tabcdtatbtctd is cm−4, and hence (in the traditional units) there are no powers A and B such that cAGB Tabcdtatbtctd would have energy-density dimension. As we will see, the Bel-Robinson ‘energy-momentum density’ Tabcdtbtctd appears naturally in connection with the quasi-local energy-momentum and spin angular momentum expressions for small spheres only in higher-order terms. Therefore, if we want to associate energy-momentum and angular momentum with the gravity itself in a Lagrangian framework, then it is the gravitational counterpart of the canonical energy-momentum and spin tensors and the canonical Noether current built from them that should be introduced. Hence it seems natural to apply the Lagrange-Belinfante-Rosenfeld procedure, sketched in the previous Section 2.1, to gravity too [73, 74, 438, 259, 260, 486].

3.1.3 Pseudotensors

The lack of any background geometric structure in the gravitational action yields, first, that any vector field Ka generates a symmetry of the matter-plus-gravity system. Its second consequence is the need for an auxiliary derivative operator, e.g., the Levi-Civita covariant derivative coming from an auxiliary, nondynamic background metric (see, e.g., [307, 430]), or a background (usually torsion free, but not necessarily flat) connection (see, e.g., [287]), or the partial derivative coming from a local coordinate system (see, e.g., [525]). Though the natural expectation would be that the final results be independent of these background structures, as is well known, the results do depend on them.

In particular [486], for Hilbert’s second-order Lagrangian LH:= R/16πG in a fixed local coordinate system {xα} and derivative operator μ instead of e, Eq. (2.4) gives precisely Møller’s energy-momentum pseudotensor \({{\rm{M}}^{{\theta ^\alpha}}}\beta\), which was defined originally through the superpotential equation \(\sqrt {\vert g\vert} (8\pi {G_{\rm{M}}}{\theta ^\alpha}_\beta - {G^\alpha}_\beta): = {\partial _{\mu {\rm{M}}}}{\cup _\beta}^{\alpha \mu}\), where \(_{\rm{M}}{\cup _\beta}^{\alpha \mu}: = \sqrt {\vert g\vert} {g^{\alpha \rho}}{g^{\mu \omega}}({\partial _{[\omega}}{g_{\rho ]\beta}})\) is the Møller superpotential [367]. (For another simple and natural introduction of Møller’s energy-momentum pseudotensor, see [131].) For the spin pseudotensor, Eq. (2.2) gives

$$8\pi G{\sqrt {\vert g\vert} _{\rm{M}}}{\sigma ^{\mu \alpha}}_\beta = {- _{\rm{M}}}{\cup _\beta}^{\alpha \mu} + \;{\partial _\nu}\left({\sqrt {\vert g\vert} \delta _\beta ^{\left[ \mu \right.}{g^{\left. \nu \right]\alpha}}} \right),$$

which is, in fact, only pseudotensorial. Similarly, the contravariant form of these pseudotensors and the corresponding canonical Noether current are also pseudotensorial. We saw in Section 2.1.2 that a specific combination of the canonical energy-momentum and spin tensors gave the symmetric energy-momentum tensor, which is gauge invariant even if the matter fields have gauge freedom, and one might hope that the analogous combination of the energy-momentum and spin pseudotensors gives a reasonable tensorial energy-momentum density for the gravitational field. The analogous expression is, in fact, tensorial, but unfortunately it is just the negative of the Einstein tensor [486, 487].Footnote 3 Therefore, to use the pseudotensors, a ‘natural’ choice for a ‘preferred’ coordinate system would be needed. This could be interpreted as a gauge choice, or a choice for the reference configuration.

A further difficulty is that the different pseudotensors may have different (potential) significance. For example, for any fixed kR Goldberg’s 2kth symmetric pseudotensor \(t_{(2k)}^{\alpha \beta}\) is defined by \(2\vert g{\vert ^{k + 1}}(8\pi Gt_{(2k)}^{\alpha \beta} - {G^{\alpha \beta}}): = {\partial _\mu}{\partial _\nu}[\vert g{\vert ^{k + 1}}({g^{\alpha \beta}}{g^{\mu \nu}} - {g^{\alpha \nu}}{g^{\beta \mu}})]\) (which, for k = 0, reduces to the Landau-Lifshitz pseudotensor, the only symmetric pseudotensor which is a quadratic expression of the first derivatives of the metric) [222]. However, by Einstein’s equations, this definition implies that \({\partial _\alpha}[\vert g{\vert ^{k + 1}}(t_{(2k)}^{\alpha \beta} + {T^{\alpha \beta}})] = 0\). Hence what is (coordinate-)divergence-free (i.e., ‘pseudo-conserved’) cannot be interpreted as the sum of the gravitational and matter energy-momentum densities. Indeed, the latter is |g|1/2 Tαβ, while the second term in the divergence equation has an extra weight |g|k+1/2. Thus, there is only one pseudotensor in this series, which satisfies the ‘conservation law’ with the correct weight. In particular, the Landau-Lifshitz pseudotensor also has this defect. On the other hand, the pseudotensors coming from some action (the ‘canonical pseudotensors’) appear to be free of this kind of difficulty (see also [486, 487]). Excellent classical reviews on these (and several other) pseudotensors are [525, 77, 15, 223], and for some recent ones (using background geometric structures) see, e.g., [186, 187, 102, 211, 212, 304, 430].

A particularly useful and comprehensive recent review with many applications and an extended bibliography is that of Petrov [428]. We return to the discussion of pseudotensors in Sections 3.3.1, 4.2.2 and 11.3.5.

3.1.4 Strategies to avoid pseudotensors I: Background metrics/connections

One way of avoiding the use of pseudotensorial quantities is to introduce an explicit background connection [287] or background metric [437, 305, 310, 307, 306, 429, 184]. (The superpotential of Katz, Bičák, and Lynden-Bell [306] has been rediscovered recently by Chen and Nester [137] in a completely different way. We return to a discussion of the approach of Chen and Nester in Section 11.3.2.) The advantage of this approach would be that we could use the background not only to derive the canonical energy-momentum and spin tensors, but to define the vector fields Ka as the symmetry generators of the background. Then, the resulting Noether currents are, without doubt, tensorial. However, they depend explicitly on the choice of the background connection or metric not only through Ka: The canonical energy-momentum and spin tensors themselves are explicitly background-dependent. Thus, again, the resulting expressions would have to be supplemented by a ‘natural’ choice for the background, and the main question is how to find such a ‘natural’ reference configuration from the infinitely many possibilities. A particularly interesting special bimetric approach was suggested in [407] (see also [408]), in which the background (flat) metric is also fixed by using Synge’s world function.

3.1.5 Strategies to avoid pseudotensors II: The tetrad formalism

In the tetrad formulation of general relativity, the gab-orthonormal frame fields \(\{E_{\underline a}^a\}, \underline a = 0, \ldots, 3\), are chosen to be the gravitational field variables [533, 314]. Re-expressing the Hilbert Lagrangian (i.e., the curvature scalar) in terms of the tetrad field and its partial derivatives in some local coordinate system, one can calculate the canonical energy-momentum and spin by Eqs. (2.4) and (2.2), respectively. Not surprisingly at all, we recover the pseudotensorial quantities that we obtained in the metric formulation above. However, as realized by Møller [368], the use of the tetrad fields as the field variables instead of the metric makes it possible to introduce a first-order, scalar Lagrangian for Einstein’s field equations: If \(\gamma _{\underline e \underline b}^{\underline a}: = E_{\underline e}^e\gamma _{e\underline b}^{\underline a}: = E_{\underline e}^e\vartheta _a^{\underline a}{\nabla _e}E_{\underline b}^a\), the Ricci rotation coefficients, then Møller’s tetrad Lagrangian is

$$L: = {1 \over {16\pi G}}\left[ {R - 2{\nabla _a}(E_{\underline a}^a{\eta ^{\underline a \underline b}}\gamma _{\underline c \underline b}^{\underline c})} \right] = {1 \over {16\pi G}}\left({E_{\underline a}^aR_{\underline b}^b - E_{\underline a}^bE_{\underline b}^a} \right)\gamma _{a\underline c}^{\underline a}\gamma _b^{\underline c \underline b}.$$
(3.5)

(Here \(\left\{{\vartheta _a^{\underline a}} \right\}\) is the one-form basis dual to \(\left\{{E_{\underline a}^a} \right\}\).) Although L depends on the actual tetrad field \(\left\{{E_{\underline a}^a} \right\}\), it is weakly O(1, 3)-invariant. Møller’s Lagrangian has a nice uniqueness property [412]: Any first-order scalar Lagrangian built from the tetrad fields, whose Euler-Lagrange equations are the Einstein equations, is Møller’s Lagrangian. (Using Dirac spinor variables Nester and Tung found a first-order spinor Lagrangian [392], which turned out to be equivalent to Møller’s Lagrangian [530]. Another first-order spinor Lagrangian, based on the use of the two-component spinors and the anti-self-dual connection, was suggested by Tung and Jacobson [529]. Both Lagrangians yield a well-defined Hamiltonian, reproducing the standard ADM energy-momentum in asymptotically flat spacetimes.) The canonical energy-momentum θ derived from Eq. (3.5) using the components of the tetrad fields in some coordinate system as the field variables is still pseudotensorial, but, as Møller realized, it has a tensorial superpotential:

$${\vee _b}^{ae}: = 2\left({- \gamma _{\underline b \underline c}^{\underline a}{\eta ^{\underline c \underline e}} + \gamma _{\underline d \underline c}^{\underline d}{\eta ^{\underline c \underline s}}\left({\delta _{\underline b}^{\underline a}\delta _{\underline s}^{\underline e} - \delta _{\underline s}^{\underline a}\delta _{\underline b}^{\underline e}} \right)} \right)\;\vartheta _b^{\underline b}E_{\underline a}^aE_{\underline e}^e = {\vee _b}^{[ae]}.$$
(3.6)

The canonical spin turns out to be essentially \({\vee _b}^{ae}\), i.e., a tensor. The tensorial nature of the superpotential makes it possible to introduce a canonical energy-momentum tensor for the gravitational ‘field’. Then, the corresponding canonical Noether current Ca[K] will also be tensorial and satisfies

$$8\pi G{C^a}[{\bf{K}}] = {G^{ab}}{K_b} + {\textstyle{1 \over 2}}{\nabla _c}({K^b}{\vee _b}^{ac}).$$
(3.7)

Therefore, the canonical Noether current derived from Møller’s tetrad Lagrangian is independent of the background structure (i.e., the coordinate system) that we used to do the calculations (see also [486]). However, Ca[K] depends on the actual tetrad field, and hence, a preferred class of frame fields, i.e., an O(1, 3)-gauge reduction, is needed. Thus, the explicit background dependence of the final result of other approaches has been transformed into an internal O(1, 3)-gauge dependence. It is important to realize that this difficulty always appears in connection with the gravitational energy-momentum and angular momentum, at least in disguise. In particular, the Hamiltonian approach in itself does not yield a well defined energy-momentum density for the gravitational ‘field’ (see, e.g., [379, 353]). Thus in the tetrad approach the canonical Noether current should be supplemented by a gauge condition for the tetrad field. Such a gauge condition could be some spacetime version of Nester’s gauge conditions (in the form of certain partial differential equations) for the orthonormal frames of Riemannian manifolds [378, 381]. (For the existence and the potential obstruction to the existence of the solutions to this gauge condition on spacelike hypersurfaces, see [384, 196].) Furthermore, since Ca[K] + TabKb is conserved for any vector field Ka, in the absence of the familiar Killing symmetries of the Minkowski spacetime it is not trivial to define the ‘translations’ and ‘rotations’, and hence the energy-momentum and angular momentum. To make them well defined, additional ideas would be needed. For recent reviews of the tetrad formalism of general relativity, including an extended bibliography, see, e.g., [486, 487, 403, 286].

In general, the frame field \(\{E_{\underline a}^a\}\) is defined only on an open subset UM. If the domain of the frame field can be extended to the whole M, then M is called parallelizable. For time and space-orientable spacetimes this is equivalent to the existence of a spinor structure [206], which is known to be equivalent to the vanishing of the second Stiefel-Whitney class of M [364], a global topological condition on M.

The discussion of how Møller’s superpotential \({\vee _e}^{ab}\) is related to the Nester-Witten 2-form, by means of which an alternative form of the ADM energy-momentum is given and and by means of which several quasi-local energy-momentum expressions are defined, is given in Section 3.2.1 and in the first paragraphs of Section 8.

3.1.6 Strategies to avoid pseudotensors III: Higher derivative currents

Giving up the paradigm that the Noether current should depend only on the vector field Ka and its first derivative — i.e., if we allow a term a to be present in the Noether current (2.3), even if the Lagrangian is diffeomorphism invariant — one naturally arrives at Komar’s tensorial superpotential K∨ [K]ab:= ∇[aKb] and the corresponding Noether current \({C^a}[{\bf{K}}]: = {G^a}_b{K^b} + {\nabla _b}{\nabla ^{[a}}{K^{b]}}\) [322] (see also [77]). Although its independence of any background structure (viz. its tensorial nature) and its uniqueness property (see Komar [322] quoting Sachs) is especially attractive, the vector field Ka is still to be determined. A new suggestion for the approximate spacetime symmetries that can, in principle, be used in Komar’s expression, both near a point and a world line, is given in [235]. This is a generalization of the affine collineations (including the homotheties and the Killing symmetries). We continue the discussion of the Komar expression in Sections 3.2.2, 3.2.3, 4.3.1 and 12.1, and of the approximate spacetime symmetries in Section 11.1.

3.2 On the global energy-momentum and angular momentum of gravitating systems: The successes

As is well known, in spite of the difficulties with the notion of the gravitational energy-momentum density discussed above, reasonable total energy-momentum and angular momentum can be associated with the whole spacetime, provided it is asymptotically flat. In the present section we recall the various forms of them. As we will see, most of the quasi-local constructions are simply ‘quasi-localizations’ of the total quantities. Obviously, the technique used in the ‘quasi-localization’ does depend on the actual form of the total quantities, yielding mathematically-inequivalent definitions for the quasi-local quantities. We return to the discussion of the tools needed in the quasi-localization procedures in Sections 4.2 and 4.3. Classical, excellent reviews of global energy-momentum and angular momentum are [208, 223, 28, 393, 553, 426], and a recent review of con-formal infinity (with special emphasis on its applicability in numerical relativity) is [195]. Reviews of the positive energy proofs from the early 1980s are [273, 427].

3.2.1 Spatial infinity: Energy-momentum

There are several mathematically-inequivalent definitions of asymptotic flatness at spatial infinity [208, 475, 37, 65, 200]. The traditional definition is based on the existence of a certain asymptotically flat spacelike hypersurface. Here we adopt this definition, which is probably the weakest one in the sense that the spacetimes that are asymptotically flat in the sense of any reasonable definition are asymptotically flat in the traditional sense as well. A spacelike hypersurface Σ will be called k-asymptotically flat if for some compact set K ⊂ Σ the complement Σ − K is diffeomorphic to ℝ3 minus a solid ball, and there exists a (negative definite) metric 0hab on Σ, which is flat on Σ − K, such that the components of the difference of the physical and the background metrics, hij0hij, and of the extrinsic curvature χij in the 0hij-Cartesian coordinate system {xk} fall off as rk and rk−1, respectively, for some k > 0 and r2:= δijxixj [433, 64]. These conditions make it possible to introduce the notion of asymptotic spacetime Killing vectors, and to speak about asymptotic translations and asymptotic boost rotations. Σ − K together with the metric and extrinsic curvature is called the asymptotic end of Σ. In a more general definition of asymptotic flatness Σ is allowed to have finitely many such ends.

As is well known, finite and well-defined ADM energy-momentum [23, 25, 24, 26] can be associated with any k-asymptotically flat spacelike hypersurface, if \(k > {1 \over 2}\), by taking the value on the constraint surface of the Hamiltonian H[Ka], given, for example, in [433, 64], with the asymptotic translations Ka (see [144, 52, 399, 145]). In its standard form, this is the r → ∞ limit of a two-surface integral of the first derivatives of the induced three-metric hab and of the extrinsic curvature χab for spheres \({\mathcal S_r}\) of large coordinate radius r. Explicitly:

$$E = {1 \over {16\pi G}}\;\underset {r \rightarrow \infty}{\lim} \oint\nolimits_{{{\mathcal S}_r}} {{\upsilon ^a}{{{(_0}{D_c}{h_{da}}{- _0}{D_a}{h_{cd}})}_0}{h^{cd}}{\rm{d}}{{\mathcal S}_r}},$$
(3.8)
$${P^{\bf{i}}} = - {1 \over {8\pi G}} \underset {r \rightarrow \infty}{\lim} \oint\nolimits_{{{\mathcal S}_r}} {{\upsilon ^a}({\chi _a}^b - \chi \delta _a^b){\;_0}D{x^{\bf{i}}}{\rm{d}}{{\mathcal S}_r}},$$
(3.9)

where 0De is the Levi-Civita derivative oparator determined by 0hab, and va is the outward pointing unit normal to \({{\mathcal S}_r}\) and tangent to Σ. The ADM energy-momentum, \({P^{\underline a}} = (E,{P^{\rm{i}}}\), is an element of the space dual to the space of the asymptotic translations, and transforms as a Lorentzian four-vector with respect to asymptotic Lorentz transformations of the asymptotic Cartesian coordinates.

The traditional ADM approach to the introduction of the conserved quantities and the Hamiltonian analysis of general relativity is based on the 3+1 decomposition of the fields and the spacetime. Thus, it is not a priori clear that the energy and spatial momentum form a Lorentz vector (and the spatial angular momentum and center-of-mass, discussed below, form an antisymmetric tensor). One has to check a posteriori that the conserved quantities obtained in the 3 + 1 form are, in fact, Lorentz-covariant. To obtain manifestly Lorentz-covariant quantities one should not do the 3 + 1 decomposition. Such a manifestly Lorentz-covariant Hamiltonian analysis was suggested first by Nester [377], and he was able to recover the ADM energy-momentum in a natural way (see Section 11.3).

Another form of the ADM energy-momentum is based on Møller’s tetrad superpotential [223]: Taking the flux integral of the current Ca [K] + TabKb on the spacelike hypersurface Σ, by Eq. (3.7) the flux can be rewritten as the r → ∞ limit of the two-surface integral of Møller’s superpotential on spheres of large r with the asymptotic translations Ka. Choosing the tetrad field \(E_{\underline a}^a\) to be adapted to the spacelike hypersurface and assuming that the frame \(E_{\underline a}^a\) tends to a constant Cartesian one as rk, the integral reproduces the ADM energy-momentum. The same expression can be obtained by following the familiar Hamiltonian analysis using the tetrad variables too: By the standard scenario one can construct the basic Hamiltonian [379]. This Hamiltonian, evaluated on the constraints, turns out to be precisely the flux integral of Ca[K] + TabKb, on Σ.

A particularly interesting and useful expression for the ADM energy-momentum is possible if the tetrad field is considered to be a frame field built from a normalized spinor dyad \(\{\lambda _A^{\underline A}\}, \underline A = 0,1\), on Σ, which is asymptotically constant (see Section 4.2.3). (Thus, underlined capital Roman indices are concrete name spinor indices.) Then, for the components of the ADM energy-momentum in the constant spinor basis at infinity, Møller’s expression yields the limit of

$${P^{\underline A \underline {B{\prime}}}} = {1 \over {4\pi G}}\oint\nolimits_{\mathcal S} {{{\rm{i}} \over 2}} \left({\overline \lambda _{A{\prime}}^{\underline {B{\prime}}}{\nabla _{BB{\prime}}}\lambda _A^{\underline A} - \overline \lambda _{B{\prime}}^{\underline {B{\prime}}}{\nabla _{AA{\prime}}}\lambda _B^{\underline A}} \right),$$
(3.10)

as the two-surface \({\mathcal S}\) is blown up to approach infinity. In fact, to recover the ADM energy-momentum in the form (3.10), the spinor fields \(\lambda _A^{\underline A}\) need not be required to form a normalized spinor dyad, it is enough that they form an asymptotically constant normalized dyad, and we have to use the fact that the generator vector field Ka has asymptotically constant components \({K^{\underline A {{\underline A}{\prime}}}}\) in the asymptotically constant frame field \(\lambda _{\underline A}^A\bar \lambda _{\underline {{A{\prime}}}}^{{A{\prime}}}\). Thus \({K^a} = {K^{\underline A {{\underline A}{\prime}}}}\lambda _{\underline A}^A\bar \lambda _{\underline A}^{{A{\prime}}}\) can be interpreted as an asymptotic translation. The complex-valued 2-form in the integrand of Eq. (3.10) will be denoted by \(u{({\lambda ^{\underline A}},{{\bar \lambda}^{\underline {{B{\prime}}}}})_{ab}}\), and is called the Nester-Witten 2-form. This is ‘essentially Hermitian’ and connected with Komar’s superpotential, too. In fact, for any two spinor fields αA and βA one has

$$u{\left({\alpha, \overline \beta} \right)_{ab}} - \overline {u{{(\beta, \overline \alpha)}_{ab}}} = - {\rm{i}}{\nabla _{\left[ a \right.}}{X_{\left. b \right]}},$$
(3.11)
$$u{\left({\alpha, \bar \beta} \right)_{ab}} - \overline {u{{(\beta, \bar \alpha)}_{ab}}} = {\textstyle{1 \over 2}}{\nabla _c}{X_d}{\varepsilon ^{cd}}_{ab} + {\rm{i}}\left({{\varepsilon _{A{\prime}B{\prime}}}{\alpha _{\left(A \right.}}{\nabla _{\left. B \right)C{\prime}}}{{\overline \beta}^{C{\prime}}} - {\varepsilon _{A\,B}}{{\overline \beta}_{{{\left(A{\prime}\right.}}}}{\nabla _{\left. {B{\prime}} \right)C}}{\alpha ^C}} \right),$$
(3.12)

where \({X_a}: = {\alpha _A}{{\bar \beta}_{{A{\prime}}}}\) and the overline denotes complex conjugation. Thus, apart from the terms in Eq. (3.12) involving ∇A′AαA and \({\nabla _{A{A{\prime}}}}{{\bar \beta}^{{A{\prime}}}}\), the Nester-Witten 2-form \(u{(\alpha, \bar \beta)_{ab}}\) is just \(- {{\rm{i}} \over 2}({\nabla _{[a}}{X_{b]}} + {\rm{i}}{\nabla _{[c}}{X_{d]}}{1 \over 2}{\varepsilon ^{cd}}_{ab})\), i.e., the anti-self-dual part of the curl of \(- {{\rm{i}} \over 2}{X_a}\) (The original expressions by Witten and Nester were given using Dirac, rather than two-component Weyl, spinors [559, 376]. The 2-form \(u{(\alpha, \bar \beta)_{ab}}\) in the present form using the two-component spinors probably appeared first in [276].) Although many interesting and original proofs of the positivity of the ADM energy are known even in the presence of black holes [444, 445, 559, 376, 273, 427, 300], the simplest and most transparent ones are probably those based on the use of two-component spinors: If the dominant energy condition is satisfied on the k-asymptotically flat spacelike hypersurface Σ, where \(k > {1 \over 2}\), then the ADM energy-momentum is future pointing and nonspacelike (i.e., the Lorentzian length of the energy-momentum vector, the ADM mass, is non-negative), and is null if and only if the domain of dependence D(Σ) of Σ is flat [276, 434, 217, 436, 88]. Its proof may be based on the Sparling equation [476, 175, 426, 358]:

$${\nabla _{\left[ a \right.}}u{(\lambda, \overline \mu)_{\left. {bc} \right]}} = - {1 \over 2}{\lambda _E}{\overline \mu _{E{\prime}}}{G^{e\;f}}{1 \over {3!}}{\varepsilon _{f\;abc}} + \Gamma {(\lambda, \overline \mu)_{abc}}.$$
(3.13)

The significance of this equation is that, in the exterior derivative of the Nester-Witten 2-form, the second derivatives of the metric appear only through the Einstein tensor, thus its structure is similar to that of the superpotential equations in Lagrangian field theory, and \(\Gamma {(\lambda, \mu)_{abc}}\), known as the Sparling 3-form, is a homogeneous quadratic expression of the first derivatives of the spinor fields. If the spinor fields λA and μA solve the Witten equation on a spacelike hypersurface Σ, then the pullback of \(\Gamma {(\lambda, \bar \mu)_{abc}}\) to Σ is positive definite. This theorem has been extended and refined in various ways, in particular by allowing inner boundaries of Σ that represent future marginally trapped surfaces in black holes [217, 273, 427, 268].

The ADM energy-momentum can also be written as the two-sphere integral of certain parts of the conformally rescaled spacetime curvature [28, 29, 43]. This expression is a special case of the more general ‘Riemann tensor conserved quantities’ (see [223]): If \({\mathcal S}\) is any closed spacelike two-surface with area element \(d{\mathcal S}\), then for any tensor fields ωab = ω[ab] and μab = μ[ab] one can form the integral

$${I_{\mathcal S}}[\omega, \mu ]: = \oint\nolimits_{\mathcal S} {{\omega ^{ab}}{R_{abcd}}{\mu ^{cd}}d{\mathcal S}}.$$
(3.14)

Since the falloff of the curvature tensor near spatial infinity is rk−2, the integral \({I_{\mathcal S}}[\omega, \mu ]\) at spatial infinity gives finite value when ωabμcd blows up like rk as r → ∞. In particular, for the 1/r falloff, this condition can be satisfied by \({\omega ^{ab}}{\mu ^{cd}} = \sqrt {{\rm{Area(}}{\mathcal S}{\rm{)}}} {{\hat \omega}^{ab}}{{\hat \mu}^{cd}}\), where Area(\(({\mathcal S})\)) is the area of \({\mathcal S}\) and the hatted tensor fields are \({\mathcal O}(1)\).

If the spacetime is stationary, then the ADM energy can be recovered at the r → ∞ limit of the two-sphere integral of (twice of) Komar’s superpotential with the Killing vector Ka of stationarity [223] (see also [60]), as well. (See also the remark following Eq. (3.15) below.) On the other hand, if the spacetime is not stationary then, without additional restriction on the asymptotic time translation, the Komar expression does not reproduce the ADM energy. However, by Eqs. (3.11) and (3.12) such an additional restriction might be that Ka should be a constant combination of four future-pointing null vector fields of the form \({\alpha ^A}{{\bar \alpha}^{{A{\prime}}}}\), where the spinor fields aA are required to satisfy the Weyl neutrino equation ∇A′AαA = 0. This expression for the ADM energy-momentum has been used to give an alternative, ‘four-dimensional’ proof of the positivity of the ADM energy [276]. (For a more detailed recent review of the various forms of the ADM energy and linear momentum, see, e.g., [293].)

In stationary spacetime the notion of the mechanical energy with respect to the world lines of stationary observers (i.e., the integral curves of the timelike Killing field) can be introduced in a natural way, and then (by definition) the total (ADM) energy is written as the sum of the mechanical energy and the gravitational energy. Then the latter is shown to be negative for certain classes of systems [308, 348].

The notion of asymptotic flatness at spatial infinity is generalized in [398]; here the background flat metric 0hab on Σ − K is allowed to have a nonzero deficit angle α at infinity, i.e., the corresponding line element in spherical polar coordinates takes the form −dr2r2(1 − α)(2 + sin2 (θ) 2). Then, a canonical analysis of the minimally-coupled Einstein-Higgs field is carried out on such a background, and, following a Regge-Teitelboim-type argumentation, an ADM-type total energy is introduced. It is shown that for appropriately chosen α this energy is finite for the global monopole solution, though the standard ADM energy is infinite.

3.2.2 Spatial infinity: Angular momentum

The value of the Hamiltonian of Beig and Ó Murchadha [64], together with the appropriately-defined asymptotic rotation-boost Killing vectors [497], define the spatial angular momentum and center-of-mass, provided k ≥ 1 and, in addition to the familiar falloff conditions, certain global integral conditions are also satisfied. These integral conditions can be ensured by the explicit parity conditions of Regge and Teitelboim [433] on the leading nontrivial parts of the metric hab and extrinsic curvature χab: The components in the Cartesian coordinates {xi} of the former must be even and the components of the latter must be odd parity functions of xi/r (see also [64]). Thus, in what follows we assume that k = 1. Then the value of the Beig-Ó Murchadha Hamiltonian parametrized by the asymptotic rotation Killing vectors is the spatial angular momentum of Regge and Teitelboim [433], while that parametrized by the asymptotic boost Killing vectors deviates from the center-of-mass of Beig and Ó Murchadha [64] by a term, which is the spatial momentum times the coordinate time. (As Beig and Ó Murchadha pointed out [64], the center-of-mass term of the Hamiltonian of Regge and Teitelboim is not finite on the whole phase space.) The spatial angular momentum and the new center-of-mass form an anti-symmetric Lorentz four-tensor, which transforms in the correct way under the four-translation of the origin of the asymptotically Cartesian coordinate system, and is conserved by the evolution equations [497].

The center-of-mass of Beig and Ó Murchadha was re-expressed recently [57] as the r limit of two-surface integrals of the curvature in the form (3.14) with ωabμcd proportional to the lapse N times qacqbdqadqbc, where qab is the induced two-metric on \({\mathcal S}\) (see Section 4.1.1). The geometric notion of center-of-mass introduced by Huisken and Yau [280] is another form of the Beig-Ó Murchadha center-of-mass [156].

The Ashtekar-Hansen definition for the angular momentum is introduced in their specific conformal model of spatial infinity as a certain two-surface integral near infinity. However, their angular momentum expression is finite and unambiguously defined only if the magnetic part of the spacetime curvature tensor (with respect to the Ω = const. timelike level hypersurfaces of the conformal factor) falls off faster than it would fall off in metrics with 1/r falloff (but no global integral, e.g., a parity condition had to be imposed) [37, 28].

If the spacetime admits a Killing vector of axisymmetry, then the usual interpretation of the corresponding Komar integral is the appropriate component of the angular momentum (see, e.g., [534]). However, the value of the Komar integral (with the usual normalization) is twice the expected angular momentum. In particular, if the Komar integral is normalized such that for the Killing field of stationarity in the Kerr solution the integral is m/G, for the Killing vector of axisymmetry it is 2ma/G instead of the expected ma/G (‘factor-of-two anomaly’) [305]. We return to the discussion of the Komar integral in Sections 4.3.1 and 12.1.

3.2.3 Null infinity: Energy-momentum

The study of the gravitational radiation of isolated sources led Bondi to the observation that the two-sphere integral of a certain expansion coefficient m(u, θ, ϕ) of the line element of a radiative spacetime in an asymptotically-retarded spherical coordinate system (u, r, θ, ϕ) behaves as the energy of the system at the retarded time u. Indeed, this energy is not constant in time, but decreases with u, showing that gravitational radiation carries away positive energy (‘Bondi’s mass-loss’) [91, 92]. The set of transformations leaving the asymptotic form of the metric invariant was identified as a group, currently known as the Bondi-Metzner-Sachs (or BMS) group, having a structure very similar to that of the Poincaré group [440]. The only difference is that while the Poincaré group is a semidirect product of the Lorentz group and a four dimensional commutative group (of translations), the BMS group is the semidirect product of the Lorentz group and an infinite-dimensional commutative group, called the group of the supertranslations. A four-parameter subgroup in the latter can be identified in a natural way as the group of the translations. This makes it possible to compare the Bondi-Sachs four-momenta defined on different cuts of scri, and to calculate the energy-momentum carried away by the gravitational radiation in an unambiguous way. (For further discussion of the flux, see the fourth paragraph of Section 3.2.4.) At the same time the study of asymptotic solutions of the field equations led Newman and Unti to another concept of energy at null infinity [394]. However, this energy (currently known as the Newman-Unti energy) does not seem to have the same significance as the Bondi (or Bondi-Sachs [426] or Trautman-Bondi [147, 148, 146]) energy, because its monotonicity can be proven only between special, e.g., stationary, states. The Bondi energy, which is the time component of a Lorentz vector, the Bondi-Sachs energy-momentum, has a remarkable uniqueness property [147, 148].

Without additional conditions on Ka, Komar’s expression does not reproduce the Bondi-Sachs energy-momentum in nonstationary spacetimes either [557, 223]: For the ‘obvious’ choice for Ka(twice of) Komar’s expression yields the Newman-Unti energy. This anomalous behavior in the radiative regime could be corrected in at least two ways. The first is by modifying the Komar integral according to

$${L_{\mathcal S}}[{\bf{K}}]: = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {\left({{\nabla ^{\left[ c \right.}}{K^{\left. d \right]}} + \alpha {\nabla _e}{K^{e\; \bot}}{\varepsilon ^{cd}}} \right){1 \over 2}{\varepsilon _{cdab}}},$$
(3.15)

where εcd is the area 2-form on the Lorentzian two-planes orthogonal to \({\mathcal S}\) (see Section 4.1.1) and α is some real constant. For α =1 the integral \({L_{\mathcal S}}[{\bf{K}}]\), suggested by Winicour and Tamburino, is called the linkage [557]. (N.B.: The flux integral of the sum \({C^a}[{\bf{K}}] + {T^a}_b{K^b}\) of Komar’s gravitational and the matter’s currents on some compact spacelike hypersurface Σ with boundary \({\mathcal S}\) is \({1 \over {16\pi G}}\oint {_{\mathcal S}} {\nabla ^{[a}}{K^{b]}}{1 \over 2}{\varepsilon _{abcd}}\), which, for α = 0, is half of the linkage.) In addition, to define physical quantities by linkages associated to a cut of the null infinity one should prescribe how the two-surface \({\mathcal S}\) tends to the cut and how the vector field Ka should be propagated from the spacetime to null infinity into a BMS generator [557, 553]. The other way is to consider the original Komar integral (i.e., α = 0) on the cut of infinity in the conformally-rescaled spacetime and while requiring that Ka be divergence-free [210]. For such asymptotic BMS translations both prescriptions give the correct expression for the Bondi-Sachs energy-momentum.

The Bondi-Sachs energy-momentum can also be expressed by the integral of the Nester-Witten 2-form [285, 342, 343, 276]. However, in nonstationary spacetimes the spinor fields that are asymptotically constant at null infinity are vanishing [106]. Thus, the spinor fields in the Nester-Witten 2-form must satisfy a weaker boundary condition at infinity such that the spinor fields themselves are the spinor constituents of the BMS translations. The first such condition, suggested by Bramson [106], was to require the spinor fields to be the solutions of the asymptotic twistor equation (see Section 4.2.4). One can impose several such inequivalent conditions, and all of these, based only on the linear first-order differential operators coming from the two natural connections on the cuts (see Section 4.1.2), are determined in [496].

The Bondi-Sachs energy-momentum has a Hamiltonian interpretation as well. Although the fields on a spacelike hypersurface extending to null rather than spatial infinity do not form a closed system, a suitable generalization of the standard Hamiltonian analysis could be developed [146] and used to recover the Bondi-Sachs energy-momentum.

Similar to the ADM case, the simplest proofs of the positivity of the Bondi energy [446] are probably those that are based on the Nester-Witten 2-form [285] and, in particular, the use of two-component spinors [342, 343, 276, 274, 436]: The Bondi-Sachs mass (i.e., the Lorentzian length of the Bondi-Sachs energy-momentum) of a cut of future null infinity is non-negative if there is a spacelike hypersurface Σ intersecting null infinity in the given cut such that the dominant energy condition is satisfied on Σ, and the mass is zero iff the domain of dependence D(Σ) of Σ is flat.

Converting the integral of the Nester-Witten 2-form into a (positive definite) 3-dimensional integral on Σ, a strictly positive lower bound can be given both for the ADM and Bondi-Sachs masses. Although total energy-momentum (or mass) in the form of a two-surface integral cannot be a introduced in closed universes (i.e., when Σ is compact with no boundary), a non-negative quantity m, based on this positive definite expression, can be associated with Σ. If the matter fields satisfy the dominant energy condition, then \({\rm{m}}\,{\rm{=}}\,{\rm{0}}\) if and only if the spacetime is flat and topologically Σ is a 3-torus; moreover its vanishing is equivalent to the existence of non-trivial solutions of Witten’s gauge condition. This m turned out to be recoverable as the first eigenvalue of the square of the Sen-Witten operator. It is the usefulness and the applicability of this m in practice which tell us if this is a reasonable notion of total mass of closed universes or not [503].

3.2.4 Null infinity: Angular momentum

At null infinity we have a generally accepted definition for angular momentum only in stationary or axi-symmetric, but not in general, radiative spacetime, where there are various, mathematically inequivalent suggestions for it (see Section 4.2.4). Here we review only some of those total angular momentum definitions that can be ‘quasi-localized’ or connected somehow to quasi-local expressions, i.e., those that can be considered as the null-infinity limit of some quasi-local expression. We will continue their discussion in the main part of the review, namely in Sections 7.2, 11.1 and 9.

In their classic paper Bergmann and Thomson [78] raise the idea that while the gravitational energy-momentum is connected with the spacetime diffeomorphisms, the angular momentum should be connected with its intrinsic O(1, 3) symmetry. Thus, the angular momentum should be analogous with the spin. Based on the tetrad formalism of general relativity and following the prescription of constructing the Noether currents in Yang-Mills theories, Bramson suggested a superpotential for the six conserved currents corresponding to the internal Lorentz-symmetry [107, 108, 109]. (For another derivation of this superpotential from Møller’s Lagrangian (3.5) see [496].) If \(\{\lambda _A^{\underline A}\}, \underline A = 0,1\), is a normalized spinor dyad corresponding to the orthonormal frame in Eq. (3.5), then the integral of the spinor form of the anti-self-dual part of this superpotential on a closed orientable two-surface \({\mathcal S}\) is

$$J_{\mathcal S}^{\underline A \underline B}: = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {- {\rm{i}}\lambda _{\left(A \right.}^{\underline A}\lambda _{\left. B \right)}^{\underline B}{\varepsilon _{A{\prime}B{\prime}}}},$$
(3.16)

where εA′B′ is the symplectic metric on the bundle of primed spinors. We will denote its integrand by \(w{({\lambda ^{\underline A}},{\lambda ^{\underline B}})_{ab}}\), and we call it the Bramson superpotential. To define angular momentum on a given cut of the null infinity by the formula (3.16), we should consider its limit when \({\mathcal S}\) tends to the cut in question and we should specify the spinor dyad, at least asymptotically. Bramson’s suggestion for the spinor fields was to take the solutions of the asymptotic twistor equation [106]. He showed that this definition yields a well-defined expression. For stationary spacetimes this reduces to the generally accepted formula (4.15), and the corresponding Pauli-Lubanski spin, constructed from \({\varepsilon ^{\underline {{A{\prime}}} \underline {{B{\prime}}}}}{J^{\underline A \underline B}} + {\varepsilon ^{\underline A \underline B}} + {{\bar J}^{\underline A \underline {{\prime}{B{\prime}}}}}\) and the Bondi-Sachs energy-momentum \({P^{\underline A \underline {{A{\prime}}}}}\) (given, for example, in the Newman-Penrose formalism by Eq. (4.14)), is invariant with respect to supertranslations of the cut (‘active supertranslations’). Note that since Bramson’s expression is based on the solutions of a system of partial differential equations on the cut in question, it is independent of the parametrization of the BMS vector fields. Hence, in particular, it is invariant with respect to the supertranslations of the origin cut (‘passive supertranslations’). Therefore, Bramson’s global angular momentum behaves like the spin part of the total angular momentum. For a suggestion based on Bramson’s superpotential at the quasi-local level, but using a different prescription for the spinor dyad, see Section 9.

The construction based on the Winicour-Tamburino linkage (3.15) can be associated with any BMS vector field [557, 337, 45]. In the special case of translations it reproduces the Bondi-Sachs energy-momentum. The quantities that it defines for the proper supertranslations are called the super-momenta. For the boost-rotation vector fields they can be interpreted as angular momentum. However, in addition to the factor-of-two anomaly, this notion of angular momentum contains a huge ambiguity (‘supertranslation ambiguity’): The actual form of both the boost-rotation Killing vector fields of Minkowski spacetime and the boost-rotation BMS vector fields at future null infinity depend on the choice of origin, a point in Minkowski spacetime and a cut of null infinity, respectively. However, while the set of the origins of Minkowski spacetime is parametrized by four numbers, the set of the origins at null infinity requires a smooth function of the form \(u:{S^2} \rightarrow {\rm{\mathbb R}}\). Consequently, while the corresponding angular momentum in the Minkowski spacetime has the familiar origin-dependence (containing four parameters), the analogous transformation of the angular momentum defined by using the boost-rotation BMS vector fields depends on an arbitrary smooth real valued function on the two-sphere. This makes the angular momentum defined at null infinity by the boost-rotation BMS vector fields ambiguous unless a natural selection rule for the origins, making them form a four parameter family of cuts, is found.

Motivated by Penrose’s idea that the ‘conserved’ quantities at null infinity should be searched for in the form of a charge integral of the curvature (which will be discussed in detail in Section 7), a general expression \({Q_{\mathcal S}}[{K^a}]\), associated with any BMS generator Ka and any cut \({\mathcal S}\) of scri, was introduced [174]. For real Ka this is real; it is vanishing in Minkowski spacetime; it reproduces the Bondi-Sachs energy-momentum for BMS translations; it yields nontrivial results for proper supertranslations; and for BMS rotations the resulting expressions can be interpreted as angular momentum. It was shown in [453, 173] that the difference \({Q_{{{\mathcal S}{\prime}}}}[{K^a}] - {Q_{{{\mathcal S}{{\prime\prime}}}}}[{K^a}]\) for any two cuts \({{\mathcal S}{\prime}}\) and \({{\mathcal S}{{\prime\prime}}}\) can be written as the integral of some local function on the subset of scri bounded by the cuts \({{\mathcal S}{\prime}}\) and \({{\mathcal S}{{\prime\prime}}}\), and this is precisely the flux integral of [44]. Unfortunately, however, the angular momentum introduced in this way still suffers from the same supertranslation ambiguity. A possible resolution of this difficulty could be the suggestion by Dain and Moreschi [169] in the charge integral approach to angular momentum of Moreschi [369, 370]. Their basic idea is that the requirement of the vanishing of the supermomenta (i.e., the quantities corresponding to the proper supertranslations) singles out a four-real-parameter family of cuts, called nice cuts, by means of which the BMS group can be reduced to a Poincaré subgroup that yields a well-defined notion of angular momentum. For further discussion of certain other angular momentum expressions, especially from the points of view of numerical calculations, see also [204].

Another promising approach might be that of Chruściel, Jezierski, and Kijowski [146], which is based on a Hamiltonian analysis of general relativity on asymptotically hyperboloidal spacelike hypersurfaces. They chose the six BMS vector fields tangent to the intersection of the spacelike hypersurface and null infinity as the generators of their angular momentum. Since the motions that their angular momentum generators define leave the domain of integration fixed, and apparently there is no Lorentzian four-space of origins, they appear to be the generators with respect to some fixed ‘center-of-the-cut’, and the corresponding angular momentum appears to be the intrinsic angular momentum.

In addition to the supertranslation ambiguity in the definition of angular momentum, there could be another potential ambiguity, even if the angular momentum is well defined on every cut of future null infinity. In fact, if, for example, the definition of the angular momentum is based on the solutions of some linear partial differential equation on the cut (such as Bramson’s definition, or the ones discussed in Sections 7 and 9), then in general there is no canonical isomorphism between the spaces of the solutions on different cuts, even if the solution spaces, as abstract vector spaces, are isomorphic. Therefore, the angular momenta on two different cuts belong to different vector spaces, and, without any natural correspondence between the solution spaces on the different cuts, it is meaningless to speak about the difference of the angular momenta. Thus, we cannot say anything about, e.g., the angular momentum carried away by gravitational radiation between two retarded time instants represented by two different cuts.

One possible resolution of this difficulty was suggested by Helfer [264]. He followed the twistorial approach presented in Section 7 and used a special bijective map between the two-surface twistor spaces on different cuts. His map is based on the special structures available only at null infinity. Though this map is nonlinear, it is shown that the angular momenta on the different cuts can indeed be compared. Another suggestion for (only) the spatial angular momentum was given in [501]. This is based on the quasi-local Hamiltonian analysis that is discussed in Section 11.1, and the use of the divergence-free vector fields built from the eigenspinors with the smallest eigenvalue of the two-surface Dirac operators. The angular momenta, defined in these ways on different cuts, can also be compared. We give a slightly more detailed discussion of them in Sections 7.2 and 11.1, respectively.

The main idea behind the recent definition of the total angular momentum at future null infinity of Kozameh, Newman and Silva-Ortigoza, suggested in [325, 326], is analogous to finding the center-of-charge (i.e., the time-dependent position vector with respect to which the electric dipole moment is vanishing) in flat-space electromagnetism: By requiring that the dipole part of an appropriate null rotated Weyl tensor component \(\psi _1^0\) be vanishing, a preferred set of origins, namely a (complex) center-of-mass line can be found in the four-complex-dimensional solution space of the good-cut equation (the H-space). Then the asymptotic Bianchi identities take the form of conservation equations, and certain terms in these can (in the given approximation) be identified with angular momentum. The resulting expression is just Eq. (4.15), to which all the other reasonable angular momentum expressions are expected to reduce in stationary spacetimes. A slightly more detailed discussion of the necessary technical background is given in Section 4.2.4.

3.3 The necessity of quasi-locality for observables in general relativity

3.3.1 Nonlocality of the gravitational energy-momentum and angular momentum

One reaction to the nontensorial nature of the gravitational energy-momentum density expressions was to consider the whole problem ill defined and the gravitational energy-momentum meaningless. However, the successes discussed in Section 3.2 show that the global gravitational energy-momenta and angular momenta are useful notions, and hence, it could also be useful to introduce them even if the spacetime is not asymptotically flat. Furthermore, the nontensorial nature of an object does not imply that it is meaningless. For example, the Christoffel symbols are not tensorial, but they do have geometric, and hence physical content, namely the linear connection. Indeed, the connection is a nonlocal geometric object, connecting the fibers of the vector bundle over different points of the base manifold. Hence, any expression of the connection coefficients, in particular the gravitational energy-momentum or angular momentum, must also be nonlocal. In fact, although the connection coefficients at a given point can be taken to zero by an appropriate coordinate/gauge transformation, they cannot be transformed to zero on an open domain unless the connection is flat.

Furthermore, the superpotential of many of the classical pseudotensors (e.g., of the Einstein, Bergmann, Møller’s tetrad, Landau-Lifshitz pseudotensors), being linear in the connection coefficients, can be recovered as the pullback to the spacetime manifold of various forms of a single geometric object on the linear frame bundle, namely of the Nester-Witten 2-form, along various local cross sections [192, 358, 486, 487], and the expression of the pseudotensors by their super-potentials are the pullbacks of the Sparling equation [476, 175, 358]. In addition, Chang, Nester, and Chen [131] found a natural quasi-local Hamiltonian interpretation of each of the pseudotensorial expressions in the metric formulation of the theory (see Section 11.3.5). Therefore, the pseudotensors appear to have been ‘rehabilitated’, and the gravitational energy-momentum and angular momentum are necessarily associated with extended subsets of the spacetime.

This fact is a particular consequence of a more general phenomenon [76, 439, 284]: Since (in the absence of any non-dynamical geometric background) the physical spacetime is the isomorphism class of the pairs (M, gab) (instead of a single such pair), it is meaningless to speak about the ‘value of a scalar or vector field at a point pM’. What could have meaning are the quantities associated with curves (the length of a curve, or the holonomy along a closed curve), two-surfaces (e.g., the area of a closed two-surface) etc. determined by some body or physical fields. In addition, as Torre showed [523] (see also [524]), in spatially-closed vacuum spacetimes there can be no nontrivial observable, built as spatial integrals of local functions of the canonical variables and their finitely many derivatives. Thus, if we want to associate energy-momentum and angular momentum not only to the whole (necessarily asymptotically flat) spacetime, then these quantities must be associated with extended but finite subsets of the spacetime, i.e., must be quasi-local.

The results of Friedrich and Nagy [202] show that under appropriate boundary conditions the initial boundary value problem for the vacuum Einstein equations, written into a first-order symmetric hyperbolic form, has a unique solution. Thus, there is a solid mathematical basis for the investigations of the evolution of subsystems of the universe, and hence, it is natural to ask about the observables, and in particular the conserved quantities, of their dynamics.

3.3.2 Domains for quasi-local quantities

The quasi-local quantities (usually the integral of some local expression of the field variables) are associated with a certain type of subset of spacetime. In four dimensions there are three natural candidates:

  1. 1.

    the globally hyperbolic domains DM with compact closure,

  2. 2.

    the compact spacelike (in fact, acausal) hypersurfaces Σ with boundary (interpreted as Cauchy surfaces for globally hyperbolic domains D), and

  3. 3.

    the closed, orientable spacelike two-surfaces \({\mathcal S}\) (interpreted as the boundary Σ of Cauchy surfaces for globally hyperbolic domains).

A typical example of type 3 is any charge integral expression: The quasi-local quantity is the integral of some superpotential 2-form built from the data given on the two-surface, as in Eq. (3.10), or the expression \({Q_{\mathcal S}}[{\bf{K}}]\) for the matter fields given by (2.5). An example of type 2 might be the integral of the Bel-Robinson ‘momentum’ on the hypersurface Σ:

$${E_\Sigma}[{\xi ^a}]: = \int\nolimits_\Sigma {{\xi ^d}{T_{de\,f\,g}}{t^e}{t^f}{\textstyle{1 \over {3!}}}{\varepsilon ^g}_{abc}}.$$
(3.17)

This quantity is analogous to the integral EΣ[ξa] for the matter fields given by Eq. (2.6) (though, by the remarks on the Bel-Robinson ‘energy’ in Section 3.1.2, its physical dimension cannot be of energy). If ξa is a future-pointing nonspacelike vector then EΣ[ξa] ≥ 0. Obviously, if such a quantity were independent of the actual hypersurface Σ, then it could also be rewritten as a charge integral on the boundary Σ. The gravitational Hamiltonian provides an interesting example for the mixture of type 2 and 3 expressions, because the form of the Hamiltonian is the three-surface integral of the constraints on Σ and a charge integral on its boundary Σ, and thus, if the constraints are satisfied then the Hamiltonian reduces to a charge integral. Finally, an example of type 1 might be

$${E_D}: = \inf \;\{{E_\Sigma}[{\bf{t}}]\vert \Sigma \;{\rm{is}}\;{\rm{a}}\;{\rm{Cauchy}}\;{\rm{surface}}\;{\rm{for}}\;D\},$$
(3.18)

the infimum of the ‘quasi-local Bel-Robinson energies’, where the infimum is taken on the set of all the Cauchy surfaces Σ for D with given boundary Σ. (The infimum always exists because the Bel-Robinson ‘energy density’ Tabcdtatbtctd is non-negative.) Quasi-locality in any of these three senses is compatible with the quasi-locality of Haag and Kastler [231, 232]. The specific quasi-local energy-momentum constructions provide further examples both for charge-integraltype expressions and for those based on spacelike hypersurfaces.

3.3.3 Strategies to construct quasi-local quantities

There are two natural ways of finding the quasi-local energy-momentum and angular momentum. The first is to follow some systematic procedure, while the second is the ‘quasi-localization’ of the global energy-momentum and angular momentum expressions. One of the two systematic procedures could be called the Lagrangian approach: The quasi-local quantities are integrals of some superpotential derived from the Lagrangian via a Noether-type analysis. The advantage of this approach could be its manifest Lorentz-covariance. On the other hand, since the Noether current is determined only through the Noether identity, which contains only the divergence of the current itself, the Noether current and its superpotential is not uniquely determined. In addition (as in any approach), a gauge reduction (for example in the form of a background metric or reference configuration) and a choice for the ‘translations’ and ‘boost-rotations’ should be made.

The other systematic procedure might be called the Hamiltonian approach: At the end of a fully quasi-local (covariant or not) Hamiltonian analysis we would have a Hamiltonian, and its value on the constraint surface in the phase space yields the expected quantities. Here one of the main ideas is that of Regge and Teitelboim [433], that the Hamiltonian must reproduce the correct field equations as the flows of the Hamiltonian vector fields, and hence, in particular, the correct Hamiltonian must be functionally differentiable with respect to the canonical variables. This differentiability may restrict the possible ‘translations’ and ‘boost-rotations’ too. Another idea is the expectation, based on the study of the quasi-local Hamiltonian dynamics of a single scalar field, that the boundary terms appearing in the calculation of the Poisson brackets of two Hamiltonians (the ‘Poisson boundary terms’), represent the infinitesimal flow of energy-momentum and angular momentum between the physical system and the rest of the universe [502]. Therefore, these boundary terms must be gauge invariant in every sense. This requirement restricts the potential boundary terms in the Hamiltonian as well as the boundary conditions for the canonical variables and the lapse and shift. However, if we are not interested in the structure of the quasi-local phase space, then, as a short cut, we can use the Hamilton-Jacobi method to define the quasi-local quantities. The resulting expression is a two-surface integral. Nevertheless, just as in the Lagrangian approach, this general expression is not uniquely determined, because the action can be modified by adding an (almost freely chosen) boundary term to it. Furthermore, the ‘translations’ and ‘boost-rotations’ are still to be specified.

On the other hand, at least from a pragmatic point of view, the most natural strategy to introduce the quasi-local quantities would be some ‘quasi-localization’ of those expressions that gave the global energy-momentum and angular momentum of asymptotically flat spacetimes. Therefore, respecting both strategies, it is also legitimate to consider the Winicour-Tamburino-type (linkage) integrals and the charge integrals of the curvature.

Since the global energy-momentum and angular momentum of asymptotically flat spacetimes can be written as two-surface integrals at infinity (and, as we saw in Section 3.1.1 that the mass of the source in Newtonian theory, and as we will see in Section 7.1.1 that both the energy-momentum and angular momentum of the source in the linearized Einstein theory can also be written as two-surface integrals), the two-surface observables can be expected to have special significance. Thus, to summarize, if we want to define reasonable quasi-local energy-momentum and angular momentum as two-surface observables, then three things must be specified:

  1. 1.

    an appropriate general two-surface integral (e.g., in the Lagrangian approaches the integral of a superpotential 2-form, or in the Hamiltonian approaches a boundary term together with the boundary conditions for the canonical variables),

  2. 2.

    a gauge choice (in the form of a distinguished coordinate system in the pseudotensorial approaches, or a background metric/connection in the background field approaches or a distinguished tetrad field in the tetrad approach), and

  3. 3.

    a definition for the ‘quasi-symmetries’ of the two-surface (i.e., the ‘generator vector fields’ of the quasi-local quantities in the Lagrangian, and the lapse and the shift in the Hamiltonian approaches, respectively, which, in the case of timelike ‘generator vector fields’, can also be interpreted as a fleet of observers on the two-surface).

In certain approaches the definition of the ‘quasi-symmetries’ is linked to the gauge choice, for example by using the Killing symmetries of the flat background metric.

4 Tools to Construct and Analyze Quasi-Local Quantities

Having accepted that the gravitational energy-momentum and angular momentum should be introduced at the quasi-local level, we next need to discuss the special tools and concepts that are needed in practice to construct (or even to understand) the various special quasi-local expressions. Thus, first, in Section 4.1 we review the geometry of closed spacelike two-surfaces, with special emphasis on two-surface data. Then, in Sections 4.2 and 4.3, we discuss the special situations where there is a more-or-less generally accepted ‘standard’ definition for the energy-momentum (or at least for the mass) and angular momentum. In these situations any reasonable quasi-local quantity should reduce to them.

4.1 The geometry of spacelike two-surfaces

The first systematic study of the geometry of spacelike two-surfaces from the point of view of quasi-local quantities is probably due to Tod [514, 519]. Essentially, his approach is based on the Geroch-Held-Penrose (GHP) formalism [209]. Although this is a very effective and flexible formalism [209, 425, 426, 277, 479], its form is not spacetime covariant. Since in many cases the covariance of a formalism itself already gives some hint as to how to treat and solve the problem at hand, we concentrate here mainly on a spacetime-covariant description of the geometry of the spacelike two-surfaces, developed gradually in [489, 491, 492, 493, 198, 500]. The emphasis will be on the geometric structures rather than the technicalities. In the last paragraph, we comment on certain objects appearing in connection with families of spacelike two-surfaces. Our standard differential geometric reference is [318, 319].

4.1.1 The Lorentzian vector bundle

The restriction \({{\rm{V}}^a}({\mathcal S})\) to the closed, orientable spacelike two-surface \({\mathcal S}\) of the tangent bundle TM of the spacetime has a unique decomposition to the gab-orthogonal sum of the tangent bundle TS of \({\mathcal S}\) and the bundle of the normals, denoted by NS. Then, all the geometric structures of the spacetime (metric, connection, curvature) can be decomposed in this way. If ta and va are timelike and spacelike unit normals, respectively, being orthogonal to each other, then the projections to \(T{\mathcal S}\) and \(N{\mathcal S}\) are \(\Pi _b^a: = \delta _b^a - {t^a}{t_b} + {\upsilon ^a}{\upsilon _b}\) and \(O_b^a: = \delta _b^a - \Pi _b^a\), respectively. The induced two-metric and the corresponding area 2-form on \({\mathcal S}\) will be denoted by qab = gabtatb + vavb and εab = tcvdεcdab, respectively, while the area 2-form on the normal bundle will be ⊥εab = tavbtbva. The bundle \({{\rm{V}}^a}({\mathcal S})\) together with the fiber metric gab and the projection \(\Pi _b^a\) will be called the Lorentzian vector bundle over \({\mathcal S}\). For the discussion of the global topological properties of the closed orientable two-manifolds, see, e.g., [10, 500].

4.1.2 Connections

The spacetime covariant derivative operator ∇e defines two connections on \({{\rm{V}}^a}({\mathcal S})\). The first covariant derivative, denoted by δe, is analogous to the induced (intrinsic) covariant derivative on (one-codimensional) hypersurfaces: \({\delta _e}{X^a}: = \Pi _b^a\Pi _e^f{\nabla _f}(\Pi _c^b{X^c}) + O_b^a\Pi _e^f{\nabla _f}(O_c^b{X^c})\) for any section Xa of \({{\rm{V}}^a}({\mathcal S})\). Obviously, δe annihilates both the fiber metric gab and the projection \(\Pi _b^a\). However, since for two-surfaces in four dimensions the normal is not uniquely determined, we have the ‘boost gauge freedom’ tata cosh u + va sinh u, vata sinh u + va cosh u. The induced connection will have a nontrivial part on the normal bundle, too. The corresponding (normal part of the) connection one-form on \({\mathcal S}\) can be characterized, for example, by \({A_e}: = \Pi _e^f({\nabla _f}{t_a}){\upsilon ^a}\). Therefore, the connection δe can be considered as a connection on \({{\rm{V}}^a}({\mathcal S})\) coming from a connection on the O(2) ⊗ O(1, 1)-principal bundle of the gab-orthonormal frames adapted to \({\mathcal S}\).

The other connection, Δe, is analogous to the Sen connection [447], and is defined simply by \({\Delta _e}{X^a}: = \Pi _e^f{\Delta _f}{X^a}\). This annihilates only the fiber metric, but not the projection. The difference of the connections Δe and δe turns out to be just the extrinsic curvature tensor: \({\Delta _e}{X^a} = {\delta _e}{X^a} + {Q^a}_{eb}{X^b} - {X^b}{Q_{be}}^a\). Here \({Q^a}_{eb}: = - \Pi _c^a{\Delta _e}\Pi _b^c = {\tau ^a}_e{t_b} - {v^a}_e{\upsilon _b}\), and \({\tau _{ab}}: = \Pi _a^c\Pi _b^d{\nabla _c}{t_d}\) and \({v_{ab}}: = \Pi _a^c\Pi _b^d{\nabla _c}{\upsilon _d}\) are the standard (symmetric) extrinsic curvatures corresponding to the individual normals ta and va, respectively. The familiar expansion tensors of the future-pointing outgoing and ingoing null normals, la := ta + υa and \({n^a}: = {1 \over 2}({t^a} - {\upsilon ^a})\), respectively, are θab = Qabclc and θab = Qabcnc, and the corresponding shear tensors σab and σab are defined by their trace-free part. Obviously, τab and νab (and hence the expansion and shear tensors θab, θab, σab, and σab) are boost-gauge-dependent quantities (and it is straightforward to derive their transformation from the definitions), but their combination \({Q^a}_{eb}\) is boost-gauge invariant. In particular, it defines a natural normal vector field to \({\mathcal S}\) as \({Q_b}: = {Q^a}_{ab} = \tau {t_b} - v{\upsilon _b} = {\theta {\prime}}{l_b} + \theta {n_b}\) and θ′ are the relevant traces. Qa is called the mean extrinsic curvature vector of \({\mathcal S}\). If \({{\tilde Q}_b}:{= ^ \bot}{\varepsilon ^a}_b{Q^b} = v{t_b} - \tau {\upsilon _b} = - {\theta {\prime}}{l_a} + \theta {n_a}\), called the dual mean curvature vector, then the norm of Qa and Qa is \({Q_a}{Q_b}{g^{ab}} = - {{\tilde Q}_a}{{\tilde Q}_b}{g^{ab}} = {\tau ^2} - {v^2} = 2\theta {\theta {\prime}}\), and they are orthogonal to each other: \({Q_a}{Q_b}{g^{ab}} = 0\). It is easy to show that \({\Delta _a}{{\tilde Q}^a} = 0,\,{\rm{i}}{\rm{.e}}{\rm{.,}}\,{{\tilde Q}^a}\) is the uniquely pointwise-determined direction orthogonal to the two-surface in which the expansion of the surface is vanishing. If Qa is not null, then \(\{{Q_a},{{\tilde Q}_a}\}\) defines an orthonormal frame in the normal bundle (see, e.g., [14]). If Qa is nonzero, but (e.g., future-pointing) null, then there is a uniquely determined null normal Sa to \({\mathcal S}\), such that QaSa = 1, and hence, {Qa, Sa} is a uniquely determined null frame. Therefore, the two-surface admits a natural gauge choice in the normal bundle, unless Qa is vanishing. Geometrically, Δe is a connection coming from a connection on the O(1, 3)-principal fiber bundle of the gab-orthonormal frames. The curvature of the connections δe and Δe, respectively, are

$${f^a}_{bcd} = {- ^ \bot}{\varepsilon ^a}_b({\delta _c}{A_d} - {\delta _d}{A_c}) + {\textstyle{1 \over 2}}{}^{\mathcal S}R(\Pi _c^a{q_{bd}} - \Pi _d^a{q_{bc}}),$$
(4.1)
$$\begin{array}{*{20}c} {{F^a}_{bcd} = {f^a}_{bcd} - {\delta _c}({Q^a}_{db} - {Q_{bd}}^a) + {\delta _d}({Q^a}_{cb} - {Q_{bc}}^a) +} \\ {\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; + {Q^a}_{ce}{Q_{bd}}^e + {Q_{ec}}^a{Q^e}_{db} - {Q^a}_{de}{Q_{bc}}^e - {Q_{ed}}^a{Q^e}_{cb},} \\ \end{array}$$
(4.2)

where \(^{\mathcal S}R\) is the curvature scalar of the familiar intrinsic Levi-Civita connection of \(^{\mathcal S}R\). The curvature of Δe is just the pullback to \({\mathcal S}\) of the spacetime curvature 2-form: \({F^a}_{bcd} = {R^a}_{bef}\Pi _c^e\Pi _d^f\). Therefore, the well-known Gauss, Codazzi-Mainardi, and Ricci equations for the embedding of \({\mathcal S}\) in M are just the various projections of Eq. (4.2).

4.1.3 Embeddings and convexity conditions

To prove certain statements about quasi-local quantities, various forms of the convexity of \({\mathcal S}\) must be assumed. The convexity of \({\mathcal S}\) in a three-geometry is defined by the positive definiteness of its extrinsic curvature tensor. If, in addition, the three-geometry is flat, then by the Gauss equation this is equivalent to the positivity of the scalar curvature of the intrinsic metric of \({\mathcal S}\). It is this convexity condition that appears in the solution of the Weyl problem of differential geometry [397]: if \(({S^2},{q_{ab}})\) is a C4 Riemannian two-manifold with positive scalar curvature, then this can be isometrically embedded (i.e., realized as a closed convex two-surface) in the Euclidean three-space ℝ3, and this embedding is unique up to rigid motions [477]. However, there are counterexamples even to local isometric embedability, when the convexity condition, i.e., the positivity of the scalar curvature, is violated [373]. We continue the discussion of this embedding problem in Section 10.1.6.

In the context of general relativity the isometric embedding of a closed orientable two-surface into the Minkowski spacetime ℝ1,3 is perhaps more interesting. However, even a naïve function counting shows that if such an embedding exists then it is not unique. An existence theorem for such an embedding, \(i:{\mathcal S} \rightarrow {{\rm{{\mathbb R}}}^{1,3}}\), (with S2 topology) was given by Wang and Yau [543], and they controlled these isometric embeddings in terms of a single function τ on the two-surface. This function is just \({x^{\underline a}}{T_{\underline a}}\), the ‘time function’ of the surface in the Cartesian coordinates of the Minkowski space in the direction of a constant unit timelike vector field \({T_{\underline a}}\). Interestingly enough, \(({\mathcal S},{q_{ab}})\) is not needed to have positive scalar curvature, only the sum of the scalar curvature and a positive definite expression of the derivative δeτ is required to be positive. This condition is just the requirement that the surface must have a convex ‘shadow’ in the direction \({T^{\underline a}}\), i.e., the scalar curvature of the projection of the two-surface \(i({\mathcal S}) \subset {{\rm{{\mathbb R}}}^{1,3}}\) to the spacelike hyperplane orthogonal to \({T^{\underline a}}\) is positive. The Laplacian δeδeτ of the ‘time function’ gives the mean curvature vector of \(i({\mathcal S})\) in ℝ1,3 in the direction \({T^{\underline a}}\).

If \({\mathcal S}\) is in a Lorentzian spacetime, then the weakest convexity conditions are conditions only on the mean null curvatures: \({\mathcal S}\) will be called weakly future convex if the outgoing null normals la are expanding on \({\mathcal S}\), i.e., θ:= qabθab > 0, and weakly past convex if θ′:= qabθ′ab < 0 [519]. \({\mathcal S}\) is called mean convex [247] if θθ′ < 0 on \({\mathcal S}\), or, equivalently, if \({{\tilde Q}_a}\) is timelike. To formulate stronger convexity conditions we must consider the determinant of the null expansions \(D: = \det \Vert {\theta ^a}_b\Vert \, = \,{1 \over 2}({\theta _{ab}}{\theta _{cd}} - {\theta _{ac}}{\theta _{bd}}){q^{ab}}{q^{cd}}\) and \({D{\prime}}: = \det \Vert{\theta{\prime}^{a}}_b\Vert \, = \,{1 \over 2}(\theta _{ab}{\prime}\theta _{cd}{\prime} - \theta _{ac}{\prime}\theta _{cd}{\prime}){q^{ab}}{q^{cd}}\). Note that, although the expansion tensors, and in particular the functions θ, θ′, D, and D′ are boost-gauge-dependent, their sign is gauge invariant. Then \({\mathcal S}\) will be called future convex if θ > 0 and D > 0, and past convex if θ′ < 0 and D′ > 0 [519, 492]. These are equivalent to the requirement that the two eigenvalues of \({\theta ^a}_b\) be positive and those of \({\theta{\prime}^{a}}_b\) be negative everywhere on \({\mathcal S}\), respectively. A different kind of convexity condition, based on global concepts, will be used in Section 6.1.3.

4.1.4 The spinor bundle

The connections δe and Δe determine connections on the pullback \({{\rm{S}}^A}({\mathcal S})\) to \({\mathcal S}\) of the bundle of unprimed spinors. The natural decomposition \({{\rm{V}}^a}({\mathcal S}) = T{\mathcal S} \oplus N{\mathcal S}\) defines a chirality on the spinor bundle \({{\rm{S}}^A}({\mathcal S})\) in the form of the spinor \({\gamma ^A}_B: = 2{t^{A{A{\prime}}}}{\upsilon _{B{A{\prime}}}}\), which is analogous to the γ5 matrix in the theory of Dirac spinors. Then, the extrinsic curvature tensor above is a simple expression of \({Q^A}_{eB}: = {1 \over 2}({\Delta _e}{\gamma ^A}_C){\gamma ^C}_B\) and \({\gamma ^A}_B\) (and their complex conjugate), and the two covariant derivatives on \({{\rm{S}}^A}({\mathcal S})\) are related to each other by \({\Delta _e}{\lambda ^A} = {\delta _e}{\lambda ^A} + {Q^A}_{eB}{\lambda ^B}\). The curvature \({F^A}_{Bcd}\) of Δe can be expressed by the curvature \({f^A}_{Bcd}\) of δe, the spinor \({Q^A}_{eB}\), and its δe-derivative. We can form the scalar invariants of the curvatures according to

$$f: = {f_{abcd}}{1 \over 2}({\varepsilon ^{ab}} - {{\rm{i}}^ \bot}{\varepsilon ^{ab}})\;{\varepsilon ^{cd}} = {\rm{i}}{\gamma ^A}_B{f^B}_{Acd}{\varepsilon ^{cd}}{= ^{\mathcal S}}R - 2{\rm{i}}{\delta _c}({\varepsilon ^{cd}}{A_d}),$$
(4.3)
$$F: = {F_{abcd}}{1 \over 2}({\varepsilon ^{ab}} - {{\rm{i}}^ \bot}{\varepsilon ^{ab}}){\varepsilon ^{cd}} = {\rm{i}}{\gamma ^A}_B{F^B}_{Acd}{\varepsilon ^{cd}} = f + \theta \theta {\prime}- 2{\sigma {\prime}_{ea}}{\sigma ^e}_b({q^{ab}} + {\rm{i}}{\varepsilon ^{ab}}).$$
(4.4)

f is four times the complex Gauss curvature [425] of \({\mathcal S}\), by means of which the whole curvature \({f^A}_{Bcd}\) can be characterized: \({f^A}_{Bcd} = - {i \over 4}f{\gamma ^A}_B{\varepsilon _{cd}}\) If the spacetime is space and time orientable, at least on an open neighborhood of \({\mathcal S}\), then the normals ta and va can be chosen to be globally well defined, and hence, \(N{\mathcal S}\) is globally trivializable and the imaginary part of f is a total divergence of a globally well-defined vector field.

An interesting decomposition of the SO(1, 1) connection one-form Ae, i.e., the vertical part of the connection δe, was given by Liu and Yau [338]: There are real functions α and γ, unique up to additive constants, such that Ae = εefα + δeγ. α is globally defined on \({\mathcal S}\), but in general γ is defined only on the local trivialization domains of \(N{\mathcal S}\) that are homeomorphic to ℝ2. It is globally defined if \({H^1}({\mathcal S}) = 0\). In this decomposition α is the boost-gauge-invariant part of Ae, while γ represents its gauge content. Since δeAe = δeδeγ, the ‘Coulomb-gauge condition’ δeAe = 0 uniquely fixes Ae (see also Section 10.4.1).

By the Gauss-Bonnet theorem one has \(\oint\nolimits_{\mathcal S} {f\,d{\mathcal S} =} \oint\nolimits_{\mathcal S} {^{\mathcal S}Rd{\mathcal S} = 8\pi (1 - g)}\), where g is the genus of \({\mathcal S}\). Thus, geometrically the connection δe is rather poor, and can be considered as a part of the ‘universal structure of \({\mathcal S}\)’. On the other hand, the connection Δe is much richer, and, in particular, the invariant F carries information on the mass aspect of the gravitational ‘field’. The two-surface data for charge-type quasi-local quantities (i.e., for two-surface observables) are the universal structure (i.e., the intrinsic metric qab, the projection \(\Pi _b^a\) and the connection δe) and the extrinsic curvature tensor \({Q^a}_{eb}\).

4.1.5 Curvature identities

The complete decomposition of ΔAAλB into its irreducible parts gives ΔAAλA, the Dirac-Witten operator, and \({{\mathcal T}_{{E\prime}EA}}^B{\lambda _B}: = {\Delta _{{E\prime}(E}}{\lambda _{A)}} + {1 \over 2}\gamma EA{\gamma ^{CD}}{\Delta _{{E\prime}C}}{\lambda _D}\), the two-surface twistor operator. The former is essentially the anti-symmetric part ΔA′[AλB], the latter is the symmetric and (with respect to the complex metric γAB trace-free part of the derivative. (The trace \({\gamma ^{AB}}{\Delta _{{A\prime}A}}{\lambda _B}\) can be shown to be the Dirac-Witten operator, too.) A Sen-Witten-type identity for these irreducible parts can be derived. Taking its integral one has

$$\oint\nolimits_{\mathcal S} {{{\overline \gamma}^{A{\prime}B{\prime}}}[({\Delta _{A{\prime}A}}{\gamma ^A})({\Delta _{B{\prime}B}}{\mu ^B}) + ({\tau _{A{\prime}CD}}^E{\lambda _E})({\tau _{B{\prime}}}^{CDF}{\mu _F})]\;\;d{\mathcal S}} = - {\textstyle{{\rm{i}} \over 2}}\oint\nolimits_{\mathcal S} {{\lambda ^A}{\mu ^B}{F_{A\,Bcd}}},$$
(4.5)

where λA and μA are two arbitrary spinor fields on \({\mathcal S}\), and the right-hand side is just the charge integral of the curvature \({F^A}_{Bcd}\) on \({\mathcal S}\).

4.1.6 The GHP formalism

A GHP spin frame on the two-surface \({\mathcal S}\) is a normalized spinor basis \(\varepsilon _{\rm{A}}^A: = \{{o^A},\,{\iota ^A}\}, \, {\bf{A}} = 0,1\), such that the complex null vectors \({m^a}: = {o^A}{{\bar \iota}^{{A\prime}}}\) and \({{\bar m}^a}: = {\iota ^A}{{\bar o}^{{A\prime}}}\) are tangent to \({\mathcal S}\) (or, equivalently, the future-pointing null vectors la := oAōA and \({n^a}: = {\iota ^A}{{\bar \iota}^{{A\prime}}}\) are orthogonal to \({\mathcal S}\)). Note, however, that in general a GHP spin frame can be specified only locally, but not globally on the whole \({\mathcal S}\). This fact is connected with the nontriviality of the tangent bundle \(T{\mathcal S}\) of the two-surface. For example, on the two-sphere every continuous tangent vector field must have a zero, and hence, in particular, the vectors ma and \({{\bar m}^a}\) cannot form a globally-defined basis on \({\mathcal S}\). Consequently, the GHP spin frame cannot be globally defined either. The only closed orientable two-surface with a globally-trivial tangent bundle is the torus.

Fixing a GHP spin frame \(\{\varepsilon _{\rm{A}}^A\}\) on some open \(U \subset {\mathcal S}\), the components of the spinor and tensor fields on U will be local representatives of cross sections of appropriate complex line bundles E(p, q) of scalars of type (p, q) [209, 425]: A scalar ϕ is said to be of type (p, q) if, under the rescaling oAλoA, ιAλ−1 ιA of the GHP spin frame with some nowhere-vanishing complex function λ: U → ℂ, the scalar transforms as \(\phi \mapsto {\lambda ^p}{{\bar \lambda}^q}\phi\). For example, \(\rho: = {\theta _{ab}}{m^a}{{\bar m}^b} = - {1 \over 2}\theta, \,{\rho \prime}: = \theta _{ab}\prime{m^a}{{\bar m}^b} = \theta - {1 \over 2}{\theta \prime},\,\sigma := {\theta _{ab}}{m^a}{m^b} = {\sigma _{ab}}{m^a}{m^b}\) and \(\sigma := \theta _{ab}\prime{{\bar m}^a}{{\bar m}^b}\) are of type (1,1), (−1, −1), (3, −1), and (−3, 1), respectively. The components of the Weyl and Ricci spinors, \({\psi _0}: = {\psi _{ABCD}}{o^A}{o^B}{o^C}{o^D},{\psi _1}: = {\psi _{ABCD}}{o^A}{o^B}{o^C}{\iota ^D},\,{\psi _2}: = {\psi _{ABCD}}{o^A}{o^B}{\iota ^C}{\iota ^D},\, \ldots, \,{\phi _{00}}: = {\phi _{A{B\prime}}}{o^A}{{\bar o}^{{B\prime}}},\,{\phi _{01}}: = {\phi _{A{B\prime}}}{o^A}{{\bar \iota}^{{B\prime}}},\, \ldots\), etc., also have definite (p, q)-type. In particular, Λ:= R/24 has type (0, 0). A global section of E(p, q) is a collection of local cross sections {(U, ϕ), (U′, ϕ′), …} such that {U,U′,…} forms a covering of \({\mathcal S}\) and on the nonempty overlappings, e.g., on UU′, the local sections are related to each other by \(\phi = {\psi ^p}{{\bar \psi}^q}{\phi \prime}\), where ψ: UU′ → ℂ is the transition function between the GHP spin frames: oA = ψoA and ιA = ψ−1ιA.

The connection δe defines a connection ðe on the line bundles E(p,q) [209, 425]. The usual edth operators, ð and ð′, are just the directional derivatives ð:= maða and \({\eth\prime}: = {{\bar m}^a}{\eth_a}\) on the domain \(U \subset {\mathcal S}\) of the GHP spin frame \(\{\varepsilon _{\bf{A}}^A\}\). These locally-defined operators yield globally-defined differential operators, denoted also by ð and ð′, on the global sections of E(p, q). It might be worth emphasizing that the GHP spin coefficients β and β′, which do not have definite (p, q)-type, play the role of the two components of the connection one-form, and are built both from the connection one-form for the intrinsic Riemannian geometry of \(({\mathcal S},\,{q_{ab}})\) and the connection one-form Ae in the normal bundle. ð and ð′ are elliptic differential operators, thus, their global properties, e.g., the dimension of their kernel, are connected with the global topology of the line bundle they act on, and, in particular, with the global topology of \({\mathcal S}\). These properties are discussed in [198] in general, and in [177, 58, 490] for spherical topology.

4.1.7 Irreducible parts of the derivative operators

Using the projection operators \({\pi ^{\pm A}}_B: = {1 \over 2}(\delta _B^A \pm {\gamma ^A}_B)\), the irreducible parts Δa′aλA and \({{\mathcal T}_{E \prime EA}}^B{\lambda _B}\) can be decomposed further into their right-handed and left-handed parts. In the GHP formalism these chiral irreducible parts are

$$\begin{array}{*{20}c} {- {\Delta ^ -}\lambda : = \;\eth{\lambda _1} + \rho {\prime}{\lambda _0},} & {{\Delta ^ +}\lambda : = \;\eth{\prime}{\lambda _0} + \rho {\lambda _1},} \\ {{\tau ^ -}\lambda : = \;\eth{\lambda _0} + \sigma {\lambda _1},} & {- {\tau ^ +}\lambda : = \;\eth{\prime}{\lambda _1} + \sigma {\prime}{\lambda _0},} \\ \end{array}$$
(4.6)

where λ:= (λ0,λ1) and the spinor components are defined by λA =: λ1oAλ0ιA. The various first-order linear differential operators acting on spinor fields, e.g., the two-surface twistor operator, the holomorphy/antiholomorphy operators or the operators whose kernel defines the asymptotic spinors of Bramson [106], are appropriate direct sums of these elementary operators. Their global properties under various circumstances are studied in [58, 490, 496].

4.1.8 SO(1, 1)-connection one-form versus anholonomicity

Obviously, all the structures we have considered can be introduced on the individual surfaces of one or two-parameter families of surfaces, as well. In particular [246], let the two-surface \({\mathcal S}\) be considered as the intersection \({{\mathcal N}^ +} \cap {{\mathcal N}^ -}\) of the null hypersurfaces formed, respectively, by the outgoing and the ingoing light rays orthogonal to \({\mathcal S}\), and let the spacetime (or at least a neighborhood of \({\mathcal S}\)) be foliated by two one-parameter families of smooth hypersurfaces {ν+ = const.} and {ν = const.}, where ν±: M → ℝ, such that \({{\mathcal N}^ +} = \{{v_ +} = 0\}\) and \({{\mathcal N}^ -} = \{{v_ -} = 0\}\). One can form the two normals, n±a:= ∇aν±, which are null on \({{\mathcal N}^ +}\) and \({{\mathcal N}^ -}\), respectively. Then we can define \({\beta _{\pm e}}: = ({\Delta _e}{n_{\pm a}})n_ \mp ^a\), for which β+e + βe = Δen2, where \({n^2}: = {g_{ab}}n_ + ^an_ - ^b\). (If n2 is chosen to be 1 on \({\mathcal S}\), then βe = −β+e is precisely the SO(1, 1)-connection one-form Ae above.) Then the anholonomicity is defined by \({\omega _e}: = {1 \over {2{n^2}}}{[{n_ -},\,{n_ +}]^f}{q_{fe}} = {1 \over {2{n^2}}}({\beta _{+ e}} - {\beta _{- e}})\). Since ωe is invariant with respect to the rescalings ν+ ↦ exp(A)ν+ and νexp(B)ν of the functions, defining the foliations by those functions A, B: M → ℝ, which preserve \({\nabla _{[a}}{n_{\pm b]}} = 0\), it was claimed in [246] that ωe depends only on \({\mathcal S}\). However, this implies only that ωe is invariant with respect to a restricted class of the change of the foliations, and that ωe is invariantly defined only by this class of the foliations rather than the two-surface. In fact, ωe does depend on the foliation: Starting with a different foliation defined by the functions \({{\bar v}_ +}: = \exp (\alpha){v_ +}\) and \({{\bar v}_ -}: = \exp (\beta){v_ -}\) for some α, β: M → ℝ, the corresponding anholonomicity \({{\bar \omega}_e}\) would also be invariant with respect to the restricted changes of the foliations above, but the two anholonomicities, ωe and \({{\bar \omega}_e}\), would be different: \({{\bar \omega}_e} - {\omega _e} = {1 \over 2}{\Delta _e}(\alpha - \beta)\). Therefore, the anholonomicity is a gauge-dependent quantity.

4.2 Standard situations to evaluate the quasi-local quantities

There are exact solutions to the Einstein equations and classes of special (e.g., asymptotically flat) spacetimes in which there is a commonly accepted definition of energy-momentum (or at least mass) and angular momentum. In this section we review these situations and recall the definition of these ‘standard’ expressions.

4.2.1 Round spheres

If the spacetime is spherically symmetric, then a two-sphere, which is a transitivity surface of the rotation group, is called a round sphere. Then in a spherical coordinate system (t, r, θ, ϕ) the spacetime metric takes the form gab = diag(exp(2γ), − exp(2α), −r2, −r2 sin2 θ), where γ and α are functions of t and r. (Hence, r is called the area-coordinate.) Then, with the notation of Section 4.1, one obtains \({R_{abcd}}{\varepsilon ^{ab}}{\varepsilon ^{cd}} = {4 \over {{r^2}}}(1 - \exp (- 2\alpha))\). Based on the investigations of Misner, Sharp, and Hernandez [365, 267], Cahill and McVitte [122] found

$$E(t,r): = {1 \over {8G}}{r^3}{R_{abcd}}{\varepsilon ^{ab}}{\varepsilon ^{cd}} = {r \over {2G}}(1 - {e^{- 2\alpha}})$$
(4.7)

to be an appropriate (and hence, suggested to be the general) notion of energy, the Misner-Sharp energy, contained in the two-sphere \({\mathcal S}: = \{t = const.,\,r = const.\}\). (For another expression of E(t, r) in terms of the norm of the Killing fields and the metric, see [577].) In particular, for the Reissner-Nordström solution GE(t, r) = me2/2r, while for the isentropic fluid solutions \(E(t,\,r) = 4\pi \int\nolimits_0^r {{r\prime^{2}}\mu (t,\,{r\prime})d{r\prime}}\), where and are the usual parameters of the Reissner-Nordstroïm solutions and μ is the energy density of the fluid [365, 267] (for the static solution, see, e.g., Appendix B of [240]). Using Einstein’s equations, simple equations can be derived for the derivatives tE(t, r) and tE(t, r), and if the energy-momentum tensor satisfies the dominant energy condition, then rE(t, r) > 0. Thus, E(t, r) is a monotonic function of r, provided r is the area-coordinate. Since, by spherical symmetry all the quantities with nonzero spin weight, in particular the shears σ and σ′, are vanishing and ψ2 is real, by the GHP form of Eqs. (4.3), (4.4) the energy function E(t, r) can also be written as

$$E({\mathcal S}) = {1 \over G}{r^3}\left({{1 \over 4}{}^{\mathcal S}R + \rho \rho {\prime}} \right) = {1 \over G}{r^3}(- {\psi _2} + {\phi _{11}} + \Lambda) = \sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} \left({1 + {1 \over {2\pi}}\oint\nolimits_{\mathcal S} {\rho \rho {\prime}\;d{\mathcal S}}} \right).$$
(4.8)

Any of these expressions is considered to be the ‘standard’ definition of the energy for round spheres.Footnote 4 The last of these expressions does not depend on whether r is an area-coordinate or not.

\(E({\mathcal S})\) contains a contribution from the gravitational ‘field’ too. For example, for fluids it is not simply the volume integral of the energy density μ of the fluid, because that would be \(4\pi \int\nolimits_0^r {{r\prime^{2}}\exp (\alpha)\mu \,d{r\prime}}\). This deviation can be interpreted as the contribution of the gravitational potential energy to the total energy. Consequently, \(E({\mathcal S})\) is not a globally monotonic function of r, even if μ ≥ 0. For example, in the closed Friedmann-Robertson-Walker spacetime (where, to cover the whole three-space, r cannot be chosen to be the area-radius and \(r \in [0,\pi ])\,E({\mathcal S})\) is increasing for r ∈ [0, π/2), taking its maximal value at r = π/2, and decreasing for r ∈ [π/2, π].

This example suggests a slightly more exotic spherically-symmetric spacetime. Its spacelike slice Σ will be assumed to be extrinsically flat, and its intrinsic geometry is the matching of two conformally flat metrics. The first is a ‘large’ spherically-symmetric part of a t = const. hypersurface of the closed Friedmann-Robertson-Walker spacetime with the line element \(d{l^2} = \Omega _{{\rm{FRW}}}^2dl_0^2\), where \(dl_0^2\) is the line element for the flat three-space and \(d{l^2} = \Omega _{{\rm{FRW}}}^2: = B{(1 + {{{r^2}} \over {4{T^2}}})^{- 2}}\) with positive constants B and T2, and the range of the Euclidean radial coordinate r is [0, r0], where r0 ∈ (2T, ∞). It contains a maximal two-surface at r = 2T with round-sphere mass parameter \(M: = GE(2T) = {1 \over 2}T\sqrt B\). The scalar curvature is R = 6/BT2, and hence, by the constraint parts of the Einstein equations and by the vanishing of the extrinsic curvature, the dominant energy condition is satisfied. The other metric is the metric of a piece of a t = const. hypersurface in the Schwarzschild solution with mass parameter m (see [213]): \(d{{\bar l}^2} = \Omega _S^2d\bar l_0^2\), where \(\Omega _S^2: = {(1 + {m \over {2\bar r}})^4}\) and the Euclidean radial coordinate \({\bar r}\) runs from \({{\bar r}_0}\) to ∞, where \({{\bar r}_0} \in (0,\,m/2)\). In this geometry there is a minimal surface at \(\bar r = m/2\), the scalar curvature is zero, and the round-sphere energy is \(E(\bar r) = m/G\). These two metrics can be matched to obtain a differentiable metric with a Lipschitz-continuous derivative at the two-surface of the matching (where the scalar curvature has a jump), with arbitrarily large ‘internal mass’ M/G and arbitrarily small ADM mass m/G. (Obviously, the two metrics can be joined smoothly, as well, by an ‘intermediate’ domain between them.) Since this space looks like a big spherical bubble on a nearly flat three-plane — like the capital Greek letter Ω — for later reference we will call it an ‘ΩM,m-spacetime’.

Spherically-symmetric spacetimes admit a special vector field, called the Kodama vector field Ka, such that KaGab is divergence free [321]. In asymptotically flat spacetimes Ka is timelike in the asymptotic region, in stationary spacetimes it reduces to the Killing symmetry of stationarity (in fact, this is hypersurface-orthogonal), but, in general, it is not a Killing vector. However, by ∇a(GabKb) = 0, the vector field Sa := GabKb has a conserved flux on a spacelike hypersurface Σ. In particular, in the coordinate system (t, r, θ, ϕ) and in the line element given in the first paragraph above Ka = exp[−(α + γ)](∂/∂t)a. If Σ is a solid ball of radius r, then the flux of Sa is precisely the standard round-sphere expression (4.7) for the two-sphere Σ [375].

An interesting characterization of the dynamics of the spherically-symmetric gravitational fields can be given in terms of the energy function E(t, r) given by (4.7) (or by (4.8)) (see, e.g., [578, 352, 250]). In particular, criteria for the existence and formation of trapped surfaces and for the presence and nature of the central singularity can be given by E(t, r). Other interesting quasi-locally-defined quantities are introduced and used to study nonlinear perturbations and backreaction in a wide class of spherically-symmetric spacetimes in [483]. For other applications of E(t, r) in cosmology see, e.g., [484, 130].

4.2.2 Small surfaces

In the literature there are two kinds of small surfaces. The first is that of the small spheres (both in the light cone of a point and in a spacelike hypersurface), introduced first by Horowitz and Schmidt [275], and the other is the concept of small ellipsoids in a spacelike hypersurface, considered first by Woodhouse in [313]. A small sphere in the light cone is a cut of the future null cone in the spacetime by a spacelike hypersurface, and the geometry of the sphere is characterized by data at the vertex of the cone. The sphere in a hypersurface consists of those points of a given spacelike hypersurface, whose geodesic distance in the hypersurface from a given point p, the center, is a small given value, and the geometry of this sphere is characterized by data at this center. Small ellipsoids are two-surfaces in a spacelike hypersurface with a more general shape.

To define the first, let pM be a point, and ta a future-directed unit timelike vector at p. Let \({{\mathcal N}_p}: = \partial {I^ +}(p)\), the ‘future null cone of p in M’ (i.e., the boundary of the chronological future of p). Let la be the future pointing null tangent to the null geodesic generators of \({{\mathcal N}_p}\), such that, at the vertex p, lata = 1. With this condition we fix the scale of the affine parameter r on the different generators, and hence, by requiring r(p) = 0, we fix the parametrization completely. Then, in an open neighborhood of the vertex \(p,\,{{\mathcal N}_p} - \{p\}\) is a smooth null hypersurface, and hence, for sufficiently small r, the set \({\mathcal S_r}: = \{q \in M\vert r(q) = r\}\) is a smooth spacelike two-surface and is homeomorphic to \({{\mathcal S}^2}\). \({{\mathcal S}_r}\) is called a small sphere of radius r with vertex p. Note that the condition lata = 1 fixes the boost gauge, too.

Completing la to get a Newman-Penrose complex null tetrad \(\{{l^a},{n^a},{m^a},{{\bar m}^a}\}\) such that the complex null vectors ma and \({{\bar m}^a}\) are tangent to the two-surfaces \({{\mathcal S}_r}\), the components of the metric and the spin coefficients with respect to this basis can be expanded as a series in r. If, in addition, the spinor constituent oA of la = oAōA is required to be parallelly propagated along la, then the tetrad becomes completely fixed, yielding the vanishing of several (combinations of the) spin coefficients. Then the GHP equations can be solved with any prescribed accuracy for the expansion coefficients of the metric qab on \({{\mathcal S}_r}\), the GHP spin coefficients ρ, σ, τ, p′, σ′ and β, and the higher-order expansion coefficients of the curvature in terms of the lower-order curvature components at p. Hence, the expression of any quasi-local quantity \({Q_{{{\mathcal S}_r}}}\) for the small sphere \(_{{{\mathcal S}_r}}\) can be expressed as a series of r,

$${Q_{{{\mathcal S}_r}}} = \oint\nolimits_{\mathcal S} {\left({{Q^{\left(0 \right)}} + r{Q^{\left(1 \right)}} + {\textstyle{1 \over 2}}{r^2}{Q^{\left(2 \right)}} + \cdots} \right)\;\;d{\mathcal S}},$$

where the expansion coefficients Q(k) are still functions of the coordinates, \((\zeta, \,\bar \zeta)\) or (θ,ϕ), on the unit sphere \({\mathcal S}\). If the quasi-local quantity Q is spacetime-covariant, then the unit sphere integrals of the expansion coefficients Q(k) must be spacetime covariant expressions of the metric and its derivatives up to some finite order at p and the ‘time axis’ ta. The necessary degree of the accuracy of the solution of the GHP equations depends on the nature of \({Q_{{{\mathcal S}_r}}}\) and on whether the spacetime is Ricci-flat in the neighborhood of p or not.Footnote 5 These solutions of the GHP equations, with increasing accuracy, are given in [275, 313, 118, 494].

Obviously, we can calculate the small-sphere limit of various quasi-local quantities built from the matter fields in the Minkowski spacetime, as well. In particular [494], the small-sphere expressions for the quasi-local energy-momentum and the (anti-self-dual part of the) quasi-local angular momentum of the matter fields based on \({Q_{\mathcal S}}[{\bf{K}}]\), are, respectively,

$$P_{{{\mathcal S}_r}}^{\underline A \underline {B{\prime}}} = {{4\pi} \over 3}{r^3}{T^{AA{\prime}\,BB{\prime}}}{t_{AA{\prime}}}\varepsilon _B^{\underline A}\bar \varepsilon _{B{\prime}}^{\underline {B{\prime}}} + {\mathcal O}\left({{r^4}} \right),$$
(4.9)
$$J_{{{\mathcal S}_r}}^{\underline A \underline B} = {{4\pi} \over 3}{r^3}{T_{AA{\prime}BB{\prime}}}{t^{AA{\prime}}}\left({r{t^{B{\prime}E}}{{\textstyle\varepsilon} ^{BF}}\varepsilon _{\left(E \right.}^{\underline A}\varepsilon _{\left. F \right)}^{\underline B}} \right) + {\mathcal O}\,({r^5}),$$
(4.10)

where \(\{{\mathcal E}{A \over A}\}, \,\underline A = 0,\,1\), is the ‘Cartesian spin frame’ at p and the origin of the Cartesian coordinate system is chosen to be the vertex p. Here \(K_a^{\underline A \,{{\underline B}\prime}} = {\mathcal E}_A^{\underline A}\bar {\mathcal E}_{{A\prime}}^{{{\underline B}\prime}}\) can be interpreted as the translation one-forms, while \(K_a^{\underline A \,\underline B} = r{t_{{A\prime}}}^E{\mathcal E}_{(E}^{\underline A}{\mathcal E}_{A)}^{\underline B}\) is an average on the unit sphere of the boost-rotation Killing one-forms that vanish at the vertex p. Thus, \(P_{{{\mathcal S}_r}}^{\underline A \,{{\underline B}\prime}}\) and \(J_{{{\mathcal S}_r}}^{\underline A \,\underline B}\) are the three-volume times the energy-momentum and angular momentum density with respect to p, respectively, that the observer with four-velocity ta sees at p.

Interestingly enough, a simple dimensional analysis already shows the structure of the leading terms in a large class of quasi-local spacetime covariant energy-momentum and angular momentum expressions. In fact, if \({Q_{\mathcal S}}\) is any coordinate-independent quasi-local quantity built from the first derivatives μgaβ of the spacetime metric, then in its expansion the difference of the power of r and the number of the derivatives in every term must be one, i.e., it must have the form

$$\begin{array}{*{20}c} {{Q_{{{\mathcal S}_r}}} = {Q_2}[\partial g]\;{r^2}+{Q_3}\left[ {{\partial ^2}g,{{(\partial g)}^2}} \right]\;{r^3} + {Q_4}\left[ {{\partial ^3}g,({\partial ^2}g)\;(\partial g),{{(\partial g)}^3}} \right]\;{r^4} +} \\ {+ {Q_5}\left[ {{\partial ^4}g,({\partial ^3}g)\;(\partial g),{{({\partial ^2}g)}^2},({\partial ^2}g)\;{{(\partial g)}^2},{{(\partial g)}^4}} \right]\;{r^5} + \ldots,} \\ \end{array}$$

where Qi[A, B, …], i = 2, 3, …, are scalars. They are polynomial expressions of ta, gab and εabcd at the vertex p, and they depend linearly on the tensors that are constructed at p from \({g_{\alpha \beta}},\,{g^{\alpha \beta}}\) and linearly from the coordinate-dependent quantities A, B, …. Since there is no nontrivial tensor built from the first derivative μgαβ and gαβ, the leading term is of order r3. Its coefficient Q3[2g, (dg)2] must be a linear expression of Rab and Cabcd, and polynomial in ta, gab and εabcd. In particular, if \({Q_{\mathcal S}}\) is to represent energy-momentum with generator Kc at p, then the leading term must be

$${Q_{{{\mathcal S}_r}}}[{\bf{K}}] = {r^3}\left[ {a\left({{G_{ab}}{t^a}{t^b}} \right){t_c} + bR{t_c} + c\left({{G_{ab}}{t^a}P_c^b} \right)} \right]{K^c} + {\mathcal O}\left({{r^4}} \right)$$
(4.11)

for some unspecified constants a, b, and c, where \(P_b^a: = \delta _b^a - {t^a}{t_b}\), the projection to the subspace orthogonal to ta. If, in addition to the coordinate-independence of \({Q_{\mathcal S}}\), it is Lorentz-covariant, i.e., it does not, for example, depend on the choice for a normal to \({\mathcal S}\) (e.g., in the small-sphere approximation on ta) intrinsically, then the different terms in the above expression must depend on the boost gauge of the external observer ta in the same way. Therefore, a = c, in which case the first and the third terms can in fact be written as r3 ataGabKb. Then, comparing Eq. (4.11) with Eq. (4.9), we see that a = −1/(6G), and hence the term r3 bRtaKa would have to be interpreted as the contribution of the gravitational ‘field’ to the quasi-local energy-momentum of the matter + gravity system. However, this contributes only to energy, but not to linear momentum in any frame defined by the observer ta, even in a general spacetime. This seems to be quite unacceptable. Furthermore, even if the matter fields satisfy the dominant energy condition, \({Q_{{{\mathcal S}_r}}}\) given by Eq. (4.11) can be negative, even for c = a, unless b = 0. Thus, in the leading r3 order in nonvacuum, any coordinate and Lorentz-covariant quasi-local energy-momentum expression which is nonspacelike and future pointing, should be proportional to the energy-momentum density of the matter fields seen by the observer ta times the Euclidean volume of the three-ball of radius r. No contribution from the gravitational ‘field’ is expected at this order. In fact, this result is compatible the with the principle of equivalence, and the particular results obtained in the relativistically corrected Newtonian theory (considered in Section 3.1.1) and in the weak field approximation (see Sections 4.2.5 and 7.1.1 below). Interestingly enough, even for a timelike Killing field Ke, the well known expression of Komar does not satisfy this criterion. (For further discussion of Komar’s expression see also Section 12.1.)

If the neighborhood of p is vacuum, then the r3-order term is vanishing, and the fourth-order term must be built from ∇eCabcd. However, the only scalar polynomial expression of ta, gab, εabcd, ∇eCabcd and the generator vector Ka, depending linearly on the latter two, is the zero tensor field. Thus, the r4-order term in vacuum is also vanishing. At the fifth order the only nonzero terms are quadratic in the various parts of the Weyl tensor, yielding

$${Q_{{{\mathcal S}_r}}}[{\bf{K}}] = {r^5}\;[(a{E_{ab}}{E^{ab}} + b{H_{ab}}{H^{ab}} + c{E_{ab}}{H^{ab}}){t_c} + d{E_{ae}}{H^e}_b{\varepsilon ^{ab}}_c]\;{K^c} + {\mathcal O}\;({r^6})$$
(4.12)

for constants a, b, c, and d, where Eab: = Caebftetf is the electric part and \({H_{ab}}: = {\ast} {C_{aebf}}{t^e}{t^f}: = {1 \over 2}{\varepsilon _{ae}}^{cd}{C_{cdbf}}{t^e}{t^f}\) is the magnetic part of the Weyl curvature, and εabc:=εabcdtd is the induced volume 3-form. However, using the identities CabcdCabcd = 8(EabEabHabHab), Cabcd * Cabcd = 16EabHab, 4TabcdtatbtHd = EabEab + HabHab and \(2{T_{abcd}}{t^a}{t^b}{t^c}P_e^d = {E_{ab}}{H^a}_c{\varepsilon ^{bc}}_e\), we can rewrite the above formula to be

$$\begin{array}{*{20}c} {{Q_{{{\mathcal S}_r}}}[{\bf{K}}] = {r^5}\;\left[ {\left({2(a + b){T_{abcd}}{t^a}{t^b}{t^c}{t^d} + {\textstyle{1 \over {16}}}(a - b){C_{abcd}}{C^{abcd}} +} \right.} \right.\quad \quad \quad \quad \quad \quad \quad \quad} \\ {\left. {\left. {+ {\textstyle{1 \over {16}}}c{C_{abcd}} {\ast} {C^{abcd}}} \right){t_e} + 2d{T_{abcd}}{t^a}{t^b}{t^c}P_e^d} \right]\;{K^e} + {\mathcal O}\;({r^6}).} \\ \end{array}$$
(4.13)

Again, if \({Q_{\mathcal S}}\) does not depend on ta intrinsically, then d = (a + b), in which case the first and the fourth terms together can be written into the Lorentz covariant form 2r5 dTabcdtatbtcKd. In a general expression the curvature invariants CabcdCabcd and Cabcd * Cabcd may be present. Since, however, Eab and Hab at a given point are independent, these invariants can be arbitrarily large positive or negative, and hence, for ab or c ≠ 0 the quasi-local energy-momentum could not be future pointing and nonspacelike. Therefore, in vacuum in the leading r5 order any coordinate and Lorentz-covariant quasi-local energy-momentum expression, which is nonspacelike and future pointing must be proportional to the Bel-Robinson ‘momentum’ Tabcdtatbtc.

Obviously, the same analysis can be repeated for any other quasi-local quantity. For the energy-momentum, \({Q_{\mathcal S}}\) has the structure \(\oint\nolimits_{\mathcal S} {\mathcal Q} ({\partial _\mu}{g_{\alpha \beta}})\,d{\mathcal S}\), for angular momentum it is \(\oint\nolimits_{\mathcal S} {\mathcal Q} ({\partial _\mu}{g_{\alpha \beta}})r\, d{\mathcal S}\), while the area of \({\mathcal S}\) is \(\oint\nolimits_{\mathcal S} {d{\mathcal S}}\). Therefore, the leading term in the expansion of the angular momentum is r4 and r6 order in nonvacuum and vacuum with the energy-momentum and the Bel-Robinson tensors, respectively, while the first nontrivial correction to the area 4πr2 is of order rA and r6 in nonvacuum and vacuum, respectively.

On the small geodesic sphere \({{\mathcal S}_r}\) of radius r in the given spacelike hypersurface Σ one can introduce the complex null tangents ma and \({{\bar m}^a}\) above, and if ta is the future-pointing unit normal of Σ and va the outward directed unit normal of \({{\mathcal S}_r}\) in Σ, then we can define la := ta + va and 2na:= tava. Then \(\{{l^a},{n^a},{m^a},{{\bar m}^a}\}\) is a Newman-Penrose complex null tetrad, and the relevant GHP equations can be solved for the spin coefficients in terms of the curvature components at p.

The small ellipsoids are defined as follows [313]. If f is any smooth function on Σ with a nondegenerate minimum at p ∈ Σ with minimum value f(p) = 0, then, at least on an open neighborhood U of p in Σ, the level surfaces \({{\mathcal S}_r}: = \{q \in \Sigma |2f(q) = {r^2}\}\) are smooth compact two-surfaces homeomorphic to S2. Then, in the r → 0 limit, the surfaces \({{\mathcal S}_r}\) look like small nested ellipsoids centered at p. The function f is usually ‘normalized’ so that habDaDbf|p = −3.

A slightly different framework for calculations in small regions was used in [327, 170, 235]. Instead of the Newman-Penrose (or the GHP) formalism and the spin coefficient equations, holonomic (Riemann or Fermi type normal) coordinates on an open neighborhood U of a point pM or a timelike curve γ are used, in which the metric, as well as the Christoffel symbols on U, are expressed by the coordinates on U and the components of the Riemann tensor at p or on γ. In these coordinates and the corresponding frames, the various pseudotensorial and tetrad expressions for the energy-momentum have been investigated. It has been shown that a quadratic expression of these coordinates with the Bel-Robinson tensor as their coefficient appears naturally in the local conservation law for the matter energy-momentum tensor [327]; the Bel-Robinson tensor can be recovered as some ‘double gradient’ of a special combination of the Einstein and the Landau-Lifshitz pseudotensors [170]; Møller’s tetrad expression, as well as certain combinations of several other classical pseudotensors, yield the Bel-Robinson tensor [473, 470, 471]. In the presence of some non-dynamical (background) metric a 11-parameter family of combinations of the classical pseudotensors exists, which, in vacuum, yields the Bel-Robinson tensor [472, 474]. (For this kind of investigation see also [465, 468, 466, 467, 469]).

In [235] a new kind of approximate symmetries, namely approximate affine collineations, are introduced both near a point and a world line, and used to introduce Komar-type ‘conserved’ currents. (For a readable text on the non-Killing type symmetries see, e.g., [233].) These symmetries turn out to yield a nontrivial gravitational contribution to the matter energy-momentum, even in the leading r3 order.

4.2.3 Large spheres near spatial infinity

Near spatial infinity we have the a priori 1/r and 1/r2 falloff for the three-metric hab and extrinsic curvature χab, respectively, and both the evolution equations of general relativity and the conservation equation \({T^{ab}}_{;b} = 0\) for the matter fields preserve these conditions. The spheres \({{\mathcal S}_r}\) of coordinate radius r in Σ are called large spheres if the values of r are large enough, such that the asymptotic expansions of the metric and extrinsic curvature are legitimate.Footnote 6 Introducing some coordinate system, e.g., the complex stereographic coordinates, on one sphere and then extending that to the whole Σ along the normals va of the spheres, we obtain a coordinate system \((r,\zeta, \,\bar \zeta)\) on Σ. Let \(\varepsilon _{\bf{A}}^A = \{{o^A},{\iota ^A}\}, \, {\bf{A}} = 0,\, 1\), be a GHP spinor dyad on Σ adapted to the large spheres in such a way that ma := oAA and \({{\bar m}^a} = {\iota ^A}{{\bar o}^{{A\prime}}}\) are tangent to the spheres and are tangent to the spheres and, the future directed unit normal of Σ. These conditions fix the spinor dyad completely, and, in particular, \({v^a} = _2^1{o^A}{{\bar o}^{{A\prime}}} - {\iota ^A}{{\bar \iota}^{{A\prime}}}\), the outward directed unit normal to the spheres tangent to Σ.

The falloff conditions yield that the spin coefficients tend to their flat spacetime value as 1/r2 and the curvature components to zero like 1/r3. Expanding the spin coefficients and curvature components as a power series of 1/r, one can solve the field equations asymptotically (see [65, 61] for a different formalism). However, in most calculations of the large sphere limit of the quasi-local quantities, only the leading terms of the spin coefficients and curvature components appear. Thus, it is not necessary to solve the field equations for their second or higher-order nontrivial expansion coefficients.

Using the flat background metric 0hab and the corresponding derivative operator 0De we can define a spinor field 0λA to be constant if 0De0λA = 0. Obviously, the constant spinors form a two-complex-dimensional vector space. Then, by the falloff properties \({D_{e0}}{\lambda _A} = {\mathcal O}({r^{- 2}})\). Thus, we can define the asymptotically constant spinor fields to be those λA that satisfy \({D_e}{\lambda _A} = {\mathcal O}({r^{- 2}})\), where De is the intrinsic Levi-Civita derivative operator on Σ. Note that this implies that, with the notation of Eq. (4.6), all the chiral irreducible parts, \({\Delta ^ +}\lambda, \,{\Delta ^ -}\lambda, \,{{\mathcal T}^ +}\lambda\), and \({{\mathcal T}^ -}\lambda\) of the derivative of the asymptotically constant spinor field λA are \({\mathcal O}({r^{- 2}})\).

4.2.4 Large spheres near null infinity

Let the spacetime be asymptotically flat at future null infinity in the sense of Penrose [413, 414, 415, 426] (see also [208]), i.e., the physical spacetime can be conformally compactified by an appropriate boundary ℐ+. Then future null infinity ℐ+ will be a null hypersurface in the conformally rescaled spacetime. Topologically it is \({\rm{\mathbb R}} \times {S^2}\), and the conformal factor can always be chosen such that the induced metric on the compact spacelike slices of ℐ+ is the metric of the unit sphere. Fixing such a slice \({{\mathcal S}_0}\) (called ‘the origin cut of ℐ+’) the points of ℐ+ can be labeled by a null coordinate, namely the affine parameter u ∈ ℝ along the null geodesic generators of ℐ+ measured from \({{\mathcal S}_0}\) and, for example, the familiar complex stereographic coordinates \((\zeta, \bar \zeta) \in {S^2}\), defined first on the origin cut \({{\mathcal S}_0}\) and then extended in a natural way along the null generators to the whole ℐ+. Then any other cut \({\mathcal S}\) of ℐ+ can be specified by a function \(u = f(\zeta, \bar \zeta)\). In particular, the cuts \({{\mathcal S}_u}: = \{u = {\rm{const}}.\}\) are obtained from \({{\mathcal S}_0}\) by a pure time translation.

The coordinates \((u,\zeta, \bar \zeta)\) can be extended to an open neighborhood of ℐ+ in the spacetime in the following way. Let \({{\mathcal N}_u}\) be the family of smooth outgoing null hypersurfaces in a neighborhood of ℐ+, such that they intersect the null infinity just in the cuts \({{\mathcal S}_u}\), i.e., \({{\mathcal N}_u} \cap {{\mathscr I}^ +} = {{\mathcal S}_u}\). Then let r be the affine parameter in the physical metric along the null geodesic generators of \({{\mathcal N}_u}\). Then \((u,r,\zeta, \bar \zeta)\) forms a coordinate system. The u = const., r = const. two-surfaces \({{\mathcal S}_{u,r}}\) (or simply \({{\mathcal S}_r}\) if no confusion can arise) are spacelike topological two-spheres, which are called large spheres of radius r near future null infinity. Obviously, the affine parameter r is not unique, its origin can be changed freely: \(\bar r: = r + g(u,\zeta, \bar \zeta)\) is an equally good affine parameter for any smooth g. Imposing certain additional conditions to rule out such coordinate ambiguities we arrive at a ‘Bondi-type coordinate system’.Footnote 7 In many of the large-sphere calculations of the quasi-local quantities the large spheres should be assumed to be large spheres not only in a general null, but in a Bondi-type coordinate system. For a detailed discussion of the coordinate freedom left at the various stages in the introduction of these coordinate systems, see, for example, [394, 393, 107].

In addition to the coordinate system, we need a Newman-Penrose null tetrad, or rather a GHP spinor dyad, \(\varepsilon _{\rm{A}}^A = \{{o^A},{\iota ^A}\}, \,{\rm{A = 0,1}}\), on the hypersurfaces \({{\mathcal N}_u}\). (Thus, boldface indices are referring to the GHP spin frame.) It is natural to choose oA such that la := oAōA be the tangent (∂/∂r)a of the null geodesic generators of \({{\mathcal N}_u}\), and oA itself be constant along la. Newman and Unti [394] chose ιA to be parallelly propagated along la. This choice yields the vanishing of a number of spin coefficients (see, for example, the review [393]). The asymptotic solution of the Einstein-Maxwell equations as a series of 1/r in this coordinate and tetrad system is given in [394, 179, 425], where all the nonvanishing spin coefficients and metric and curvature components are listed. In this formalism the gravitational waves are represented by the u-derivative \({{\dot \sigma}^0}\) of the asymptotic shear of the null geodesic generators of the outgoing null hypersurfaces \({{\mathcal N}_u}\).

From the point of view of the large sphere calculations of the quasi-local quantities, the choice of Newman and Unti for the spinor basis is not very convenient. It is more natural to adapt the GHP spin frame to the family of the large spheres of constant ‘radius’ r, i.e., to require ma := oAA and \({{\bar m}^a} = {\iota ^A}{{\bar o}^{{A{\prime}}}}\) to be tangents of the spheres. This can be achieved by an appropriate null rotation of the Newman-Unti basis about the spinor oA. This rotation yields a change of the spin coefficients and the metric and curvature components. As far as the present author is aware, the rotation with the highest accuracy was done for the solutions of the Einstein-Maxwell system by Shaw [455].

In contrast to the spatial-infinity case, the ‘natural’ definition of the asymptotically constant spinor fields yields identically zero spinors in general [106]. Nontrivial constant spinors in this sense could exist only in the absence of the outgoing gravitational radiation, i.e., when \({{\dot \sigma}^0} = 0\). In the language of Section 4.1.7, this definition would be limr→∞rΔ+λ = 0, limr→∞ rΔλ = 0, \({\lim\nolimits_{r \rightarrow \infty}}r{{\mathcal T}^ +}\lambda = 0\) and \({\lim\nolimits_{r \rightarrow \infty}}r{{\mathcal T}^ -}\lambda = 0\). However, as Bramson showed [106], half of these conditions can be imposed. Namely, at future null infinity \({{\mathcal C}^ +}\lambda : = ({\Delta ^ +} \oplus {{\mathcal T}^ -})\lambda = 0\) (and at past null infinity \({{\mathcal C}^ -}\lambda : = ({\Delta ^ -} \oplus {{\mathcal T}^ +})\lambda = 0)\) can always be imposed asymptotically, and has two linearly-independent solutions \(\lambda _A^{\underline A},\underline A = 0,1\), on ℐ+ (or on ℐ, respectively). The space \({\bf{S}}_\infty ^{\underline A}\) of its solutions turns out to have a natural symplectic metric \({\varepsilon _{\underline A \underline B}}\), and we refer to \(({\bf{S}}_\infty ^{\underline A},{\varepsilon _{\underline A \underline B}})\) as future asymptotic spin space. Its elements are called asymptotic spinors, and the equations \({\lim\nolimits_{r \rightarrow \infty}}r{{\mathcal C}^ \pm}\lambda = 0\), the future/past asymptotic twistor equations. At ℐ+ asymptotic spinors are the spinor constituents of the BMS translations: Any such translation is of the form \({K^{\underline A {{\underline A}{\prime}}}}\lambda _{\underline A}^A\bar \lambda _{{{\underline A}{\prime}}}^{{A{\prime}}} = {K^{\underline A {{\underline A}{\prime}}}}\lambda _A^1\bar \lambda _{\underline {{A{\prime}}}}^{{1{\prime}}}{\iota ^A}{{\bar \iota}^{{A{\prime}}}}\) for some constant Hermitian matrix \({K^{\underline A {{\underline A}{\prime}}}}\). Similarly, (apart from the proper supertranslation content) the components of the anti-self-dual part of the boost-rotation BMS vector fields are \(- \sigma _{\rm{i}}^{\underline A \underline B}\lambda _{\underline A}^1\lambda _{\underline B}^1\), where \(\sigma _{\rm{i}}^{\underline A \underline B}\) are the standard SU(2) Pauli matrices (divided by \(\sqrt 2)\) [496]. Asymptotic spinors can be recovered as the elements of the kernel of several other operators built from Δ+, Δ, \({{\mathcal T}^ +}\), and \({{\mathcal T}^ -}\), too. In the present review we use only the fact that asymptotic spinors can be introduced as antiholomorphic spinors (see also Section 8.2.1), i.e., the solutions of \({{\mathcal H}^ -}\lambda : = ({\Delta ^ -} \oplus {{\mathcal T}^ -})\lambda = 0\) (and at past null infinity as holomorphic spinors), and as special solutions of the two-surface twistor equation \({\mathcal N}\lambda : = ({{\mathcal T}^ +} \oplus {{\mathcal T}^ -})\lambda = 0\) (see also Section 7.2.1). These operators, together with others reproducing the asymptotic spinors, are discussed in [496].

The Bondi-Sachs energy-momentum given in the Newman-Penrose formalism has already become its ‘standard’ form. It is the unit sphere integral on the cut \({\mathcal S}\) of a combination of the leading term \(\psi _2^0\) of the Weyl spinor component \({\psi _2}\), the asymptotic shear σ0 and its u-derivative, weighted by the first four spherical harmonics (see, for example, [393, 426]):

$$P_{B\,S}^{\underline A \underline {B{\prime}}} = - {1 \over {4\pi G}}\oint {\left({\psi _2^0 + {\sigma ^0}{{\dot \bar \sigma}^0}} \right)\lambda _0^{\underline A}\bar \lambda _{0{\prime}}^{\underline {B{\prime}}}\;d{\mathcal S}},$$
(4.14)

where \(\lambda _0^{\underline A}: = \lambda _A^{\underline A}{o^A},\underline A = 0,1\), are the oA-component of the vectors of a spin frame in the space of the asymptotic spinors. (For the various realizations of these spinors see, e.g., [496].) The minimal assumptions on the physical Ricci tensor that already ensure that the Bondi-Sachs energy-momentum and Bondi’s mass-loss are well defined are determined by Tafel [505]. The expression of the Bondi-Sachs energy-momentum in terms of the conformal factor is also given there.

Similarly, the various definitions for angular momentum at null infinity could be rewritten in this formalism. Although there is no generally accepted definition for angular momentum at null infinity in general spacetimes, in stationary and in axi-symmetric spacetimes there is. The former is the unit sphere integral on the cut \({\mathcal S}\) of the leading term of the Weyl spinor component \({{\bar \psi}_{{1{\prime}}}}\), weighted by appropriate (spin-weighted) spherical harmonics:

$${J^{\underline A \underline B}} = {1 \over {8\pi G}}\oint {\bar \psi _1^0,\lambda _0^{\underline A}\lambda _0^{\underline B}\,d{\mathcal S}}.$$
(4.15)

In particular, Bramson’s expression also reduces to this ‘standard’ expression in the absence of the outgoing gravitational radiation [109]. If the spacetime is axi-symmetric, then the generally accepted definition of angular momentum is that of Komar with the numerical coefficient \({1 \over {16\pi G}}\) (rather than \({1 \over {8\pi G}}\)) and α = 0 in (3.15). This view is supported by the partial results of a quasi-local canonical analysis of general relativity given in [499], too.

Instead of the Bondi type coordinates above, one can introduce other ‘natural’ coordinates in a neighborhood of ℐ+. Such is the one based on the outgoing asymptotically-shear-free null geodesics [27]. While the Bondi-type coordinate system is based on the null geodesic generators of the outgoing null hypersurfaces \({{\mathcal N}_u}\), and hence, in the rescaled metric these generators are orthogonal to the cuts \({{\mathcal S}_u}\), the new coordinate system is based on the use of outgoing null geodesic congruences that extend to ℐ+ but are not orthogonal to the cuts of ℐ+ (and hence, in general, they have twist). The definition of the new coordinates \((u,r,\zeta, \bar \zeta)\) is analogous to that of the Bonditype coordinates: \((u, \zeta, \bar \zeta)\) labels the intersection point of the actual geodesic and ℐ+, while r is the affine parameter along the geodesic. The tangent \({{\tilde l}^a}\) of this null congruence is asymptotically null rotated about na: In the NP basis \(\{{l^a},{n^a},{m^a},{{\bar m}^a}\}\) above \({{\tilde l}^a} = {l^a} + b{{\bar m}^a} + \bar b{m^a} + b\bar b{m^a}\), where \(b = - L(u,\zeta, \bar \zeta)/r + {\mathcal O}({r^{- 2}})\) and \(L = L(u,\zeta, \bar \zeta)\) is a complex valued function (with spin weight one) on ℐ+. Then Aronson and Newman show in [27] that if L is chosen to satisfy \(\eth L + L\dot L = {\sigma ^0}\), then the asymptotic shear of the congruence is, in fact, of order r−3, and by an appropriate choice for the other vectors of the NP basis many spin coefficients can be made zero. In this framework it is the function L that plays a role analogous to that of σ0, and, indeed, the asymptotic solution of the field equations is given in terms of L in [27]. This L can be derived from the solution Z of the good-cut equation, which, however, is not uniquely determined, but depends on four complex parameters: \(Z = Z({Z^{\underline a}},\zeta, \bar \zeta)\). It is this freedom that is used in [325, 326] to introduce the angular momentum at future null infinity (see Section 3.2.4). Further discussion of these structures, in particular their connection with the solutions of the good-cut equation and the H-space, as well as their applications, is given in [324, 325, 326, 5].

4.2.5 Other special situations

In the weak field approximation of general relativity [525, 36, 534, 426, 303] the gravitational field is described by a symmetric tensor field hab on Minkowski spacetime (\(({{\rm{R}}^4},g_{ab}^0)\)), and the dynamics of the field hab is governed by the linearized Einstein equations, i.e., essentially the wave equation. Therefore, the tools and techniques of the Poincaré-invariant field theories, in particular the Noether-Belinfante-Rosenfeld procedure outlined in Section 2.1 and the ten Killing vectors of the background Minkowski spacetime, can be used to construct the conserved quantities. It turns out that the symmetric energy-momentum tensor of the field hab is essentially the second-order term in the Einstein tensor of the metric \({g_{ab}}: = g_{ab}^0 + {h_{ab}}\). Thus, in the linear approximation the field hab does not contribute to the global energy-momentum and angular momentum of the matter + gravity system, and hence these quantities have the form (2.5) with the linearized energy-momentum tensor of the matter fields. However, as we will see in Section 7.1.1, this energy-momentum and angular momentum can be re-expressed as a charge integral of the (linearized) curvature [481, 277, 426].

pp-waves spacetimes are defined to be those that admit a constant null vector field La, and they interpreted as describing pure plane-fronted gravitational waves with parallel rays. If matter is present, then it is necessarily pure radiation with wave-vector La, i.e., TabLb = 0 holds [478]. A remarkable feature of the pp-wave metrics is that, in the usual coordinate system, the Einstein equations become a two-dimensional linear equation for a single function. In contrast to the approach adopted almost exclusively, Aichelburg [8] considered this field equation as an equation for a boundary value problem. As we will see, from the point of view of the quasi-local observables this is a particularly useful and natural standpoint. If a pp-wave spacetime admits an additional spacelike Killing vector Ka with closed S1 orbits, i.e., it is cyclically symmetric too, then La and Ka are necessarily commuting and are orthogonal to each other, because otherwise an additional timelike Killing vector would also be admitted [485].

Since the final state of stellar evolution (the neutron star or black hole state) is expected to be described by an asymptotically flat, stationary, axisymmetric spacetime, the significance of these spacetimes is obvious. It is conjectured that this final state is described by the Kerr-Newman (either outer or black hole) solution with some well-defined mass, angular momentum and electric charge parameters [534]. Thus, axisymmetric two-surfaces in these solutions may provide domains, which are general enough but for which the quasi-local quantities are still computable. According to a conjecture by Penrose [418], the (square root of the) area of the event horizon provides a lower bound for the total ADM energy. For the Kerr-Newman black hole this area is \(4\pi (2{m^2} - {e^2} + 2m\sqrt {{m^2} - {e^2} - {a^2}})\). Thus, particularly interesting two-surfaces in these spacetimes are the spacelike cross sections of the event horizon [80].

There is a well-defined notion of total energy-momentum not only in the asymptotically flat, but even in the asymptotically anti-de Sitter spacetimes as well. This is the Abbott-Deser energy [1], whose positivity has also been proven under similar conditions that we had to impose in the positivity proof of the ADM energy [220]. (In the presence of matter fields, e.g., a self-interacting scalar field, the falloff properties of the metric can be weakened such that the ‘charges’ defined at infinity and corresponding to the asymptotic symmetry generators remain finite [265].) The conformal technique, initiated by Penrose, is used to give a precise definition of the asymptotically anti-de Sitter spacetimes and to study their general, basic properties in [42]. A comparison and analysis of the various definitions of mass for asymptotically anti-de Sitter metrics is given in [150].

Extending the spinorial proof [349] of the positivity of the total energy in asymptotically anti-de Sitter spacetime, Chruściel, Maerten and Tod [149] give an upper bound for the angular momentum and center-of-mass in terms of the total mass and the cosmological constant. (Analogous investigations show that there is a similar bound at the future null infinity of asymptotically flat spacetimes with no outgoing energy flux, provided the spacetime contains a constant-mean-curvature, hyperboloidal, initial-data set on which the dominant energy condition is satisfied. In this bound the role of the cosmological constant is played by the (constant) mean curvature of the hyperboloidal spacelike hypersurface [151].) Thus, it is natural to ask whether or not a specific quasi-local energy-momentum or angular momentum expression has the correct limit for large spheres in asymptotically anti-de Sitter spacetimes.

4.3 On lists of criteria of reasonableness of the quasi-local quantities

In the literature there are various, more or less ad hoc, ‘lists of criteria of reasonableness’ of the quasi-local quantities (see, for example, [176, 143]). However, before discussing them, it seems useful to first formulate some general principles that any quasi-local quantity should satisfy.

4.3.1 General expectations

In nongravitational physics the notions of conserved quantities are connected with symmetries of the system, and they are introduced through some systematic procedure in the Lagrangian and/or Hamiltonian formalism. In general relativity the total energy-momentum and angular momentum are two-surface observables, thus, we concentrate on them even at the quasi-local level. These facts motivate our three a priori expectations:

  1. 1.

    The quasi-local quantities that are two-surface observables should depend only on the two-surface data, but they cannot depend, e.g., on the way that the various geometric structures on \({\mathcal S}\) are extended off the two-surface. There seems to be no a priori reason why the two-surface would have to be restricted to spherical topology. Thus, in the ideal case, the general construction of the quasi-local energy-momentum and angular momentum should work for any closed orientable spacelike two-surface.

  2. 2.

    It is desirable to derive the quasi-local energy-momentum and angular momentum as the charge integral (Lagrangian interpretation) and/or as the value of the Hamiltonian on the constraint surface in the phase space (Hamiltonian interpretation). If they are introduced in some other way, they should have a Lagrangian and/or Hamiltonian interpretation.

  3. 3.

    These quantities should correspond to the ‘quasi-symmetries’ of the two-surface, which quasisymmetries are special spacetime vector fields on the two-surface. In particular, the quasilocal energy-momentum should be expected to be in the dual of the space of the ‘quasitranslations’, and the angular momentum in the dual of the space of the ‘quasi-rotations’.

To see that these conditions are nontrivial, let us consider the expressions based on the linkage integral (3.15). \({L_{\mathcal S}}[{\bf{K}}]\) does not satisfy the first part of our first requirement. In fact, it depends on the derivative of the normal components of Ka in the direction orthogonal to \({\mathcal S}\) for any value of the parameter α. Thus, it depends not only on the geometry of \({\mathcal S}\) and the vector field Ka given on the two-surface, but on the way in which Ka is extended off the two-surface. Therefore, \({L_{\mathcal S}}[{\bf{K}}]\) is ‘less quasi-local’ than \({A_{\mathcal S}}[\omega ]\) or \({H_{\mathcal S}}[\lambda, \bar \mu ]\) that will be introduced in Sections 7.2.1 and 7.2.2, respectively.

We will see that the Hawking energy satisfies our first requirement, but not the second and the third ones. The Komar integral (i.e., half of the linkage for α = 0) has the form of the charge integral of a superpotential, \({1 \over {16\pi G}}\oint\nolimits_{\mathcal S} {{\nabla ^{[a}}{K^{b]}}{1 \over 2}{\varepsilon _{abcd}}}\), i.e., it has a Lagrangian interpretation. The corresponding conserved Komar-current was defined by 8 \(8\pi G{C^a}[{\bf{K}}]: = {G^a}_b{K^b} + {\nabla _b}{\nabla ^{[a}}{K^{b]}}\). However, its flux integral on some compact spacelike hypersurface with boundary \({\mathcal S}: = \partial \Sigma\) cannot be a Hamiltonian on the ADM phase space in general. In fact, it is

$$\begin{array}{*{20}c} {{}_KH\;[{\bf{K}}]: = \int\nolimits_\Sigma {{C^a}[{\bf{K}}]\,{t_a}\;d\Sigma} \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \,\;} \\ {= \int\nolimits_\Sigma {(cN + {c_a}{N^a})\;d\Sigma + {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {{\upsilon _a}\left({{\chi ^a}_b{N^b} - {D^a}N + {1 \over {2N}}{{\dot N}^a}} \right)\;d{\mathcal S}}.}} \\ \end{array}$$
(4.16)

Here c and ca are, respectively, the Hamiltonian and momentum constraints of the vacuum theory, ta is the future-directed unit normal to Σ, va is the outward-directed unit normal to \({\mathcal S}\) in Σ, and N and Na are the lapse and shift part of Ka, respectively, defined by Ka =: Nta + Na. Thus, KH[K] is a well-defined function of the configuration and velocity variables (N, Na, hab) and (, a, ab), respectively. However, since the velocity a cannot be expressed by the canonical variables (see e.g. [558, 63]), KH[K] can be written as a function on the ADM phase space only if the boundary conditions at Σ ensure the vanishing of the integral of vaa/N.

4.3.2 Pragmatic criteria

Since in certain special situations there are generally accepted definitions for the energy-momentum and angular momentum, it seems reasonable to expect that in these situations the quasi-local quantities reduce to them. One half of the pragmatic criteria is just this expectation, and the other is a list of some a priori requirements on the behavior of the quasi-local quantities.

One such list for the energy-momentum and mass, based mostly on [176, 143] and the properties of the quasi-local energy-momentum of the matter fields of Section 2.2, might be the following:

  1. 1.1

    The quasi-local energy-momentum \(P_{\mathcal S}^{\underline a}\) must be a future-pointing nonspacelike vector (assuming that the matter fields satisfy the dominant energy condition on some Σ for which \({\mathcal S} = \partial \Sigma\), and maybe some form of the convexity of \({\mathcal S}\) should be required) (‘positivity’).

  2. 1.2

    \(P_{\mathcal S}^{\underline a}\) must be zero iff D(Σ) is flat, and null iff D(Σ) has a pp-wave geometry with pure radiation (‘rigidity’).

  3. 1.3

    \(P_{\mathcal S}^{\underline a}\) must give the correct weak field limit.

  4. 1.4

    \(P_{\mathcal S}^{\underline a}\) must reproduce the ADM, Bondi-Sachs and Abbott-Deser energy-momenta in the appropriate limits (‘correct large-sphere behaviour’).

  5. 1.5

    For small spheres \(P_{\mathcal S}^{\underline a}\) must give the expected results (‘correct small sphere behaviour’):

    1. 1.

      \({4 \over 3}\pi {r^3}{T^{ab}}{t_b}\) in nonvacuum and

    2. 2.

      kr5Tabcdtbtctd in vacuum for some positive constant k and the Bel-Robinson tensor Tabcd.

  6. 1.6

    For round spheres \(P_{\mathcal S}^{\underline a}\) must yield the ‘standard’ Misner-Sharp round-sphere expression.

  7. 1.7

    For marginally trapped surfaces the quasi-local mass \({m_{\mathcal S}}\) must be the irreducible mass \(\sqrt {{\rm{Area(}}{\mathcal S}{\rm{)/16}}\pi {G^2}}\).

For a different view on the positivity of the quasi-local energy see [391]. Item 1.7 is motivated by the expectation that the quasi-local mass associated with the apparent horizon of a black hole (i.e., the outermost marginally-trapped surface in a spacelike slice) be just the irreducible mass [176, 143].

Usually, \({m_{\mathcal S}}\) is expected to be monotonicgally increasing in some appropriate sense [143]. For example, if \({{\mathcal S}_1} = \partial \Sigma\) for some achronal (and hence spacelike or null) hypersurface Σ in which \({{\mathcal S}_2}\) is a spacelike closed two-surface and the dominant energy condition is satisfied on Σ, then \({m_{{{\mathcal S}_1}}} \geq {m_{{{\mathcal S}_2}}}\) seems to be a reasonable expectation [176]. (However, see also Section 4.3.3.) A further, and, in fact, a related issue is the (post) Newtonian limit of the quasi-local mass expressions. In item 1.4 we expected, in particular, that the quasi-local mass tends to the ADM mass at spatial infinity. However, near spatial infinity the radiation and the dynamics of the fields and the geometry die off rapidly. Hence, in vacuum asymptotically flat spacetimes in the asymptotic regime the gravitational ‘field’ approaches the Newtonian one, and hence its contribution to the total energy of the system is similar to that of the negative definite binding energy [400, 199]. Therefore, it seems natural to expect that the quasi-local mass tends to the ADM mass as a monotonically decreasing function (see also sections 3.1.1 and 12.3.3).

In contrast to the energy-momentum and angular momentum of the matter fields on the Minkowski spacetime, the additivity of the energy-momentum (and angular momentum) is not expected. In fact, if \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\) are two connected two-surfaces, then, for example, the corresponding quasi-local energy-momenta would belong to different vector spaces, namely to the dual of the space of the quasi-translations of the first and second two-surface, respectively. Thus, even if we consider the disjoint union \({{\mathcal S}_1} \cup {{\mathcal S}_2}\) to surround a single physical system, we can add the energy-momentum of the first to that of the second only if there is some physically/geometrically distinguished rule defining an isomorphism between the different vector spaces of the quasi-translations. Such an isomorphism would be provided for example by some naturally-chosen globally-defined flat background. However, as we discussed in Section 3.1.2, general relativity itself does not provide any background. The use of such a background would contradict the complete diffeomorphism invariance of the theory. Nevertheless, the quasi-local mass and the length of the quasi-local Pauli-Lubanski spin of different surfaces can be compared, because they are scalar quantities.

Similarly, any reasonable quasi-local angular momentum expression \(J_{\mathcal S}^{\underline a \underline b}\) may be expected to satisfy the following:

  1. 2.1

    \(J_{\mathcal S}^{\underline a \underline b}\) must give zero for round spheres.

  2. 2.2

    For two-surfaces with zero quasi-local mass, the Pauli-Lubanski spin should be proportional to the (null) energy-momentum four-vector \(P_{\mathcal S}^{\underline a}\).

  3. 2.3

    \(J_{\mathcal S}^{\underline a \underline b}\) must give the correct weak field limit.

  4. 2.4

    \(J_{\mathcal S}^{\underline a \underline b}\) must reproduce the generally-accepted spatial angular momentum at spatial infinity, and in stationary and in axi-symmetric spacetimes it should reduce to the ‘standard’ expressions at the null infinity as well (‘correct large-sphere behaviour’).

  5. 2.5

    For small spheres the anti-self-dual part of \(J_{\mathcal S}^{\underline a \underline b}\), defined with respect to the center of the small sphere (the ‘vertex’ in Section 4.2.2) is expected to give \({4 \over 3}\pi {r^3}{T_{cd}}{t^c}(r{\varepsilon ^{D(A}}{t^{B){D{\prime}}}})\) in nonvacuum and Cr5Tcdeftctdte(F(AtB)F′) in vacuum for some constant C (‘correct small sphere behaviour’).

Since there is no generally accepted definition for the angular momentum at null infinity, we cannot expect anything definite there in nonstationary, non-axi-symmetric spacetimes. Similarly, there are inequivalent suggestions for the center-of-mass at spatial infinity (see Sections 3.2.2 and 3.2.4).

4.3.3 Incompatibility of certain ‘natural’ expectations

As Eardley noted in [176], probably no quasi-local energy definition exists, which would satisfy all of his criteria. In fact, it is easy to see that this is the case. Namely, any quasi-local energy definition, which reduces to the ‘standard’ expression for round spheres cannot be monotonic, as the closed Friedmann-Robertson-Walker or the ΩM,m spacetimes show explicitly. The points where the monotonicity breaks down are the extremal (maximal or minimal) surfaces, which represent an event horizon in the spacetime. Thus, one may argue that since the event horizon hides a portion of spacetime, we cannot know the details of the physical state of the matter + gravity system behind the horizon. Hence, in particular, the monotonicity of the quasi-local mass may be expected to break down at the event horizon. However, although for stationary systems (or at the moment of time symmetry of a time-symmetric system) the event horizon corresponds to an apparent horizon (or to an extremal surface, respectively), for general nonstationary systems the concepts of the event and apparent horizons deviate. Thus, it does not seem possible to formulate the causal argument of Section 4.3.2 in the hypersurface Σ. Actually, the root of the nonmonotonicity is the fact that the quasi-local energy is a two-surface observable in the sense of requirement 1 in Section 4.3.1 above. This does not mean, of course, that in certain restricted situations the monotonicity (‘local monotonicity’) could not be proven. This local monotonicity may be based, for example, on Lie dragging of the two-surface along some special spacetime vector field.

If the quasi-local mass should, in fact, tend to the ADM mass as a monotonically deceasing function in the asymptotic region of asymptotically flat spacetimes, then neither item 1.6 nor 1.7 can be expected to hold. In fact, if the dominant energy condition is satisfied, then the standard round-sphere Misner-Sharp energy is a monotonically increasing or constant (rather than strictly decreasing) function of the area radius r. For example, the Misner-Sharp energy in the Schwarzschild spacetime is the constant function <monospace>m</monospace>/G. The Schwarzschild solution provides a conterexample to item 1.7, too: Since both its ADM mass and the irreducible mass of the black hole are <monospace>m</monospace>/G, any quasi-local mass function of the radius r which is strictly decreasing for large r and coincides with them at infinity and on the horizon, respectively, would have to take its maximal value on some two-surface outside the horizon. However, it does not seem why such a gemetrically, and hence physically distinguished two-surface should exist.

In the literature the positivity and monotonicity requirements are sometimes confused, and there is an ‘argument’ that the quasi-local gravitational energy cannot be positive definite, because the total energy of the closed universes must be zero. However, this argument is based on the implicit assumption that the quasi-local energy is associated with a compact three-dimensional domain, which, together with the positive definiteness requirement would, in fact, imply the monotonicity and a positive total energy for the closed universe. If, on the other hand, the quasi-local energy-momentum is associated with two-surfaces, then the energy may be positive definite and not monotonic. The standard round sphere energy expression (4.7) in the closed FriedmannRobertson-Walker spacetime, or, more generally, the Dougan-Mason energy-momentum (see Section 8.2.3) are such examples.

5 The Bartnik Mass and its Modifications

5.1 The Bartnik mass

5.1.1 The main idea

One of the most natural ideas of quasi-localization of the familiar ADM mass is due to Bartnik [54, 53]. His idea is based on the positivity of the ADM energy, and, roughly, can be summarized as follows. Let Σ be a compact, connected three-manifold with connected boundary \({\mathcal S}\), and let hab be a (negative definite) metric and χab a symmetric tensor field on Σ, such that they, as an initial data set, satisfy the dominant energy condition: if 16πGμR + χ2χabχab and 8πGjaDb(χabχhab), then μ ≥ (−jaja)1/2. For the sake of simplicity we denote the triple (Σ, hab, χab) by Σ. Then let us consider all the possible asymptotically flat initial data sets (\(\hat \Sigma, {{\hat h}_{ab}},{{\hat \chi}_{ab}}\)) with a single asymptotic end, denoted simply by \({\hat \Sigma}\), which satisfy the dominant energy condition, have finite ADM energy and are extensions of Σ above through its boundary \({\mathcal S}\). The set of these extensions will be denoted by \({\mathcal E}(\Sigma)\). By the positive energy theorem, \({\hat \Sigma}\) has non-negative ADM energy \({E_{{\rm{ADM}}}}(\hat \Sigma)\), which is zero precisely when \({\hat \Sigma}\) is a data set for the flat spacetime. Then we can consider the infimum of the ADM energies, inf \(\{{E_{{\rm{ADM}}}}(\hat \Sigma)\vert \hat \Sigma \; \in \;{\mathcal E}(\Sigma)\}\), where the infimum is taken on \({\mathcal E}(\Sigma)\). Obviously, by the non-negativity of the ADM energies, this infimum exists and is non-negative, and it is tempting to define the quasi-local mass of Σ by this infimum.Footnote 8 However, it is easy to see that, without further conditions on the extensions of (Σ, hab, χab), this infimum is zero. In fact, Σ can be extended to an asymptotically flat initial data set \({\hat \Sigma}\) with arbitrarily small ADM energy such that \({\hat \Sigma}\) contains a horizon (for example in the form of an apparent horizon) between the asymptotically flat end and Σ. In particular, in the ‘ΩM,m-spacetime’ discussed in Section 4.2.1 on round spheres, the spherically symmetric domain bounded by the maximal surface (with arbitrarily-large round-sphere mass M/G) has an asymptotically flat extension, the complete spacelike hypersurface of the data set for the ΩM,m-spacetime itself, with arbitrarily small ADM mass m/G.

Obviously, the fact that the ADM energies of the extensions can be arbitrarily small is a consequence of the presence of a horizon hiding Σ from the outside. This led Bartnik [54, 53] to formulate his suggestion for the quasi-local mass of Σ. He concentrated on time-symmetric data sets (i.e., those for which the extrinsic curvature ηab is vanishing), when the horizon appears to be a minimal surface of topology S2 in \({\hat \Sigma}\) (see, e.g., [213]), and the dominant energy condition is just the requirement of the non-negativity of the scalar curvature of the spatial metric: R ≥ 0. Thus, if \({{\mathcal E}_0}(\Sigma)\) denotes the set of asymptotically flat Riemannian geometries \(\hat \Sigma = (\hat \Sigma, {{\hat h}_{ab}})\) with non-negative scalar curvature and finite ADM energy that contain no stable minimal surface, then Bartnik’s mass is

$${m_{\rm{B}}}(\Sigma): = \inf \left\{{{E_{{\rm{ADM}}}}(\hat \Sigma)\vert \hat \Sigma \in {\varepsilon _0}(\Sigma)} \right\}.$$
(5.1)

The ‘no-horizon’ condition on \({\hat \Sigma}\) implies that topologically Σ is a three-ball. Furthermore, the definition of \({{\mathcal E}_0}(\Sigma)\) in its present form does not allow one to associate the Bartnik mass to those three-geometries (Σ, hab) that contain minimal surfaces inside Σ. Although formally the maximal two-surfaces inside Σ are not excluded, any asymptotically flat extension of such a Σ would contain a minimal surface. In particular, the spherically-symmetric three-geometry, with line element dl2 = − dr2 − sin2 r(2 + sin2 θ dϕ2) with (θ, ϕ) ∈ S2 and r ∈ [0, r0], π/2 < r0 < π, has a maximal two-surface at r = π/2, and any of its asymptotically flat extensions necessarily contains a minimal surface of area not greater than 4π sin2 r0. Thus, the Bartnik mass (according to the original definition given in [54, 53]) cannot be associated with every compact time-symmetric data set (Σ, hab), even if Σ is topologically trivial. Since for 0 < r0 < π/2 this data set can be extended without any difficulty, this example shows that mB is associated with the three-dimensional data set Σ, and not only to the two-dimensional boundary Σ.

Of course, to rule out this limitation, one can modify the original definition by considering the set \({{\tilde {\mathcal E}}_0}(\mathcal S)\) of asymptotically flat Riemannian geometries \(\hat \Sigma = (\hat \Sigma, {{\hat h}_{ab}})\) (with non-negative scalar curvature, finite ADM energy and with no stable minimal surface), which contain \(({\mathcal S},{q_{ab}})\) as an isometrically-embedded Riemannian submanifold, and define \({{\tilde m}_{\rm{B}}}({\mathcal S})\) by Eq. (5.1) with \({{\mathcal E}_0}({\mathcal S})\) instead of \({{\mathcal E}_0}(\Sigma)\). Obviously, this \({{\tilde m}_{\rm{B}}}({\mathcal S})\) could be associated with a larger class of two-surfaces than the original mB(Σ) can be to compact three-manifolds, and \(0 \leq {{\tilde m}_{\rm{B}}}(\partial \Sigma) \leq {m_{\rm{B}}}(\Sigma)\) holds.

In [279, 56] the set \({{\mathcal E}_0}(\Sigma)\) was allowed to include extensions \({\hat \Sigma}\) of Σ having boundaries as compact outermost horizons, when the corresponding ADM energies are still non-negative [217], and hence mB(Σ) is still well defined and non-negative. (For another description of \({{\mathcal E}_0}(\Sigma)\) allowing horizons in the extensions but excluding them between Σ and the asymptotic end, see [110] and Section 5.2 of this paper.)

Bartnik suggests a definition for the quasi-local mass of a spacelike two-surface \({\mathcal S}\) (together with its induced metric and the two extrinsic curvatures), as well [54]. He considers those globally-hyperbolic spacetimes \(\hat M: = (\hat M,{{\hat g}_{ab}})\) that satisfy the dominant energy condition, admit an asymptotically flat (metrically-complete) Cauchy surface \({\hat \Sigma}\) with finite ADM energy, have no event horizon and in which \({\mathcal S}\) can be embedded with its first and second fundamental forms. Let \({{\mathcal E}_0}({\mathcal S})\) denote the set of these spacetimes. Since the ADM energy \({E_{{\rm{ADM}}}}(\hat M)\) is non-negative for any \(\hat M \in \;{{\mathcal E}_0}({\mathcal S})\) (and is zero precisely for flat \({\hat M}\)), the infimum

$${m_{\rm{B}}}({\mathcal S}): = \inf \left\{{{E_{{\rm{ADM}}}}(\hat M)\vert \hat M \in {\varepsilon _0}({\mathcal S})} \right\}$$
(5.2)

exists and is non-negative. Although it seems plausible that mB(Σ) is only the ‘spacetime version’ of mB(Σ), without the precise form of the no-horizon conditions in \({{\mathcal E}_0}(\Sigma)\) and that in \({{\mathcal E}_0}({\mathcal S})\) they cannot be compared, even if the extrinsic curvature were allowed in the extensions \({\hat \Sigma}\) of Σ.

5.1.2 The main properties of mB(Σ)

The first immediate consequence of Eq. (5.1) is the monotonicity of the Bartnik mass. If Σ1 ⊂ Σ2, then \({{\mathcal E}_0}({\Sigma _2}) \subset {{\mathcal E}_0}({\Sigma _1})\), and hence, mB1) ≤ mB2). Obviously, by definition (5.1) one has \({m_{\rm{B}}}(\Sigma) \leq {m_{{\rm{ADM}}}}(\hat \Sigma)\) for any \(\hat \Sigma \in \;{{\mathcal E}_0}(\Sigma)\). Thus, if m is any quasi-local mass functional that is larger than mB (i.e., that assigns a non-negative real to any Σ such that m(Σ) ≥ mB(Σ) for any allowed Σ), furthermore if \(m(\Sigma) \leq {m_{{\rm{ADM}}}}(\hat \Sigma)\) for any \(\hat \Sigma \in \;{{\mathcal E}_0}(\Sigma)\), then by the definition of the infimum in Eq. (5.1) one has mB(Σ) ≥ m(Σ) −εmB(Σ) − ε for any ε < 0. Therefore, mB is the largest mass functional satisfying \({m_{\rm{B}}}(\Sigma) \leq {m_{{\rm{ADM}}}}(\hat \Sigma)\) for any \(\hat \Sigma \in \;{{\mathcal E}_0}(\Sigma)\). Another interesting consequence of the definition of mB, due to Simon (see [56]), is that if \({\hat \Sigma}\) is any asymptotically flat, time-symmetric extension of Σ with non-negative scalar curvature satisfying \({m_{{\rm{ADM}}}}(\hat \Sigma) < {m_{\rm{B}}}(\Sigma)\), then there is a black hole in \({\hat \Sigma}\) in the form of a minimal surface between Σ and the infinity of \({\hat \Sigma}\). For further discussion of mB(Σ) from the point of view of black holes, as well as the relationship between the Bartnik mass and other expressions (e.g., the Hawking energy), see [460].

As we saw, the Bartnik mass is non-negative, and, obviously, if Σ is flat (and hence is a data set for flat spacetime), then mB(Σ) = 0. The converse of this statement is also true [279]: If mB(Σ) = 0, then Σ is locally flat. The Bartnik mass tends to the ADM mass [279]: If \((\hat \Sigma, {\hat h_{ab}})\) is an asymptotically flat Riemannian three-geometry with non-negative scalar curvature and finite ADM mass \({m_{{\rm{ADM}}}}(\hat \Sigma)\), and if {Σn}, n ∈ ℕ, is a sequence of solid balls of coordinate radius n in \({\hat \Sigma}\), then \({\lim\nolimits _{n \rightarrow \infty}}{m_{\rm{B}}}({\Sigma _n}) = {m_{{\rm{ADM}}}}(\hat \Sigma)\). The proof of these two results is based on the use of Hawking energy (see Section 6.1), by means of which a positive lower bound for mB(Σ) can be given near the nonflat points of Σ. In the proof of the second statement one must use the fact that Hawking energy tends to the ADM energy, which, in the time-symmetric case, is just the ADM mass.

The proof that the Bartnik mass reduces to the ‘standard expression’ for round spheres is a nice application of the Riemannian Penrose inequality [279]. Let Σ be a spherically-symmetric Riemannian three-geometry with spherically-symmetric boundary \({\mathcal S}: = \partial \Sigma\). One can form its ‘standard’ round-sphere energy \(E({\mathcal S})\) (see Section 4.2.1), and take its spherically-symmetric asymptotically flat vacuum extension \({{\hat \Sigma}_{{\rm{SS}}}}\) (see [54, 56]). By the Birkhoff theorem the exterior part of \({{\hat \Sigma}_{{\rm{SS}}}}\) is a part of a t = const. hypersurface of the vacuum Schwarzschild solution, and its ADM mass is just \(E({\mathcal S})\). Then, any asymptotically flat extension \({\hat \Sigma}\) of Σ can also be considered as (a part of) an asymptotically flat time-symmetric hypersurface with minimal surface, whose area is \(16\pi {G^2}{E_{{\rm{ADM}}}}({{\hat \Sigma}_{{\rm{SS}}}})\). Thus, by the Riemannian Penrose inequality [279] \({E_{{\rm{ADM}}}}(\hat \Sigma) \geq {E_{{\rm{ADM}}}}({{\hat \Sigma}_{{\rm{SS}}}}) = E({\mathcal S})\). Therefore, the Bartnik mass of Σ is just the ‘standard’ round-sphere expression \(E({\mathcal S})\).

5.1.3 The computability of the Bartnik mass

Since for any given Σ the set \({\mathcal E_0}(\Sigma)\) of its extensions is a huge set, it is almost hopeless to parametrize it. Thus, by its very definition, it seems very difficult to compute the Bartnik mass for a given, specific (Σ, hab). Without some computational method the potentially useful properties of mB(Σ) would be lost from the working relativist’s arsenal.

Such a computational method might be based on a conjecture of Bartnik [54, 56]: The infimum in definition (5.1) of the mass mB(Σ) is realized by an extension \((\hat \Sigma, {{\hat h}_{ab}})\) of (Σ, hab) such that the exterior region, \((\hat \Sigma - \Sigma, {{\hat h}_{ab}}{\vert _{\hat \Sigma - \Sigma}})\), is static, the metric is Lipschitz-continuous across the two-surface \(\partial \Sigma \subset \hat \Sigma\), and the mean curvatures of Σ of the two sides are equal. Therefore, to compute mB for a given (Σ, hab), one should find an asymptotically flat, static vacuum metric ĥab satisfying the matching conditions on Σ, and where the Bartnik mass is the ADM mass of ĥab. As Corvino shows [154], if there is an allowed extension \({\hat \Sigma}\) of Σ for which \({m_{{\rm{ADM}}}}(\hat \Sigma) = {m_{\rm{B}}}(\Sigma)\), then the extension \(\hat \Sigma - \bar \Sigma\) is static; furthermore, if Σ1 ⊂ Σ2, mB1) = mB2) and Σ2 has an allowed extension \({\hat \Sigma}\) for which \({m_{\rm{B}}}({\Sigma _2}) = {m_{{\rm{ADM}}}}(\hat \Sigma)\), then \({\Sigma _2} - \overline {{\Sigma _1}}\) is static. Thus, the proof of Bartnik’s conjecture is equivalent to the proof of the existence of such an allowed extension. The existence of such an extension is proven in [360] for geometries (Σ, hab) close enough to the Euclidean one and satisfying a certain reflection symmetry, but the general existence proof is still lacking. (For further partial existence results see [17].) Bartnik’s conjecture is that (Σ, hab) determines this exterior metric uniquely [56]. He conjectures [54, 56] that a similar computation method can be found for the mass \({m_{\rm{B}}}({\mathcal S})\), defined in Eq. (5.2), as well, where the exterior metric should be stationary. This second conjecture is also supported by partial results [155]: If (Σ, hab, χab) is any compact vacuum data set, then it has an asymptotically flat vacuum extension, which is a spacelike slice of a Kerr spacetime outside a large sphere near spatial infinity.

To estimate mB(Σ) one can construct admissible extensions of (Σ, hab) in the form of the metrics in quasi-spherical form [55]. If the boundary Σ is a metric sphere of radius r with non-negative mean curvature k, then mB(Σ) can be estimated from above in terms of r and k.

5.2 Bray’s modifications

Another, slightly modified definition for the quasi-local mass is suggested by Bray [110, 113]. Here we summarize his ideas.

Let Σ = (Σ, hab, χab) be any asymptotically flat initial data set with finitely-many asymptotic ends and finite ADM masses, and suppose that the dominant energy condition is satisfied on Σ. Let \({\mathcal S}\) be any fixed two-surface in Σ, which encloses all the asymptotic ends except one, say the i-th (i.e., let \({\mathcal S}\) be homologous to a large sphere in the i-th asymptotic end). The outside region with respect to \({\mathcal S}\), denoted by \(O({\mathcal S})\), will be the subset of Σ containing the i-th asymptotic end and bounded by \({\mathcal S}\), while the inside region, \(I({\mathcal S})\), is the (closure of) \(\Sigma - O({\mathcal S})\). Next, Bray defines the ‘extension’ \({{\hat \Sigma}_{\rm{e}}}\) of \({\mathcal S}\) by replacing \(O({\mathcal S})\) by a smooth asymptotically flat end of any data set satisfying the dominant energy condition. Similarly, the ‘fill-in’ \({{\hat \Sigma}_{\rm{f}}}\) of \({\mathcal S}\) is obtained from Σ by replacing \(I({\mathcal S})\) by a smooth asymptotically flat end of any data set satisfying the dominant energy condition. Finally, the surface \({\mathcal S}\) will be called outer-minimizing if, for any closed two-surface \({\tilde {\mathcal S}}\) enclosing \({\mathcal S}\), one has \({\rm{Area}}({\mathcal S}) \leq {\rm{Area}}(\tilde {\mathcal S})\).

Let \({\mathcal S}\) be outer-minimizing, and let \({\mathcal E}({\mathcal S})\) denote the set of extensions of \({\mathcal S}\) in which \({\mathcal S}\) is still outer-minimizing, and \({\mathcal F}({\mathcal S})\) denote the set of fill-ins of \({\mathcal S}\). If \({{\hat \Sigma}_{\rm{f}}} \in {\mathcal F}({\mathcal S})\) and \({A_{{{\hat \Sigma}_{\rm{f}}}}}\) denotes the infimum of the area of the two-surfaces enclosing all the ends of \({{\hat \Sigma}_{\rm{f}}}\) except the outer one, then Bray defines the outer and inner mass, \({m_{{\rm{out}}}}({\mathcal S})\) and \({m_{{\rm{in}}}}({\mathcal S})\), respectively, by

$$\begin{array}{*{20}c} {{m_{{\rm{out}}}}({\mathcal S}): = \inf \left\{{{m_{{\rm{ADM}}}}({{\hat \Sigma}_e})\vert {{\hat \Sigma}_e} \in {\mathcal E} \,({\mathcal S})} \right\},} \\ {{m_{{\rm{in}}}}({\mathcal S}): = \sup \left\{{\sqrt {{{{A_{{{\hat \Sigma}_{\rm{f}}}}}} \over {16\pi G}}} \vert {{\hat \Sigma}_{\rm{f}}} \in {\mathcal F}\,({\mathcal S})} \right\}.} \\ \end{array}$$

\({m_{{\rm{out}}}}({\mathcal S})\) deviates slightly from Bartnik’s mass (5.1) even if the latter would be defined for non-time-symmetric data sets, because Bartnik’s ‘no-horizon condition’ excludes apparent horizons from the extensions, while Bray’s condition is that \({\mathcal S}\) be outer-minimizing.

A simple consequence of the definitions is the monotonicity of these masses: If \({{\mathcal S}_2}\) and \({{\mathcal S}_1}\) are outer-minimizing two-surfaces such that \({{\mathcal S}_2}\) encloses \({{\mathcal S}_1}\), then \({m_{{\rm{in}}}}({{\mathcal S}_2}) \geq {m_{{\rm{in}}}}({{\mathcal S}_1})\) and \({m_{{\rm{out}}}}({{\mathcal S}_2}) \geq {m_{{\rm{out}}}}({{\mathcal S}_1})\). Furthermore, if the Penrose inequality holds (for example, in a time-symmetric data set, for which the inequality has been proven), then for outer-minimizing surfaces \({m_{{\rm{out}}}}({\mathcal S}) \geq {m_{{\rm{in}}}}({\mathcal S})\) [110, 113]. Furthermore, if Σi is a sequence such that the boundaries Σi shrink to a minimal surface \({\mathcal S}\), then the sequence mout(Σi) tends to the irreducible mass \(\sqrt {{\rm{Area}}({\mathcal S})/(16\pi {G^2})}\) [56]. Bray defines the quasi-local mass of a surface not simply to be a number, but the whole closed interval \([{m_{{\rm{in}}}}({\mathcal S}),{m_{{\rm{out}}}}({\mathcal S})]\). If \({\mathcal S}\) encloses the horizon in the Schwarzschild data set, then the inner and outer masses coincide, and Bray expects that the converse is also true: If \({m_{{\rm{in}}}}({\mathcal S}),{m_{{\rm{out}}}}({\mathcal S})\), then \({\mathcal S}\) can be embedded into the Schwarzschild spacetime with the given two-surface data on \({\mathcal S}\) [113].

For further modification of Bartnik’s original ideas, see [311].

6 The Hawking Energy and its Modifications

6.1 The Hawking energy

6.1.1 The definition

Studying the perturbation of the dust-filled k = −1 Friedmann-Robertson-Walker spacetimes, Hawking found that

$$\begin{array}{*{20}c} {{E_{\rm{H}}}({\mathcal S}): = \sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} \left({1 + {1 \over {2\pi}}\oint\nolimits_{\mathcal S} {\rho \rho {\prime}\;d{\mathcal S}}} \right) = \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= \sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} {1 \over {4\pi}}\oint\nolimits_{\mathcal S} {(\sigma \sigma {\prime}+ \bar \sigma \bar \sigma {\prime}- {\psi _2} - {{\bar \psi}_{2{\prime}}} + 2{\phi _{11}} + 2\Lambda)\;d{\mathcal S}}} \\ \end{array}$$
(6.1)

behaves as an appropriate notion of energy surrounded by the spacelike topological two-sphere \({\mathcal S}\) [236]. Here we used the Gauss-Bonnet theorem and the GHP form of Eqs. (4.3) and (4.4) for F to express ρρ′ by the curvature components and the shears. Thus, Hawking energy is genuinely quasi-local.

Hawking energy has the following clear physical interpretation even in a general spacetime, and, in fact, EH can be introduced in this way. Starting with the rough idea that the mass-energy surrounded by a spacelike two-sphere \({\mathcal S}\) should be the measure of bending of the ingoing and outgoing light rays orthogonal to \({\mathcal S}\), and recalling that under a boost gauge transformation laαla, naα−1na the convergences ρ and ρ′ transform as ραρ and ρ′ ↦ α−1ρ′, respectively, the energy must have the form \(C + D\oint\nolimits_{\mathcal S} {\rho \rho {\prime}d{\mathcal S}}\), where the unspecified parameters C and D can be determined in some special situations. For metric two-spheres of radius r in the Minkowski spacetime, for which ρ = −1/r and ρ′ = 1/2r, we expect zero energy, thus, D = C/(2π). For the event horizon of a Schwarzschild black hole with mass parameter m, for which ρ = 0 = ρ′, we expect m/G, which can be expressed by the area of \({\mathcal S}\). Thus, \({C^2} = {\rm{Area}}({\mathcal S})/(16\pi {G^2})\), and hence, we arrive at Eq. (6.1).

6.1.2 Hawking energy for spheres

Obviously, for round spheres, EH reduces to the standard expression (4.7). This implies, in particular, that the Hawking energy is not monotonic in general, since for a Killing horizon (e.g., for a stationary event horizon) ρ = 0, the Hawking energy of its spacelike spherical cross sections \({\mathcal S}\) is \(\sqrt {{\rm{Area}}({\mathcal S})/(16\pi {G^2})}\). In particular, for the event horizon of a Kerr-Newman black hole it is just the familiar irreducible mass \(\sqrt {2{m^2} - {e^2} + 2m\sqrt {{m^2} - {e^2} - {a^2}}}/(2G)\). For more general surfaces Hawking energy is calculated numerically in [272].

For a small sphere of radius r with center pM in nonvacuum spacetimes it is \({{4\pi} \over 3}{r^3}{T_{ab}}{t^a}{t^b}\), while in vacuum it is \({2 \over {45G}}{r^5}{T_{abcd}}{t^a}{t^b}{t^c}{t^d}\), where Tab is the energy-momentum tensor and Tabcd is the Bel-Robinson tensor at p [275]. The first result shows that in the lowest order the gravitational ‘field’ does not have a contribution to Hawking energy, that is due exclusively to the matter fields. Thus, in vacuum the leading order of EH must be higher than r3. Then, even a simple dimensional analysis shows that the number of the derivatives of the metric in the coefficient of the rk-order term in the power series expansion of EH is (k − 1). However, there are no tensorial quantities built from the metric and its derivatives such that the total number of the derivatives involved would be three. Therefore, in vacuum, the leading term is necessarily of order r5, and its coefficient must be a quadratic expression of the curvature tensor. It is remarkable that for small spheres EH is positive definite both in nonvacuum (provided the matter fields satisfy, for example, the dominant energy condition) and vacuum. This shows, in particular, that EH should be interpreted as energy rather than as mass: For small spheres in a pp-wave spacetime EH is positive, while, as we saw for matter fields in Section 2.2.3, a mass expression could be expected to be zero. (We will see in Sections 8.2.3 and 13.5 that, for the Dougan-Mason energy-momentum, the vanishing of the mass characterizes the pp-wave metrics completely.)

Using the second expression in Eq. (6.1) it is easy to see that at future null infinity EH tends to the Bondi-Sachs energy. A detailed discussion of the asymptotic properties of EH near null infinity both for radiative and stationary spacetimes is given in [455, 457]. Similarly, calculating EH for large spheres near spatial infinity in an asymptotically flat spacelike hypersurface, one can show that it tends to the ADM energy.

6.1.3 Positivity and monotonicity properties

In general, Hawking energy may be negative, even in Minkowski spacetime. Geometrically this should be clear, since for an appropriately general (e.g., concave) two-surface \({\mathcal S}\), the integral \(\oint\nolimits_{\mathcal S} {\rho {\rho \prime}s} {\mathcal S}\) could be less than −2π. Indeed, in flat spacetime EH is proportional to \(\oint\nolimits_{\mathcal S} {(\sigma {\sigma \prime} + \bar \sigma {{\bar \sigma}\prime})d} {\mathcal S}\) by the Gauss equation. For topologically-spherical two-surfaces in the t = const. spacelike hyperplane of Minkowski spacetime σσ′ is real and nonpositive, and it is zero precisely for metric spheres, while for two-surfaces in the r = const. timelike cylinder σσ′ is real and non-negative, and it is zero precisely for metric spheres.Footnote 9 If, however, \({\mathcal S}\) is ‘round enough’ (not to be confused with the round spheres in Section 4.2.1), which is some form of a convexity condition, then EH behaves nicely [143]: \({\mathcal S}\) will be called round enough if it is a submanifold of a spacelike hypersurface Σ, and if among the two-dimensional surfaces in Σ, which enclose the same volume as \({\mathcal S}\) does, \({\mathcal S}\) has the smallest area. It is proven by Christodoulou and Yau [143] that if \({\mathcal S}\) is round enough in a maximal spacelike slice Σ on which the energy density of the matter fields is non-negative (for example, if the dominant energy condition is satisfied), then the Hawking energy is non-negative.

Although Hawking energy is not monotonic in general, it has interesting monotonicity properties for special families of two-surfaces. Hawking considered one-parameter families of spacelike two-surfaces foliating the outgoing and the ingoing null hypersurfaces, and calculated the change of EH [236]. These calculations were refined by Eardley [176]. Starting with a weakly future convex two-surface \({\mathcal S}\) and using the boost gauge freedom, he introduced a special family \({{\mathcal S}_r}\) of spacelike two-surfaces in the outgoing null hypersurface \({\mathcal N}\), where r will be the luminosity distance along the outgoing null generators. He showed that \({E_H}({{\mathcal S}_r})\) is nondecreasing with r, provided the dominant energy condition holds on \({\mathcal N}\). Similarly, for weakly past convex \({\mathcal S}\) and the analogous family of surfaces in the ingoing null hypersurface \({E_H}({{\mathcal S}_r})\) is nonincreasing. Eardley also considered a special spacelike hypersurface, filled by a family of two-surfaces, for which \({E_H}({{\mathcal S}_r})\) is nondecreasing. By relaxing the normalization condition lana = 1 for the two null normals to lana = exp(f) for some \(f:{\mathcal S} \rightarrow {\mathbb R}\), Hayward obtained a flexible enough formalism to introduce a double-null foliation (see Section 11.2 below) of a whole neighborhood of a mean convex two-surface by special mean convex two-surfaces [247]. (For the more general GHP formalism in which lana is not fixed, see [425].) Assuming that the dominant energy condition holds, he showed that the Hawking energy of these two-surfaces is nondecreasing in the outgoing, and nonincreasing in the ingoing direction.

In contrast to the special foliations of the null hypersurfaces above, Frauendiener defined a special spacelike vector field, the inverse mean curvature vector in the spacetime [194]. If \({\mathcal S}\) is a weakly future and past convex two-surface, then qa ≔ 2Qa/(QbQb) = −[1/(2ρ)]la − [1/(2ρ′)]na is an outward-directed spacelike normal to \({\mathcal S}\). Here Qb is the trace of the extrinsic curvature tensor: \({Q_b}: = {Q^b}_{ab}\) (see Section 4.1.2). Starting with a single weakly future and past convex two-surface, Frauendiener gives an argument for the construction of a one-parameter family \({{\mathcal S}_t}\) of two-surfaces being Lie-dragged along its own inverse mean curvature vector qa. Assuming that such a family of surfaces (and hence, the vector field qa on the three-submanifold swept by \({{\mathcal S}_t}\)) exists, Frauendiener showed that the Hawking energy is nondecreasing along the vector field qa if the dominant energy condition is satisfied. This family of surfaces would be analogous to the solution of the geodesic equation, where the initial point and direction at that point specify the whole solution, at least locally. However, it is known (Frauendiener, private communication) that the corresponding flow is based on a system of parabolic equations such that it does not admit a well-posed initial value formulation.Footnote 10 Motivated by this result, Malec, Mars, and Simon [351] considered the inverse mean curvature flow of Geroch on spacelike hypersurfaces (see Section 6.2.2). They showed that if the dominant energy condition and certain additional (essentially technical) assumptions hold, then the Hawking energy is monotonic. These two results are the natural adaptations for the Hawking energy of the corresponding results known for some time for the Geroch energy, aiming to prove the Penrose inequality. (We return to this latter issue in Section 13.2, only for a very brief summary.) The necessary conditions on flows of two-surfaces on null, as well as spacelike, hypersurfaces ensuring the monotonicity of the Hawking energy are investigated in [114]. The monotonicity property of the Hawking energy under another geometric flows is discussed in [89].

For a discussion of the relationship between Hawking energy and other expressions (e.g., the Bartnik mass and the Brown-York energy), see [460]. For the first attempts to introduce quasi-local energy oparators, in particular the Hawking energy oparator, in loop quantum gravity, see [565].

6.1.4 Two generalizations

Hawking defined not only energy, but spatial momentum as well, completely analogously to how the spatial components of Bondi-Sachs energy-momentum are related to Bondi energy:

$$P_{\rm{H}}^{\underline a}({\mathcal S}) = \sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} {1 \over {4\pi}}\oint\nolimits_{\mathcal S} {(\sigma \sigma {\prime}+ \bar \sigma \bar \sigma {\prime}- {\psi _2} - {{\bar \psi}_{2{\prime}}} + 2{\phi _{11}} + 2\Lambda)\,{W^{\underline a}}\,d{\mathcal S}},$$
(6.2)

where \({W^{\underline a}},\,a = 0,\, \ldots, \,3\), are essentially the first four spherical harmonics:

$$\begin{array}{*{20}c} {{W^0} = 1,} & {{W^1} = {{\zeta + \bar \zeta} \over {1 + \zeta \bar \zeta}},} & {{W^2} = {1 \over {\rm{i}}}{{\zeta - \bar \zeta} \over {1 + \zeta \bar \zeta}},} & {{W^3} = {{1 - \zeta \bar \zeta} \over {1 + \zeta \bar \zeta}}.} \\ \end{array}$$
(6.3)

Here ζ and \({\bar \zeta}\) are the standard complex stereographic coordinates on \({\mathcal S} \approx {S^2}\).

Hawking considered the extension of the definition of \({E_H}({\mathcal S})\) to higher genus two-surfaces as well by the second expression in Eq. (6.1). Then, in the expression analogous to the first one in Eq. (6.1), the genus of \({\mathcal S}\) appears. For recent generalizations of the Hawking energy for two-surfaces foliating the stationary and dynamical untrapped hypersurfaces, see [527, 528] and Section 11.3.4.

6.2 The Geroch energy

6.2.1 The definition

Suppose that the two-surface \({\mathcal S}\) for which EH is defined is embedded in the spacelike hypersurface Σ. Let χab be the extrinsic curvature of Σ in M and kab the extrinsic curvature of \(\Sigma\) in Σ. (In Section 4.1.2 we denote the latter by νab.) Then 8ρρ′ = (χabqab)2 − (kabqab)2, by means of which

$$\begin{array}{*{20}c} {{E_{\rm{H}}}({\mathcal S}) = \sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} \left({1 - {1 \over {16\pi}}\oint\nolimits_{\mathcal S} {{{({k_{ab}}{q^{ab}})}^2}\;d{\mathcal S}} + {1 \over {16\pi}}\oint\nolimits_{\mathcal S} {{{({\chi _{ab}}{q^{ab}})}^2}\;d{\mathcal S}}} \right) \geq} \\ {\; \geq \sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} \left({1 - {1 \over {16\pi}}\oint\nolimits_{\mathcal S} {{{({k_{ab}}{q^{ab}})}^2}\;d{\mathcal S}}} \right) = \quad \quad \quad \quad \quad \quad} \\ {\;\; = {1 \over {16\pi}}\sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} \oint\nolimits_{\mathcal S} {\left({{2^{\mathcal S}}R - {{({k_{ab}}{q^{ab}})}^2}} \right)\;d{\mathcal S}} = :{E_{\rm{G}}}({\mathcal S}).\quad \quad} \\ \end{array}$$
(6.4)

In the last step we use the Gauss-Bonnet theorem for \({\mathcal S} \approx {S^2}\). \({E_G}({\mathcal S})\) is known as the Geroch energy [207]. Thus, it is not greater than the Hawking energy, and, in contrast to EH, it depends not only on the two-surface \({\mathcal S}\), but on the hypersurface Σ as well.

The calculation of the small sphere limit of the Geroch energy was saved by observing [275] that, by Eq. (6.4), the difference of the Hawking and the Geroch energies is proportional to \(\sqrt {{\rm{Area}}({\mathcal S})} \times \oint\nolimits_{\mathcal S} {{{({\chi _{ab}}{q^{ab}})}^2}d{\mathcal S}}\). Since, however, χabqab — for the family of small spheres \({{\mathcal S}_r}\) — does not tend to zero in the r → 0 limit, in general, this difference is \({\mathcal O}({r^3})\). It is zero if Σ is spanned by spacelike geodesics orthogonal to ta at p. Thus, for general Σ, the Geroch energy does not give the expected \({{4\pi} \over 3}{r^3}{T_{ab}}{t^a}{t^b}\) result. Similarly, in vacuum, the Geroch energy deviates from the Bel-Robinson energy in r5 order even if Σ is geodesic at p.

Since \({E_H}({\mathcal S}) \geq {E_G}({\mathcal S})\) and since the Hawking energy tends to the ADM energy, the large sphere limit of \({E_G}({\mathcal S})\) in an asymptotically flat Σ cannot be greater than the ADM energy. In fact, it is also precisely the ADM energy [207].

For a definition of Geroch’s energy as a quasi-local energy oparator in loop quantum gravity, see [565].

6.2.2 Monotonicity properties

The Geroch energy has interesting positivity and monotonicity properties along a special flow in Σ [207, 291]. This flow is the inverse mean curvature flow defined as follows. Let t: Σ → ℝ be a smooth function such that

  1. 1.

    its level surfaces, \({{\mathcal S}_t}: = \{q \in \Sigma \left\vert {t(q) = t} \right.\}\), are homeomorphic to S2,

  2. 2.

    there is a point p ∈ Σ such that the surfaces \({{\mathcal S}_t}\) are shrinking to p in the limit t → −∞, and

  3. 3.

    they form a foliation of Σ − {p}.

Let n be the lapse function of this foliation, i.e., if va is the outward directed unit normal to \({{\mathcal S}_t}\) in Σ, then nvaDat = 1. Denoting the integral on the right-hand side in Eq. (6.4) by Wt, we can calculate its derivative with respect to t. In general this derivative does not seem to have any remarkable properties. If, however, the foliation is chosen in a special way, namely if the lapse is just the inverse mean curvature of the foliation, n = 1/k where kkabqab, and furthermore Σ is maximal (i.e., χ = 0) and the energy density of the matter is non-negative, then, as shown by Geroch [207], Wt ≥ 0 holds. Jang and Wald [291] modified the foliation slightly, such that t ∈ [0, ∞), and the surface \({{\mathcal S}_0}\) was assumed to be future marginally trapped (i.e., ρ = 0 and ρ′ ≥ 0). Then they showed that, under the conditions above, \(\sqrt {{\rm{Area}}({{\mathcal S}_0})} {W_0} \leq \sqrt {{\rm{Area}}({{\mathcal S}_t})} {W_t}\). Since \({E_G}({{\mathcal S}_t})\) tends to the ADM energy as t → ∞, these considerations were intended to argue that the ADM energy should be non-negative (at least for maximal Σ) and not less than \(\sqrt {{\rm{Area}}({{\mathcal S}_0})/(16\pi {G^2})}\) (at least for time-symmetric Σ), respectively. Later Jang [289] showed that, if a certain quasi-linear elliptic differential equation for a function w on a hypersurface Σ admits a solution (with given asymptotic behavior), then w defines a mapping between the data set (Σ, hab, χab) on Σ and a maximal data set \((\Sigma, \,{{\bar h}_{ab}},\,{{\bar \chi}_{ab}})\) (i.e., for which \({{\bar \chi}_{ab}}{{\bar h}^{ab}} = 0\)) such that the corresponding ADM energies coincide. Then Jang shows that a slightly modified version of the Geroch energy is monotonic (and tends to the ADM energy) with respect to a new, modified version of the inverse mean curvature foliation of \((\Sigma, \,{{\bar h}_{ab}})\).

The existence and the properties of the original inverse-mean-curvature foliation of (Σ, hab) above were proven and clarified by Huisken and Ilmanen [278, 279], giving the first complete proof of the Riemannian Penrose inequality, and, as proven by Schoen and Yau [444], Jang’s quasi-linear elliptic equation admits a global solution.

6.3 The Hayward energy

We saw that EH can be nonzero, even in the Minkowski spacetime. This may motivate us to consider the following expression

$$\begin{array}{*{20}c} {I({\mathcal S}): = \sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} \left({1 + {1 \over {4\pi}}\oint\nolimits_{\mathcal S} {(2\rho \rho {\prime}- \sigma \sigma {\prime}- \bar \sigma \bar \sigma {\prime})\;d{\mathcal S}}} \right)} \\ {\quad \;\;\;= \sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}} {1 \over {4\pi}}\oint\nolimits_{\mathcal S} {(- {\psi _2} - {{\bar \psi}_{2{\prime}}} + 2{\phi _{11}} + 2\Lambda)\;d{\mathcal S}}.} \\ \end{array}$$
(6.5)

(Thus, the integrand is \({1 \over 4}(F + \bar F)\), where F is given by Eq. (4.4).) By the Gauss equation, this is zero in flat spacetime, furthermore, it is not difficult to see that its limit at spatial infinity is still the ADM energy. However, using the second expression of \(I({\mathcal S})\), one can see that its limit at the future null infinity is the Newman-Unti, rather than the Bondi-Sachs energy.

In the literature there is another modification of Hawking energy, due to Hayward [248]. His suggestion is essentially \(I({\mathcal S})\) with the only difference being that the integrands of Eq. (6.5) above contain an additional term, namely the square of the anholonomicity −ωaωa (see Sections 4.1.8 and 11.2.1). However, we saw that ωa is a boost-gauge-dependent quantity, thus, the physical significance of this suggestion is questionable unless a natural boost gauge choice, e.g., in the form of a preferred foliation, is made. (Such a boost gauge might be that given by the mean extrinsic curvature vector Qa and \({{\bar Q}_a}\) discussed in Section 4.1.2.) Although the expression for the Hayward energy in terms of the GHP spin coefficients given in [81, 83] seems to be gauge invariant, this is due only to an implicit gauge choice. The correct, general GHP form of the extra term is \(- {\omega _a}{\omega ^a} = 2(\beta - {{\bar \beta}\prime})(\bar \beta - {\beta \prime})\). If, however, the GHP spinor dyad is fixed, as in the large or small sphere calculations, then \(\beta - {{\bar \beta}\prime} = \tau = - {{\bar \tau}\prime}\), and hence, the extra term is, in fact, the gauge invariant \(2\tau \bar \tau\).

Taking into account that \(\tau = {\mathcal O}({r^{- 2}})\) near the future null infinity (see, e.g., [455]), it is obvious from the remark on the asymptotic behavior of \(I({\mathcal S})\) above that the Hayward energy tends to the Newman-Unti, instead of the Bondi-Sachs, energy at the future null infinity. The Hayward energy has been calculated for small spheres both in nonvacuum and vacuum [81]. In nonvacuum it gives the expected value \({{4\pi} \over 3}{r^3}{T_{ab}}{t^a}{t^b}\). However, in vacuum it is \(- {8 \over {45G}}{r^5}{T_{abcd}}{t^a}{t^b}{t^c}{t^d}\), which is negative.

7 Penrose’s Quasi-Local Energy-Momentum and Angular Momentum

The construction of Penrose is based on twistor-theoretical ideas, and motivated by the linearized gravity integrals for energy-momentum and angular momentum. Since, however, twistor-theoretical ideas and basic notions are still considered ‘special knowledge’, the review here of the basic idea behind the Penrose construction is slightly more detailed than that of the others. The main introductory references of the field are the volumes [425, 426] by Penrose and Rindler on ‘Spinors and Spacetime’, especially volume 2, the very readable book by Hugget and Tod [277] and the comprehensive review article [516] by Tod.

7.1 Motivations

7.1.1 How do the twistors emerge?

We saw in Section 3.1.1 that in the Newtonian theory of gravity the mass of the source in D can be expressed as the flux integral of the gravitational field strength on the boundary \({\mathcal S}: = \partial D\). Similarly, in the weak field (linear) approximation of general relativity on Minkowski spacetime the source of the gravitational field (i.e., the linearized energy-momentum tensor) is still analogous to charge. In fact, the total energy-momentum and angular momentum of the source can be expressed as appropriate two-surface integrals of the curvature at infinity [481]. Thus, it is natural to expect that the energy-momentum and angular momentum of the source in a finite three-volume Σ, given by Eq. (2.5), can also be expressed as the charge integral of the curvature on the two-surface \({\mathcal S}\). However, the curvature tensor can be integrated on \({\mathcal S}\) only if at least one pair of its indices is annihilated by some tensor via contraction, i.e., according to Eq. (3.14) if some ωab = ω[ab] is chosen and μab = εab. To simplify the subsequent analysis, ωab will be chosen to be anti-self-dual: ωab = εA′B′ ωAB with ωAB = ω(AB).Footnote 11 Thus, our goal is to find an appropriate spinor field ωAB on \({\mathcal S}\) such that

$${Q_{\mathcal S}} = [{\bf{K}}]: = \int\nolimits_\Sigma {{K_a}{T^{ab}}{1 \over {3!}}{\varepsilon _{bcde}}} = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {{\omega ^{A\,B}}{R_{A\,Bcd}}} = :{A_{\mathcal S}}[\omega ].$$
(7.1)

Since the dual of the exterior derivative of the integrand on the right, and, by Einstein’s equations, the dual of the 8πG times the integrand on the left, respectively, is

$${\varepsilon ^{ecdf}}{\nabla _e}({\omega ^{A\,B}}{R_{A\,Bcd}}) = - 2{\rm{i}}{\psi ^F}_{A\,BC}{\nabla ^{F{\prime}\left(A \right.}}{\omega ^{\left. {BC} \right)}} + 2{\phi _{A\,B\,E{\prime}}}^{F{\prime}}{\rm{i}}{\nabla ^{E{\prime}F}}{\omega ^{A\,B}} + 4\Lambda {\rm{i}}\nabla _A^{F{\prime}}{\omega ^{F\,A}},$$
(7.2)
$$- 8\pi G{K_a}{T^{a\,f}} = 2{\phi ^{FAF{\prime}A{\prime}}}{K_{AA{\prime}}} + 6\Lambda {K^{FF{\prime}}}.$$
(7.3)

expressions (7.2) and (7.3) are equal if ωAB satisfies

$${\nabla ^{A{\prime}A}}{\omega ^{BC}} = - {\rm{i}}{\varepsilon ^{A\left(B \right.}}{K^{\left. C \right)A{\prime}}}.$$
(7.4)

This equation in its symmetrized form, \({\nabla ^{{A\prime}(A}}{\omega ^{BC)}} = 0\), is the valence 2 twistor equation, a specific example for the general twistor equation \({\nabla ^{{A\prime}(A}}{\omega ^{BC \ldots E)}} = 0\) for ωBC.…E = ω(BC.…E). Thus, as could be expected, ωAB depends on the Killing vector Ka, and, in fact, Ka can be recovered from ωAB as \({K^{{A\prime}A}} = {2 \over 3}{\rm{i}}\nabla _B^{{A\prime}}{\omega ^{AB}}\). Thus, ωAB plays the role of a potential for the Killing vector KA′A. However, as a consequence of Eq. (7.4), Ka is a self-dual Killing 1-form in the sense that its derivative is a self-dual (s.d.) 2-form: In fact, the general solution of Eq. (7.4) and the corresponding Killing vector are

$$\begin{array}{*{20}c} {{\omega ^{A\,B}} = - {\rm{i}}{x^{AA{\prime}}}{x^{BB{\prime}}}{{\bar M}_{A{\prime}B{\prime}}} + {\rm{i}}{x^{\left(A \right.}}_{A{\prime}}{T^{\left. B \right)A{\prime}}} + {\Omega ^{A\,B}},} \\ {{K^{AA{\prime}}} = {T^{AA{\prime}}} + 2{x^{A\,B{\prime}}}\bar M_{B{\prime}}^{A{\prime}},\quad \quad \quad \quad \quad \quad \;\;\;} \\ \end{array}$$
(7.5)

where \({{\bar M}_{{A\prime}{B\prime}}},\,{T^{A{A\prime}}}\), and ΩAB are constant spinors, and using the notation \({x^{A{A\prime}}}: = {x^{\underline a}}\sigma _{\underline a}^{\underline A \,{{\underline A}\prime}}{\mathcal E}_{\underline A}^A\bar {\mathcal E}_{{{\underline A}\prime}}^{{A\prime}}\), where \(\{{\mathcal E}_{\underline {\rm{A}}}^{\rm{A}}\}\) is a constant spin frame (the ‘Cartesian spin frame’) and \(\sigma _{\underline a}^{\underline A \,{{\underline A}\prime}}\) are the standard SL(2, ℂ) Pauli matrices (divided by \(\sqrt 2\)). These yield that Ka is, in fact, self-dual, \({\nabla _{A{A\prime}}}{K_{B{B\prime}}} = {\varepsilon _{AB}}{{\bar M}_{{A\prime}{B\prime}}},\,{T^{A{A\prime}}}\) is a translation and \({{\bar M}_{{A\prime}{B\prime}}}\) generates self-dual rotations. Then \({Q_{\mathcal S}}[{\bf{K}}] = {T_{A{A\prime}}}{P^{A{A\prime}}} + 2{{\bar M}_{{A\prime}{B\prime}}}{J^{{A\prime}{B\prime}}}\), implying that the charges corresponding to ΩAB are vanishing, the four components of the quasi-local energy-momentum correspond to the real TAA′ s, and the spatial angular momentum and center-of-mass are combined into the three complex components of the self-dual angular momentum \({{\bar J}^{{A\prime}{B\prime}}}\), generated by \({{\bar M}_{{A\prime}{B\prime}}}\).

7.1.2 Twistor space and the kinematical twistor

Recall that the space of the contravariant valence-one twistors of Minkowski spacetime is the set of the pairs Zα ≔ (λA, πA′) of spinor fields, which solve the valence-one-twistor equation ∇A′AλB = −iεABπA′. If Zα is a solution of this equation, then α ≔ (αA, πA′ + iϒA′aλA) is a solution of the corresponding equation in the conformally-rescaled spacetime, where ϒaΩ−1aΩ and Ω is the conformal factor. In general, the twistor equation has only the trivial solution, but in the (conformal) Minkowski spacetime it has a four complex-parameter family of solutions. Its general solution in the Minkowski spacetime is λA = ΛA − ixAA′ πA′, where ΛA and πA′ are constant spinors. Thus, the space Tα of valence-one twistors, called the twistor space, is four-complex-dimensional, and hence, has the structure \({{\rm{T}}^\alpha} = {{\rm{S}}^A} \oplus {{{\rm{\bar S}}}_{{A\prime}}}\). Tα admits a natural Hermitian scalar product: if Wβ = (ωB, σB′) is another twistor, then \({H_{\alpha {\beta \prime}}}{Z^\alpha}{{\bar W}^{{\beta \prime}}}: = {\lambda ^A}{{\bar \sigma}_A} + {\pi _{{A\prime}}}{{\bar \omega}^{{A\prime}}}\). Its signature is (+, +, −, −), it is conformally invariant, \({H_{\alpha {\beta \prime}}}{{\hat Z}^\alpha}{{\bar \hat W}^{{\beta \prime}}}: = {H_{\alpha {\beta \prime}}}{Z^\alpha}{{\bar W}^{{\beta \prime}}}\), and it is constant on Minkowski spacetime. The metric Hαβ′ defines a natural isomorphism between the complex conjugate twistor space, \({{{\rm{\bar T}}}^\alpha}\prime\), and the dual twistor space, \({{\rm{T}}_\beta}: = {{\rm{S}}_B} \oplus {{\rm{\bar S}}^{{B\prime}}}\), by \(({{\bar \lambda}^{{A\prime}}},\,{{\bar \pi}_A}) \mapsto ({{\bar \pi}_A},\,{{\bar \lambda}^{{A\prime}}})\). This makes it possible to use only twistors with unprimed indices. In particular, the complex conjugate Āα′β′ of the covariant valence-two twistor Aαβ can be represented by the conjugate twistor AαβAα′β′Hα′βHβ′β. We should mention two special, higher-valence twistors. The first is the infinity twistor. This and its conjugate are given explicitly by

$$\begin{array}{*{20}c} {{I^{\alpha \beta}}: = \left({\begin{array}{*{20}c} {{\varepsilon ^{A\,B}}} & 0 \\ 0 & 0 \\ \end{array}} \right),} & {{I_{\alpha \beta}}: = {{\bar I}^{\alpha {\prime}\beta {\prime}}}{H_{\alpha {\prime}\alpha}}{H_{\beta {\prime}\beta}} = \left({\begin{array}{*{20}c} 0 & 0 \\ 0 & {{\varepsilon ^{A{\prime}\,B{\prime}}}} \\ \end{array}} \right)} \\ \end{array}.$$
(7.6)

The other is the completely anti-symmetric twistor εεαβγ, whose component ε0123 in an Hαβ′-orthonormal basis is required to be one. The only nonvanishing spinor parts of εεαβγ are those with two primed and two unprimed spinor indices: \({\varepsilon ^{A{\prime}B{\prime}}}_{CD} = {\varepsilon ^{A{\prime}B{\prime}}}{\varepsilon _{CD}},{\varepsilon ^{A{\prime}}}_B{\,^{C{\prime}}}_D = - {\varepsilon ^{A{\prime}C{\prime}}}{\varepsilon _{BD}},{\varepsilon _{AB}}^{C{\prime}D{\prime}} = {\varepsilon _{AB}}{\varepsilon ^{C{\prime}D{\prime}}}\). Thus, for any four twistors \(Z_i^\alpha = (\lambda _i^A,\,\pi _{{A\prime}}^i),\,i = 1,\, \ldots, \,4\), the determinant of the 4×4 matrix, whose i-th column is \((\lambda _i^0,\,\lambda _i^1,\,\pi _0^i,\,\pi _1^i)\), where the \(\lambda _i^0,\, \ldots, \,\pi _1^i\), are the components of the spinors \(\lambda _i^A\) and \(\pi _A^i\), in some spin frame, is

$$\nu : = \det \left({\begin{array}{*{20}c} {\lambda _1^{\bf{0}}} & {\lambda _2^{\bf{0}}} & {\lambda _3^{\bf{0}}} & {\lambda _4^{\bf{0}}} \\ {\lambda _1^{\bf{1}}} & {\lambda _2^{\bf{1}}} & {\lambda _3^{\bf{1}}} & {\lambda _4^{\bf{1}}} \\ {\pi _{\bf{0\prime}}^1} & {\pi _{\bf{0\prime}}^2} & {\pi _{\bf{0\prime}}^3} & {\pi _{\bf{0\prime}}^4} \\ {\pi _{\bf{1\prime}}^1} & {\pi _{\bf{1\prime}}^2} & {\pi _{\bf{1\prime}}^3} & {\pi _{\bf{1\prime}}^4} \\ \end{array}} \right) = {\textstyle{1 \over 4}}{{\epsilon}^{ij}}_{kl}\lambda _i^A\lambda _j^B\pi _{A{\prime}}^k\pi _{B{\prime}}^l{\varepsilon _{A\,B}}{\varepsilon ^{A{\prime}B{\prime}}} = {\textstyle{1 \over 4}}{\varepsilon _{\alpha \beta \gamma \delta}}Z_1^\alpha Z_2^\beta Z_3^\gamma Z_4^\delta,$$
(7.7)

where \({\epsilon ^{ij}}_{kl}\) is the totally antisymmetric Levi-Civita symbol. Then Iαβ and Iαβ are dual to each other in the sense that \({I^{\alpha \beta}} = {1 \over 2}{\varepsilon ^{\alpha \beta \gamma \delta}}{I_{\gamma \delta}}\), and by the simplicity of Iαβ one has εαβγδIαβIγδ = 0.

The solution ωAB of the valence-two twistor equation, given by Eq. (7.5), can always be written as a linear combination of the symmetrized product \({\lambda ^{(A}}{\omega ^B})\) of the solutions λA and ωA of the valence-one twistor equation. ωAB uniquely defines a symmetric twistor ωαβ (see, e.g., [426]). Its spinor parts are

$${\omega ^{\alpha \beta}} = \left({\begin{array}{*{20}c} {{\omega ^{A\,B}}} & {- {\textstyle{1 \over 2}}{K^A}_{B{\prime}}} \\ {- {\textstyle{1 \over 2}}{K_{A{\prime}}}^B} & {- {\rm{i}}{{\bar M}_{A{\prime}B{\prime}}}} \\ \end{array}} \right).$$

However, Eq. (7.1) can be interpreted as a ℂ-linear mapping of ωαβ into ℂ, i.e., Eq. () defines a dual twistor, the (symmetric) kinematical twistor Aαβ, which therefore has the structure

$${A_{\alpha \beta}} = \left({\begin{array}{*{20}c} 0 & {{P_A}^{B{\prime}}} \\ {{P^{A{\prime}}}_B} & {2{\rm{i}}{{\bar J}^{A{\prime}B{\prime}}}} \\ \end{array}} \right).$$
(7.8)

Thus, the quasi-local energy-momentum and self-dual angular momentum of the source are certain spinor parts of the kinematical twistor. In contrast to the ten complex components of a general symmetric twistor, it has only ten real components as a consequence of its structure (its spinor part AAB is identically zero) and the reality of PAA′. These properties can be reformulated by the infinity twistor and the Hermitian metric as conditions on Aαβ: the vanishing of the spinor part Aab is equivalent to AαβIαγIβδ = 0 and the energy momentum is the \({A_{\alpha \beta}}{Z^\alpha}{I^{\beta \gamma}}{H_{\gamma {\gamma \prime}}}{{\bar Z}^{{\gamma \prime}}}\) part of the kinematical twistor, while the whole reality condition (ensuring both AAB = 0 and the reality of the energy-momentum) is equivalent to

$${A_{\alpha \beta}}{I^{\beta \gamma}}{H_{\gamma \delta {\prime}}} = {\bar A_{\delta {\prime}\beta {\prime}}}{\bar I^{\beta {\prime}\gamma {\prime}}}{H_{\gamma {\prime}\alpha}}.$$
(7.9)

Using the conjugate twistors, this can be rewritten (and, in fact, usually is written) as \({A_{\alpha \beta}}{I^{\beta \gamma}} = ({H^{\gamma {\alpha \prime}}}\,{{\bar A}_{{\alpha \prime}{\beta \prime}}}{H^{{\beta \prime}\delta}})\,({H_{\delta {\delta \prime}}}{{\bar I}^{{\delta \prime}{\gamma \prime}}}{H_{{\gamma \prime}\alpha}}) = {{\bar A}^{\gamma \delta}}{I_{\delta \alpha}}\). The quasi-local mass can also be expressed by the kinematical twistor as its Hermitian norm [420] or as its determinant [510]:

$${m^2} = - {P_A}^{A{\prime}}{P^A}_{A{\prime}} = - {1 \over 2}{A_{\alpha \beta}}{\bar A_{\alpha {\prime}\beta {\prime}}}{H^{\alpha \alpha {\prime}}}{H^{\beta \beta {\prime}}} = - {1 \over 2}{A_{\alpha \beta}}{\bar A^{\alpha \beta}},$$
(7.10)
$${m^4} = 4\det {A_{\alpha \beta}} = {\textstyle{1 \over {3!}}}{\varepsilon ^{\alpha \beta \gamma \delta}}{\varepsilon ^{\mu \nu \,\rho \sigma}}{A_{\alpha \mu}}{A_{\beta \nu}}{A_{\gamma \rho}}{A_{\delta \sigma}}.$$
(7.11)

Similarly, as Helfer shows [264], the various components of the Pauli-Lubanski spin vector \({S_a}: = {1 \over 2}{\varepsilon _{abcd}}{P^b}{J^{cd}}\) can also be expressed by the kinematic and infinity twistors and by certain special null twistors: if Zα = (−ixAB′ πB′, πA′) and Wα = (−ixAB′ σB′, σA′) are two different (null) twistors such that AαβZαZβ = 0 and AαβWαWβ = 0, then

$${(2{P^e}{\pi _{E{\prime}}}{\bar \pi _E}{P^f}{\sigma _{F{\prime}}}{\bar \sigma _F})^{- 1}}{\bar \pi _A}{\pi _{A{\prime}}}{\bar \sigma _B}{\bar \sigma _{B{\prime}}}({S^a}{P^b} - {S^b}{P^a}) = - \Re \left({{{{A_{\alpha \beta}}{Z^\alpha}{W^\beta}} \over {{I_{\gamma \delta}}{Z^\gamma}{W^\delta}}}} \right).$$
(7.12)

(ℜ on the right means ‘real part’.)

Thus, to summarize, the various spinor parts of the kinematical twistor Aαβ are the energy-momentum and s.d. angular momentum. However, additional structures, namely the infinity twistor and the Hermitian scalar product, are needed to be able to ‘isolate’ its energy-momentum and angular momentum parts, and, in particular, to define the mass and express the Pauli-Lubanski spin. Furthermore, the Hermiticity condition ensuring that Aαβ has the correct number of components (ten reals) is also formulated in terms of these additional structures.

7.2 The original construction for curved spacetimes

7.2.1 Two-surface twistors and the kinematical twistor

In general spacetimes, the twistor equations have only the trivial solution. Thus, to be able to associate a kinematical twistor with a closed orientable spacelike two-surface \({\mathcal S}\) in general, the conditions on the spinor field ωAB have to be relaxed. Penrose’s suggestion [420, 421] is to consider ωAB in Eq. (7.1) to be the symmetrized product \({\lambda ^{(A}}{\omega ^\beta})\) of spinor fields that are solutions of the ‘tangential projection to \({\mathcal S}\)’ of the valence-one twistor equation, the two-surface twistor equation. (The equation obtained as the ‘tangential projection to \({\mathcal S}\)’ of the valence-two twistor equation (7.4) would be under-determined [421].) Thus, the quasi-local quantities are searched for in the form of a charge integral of the curvature:

$$\begin{array}{*{20}c} {{A_{\mathcal S}}[\lambda, \omega ]: = {{- 1} \over {8\pi G}}\oint\nolimits_{\mathcal S} {{\lambda ^A}{\omega ^B}{R_{A\,Bcd}}} \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= {{\rm{i}} \over {4\pi G}}\oint\nolimits_{\mathcal S} {[{\lambda ^0}{\omega ^0}({\phi _{01}} - {\psi _1}) + ({\lambda ^0}{\omega ^1} + {\lambda ^1}{\omega ^0})\,({\phi _{11}} + \Lambda - {\psi _2}) + {\lambda ^1}{\omega ^1}({\phi _{21}} - {\psi _3})]\;d{\mathcal S},}} \\ \end{array}$$
(7.13)

where the second expression is given in the GHP formalism with respect to some GHP spin frame adapted to the two-surface \({\mathcal S}\). Since the indices c and d on the right of the first expression are tangential to \({\mathcal S}\), this is just the charge integral of FABcd in the spinor identity (4.5) of Section 4.1.5.

The two-surface twistor equation that the spinor fields should satisfy is just the covariant spinor equation \({\mathcal{T}_{E'EA}}{{\mkern 1mu} ^B}{\lambda _B} = 0\). By Eq. (4.6) its GHP form is \({\mathcal T}\lambda : = ({{\mathcal T}^ +} \oplus {{\mathcal T}^ -})\lambda = 0\), which is a first-order elliptic system, and its index is 4(1 − g), where g is the genus of \({\mathcal S}\) [58]. Thus, there are at least four (and in the generic case precisely four) linearly-independent solutions to \({\mathcal T}\lambda = 0\) on topological two-spheres. However, there are ‘exceptional’ two-spheres for which there exist at least five linearly independent solutions [297]. For such ‘exceptional’ two-spheres (and for higher-genus two-surfaces for which the twistor equation has only the trivial solution in general) the subsequent construction does not work. (The concept of quasi-local charges in Yang-Mills theory can also be introduced in an analogous way [509, 183]). The space of the solutions to \({\rm{T}}_{\mathcal S}^\alpha\) is called the two-surface twistor space. In fact, in the generic case this space is four-complex-dimensional, and under conformal rescaling the pair Zα = (λA, iΔA′AλA) transforms like a valence one contravariant twistor. Zα is called a two-surface twistor determined by λA. If \({{\mathcal S}\prime}\) is another generic two-surface with the corresponding two-surface twistor space \({\rm{T}}_{{{\mathcal S}\prime}}^\alpha\), then although \({\rm{T}}_{\mathcal S}^\alpha\) and \({\rm{T}}_{{{\mathcal S}\prime}}^\alpha\) are isomorphic as vector spaces, there is no canonical isomorphism between them. The kinematical twistor Aαβ is defined to be the symmetric twistor determined by \({A_{\alpha \beta}}{Z^\alpha}{W^\beta}: = {A_{\mathcal S}}[\lambda, \,\omega ]\) for any Zα = (λA, iΔA′AλA) and Wα = (ωA, iΔA′AωA from \({\rm{T}}_{\mathcal S}^\alpha\). Note that \({A_{\mathcal S}}[\lambda, \,\omega ]\) is constructed only from the two-surface data on \({\mathcal S}\).

7.2.2 The Hamiltonian interpretation of the kinematical twistor

For the solutions λA and ωA of the two-surface twistor equation, the spinor identity (4.5) reduces to Tod’s expression [420, 426, 516] for the kinematical twistor, making it possible to re-express \({\mathcal S}\) by the integral of the Nester-Witten 2-form [490]. Indeed, if

$${H_{\mathcal S}}[\lambda, \bar \mu ]: = {1 \over {4\pi G}}\oint\nolimits_{\mathcal S} {u{{(\lambda, \bar \mu)}_{ab}}} = - {1 \over {4\pi G}}\oint\nolimits_{\mathcal S} {{{\bar \gamma}^{A{\prime}B{\prime}}}{{\bar \mu}_{A{\prime}}}{\Delta _{B{\prime}B}}{\lambda ^B}\,d{\mathcal S}},$$
(7.14)

then, with the choice \({{\bar \mu}_{{A\prime}}}: = {\rm{i}}{\Delta _{{A\prime}}}^A{\omega _A}\), this gives Penrose’s charge integral by Eq. (4.5): \({A_{\mathcal S}}[\lambda, \,\omega ] = {H_{\mathcal S}}[\lambda, \,\bar \mu ]\). Then, extending the spinor fields λA and ωA from \({\mathcal S}\) to a spacelike hypersurface \(\Sigma\) with boundary \({\mathcal S}\) in an arbitrary way, by the Sparling equation it is straightforward to rewrite \({A_{\mathcal S}}[\lambda, \,\omega ]\) in the form of the integral of the energy-momentum tensor of the matter fields and the Sparling form on Σ. Since such an integral of the Sparling form can be interpreted as the Hamiltonian of general relativity, this is a quick re-derivation of Mason’s [357, 358] Hamiltonian interpretation of Penrose’s kinematical twistor: \({A_{\mathcal S}}[\lambda, \,\omega ]\) is just the boundary term in the total Hamiltonian of the matter + gravity system, and the spinor fields λA and ωA (together with their ‘projection parts’ iΔA′AλA and iΔA′AωA) on \({\mathcal S}\) are interpreted as the spinor constituents of the special lapse and shift, called the ‘quasi-translations’ and ‘quasi-rotations’ of the two-surface, on the two-surface itself.

7.2.3 The Hermitian scalar product and the infinity twistor

In general, the natural pointwise Hermitian scalar product, defined by \(\left\langle {Z,\,\bar W} \right\rangle : = - {\rm{i(}}{\lambda ^A}{\Delta _{A{A\prime}}}{{\bar \omega}^{{A\prime}}} - {{\bar \omega}^{{A\prime}}}{\Delta _{A{A\prime}}}{\lambda ^A})\), is not constant on \({\mathcal S}\), thus, it does not define a Hermitian scalar product on the two-surface twistor space. As is shown in [296, 299, 514], \(\left\langle {Z,\,\bar W} \right\rangle\) is constant on \({\mathcal S}\) for any two two-surface twistors if and only if \({\mathcal S}\) can be embedded, at least locally, into some conformal Minkowski spacetime with its intrinsic metric and extrinsic curvatures. Such two-surfaces are called noncontorted, while those that cannot be embedded are called contorted. One natural candidate for the Hermitian metric could be the average of \(\left\langle {Z,\,\bar W} \right\rangle\) on \({\mathcal S}\) [420]: \({H_{\alpha {\beta \prime}}}{Z^\alpha}{{\bar W}^{{\beta \prime}}}: = [{\rm{Area(}}{\mathcal S}{{\rm{)}}^{- {1 \over 2}}}\oint\nolimits_{\mathcal S} {\left\langle {Z,\,\bar W} \right\rangle \,d{\mathcal S}}\), which reduces to \(\left\langle {Z,\,\bar W} \right\rangle\) on noncontorted two-surfaces. Interestingly enough, \(\oint\nolimits_{\mathcal S} {\left\langle {Z,\,\bar W} \right\rangle \,d{\mathcal S}}\) can also be reexpressed by the integral (7.14) of the Nester-Witten 2-form [490]. Unfortunately, however, neither this metric nor the other suggestions appearing in the literature are conformally invariant. Thus, for contorted two-surfaces, the definition of the quasi-local mass as the norm of the kinematical twistor (cf. Eq. (7.10)) is ambiguous unless a natural Hαβ′ is found.

If \({\mathcal S}\) is noncontorted, then the scalar product \(\left\langle {Z,\,\bar W} \right\rangle\) defines the totally anti-symmetric twistor εεαβγ, and for the four independent two-surface twistors \(Z_1^\alpha, \, \ldots, \,Z_4^\alpha\) the contraction \({\varepsilon _{\alpha \beta \gamma \delta}}Z_1^\alpha Z_2^\beta Z_3^\gamma Z_4^\delta\), and hence, by Eq. (7.7), the determinant ν, is constant on \({\mathcal S}\). Nevertheless, ν can be constant even for contorted two-surfaces for which \(\left\langle {Z,\,\bar W} \right\rangle\) is not. Thus, the totally anti-symmetric twistor εεαβγ can exist even for certain contorted two-surfaces. Therefore, an alternative definition of the quasi-local mass might be based on Eq. (7.11) [510]. However, although the two mass definitions are equivalent in the linearized theory, they are different invariants of the kinematical twistor even in de Sitter or anti-de Sitter spacetimes. Thus, if needed, the former notion of mass will be called the norm-mass, the latter the determinant-mass (denoted by mD).

If we want to have not only the notion of the mass but its reality as well, then we should ensure the Hermiticity of the kinematical twistor. But to formulate the Hermiticity condition (7.9), one also needs the infinity twistor. However, −εA′B Δ A′AλAΔB′BωB is not constant on \({\mathcal S}\) even if it is noncontorted. Thus, in general, it does not define any twistor on \({\rm{T}}_{\mathcal S}^\alpha\). One might take its average on \({\mathcal S}\) (which can also be re-expressed by the integral of the Nester-Witten 2-form [490]), but the resulting twistor would not be simple. In fact, even on two-surfaces in de Sitter and anti-de Sitter spacetimes with cosmological constant λ the natural definition for Iαβ is Iαβ ≔ diag(λεAB, εA′B′) [426, 424, 510], while on round spheres in spherically-symmetric spacetimes it is \({I_{\alpha \beta}}{Z^\alpha}{W^\beta}: = {1 \over {2{r^2}}}(1 + 2{r^2}\rho {\rho {\prime}}){\varepsilon _{AB}}{\lambda ^A}{\omega ^B} - {\varepsilon ^{{A{\prime}}{B{\prime}}}}{\Delta _{{A{\prime}}A}}{\lambda ^A}{\Delta _{{B{\prime}}B}}{\omega ^B}\) [496]. Thus, no natural simple infinity twistor has been found in curved spacetime. Indeed, Helfer claims that no such infinity twistor can exist [263]: even if the spacetime is conformally flat (in which case the Hermitian metric exists) the Hermiticity condition would be fifteen algebraic equations for the (at most) twelve real components of the ‘would be’ infinity twistor. Then, since the possible kinematical twistors form an open set in the space of symmetric twistors, the Hermiticity condition cannot be satisfied even for nonsimple IαβS. However, in contrast to the linearized gravity case, the infinity twistor should not be given once and for all on some ‘universal’ twistor space that may depend on the actual gravitational field. In fact, the two-surface twistor space itself depends on the geometry of \({\mathcal S}\), and hence all its structures also.

Since in the Hermiticity condition (7.9) only the special combination \({H^\alpha}_{{\beta {\prime}}}: = {I^{\alpha \beta}}{H_{\beta {\beta {\prime}}}}\) of the infinity and metric twistors (the ‘bar-hook’ combination) appears, it might still be hoped that an appropriate \({H^\alpha}_{{\beta {\prime}}}\) could be found for a class of two-surfaces in a natural way [516]. However, as far as the present author is aware, no real progress has been achieved in this way.

7.2.4 The various limits

Obviously, the kinematical twistor vanishes in flat spacetime and, since the basic idea comes from linearized gravity, the construction gives the correct results in the weak field approximation. The nonrelativistic weak field approximation, i.e., the Newtonian limit, was clarified by Jeffryes [298]. He considers a one-parameter family of spacetimes with perfect fluid source, such that in the λ → 0 limit of the parameter λ, one gets a Newtonian spacetime, and, in the same limit, the two-surface \({\mathcal S}\) lies in a t = const. hypersurface of the Newtonian time t. In this limit the pointwise Hermitian scalar product is constant, and the norm-mass can be calculated. As could be expected, for the leading λ2-order term in the expansion of m as a series of λ he obtained the conserved Newtonian mass. The Newtonian energy, including the kinetic and the Newtonian potential energy, appears as a λ4-order correction.

The Penrose definition for the energy-momentum and angular momentum can be applied to the cuts \({\mathcal S}\) of the future null infinity ℐ+ of an asymptotically flat spacetime [420, 426]. Then every element of the construction is built from conformally-rescaled quantities of the nonphysical spacetime. Since ℐ+ is shear-free, the two-surface twistor equations on \({\mathcal S}\) decouple, and hence, the solution space admits a natural infinity twistor Iαβ. It singles out precisely those solutions whose primary spinor parts span the asymptotic spin space of Bramson (see Section 4.2.4), and they will be the generators of the energy-momentum. Although \({\mathcal S}\) is contorted, and hence, there is no natural Hermitian scalar product, there is a twistor \({H^\alpha}_{{\beta \prime}}\) with respect to which Aαβ is Hermitian. Furthermore, the determinant ν is constant on \({\mathcal S}\), and hence it defines a volume 4-form on the two-surface twistor space [516]. The energy-momentum coming from Aαβ is just that of Bondi and Sachs. The angular momentum defined by Aαβ is, however, new. It has a number of attractive properties. First, in contrast to definitions based on the Komar expression, it does not have the ‘factor-of-two anomaly’ between the angular momentum and the energy-momentum. Since its definition is based on the solutions of the two-surface twistor equations (which can be interpreted as the spinor constituents of certain BMS vector fields generating boost-rotations) instead of the BMS vector fields themselves, it is free of supertranslation ambiguities. In fact, the two-surface twistor space on \({\mathcal S}\) reduces the BMS Lie algebra to one of its Poincaré subalgebras. Thus, the concept of the ‘translation of the origin’ is moved from null infinity to the twistor space (appearing in the form of a four-parameter family of ambiguities in the potential for the shear σ), and the angular momentum transforms just in the expected way under such a ‘translation of the origin’. It is shown in [174] that Penrose’s angular momentum can be considered as a supertranslation of previous definitions.

The other way of determining the null infinity limit of the energy-momentum and angular momentum is to calculate them for large spheres from the physical data, instead of for the spheres at null infinity from the conformally-rescaled data. These calculations were done by Shaw [455, 457]. At this point it should be noted that the r → ℞ limit of Aαβ vanishes, and it is \(\sqrt {{\rm{Area(}}{{\mathcal S}_r})} {A_{\alpha \beta}}\) that yields the energy-momentum and angular momentum at infinity (see the remarks following Eq. (3.14)). The specific radiative solution for which the Penrose mass has been calculated is that of Robinson and Trautman [510]. The two-surfaces for which the mass was calculated are the r = const. cuts of the geometrically-distinguished outgoing null hypersurfaces u = const. Tod found that, for given u, the mass m is independent of r, as could be expected because of the lack of incoming radiation.

In [264] Helfer suggested a bijective nonlinear map between the two-surface twistor spaces on the different cuts of ℐ+, by means of which he got something like a ‘universal twistor space’. Then he extends the kinematical twistor to this space, and in this extension the shear potential (i.e., the complex function for which the asymptotic shear can be written as σ = ð2 S) appears explicitly. Using Eq. (7.12) as the definition of the intrinsic-spin angular momentum at scri, Helfer derives an explicit formula for the spin. In addition to the expected Pauli-Lubanski type term, there is an extra term, which is proportional to the imaginary part of the shear potential. Since the twistor spaces on the different cuts of scri have been identified, the angular momentum flux can be, and has in fact been, calculated. (For an earlier attempt to calculate this flux, see [262].)

The large sphere limit of the two-surface twistor space and the Penrose construction were investigated by Shaw in the Sommers [475], Ashtekar-Hansen [37], and Beig-Schmidt [65] models of spatial infinity in [451, 452, 454]. Since no gravitational radiation is present near the spatial infinity, the large spheres are (asymptotically) noncontorted, and both the Hermitian scalar product and the infinity twistor are well defined. Thus, the energy-momentum and angular momentum (and, in particular, the mass) can be calculated. In vacuum he recovered the Ashtekar-Hansen expression for the energy-momentum and angular momentum, and proved their conservation if the Weyl curvature is asymptotically purely electric. In the presence of matter the conservation of the angular momentum was investigated in [456].

The Penrose mass in asymptotically anti-de Sitter spacetimes was studied by Kelly [312]. He calculated the kinematical twistor for spacelike cuts \({\mathcal S}\) of the infinity ℐ+, which is now a timelike three-manifold in the nonphysical spacetime. Since ℐ admits global three-surface twistors (see the next Section 7.2.5), \({\mathcal S}\) is noncontorted. In addition to the Hermitian scalar product, there is a natural infinity twistor, and the kinematical twistor satisfies the corresponding Hermiticity condition. The energy-momentum four-vector coming from the Penrose definition is shown to coincide with that of Ashtekar and Magnon [42]. Therefore, the energy-momentum four-vector is future pointing and timelike if there is a spacelike hypersurface extending to ℐ on which the dominant energy condition is satisfied. Consequently, m2 ≥ 0. Kelly shows that \(m_{\rm{D}}^2\) is also non-negative and in vacuum it coincides with m2. In fact [516], mmD ≥ 0 holds.

7.2.5 The quasi-local mass of specific two-surfaces

The Penrose mass has been calculated in a large number of specific situations. Round spheres are always noncontorted [514], thus, the norm-mass can be calculated. (In fact, axisymmetric two-surfaces in spacetimes with twist-free rotational Killing vectors are noncontorted [299].) The Penrose mass for round spheres reduces to the standard energy expression discussed in Section 4.2.1 [510]. Thus, every statement given in Section 4.2.1 for round spheres is valid for the Penrose mass, and we do not repeat them. In particular, for round spheres in a t = const. slice of the Kantowski-Sachs spacetime, this mass is independent of the two-surfaces [507]. Interestingly enough, although these spheres cannot be shrunk to a point (thus, the mass cannot be interpreted as ‘the three-volume integral of some mass density’), the time derivative of the Penrose mass looks like the mass conservation equation. It is, minus the pressure times the rate of change of the three-volume of a sphere in flat space with the same area as \({\mathcal S}\) [515]. In conformally-flat spacetimes [510] the two-surface twistors are just the global twistors restricted to \({\mathcal S}\), and the Hermitian scalar product is constant on \({\mathcal S}\). Thus, the norm-mass is well defined.

The construction works nicely, even if global twistors exist only on a, e.g., spacelike hypersurface Σ containing \({\mathcal S}\). These are the three-surface twistors [510, 512], which are solutions of certain (overdetermined) elliptic partial-differential equations, called the three-surface twistor equations, on Σ. These equations are completely integrable (i.e., they admit the maximal number of linearly-independent solutions, namely four) if and only if Σ, with its intrinsic metric and extrinsic curvature, can be embedded, at least locally, into some conformally-flat spacetime [512]. Such hypersurfaces are called noncontorted. It might be interesting to note that the noncontorted hypersurfaces can also be characterized as the critical points of the Chern-Simons functional, built from the real Sen connection on the Lorentzian vector bundle or from the three-surface twistor connection on the twistor bundle over Σ [66, 495]. Returning to the quasi-local mass calculations, Tod showed that in vacuum the kinematical twistor for a two-surface \({\mathcal S}\) in a noncontorted Σ depends only on the homology class of \({\mathcal S}\). In particular, if \({\mathcal S}\) can be shrunk to a point, then the corresponding kinematical twistor is vanishing. Since Σ is noncontorted, \({\mathcal S}\) is also noncontorted, and hence the norm-mass is well defined. This implies that the Penrose mass in the Schwarzschild solution is the Schwarzschild mass for any noncontorted two-surface that can be deformed into a round sphere, and it is zero for those that do not go round the black hole [514]. Thus, in particular, the Penrose mass can be zero even in curved spacetimes.

A particularly interesting class of noncontorted hypersurfaces is that of the conformally-flat time-symmetric initial data sets. Tod considered Wheeler’s solution of the time-symmetric vacuum constraints describing n ‘points at infinity’ (or, in other words, n − 1 black holes) and two-surfaces in such a hypersurface [510]. He found that the mass is zero if \({\mathcal S}\) does not go around any black hole, it is the mass Mi of the i-th black hole if \({\mathcal S}\) links precisely the i-th black hole, it is \({M_i} + {M_j} - {M_i}{M_j}/{d_{ij}} + {\mathcal O}(1/d_{ij}^2)\) if \({\mathcal S}\) links precisely the i-th and the j-th black holes, where dij is some appropriate measure of the distance between the black holes, …, etc. Thus, the mass of the i-th and j-th holes as a single object is less than the sum of the individual masses, in complete agreement with our physical intuition that the potential energy of the composite system should contribute to the total energy with negative sign.

Beig studied the general conformally-flat time-symmetric initial data sets describing n ‘points at infinity’ [62]. He found a symmetric trace-free and divergence-free tensor field Tab and, for any conformal Killing vector ξa of the data set, defined the two-surface flux integral P(ξ) of Tabξb on \({\mathcal S}\). He showed that P(ξ) is conformally invariant, depends only on the homology class of \({\mathcal S}\), and, apart from numerical coefficients, for the ten (locally-existing) conformal Killing vectors, these are just the components of the kinematical twistor derived by Tod in [510] (and discussed in the previous paragraph). In particular, Penrose’s mass in Beig’s approach is proportional to the length of the P’s with respect to the Cartan-Killing metric of the conformal group of the hypersurface.

Tod calculated the quasi-local mass for a large class of axisymmetric two-surfaces (cylinders) in various LRS Bianchi and Kantowski-Sachs cosmological models [515] and more general cylindrically-symmetric spacetimes [517]. In all these cases the two-surfaces are noncontorted, and the construction works. A technically interesting feature of these calculations is that the two-surfaces have edges, i.e., they are not smooth submanifolds. The twistor equation is solved on the three smooth pieces of the cylinder separately, and the resulting spinor fields are required to be continuous at the edges. This matching reduces the number of linearly-independent solutions to four. The projection parts of the resulting twistors, the \({\rm{i}}{\Delta _{{A\prime}A}}{\lambda ^A}{\rm{s}}\), are not continuous at the edges. It turns out that the cylinders can be classified invariantly to be hyperbolic, parabolic, or elliptic. Then the structure of the quasi-local mass expressions is not simply ‘density’ × ‘volume’, but is proportional to a ‘type factor’ f(L) as well, where is the coordinate length of the cylinder. In the hyperbolic, parabolic, and elliptic cases this factor is sinh ωL/(ωL), 1, and sin ωL/(ωL), respectively, where ω is an invariant of the cylinder. The various types are interpreted as the presence of a positive, zero, or negative potential energy. In the elliptic case the mass may be zero for finite cylinders. On the other hand, for static perfect fluid spacetimes (hyperbolic case) the quasi-local mass is positive. A particularly interesting spacetime is that describing cylindrical gravitational waves, whose presence is detected by the Penrose mass. In all these cases the determinant-mass has also been calculated and found to coincide with the norm-mass. A numerical investigation of the axisymmetric Brill waves on the Schwarzschild background is presented in [87]. It was found that the quasi-local mass is positive, and it is very sensitive to the presence of the gravitational waves.

Another interesting issue is the Penrose inequality for black holes (see Section 13.2.1). Tod shows [513, 514] that for static black holes the Penrose inequality holds if the mass of the black hole is defined to be the Penrose quasi-local mass of the spacelike cross section \({\mathcal S}\) of the event horizon. The trick here is that \({\mathcal S}\) is totally geodesic and conformal to the unit sphere, and hence, it is noncontorted and the Penrose mass is well defined. Then, the Penrose inequality will be a Sobolev-type inequality for a non-negative function on the unit sphere. This inequality is tested numerically in [87].

Apart from the cuts of ℐ+ in radiative spacetimes, all the two-surfaces discussed so far were noncontorted. The spacelike cross section of the event horizon of the Kerr black hole provides a contorted two-surface [516]. Thus, although the kinematical twistor can be calculated for this, the construction in its original form cannot yield any mass expression. The original construction has to be modified.

7.2.6 Small surfaces

The properties of the Penrose construction that we have discussed are very remarkable and promising. However, the small surface calculations clearly show some unwanted features of the original construction [511, 313, 560], and force its modification.

First, although the small spheres are contorted in general, the leading term of the pointwise Hermitian scalar product is constant: \({\lambda ^A}{\Delta _{A{A\prime}}}{{\bar \omega}^{{A\prime}}} - {{\bar \omega}^{{A\prime}}}{\Delta _{{A\prime}A}}{\lambda ^A}\) for any two-surface twistors Zα = (λA,iΔA′AλA) and Wα = (ωA,iΔA′AωA) [511, 313]. Since in nonvacuum spacetimes the kinematical twistor has only the ‘four-momentum part’ in the leading \({\mathcal O}({r^3})\)-order with \({P_a} = {{4\pi} \over 3}{r^3}{T_{ab}}{t^b}\), the Penrose mass, calculated with the norm above, is just the expected mass in the leading \({\mathcal O}({r^3})\) order. Thus, it is positive if the dominant energy condition is satisfied. On the other hand, in vacuum the structure of the kinematical twistor is

$${A_{\alpha \beta}} = \left({\begin{array}{*{20}c} {2{\rm{i}}{\lambda _{AB}}} & {{P_A}^{B{\prime}}} \\ {{P^{A{\prime}}}_B} & 0 \\ \end{array}} \right) + {\mathcal O}\,({r^6}),$$
(7.15)

where \({\lambda _{AB}} = {\mathcal O}({r^5})$${P_{A{A\prime}}} = {2 \over {45G}}{r^5}{\psi _{ABCD}}{\chi _{{A\prime}{B\prime}{C\prime}{D\prime}}}{t^{B{B\prime}}}{t^{CC}}{t^{D{D\prime}}}\) with \({\chi _{ABCD}}: = {\psi _{ABCD}} - 4{{\bar \psi}_{{A\prime}{B\prime}{C\prime}{D\prime}}}{t^{{A\prime}}}{\,_A}{t^{{B\prime}}}_B{t^{{C\prime}}}_C{t^{{D\prime}}}_D\). In particular, in terms of the familiar conformal electric and magnetic parts of the curvature the leading term in the time component of the four-momentum is \({P_{A{A\prime}}}{t^{A{A\prime}}} = {1 \over {45G}}{H_{ab}}({H^{ab}} - {\rm{i}}{E^{ab}})\). Then, the corresponding norm-mass, in the leading order, can even be complex! For an \({{\mathcal S}_r}\) in the t = const. hypersurface of the Schwarzschild spacetime, this is zero (as it must be inlight of the results of Section 7.2.5, because this is a noncontorted spacelike hypersurface), but for a general small two-sphere not lying in such a hypersurface, PAA′ is real and spacelike, and hence, m2 < 0. In the Kerr spacetime, PAA′ itself is complex [511, 313].

7.3 The modified constructions

Independently of the results of the small-sphere calculations, Penrose claims that in the Schwarzschild spacetime the quasi-local mass expression should yield the same zero value on two-surfaces, contorted or not, which do not surround the black hole. (For the motivations and the arguments, see [422].) Thus, the original construction should be modified, and the negative results for the small spheres above strengthened this need. A much more detailed review of the various modifications is given by Tod in [516].

7.3.1 The ‘improved’ construction with the determinant

A careful analysis of the roots of the difficulties lead Penrose [422, 426] (see also [511, 313, 516]) to suggest the modified definition for the kinematical twistor

$${A{\prime}_{\alpha \beta}}{Z^\alpha}{W^\beta}: = {{\rm{i}} \over {8\pi G}}\oint\nolimits_{\mathcal S} {\eta \,{\lambda ^A}{\omega ^B}{R_{A\,Bcd}}},$$
(7.16)

where η is a constant multiple of the determinant in Eq. (7.7). Since on noncontorted two-surfaces the determinant ν is constant, for such surfaces A′αβ reduces to Aαβ, and hence, all the nice properties proven for the original construction on noncontorted two-surfaces are shared by A′αβ. The quasi-local mass calculated from Eq. (7.16) for small spheres (in fact, for small ellipsoids [313]) in vacuum is vanishing in the fifth order. Thus, apparently, the difficulties have been resolved. However, as Woodhouse pointed out, there is an essential ambiguity in the (nonvanishing, sixth-order) quasi-local mass [560]. In fact, the structure of the modified kinematical twistor has the form (7.15) with vanishing \({P^{{A\prime}}}_B\) and \({P_A}^{{B\prime}}\) but with nonvanishing λAB in the fifth order. Then, in the quasi-local mass (in the leading sixth order) there will be a term coming from the (presumably nonvanishing) sixth-order part of \({P^{{A\prime}}}_B\) and \({P_A}^{{B\prime}}\) and the constant part of the Hermitian scalar product, and the fifth-order λAB and the still ambiguous \({\mathcal O}(r)\)-order part of the Hermitian metric.

7.3.2 Modification through Tod’s expression

These anomalies lead Penrose to modify A′αβ slightly [423]. This modified form is based on Tod’s form of the kinematical twistor:

$${A^{{\prime}{\prime}}_{\alpha \beta}}{Z^\alpha}{W^\beta}: = {1 \over {4\pi G}}\oint\nolimits_{\mathcal S} {{{\bar \gamma}^{A{\prime}B{\prime}}}[{\rm{i}}{\Delta _{A{\prime}A}}(\sqrt \eta {\lambda ^A})]\;[{\rm{i}}{\Delta _{B{\prime}B}}(\sqrt \eta {\omega ^B})]\;d{\mathcal S}}.$$
(7.17)

The quasi-local mass on small spheres coming from A″αβ is positive [516].

7.3.3 Mason’s suggestions

A beautiful property of the original construction was its connection with the Hamiltonian formulation of the theory [357]. Unfortunately, such a simple Hamiltonian interpretation is lacking for the modified constructions. Although the form of Eq. (7.17) is that of the integral of the Nester-Witten 2-form, and the spinor fields \(\sqrt \eta {\lambda ^A}\) and \({\rm{i}}{\Delta _{{A\prime}A}}(\sqrt \eta {\lambda ^A})\) could still be considered as the spinor constituents of the ‘quasi-Killing vectors’ of the two-surface \({\mathcal S}\), their structure is not so simple, because the factor η itself depends on all four of the independent solutions of the two-surface twistor equation in a rather complicated way.

To have a simple Hamiltonian interpretation, Mason suggested further modifications [357, 358]. He considers the four solutions \(\lambda _i^A,i = 1, \ldots, 4\), of the two-surface twistor equations, and uses these solutions in the integral (7.14) of the Nester-Witten 2-form. Since \({H_{\mathcal S}}\) is a Hermitian bilinear form on the space of the spinor fields (see Section 8), he obtains 16 real quantities as the components of the 4 × 4 Hermitian matrix \({E_{ij}}: = {H_{\mathcal S}}[{\lambda _i},{{\bar \lambda}_j}]\). However, it is not clear how the four ‘quasi-translations’ of \({\mathcal S}\) should be found among the 16 vector fields \(\lambda _i^A\bar \lambda _j^{{A\prime}}\) (called ‘quasi-conformal Killing vectors’ of \({\mathcal S}\)) for which the corresponding quasi-local quantities could be considered as the components of the quasi-local energy-momentum. Nevertheless, this suggestion leads us to the next class of quasi-local quantities.

8 Approaches Based on the Nester-Witten 2-Form

We saw in Section 3.2 that

  • both the ADM and Bondi-Sachs energy-momenta can be re-expressed by the integral of the Nester-Witten 2-form \(u{(\lambda, \bar \mu)_{ab}}\),

  • the proof of the positivity of the ADM and Bondi—Sachs masses is relatively simple in terms of the two-component spinors.

Thus, from a pragmatic point of view, it seems natural to search for the quasi-local energy-momentum in the form of the integral of the Nester-Witten 2-form. Now we will show that

  • the integral of Møller’s tetrad superpotential for the energy-momentum, coming from his tetrad Lagrangian (3.5), is just the integral of \(u{({\lambda ^{\underline A}},{\bar \lambda ^{{{\underline B}{\prime}}}})_{ab}}\), where \(\{\lambda _A^{\underline A}\}\) is a normalized spinor dyad.

Hence, all the quasi-local energy-momenta based on the integral of the Nester-Witten 2-form have a natural Lagrangian interpretation in the sense that they are charge integrals of the canonical Noether current derived from Møller’s first-order tetrad Lagrangian.

If \({\mathcal S}\) is any closed, orientable spacelike two-surface and an open neighborhood of \({\mathcal S}\) is time and space orientable, then an open neighborhood of \({\mathcal S}\) is always a trivialization domain of both the orthonormal and the spin frame bundles [500]. Therefore, the orthonormal frame \(\{E_{\underline a}^a\}\) can be chosen to be globally defined on \({\mathcal S}\), and the integral of the dual of Møller’s superpotential, \({1 \over 2}{K^e}{\vee_e}^{ab}{1 \over 2}{\varepsilon _{abcd}}\), appearing on the right-hand side of the superpotential Eq. (3.7), is well defined. If (ta, va) is a pair of globally-defined normals of \({\mathcal S}\) in the spacetime, then in terms of the geometric objects introduced in Section 4.1, this integral takes the form

$$\begin{array}{*{20}c} {Q\,[{\bf{K}}]: = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {{1 \over 2}{K^e}{\vee _e}^{ab}{1 \over 2}{\varepsilon _{abcd}}} \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {{K^e}\left({- {}^ \bot {\varepsilon _{ea}}{Q_b}^{ba} - {A_e} - {}^ \bot {\varepsilon _{ea}}({\delta _b}E_{\underline b}^b){\eta ^{\underline b \underline a}}E_{\underline a}^a + {\delta _e}({t_a}E_{\underline a}^a){\eta ^{\underline a \underline b}}E_{\underline b}^b{\upsilon _b}} \right)d{\mathcal S}.}} \\ \end{array}$$
(8.1)

The first term on the right is just the dual mean curvature vector of \({\mathcal S}\), the second is the connection one-form on the normal bundle, while the remaining terms are explicitly SO(1, 3) gauge dependent. On the other hand, this is boost gauge invariant (the boost gauge dependence of the second term is compensated by the last one), and depends on the tetrad field and the vector field Ka given only on \({\mathcal S}\), but is independent in the way in which they are extended off the surface. As we will see, the general form of other quasi-local energy-momentum expressions show some resemblance to Eq. (8.1).

Then, suppose that the orthonormal basis is built from a normalized spinor dyad, i.e., \(E_{\underline a}^a = \sigma _{\underline a}^{\underline A {{\underline B}{\prime}}}\varepsilon _{\underline A}^A\bar \varepsilon _{{{\underline B}{\prime}}}^{{A{\prime}}}\), where \(\sigma _{\underline a}^{\underline A {{\underline B}{\prime}}}\) are the SL(2, ℂ) Pauli matrices (divided by \(\sqrt 2)\)) and \(\{\varepsilon _{\underline A}^A\}, \underline A = 0,1\), is a normalized spinor basis. A straightforward calculation yields the following remarkable expression for the dual of Møller’s superpotential:

$${1 \over 4}\sigma _{\underline A \,\underline {B{\prime}}}^{\underline a}E_{\underline a}^e{\vee _\varepsilon}^{ab}{1 \over 2}{\varepsilon _{abcd}} = u\,{({\varepsilon _{\underline A}},{\bar \varepsilon _{\underline {B{\prime}}}})_{cd}} + \overline {u{{({\varepsilon _{\underline B}},{{\bar \varepsilon}_{\underline {A{\prime}}}})}_{cd}}},$$
(8.2)

where the overline denotes complex conjugation. Thus, the real part of the Nester-Witten 2-form, and hence, by Eq. (3.11), apart from an exact 2-form, the Nester-Witten 2-form itself, built from the spinors of a normalized spinor basis, is just the superpotential 2-form derived from Møller’s first-order tetrad Lagrangian [500].

Next we will discuss some general properties of the integral of \(u{(\lambda, \bar \mu)_{ab}}\), where λA and μA are arbitrary spinor fields on \({\mathcal S}\). Then, in the integral \({H_{\mathcal S}}[\lambda, \bar \mu ]\), defined by Eq. (7.14), only the tangential derivative of λA appears. (μA is involved in \({H_{\mathcal S}}[\lambda, \bar \mu ]\) algebraically.) Thus, by Eq. (3.11), \({H_{\mathcal S}}:{C^\infty}({\mathcal S},{{\rm{S}}_A}) \times {C^\infty}({\mathcal S},{{\rm{S}}_A}) \rightarrow {\rm{\mathbb C}}\) is a Hermitian scalar product on the (infinite-dimensional complex) vector space of smooth spinor fields on \({\mathcal S}\). Thus, in particular, the spinor fields in \({H_{\mathcal S}}[\lambda, \bar \mu ]\) need be defined only on \({\mathcal S}\), and \(\overline {{H_{\mathcal S}}[\lambda, \bar \mu ]}\) holds. A remarkable property of \({{H_{\mathcal S}}}\) is that if λA is a constant spinor field on \({\mathcal S}\) with respect to the covariant derivative Δe, then \({H_{\mathcal S}}[\lambda, \bar \mu ] = 0\) for any smooth spinor field μA on \({\mathcal S}\). Furthermore, if \(\lambda _A^{\underline A} = (\lambda _A^0,\lambda _A^1)\) is any pair of smooth spinor fields on \({\mathcal S}\), then for any constant SL(2, ℂ) matrix \({\Lambda _{\underline A}}^{\underline B}\) one has \({H_{\mathcal S}}[{\lambda ^{\underline C}}{\Lambda _{\underline C}}^{\underline A},{{\bar \lambda}^{\underline {{D{\prime}}}}}{{\bar \Lambda}_{\underline {{D{\prime}}}}}^{{{\underline B}{\prime}}}] = {H_{\mathcal S}}[{\lambda ^{\underline C}},{{\bar \lambda}^{{{\underline D}{\prime}}}}]{\Lambda _{\underline C}}^{\underline A}{{\bar \Lambda}_{{{\underline D}{\prime}}}}^{{{\underline B}{\prime}}}\), i.e., the integrals \({H_{\mathcal S}}[{\lambda ^{\underline A}},{{\bar \lambda}^{{{\underline B}{\prime}}}}]\) transform as the spinor components of a real Lorentz vector over the two-complex-dimensional space spanned by \(\lambda _A^0\) and \(\lambda _A^1\). Therefore, to have a well-defined quasi-local energy-momentum vector we have to specify some two-dimensional subspace \({{\bf{S}}^{\underline A}}\) of the infinite-dimensional space \({C^\infty}({\mathcal S},{{\rm{S}}_A})\) and a symplectic metric \({\varepsilon _{\underline A \underline B}}\) thereon. Thus, underlined capital Roman indices will be referring to this space. The elements of this subspace would be interpreted as the spinor constituents of the ‘quasi-translations’ of the surface \({\mathcal S}\). Note, however, that in general the symplectic metric \({\varepsilon _{\underline A \underline B}}\) need not be related to the pointwise symplectic metric εAB on the spinor spaces, i.e., the spinor fields \(\lambda _A^0\) and \(\lambda _A^1\) that span \({{\bf{S}}^{\underline A}}\) are not expected to form a normalized spin frame on \({\mathcal S}\). Since, in Møller’s tetrad approach it is natural to choose the orthonormal vector basis to be a basis in which the translations have constant components (just like the constant orthonormal bases in Minkowski spacetime, which are bases in the space of translations), the spinor fields \(\lambda _A^{\underline A}\) could also be interpreted as the spinor basis that should be used to construct the orthonormal vector basis in Møller’s superpotential (3.6). In this sense the choice of the subspace \({{\bf{S}}^{\underline A}}\) and the metric \({\varepsilon _{\underline A \underline B}}\) is just a gauge reduction (see Section 3.3.3).

Once the spin space \({\rm{(}}{{\rm{S}}^{\underline A}},{\varepsilon _{\underline A \underline B}})\) is chosen, the quasi-local energy-momentum is defined to be \(P_{\mathcal S}^{\underline A \underline {{B{\prime}}}}: = {H_{\mathcal S}}[{\lambda ^{\underline A}},{{\bar \lambda}^{\underline {{B{\prime}}}}}]\) and the corresponding quasi-local mass \({m_{\mathcal S}}\). is \(m_{\mathcal S}^2: = {\varepsilon _{\underline A \underline B}}{\varepsilon _{{{\underline A}{\prime}}{{\underline B}{\prime}}}}P_{\mathcal S}^{\underline A {{\underline A}{\prime}}}P_{\mathcal S}^{\underline B {{\underline B}{\prime}}}\) In particular, if one of the spinor fields \(\lambda _A^{\underline A}\), e.g., \(\lambda _A^0\), is constant on \({\mathcal S}\) (which means that the geometry of \({\mathcal S}\) is considerably restricted), then \(P_{\mathcal S}^{{{00}{\prime}}} = P_{\mathcal S}^{{{01}{\prime}}} = P_{\mathcal S}^{{{10}{\prime}}} = 0\), and hence, the corresponding mass \({m_{\mathcal S}}\) is zero. If both \(\lambda _A^0\) and \(\lambda _A^1\) are constant (in particular, when they are the restrictions to \({\mathcal S}\) of the two constant spinor fields in the Minkowski spacetime), then \(P_{\mathcal S}^{\underline A \underline {{B{\prime}}}}\) itself is vanishing.

Therefore, to summarize, the only thing that needs to be specified is the spin space \(({{\rm{S}}^{\underline A}},{\varepsilon _{\underline A \underline B}})\), and the various suggestions for the quasi-local energy-momentum based on the integral of the Nester-Witten 2-form correspond to the various choices for this spin space.

8.1 The Ludvigsen-Vickers construction

8.1.1 The definition

Suppose that spacetime is asymptotically flat at future null infinity, and the closed spacelike two-surface \({\mathcal S}\) can be joined to future null infinity by a smooth null hypersurface \({\mathcal N}\). Let \({{\mathcal S}_\infty}: = {\mathcal N} \cap {{\mathscr I}^ +}\), the cut defined by the intersection of \({\mathcal N}\) with future null infinity. Then, the null geodesic generators of \({\mathcal N}\) define a smooth bijection between \({\mathcal S}\) and the cut \({{\mathcal S}_\infty}\) (and hence, in particular, \({\mathcal S} \approx {S^2}\)). We saw in Section 4.2.4 that on the cut \({{\mathcal S}_\infty}\) at the future null infinity we have the asymptotic spin space \((S_\infty ^{\underline A},{\varepsilon _{\underline A \underline B}})\). The suggestion of Ludvigsen and Vickers [346] for the spin space \(({{\rm{S}}^{\underline A}},{\varepsilon _{\underline A \underline B}})\) on \({\mathcal S}\) is to import the two independent solutions of the asymptotic twistor equations, i.e., the asymptotic spinors, from the future null infinity back to the two-surface along the null geodesic generators of the null hypersurface \({\mathcal N}\). Their propagation equations, given both in terms of spinors and in the GHP formalism, are

(8.3)
$${\iota ^A}{\bar o^{A{\prime}}}({\nabla _{AA{\prime}}}{\lambda _B})\,{o^B} = \;{\prime}{\lambda _0} + \rho {\lambda _1} = 0.$$
(8.4)

Here \(\varepsilon _{\rm{A}}^A = \{{o^A},{\iota ^A}\}\) is the GHP spin frame introduced in Section 4.2.4, and by Eq. (4.6) the second half of these equations is just Δ+λ = 0. It should be noted that the choice of Eqs. (8.3) and (8.4) for the propagation law of the spinors is ‘natural’ in the sense that in flat spacetime they reduce to the condition of parallel propagation, and Eq. (8.4) is just the appropriate part of the asymptotic twistor equation of Bramson. We call the spinor fields obtained by using Eqs. (8.3) and (8.4) the Ludvigsen-Vickers spinors on \({\mathcal S}\). Thus, given an asymptotic spinor at infinity, we propagate its zero-th components (with respect to the basis \(\varepsilon _{\rm{A}}^A\)) to \({\mathcal S}\) by Eq. (8.3). This will be the zero-th component of the Ludvigsen-Vickers spinor. Then, its first component will be determined by Eq. (8.4), provided ρ is not vanishing on any open subset of \({\mathcal S}\). If \(\lambda _A^0\) and \(\lambda _A^1\) are Ludvigsen-Vickers spinors on \({\mathcal S}\) obtained by Eqs. (8.3) and (8.4) from two asymptotic spinors that formed a normalized spin frame, then, by considering \(\lambda _A^0\) and \(\lambda _A^1\) to be normalized in \({{\bf{S}}^{\underline A}}\), we define the symplectic metric \({\varepsilon _{\underline A \underline B}}\) on \({{\rm{S}}^{\underline A}}\) to be that with respect to which \(\lambda _A^0\) and \(\lambda _A^1\) form a normalized spin frame. Note, however, that this symplectic metric is not connected with the symplectic fiber metric εAB of the spinor bundle \({{\bf{S}}^A}({\mathcal S})\) over \({\mathcal S}\). Indeed, in general, \(\lambda _A^{\underline A}\lambda _B^{\underline B}{\varepsilon ^{AB}}\) is not constant on \({\mathcal S}\), and hence, εAB does not determine any symplectic metric on the space \({{\bf{S}}^{\underline A}}\) of the Ludvigsen-Vickers spinors. In Minkowski spacetime the two Ludvigsen-Vickers spinors are just the restriction to \({\mathcal S}\) of the two constant spinors.

8.1.2 Remarks on the validity of the construction

Before discussing the usual questions about the properties of the construction (positivity, monotonicity, the various limits, etc.), we should make some general remarks. First, it is obvious that the Ludvigsen-Vickers energy-momentum in its above form cannot be defined in a spacetime, which is not asymptotically flat at null infinity. Thus, their construction is not genuinely quasi-local, because it depends not only on the (intrinsic and extrinsic) geometry of \({\mathcal S}\), but on the global structure of the spacetime as well. In addition, the requirement of the smoothness of the null hypersurface \({\mathcal N}\) connecting the two-surface to the null infinity is a very strong restriction. In fact, for general (even for convex) two-surfaces in a general asymptotically flat spacetime, conjugate points will develop along the (outgoing) null geodesics orthogonal to the two-surface [417, 240]. Thus, either the two-surface must be near enough to the future null infinity (in the conformal picture), or the spacetime and the two-surface must be nearly spherically symmetric (or the former cannot be ‘very much curved’ and the latter cannot be ‘very much bent’).

This limitation yields that, in general, the original construction above does not have a small sphere limit. However, using the same propagation equations (8.3) and (8.4) one could define a quasi-local energy-momentum for small spheres [346, 84]. The basic idea is that there is a spin space at the vertex p of the null cone in the spacetime whose spacelike cross section is the actual two-surface, and the Ludvigsen-Vickers spinors on \({\mathcal S}\) are defined by propagating these spinors from the vertex p to \({\mathcal S}\) via Eqs. (8.3) and (8.4). This definition works in arbitrary spacetimes, but the two-surface cannot be extended to a large sphere near the null infinity, and it is still not genuinely quasi-local.

8.1.3 Monotonicity, mass-positivity and the various limits

Once the Ludvigsen-Vickers spinors are given on a spacelike two-surface \({{\mathcal S}_r}\) of constant affine parameter r in the outgoing null hypersurface \({\mathcal N}\), then they are uniquely determined on any other spacelike two-surface \({{\mathcal S}_{{r{\prime}}}}\) in \({\mathcal N}\), as well, i.e., the propagation law, Eqs. (8.3) and (8.4), defines a natural isomorphism between the space of the Ludvigsen-Vickers spinors on different two-surfaces of constant affine parameter in the same \({\mathcal N}\). (r need not be a Bondi-type coordinate.) This makes it possible to compare the components of the Ludvigsen-Vickers energy-momenta on different surfaces. In fact [346], if the dominant energy condition is satisfied (at least on \({\mathcal N}\)), then for any Ludvigsen-Vickers spinor λA and affine parameter values r1r2, one has \({H_{{{\mathcal S}_{{r_1}}}}}[\lambda, \bar \lambda ] \leq {H_{{{\mathcal S}_{{r_2}}}}}[\lambda, \bar \lambda ]\), and the difference \({H_{{{\mathcal S}_{{r_2}}}}}[\lambda, \bar \lambda ] \leq {H_{{{\mathcal S}_{{r_1}}}}}[\lambda, \bar \lambda ] \geq 0\) can be interpreted as the energy flux of the matter and the gravitational radiation through \({\mathcal N}\) between \({{\mathcal S}_{{r_1}}}\) and \({{\mathcal S}_{{r_2}}}\). Thus, both \(P_{{{\mathcal S}_r}}^{{{00}{\prime}}}\) and \(P_{{{\mathcal S}_r}}^{{{11}{\prime}}}\) are increasing with r (‘mass-gain’). A similar monotonicity property (‘mass-loss’) can be proven on ingoing null hypersurfaces, but then the propagation equations (8.3) and (8.4) should be replaced by ϸ′λ1 = 0 and − Δλ ≔ ðλ1 + ρ′λ0 = 0. Using these equations the positivity of the Ludvigsen-Vickers mass was proven in various special cases in [346].

Concerning the positivity properties of the Ludvigsen-Vickers mass and energy, first it is obvious by the remarks on the nature of the propagation equations (8.3) and (8.4) that in Minkowski spacetime the Ludvigsen-Vickers energy-momentum is vanishing. However, in the proof of the non-negativity of the Dougan-Mason energy (discussed in Section 8.2) only the λA ∈ ker Δ+ part of the propagation equations is used. Therefore, as realized by Bergqvist [79], the Ludvigsen-Vickers energy-momenta (both based on the asymptotic and the point spinors) are also future directed and nonspacelike, if \({\mathcal S}\) is the boundary of some compact spacelike hypersurface Γ on which the dominant energy condition is satisfied and \({\mathcal S}\) is weakly future convex (or at least ρ ≤ 0). Similarly, the Ludvigsen-Vickers definitions share the rigidity properties proven for the Dougan-Mason energy-momentum [488]. Under the same conditions the vanishing of the energy-momentum implies the flatness of the domain of dependence D(Σ) of Σ.

In the weak field approximation [346] the difference \({H_{{{\mathcal S}_{{r_2}}}}}[\lambda, \bar \lambda ] - {H_{{{\mathcal S}_{{r_1}}}}}[\lambda, \bar \lambda ]\) is just the integral of \(4\pi G{T_{ab}}{l^a}{\lambda ^B}{{\bar \lambda}^{{B{\prime}}}}\) on the portion of \({\mathcal N}\) between the two two-surfaces, where Tab is the linearized energy-momentum tensor. The increment of \({H_{{{\mathcal S}_r}}}[\lambda, \bar \lambda ]\) on \({\mathcal N}\) is due only to the flux of the matter energy-momentum.

Since the Bondi-Sachs energy-momentum can be written as the integral of the Nester-Witten 2-form on the cut in question at the null infinity with the asymptotic spinors, it is natural to expect that the first version of the Ludvigsen-Vickers energy-momentum tends to that of Bondi and Sachs. It was shown in [346, 457] that this expectation is, in fact, correct. The Ludvigsen-Vickers mass was calculated for large spheres both for radiative and stationary spacetimes with r−2 and r−3 accuracy, respectively, in [455, 457].

Finally, on a small sphere of radius r in nonvacuum the second definition gives [84] the expected result (4.9), while in vacuum [84, 494] it is

$$P_{{{\mathcal S}_r}}^{\underline A \underline {B{\prime}}} = {1 \over {10G}}{r^5}{T^a}_{bcd}{t^b}{t^c}{t^d}\varepsilon _A^{\underline A}\bar \varepsilon _{A{\prime}}^{\underline {B{\prime}}} + {4 \over {45G}}{r^6}{t^e}({\nabla _e}{T^a}_{bcd}){t^b}{t^c}{t^d}\varepsilon _A^{\underline A}\bar \varepsilon _{A{\prime}}^{\underline {B{\prime}}} + {\mathcal O}({r^7}).$$
(8.5)

Thus, its leading term is the energy-momentum of the matter fields and the Bel-Robinson momentum, respectively, seen by the observer ta at the vertex p. Thus, assuming that the matter fields satisfy the dominant energy condition, for small spheres this is an explicit proof that the Ludvigsen-Vickers quasi-local energy-momentum is future pointing and nonspacelike.

8.2 The Dougan-Mason constructions

8.2.1 Holomorphic/antiholomorphic spinor fields

The original construction of Dougan and Mason [172] was introduced on the basis of sheaf-theoretical arguments. Here we follow a slightly different, more ‘pedestrian’ approach, based mostly on [488, 490].

Following Dougan and Mason we define the spinor field λA to be antiholomorphic when meeλA = meΔeλA = 0, or holomorphic if \({\bar m^e}{\nabla _e}{\lambda _A} = {\bar m^e}{\Delta _e}{\lambda _A} = 0\). Thus, this notion of holomorphicity/antiholomorphicity is referring to the connection Δe on \({\mathcal S}\). While the notion of the holomorphicity/antiholomorphicity of a function on \({\mathcal S}\) does not depend on whether the Δe or δe operator is used, for tensor or spinor fields it does. Although the vectors ma and \({\bar m^a}\) are not uniquely determined (because their phase is not fixed), the notion of holomorphicity/antiholomorphicity is well defined, because the defining equations are homogeneous in ma and \({{\bar m}^a}\). Next, suppose that there are at least two independent solutions of \({\bar m^e}{\Delta _e}{\lambda _A} = 0\). If λA and μA are any two such solutions, then \({\bar m^e}{\Delta _e}({\lambda _A}{\mu _B}{\varepsilon ^{AB}}) = 0\), and hence by Liouville’s theorem λAμBεAB is constant on \({\mathcal S}\). If this constant is not zero, then we call \({\mathcal S}\) generic; if it is zero then \({\mathcal S}\) will be called exceptional. Obviously, holomorphic λA on a generic \({\mathcal S}\) cannot have any zero, and any two holomorphic spinor fields, e.g., λA and λA, span the spin space at each point of \({\mathcal S}\) (and they can be chosen to form a normalized spinor dyad with respect to εAB on the whole of \({\mathcal S}\)). Expanding any holomorphic spinor field in this frame, the expanding coefficients turn out to be holomorphic functions, and hence, constant. Therefore, on generic two-surfaces there are precisely two independent holomorphic spinor fields. In the GHP formalism, the condition of the holomorphicity of the spinor field λA is that its components (λ0, λ1) be in the kernel of \({{\mathcal H}^ +}: = {\Delta ^ +} \oplus {{\mathcal T}^ +}\). Thus, for generic two-surfaces ker \({{\mathcal H}^ +}\) with the constant \({\varepsilon _{\underline A \underline B}}\) would be a natural candidate for the spin space \(\left({{{\bf{S}}^{\underline A}},\,{\varepsilon _{\underline A \underline B}}} \right)\) above. For exceptional two-surfaces, the kernel space ker \({{\mathcal H}^ +}\) is either two-dimensional but does not inherit a natural spin space structure, or it is higher than two dimensional.

Similarly, the symplectic inner product of any two antiholomorphic spinor fields is also constant, one can define generic and exceptional two-surfaces as well, and on generic surfaces there are precisely two antiholomorphic spinor fields. The condition of the antiholomorphicity of λA is \(\lambda \in \ker \,{{\mathcal H}^ -}: = \ker ({\Delta ^ -} \oplus {{\mathcal T}^ -})\). Then \({{\bf{S}}^{\underline A}} = \ker \,{{\mathcal H}^ -}\) could also be a natural choice. Note that the spinor fields, whose holomorphicity/antiholomorphicity is defined, are unprimed, and these correspond to the antiholomorphicity/holomorphicity, respectively, of the primed spinor fields of Dougan and Mason. Thus, the main question is whether there exist generic two-surfaces, and if they do, whether they are ‘really generic’, i.e., whether most of the physically important surfaces are generic or not.

8.2.2 The genericity of the generic two-surfaces

\({{\mathcal H}^ \pm}\) are first-order elliptic differential operators on certain vector bundles over the compact two-surface \({\mathcal S}\), and their index can be calculated: \({\rm{index}}({{\mathcal H}^ \pm}) = 2(1 - g)\), where g is the genus of \({\mathcal S}\). Therefore, for \({\mathcal S} \approx {S^2}\) there are at least two linearly-independent holomorphic and at least two linearly-independent antiholomorphic spinor fields. The existence of the holomorphic/antiholomorphic spinor fields on higher-genus two-surfaces is not guaranteed by the index theorem. Similarly, the index theorem does not guarantee that \({\mathcal S} \approx {S^2}\) is generic either. If the geometry of \({\mathcal S}\) is very special, then the two holomorphic/antiholomorphic spinor fields (which are independent as solutions of \({{\mathcal H}^ \pm}\lambda = 0\)) might be proportional to each other. For example, future marginally-trapped surfaces (i.e., for which ρ = 0) are exceptional from the point of view of holomorphic spinors, and past marginally-trapped surfaces (ρ′ = 0) from the point of view of antiholomorphic spinors. Furthermore, there are surfaces with at least three linearly-independent holomorphic/antiholomorphic spinor fields. However, small generic perturbations of the geometry of an exceptional two-surface \({\mathcal S}\) with S2 topology make \({\mathcal S}\) generic.

Finally, we note that several first-order differential operators can be constructed from the chiral irreducible parts Δ± and \({{\mathcal T}^ \pm}\) of Δe, given explicitly by Eq. (4.6). However, only four of them, the Dirac-Witten operator Δ ≔ Δ+ ⊕ Δ, the twistor operator \({\mathcal T}: = {{\mathcal T}^ +} \oplus {{\mathcal T}^ -}\), and the holomorphy and antiholomorphy operators \({{\mathcal H}^ \pm}\), are elliptic (which ellipticity, together with the compactness of \({\mathcal S}\), would guarantee the finiteness of the dimension of their kernel), and it is only \({{\mathcal H}^ \pm}\) that have a two-complex-dimensional kernel in the generic case. This purely mathematical result gives some justification for the choices of Dougan and Mason. The spinor fields \(\lambda _A^{\underline A}\) that should be used in the Nester-Witten 2-form are either holomorphic or antiholomorphic. This construction does not work for exceptional two-surfaces.

8.2.3 Positivity properties

One of the most important properties of the Dougan-Mason energy-momenta is that they are future-pointing nonspacelike vectors, i.e., the corresponding masses and energies are non-negative. Explicitly [172], if \({\mathcal S}\) is the boundary of some compact spacelike hypersurface Σ on which the dominant energy condition holds, furthermore if \({\mathcal S}\) is weakly future convex (in fact, ρ ≥ 0 is enough), then the holomorphic Dougan-Mason energy-momentum is a future-pointing nonspacelike vector, and, analogously, the antiholomorphic energy-momentum is future pointing and nonspacelike if ρ′ ≥ 0. (For the functional analytic techniques and tools to give a complete positivity proof, see, e.g., [182].) As Bergqvist [79] stressed (and we noted in Section 8.1.3), Dougan and Mason used only the Δ+λ = 0 (and, in the antiholomorphic construction, the Δλ = 0) half of the ‘propagation law’ in their positivity proof. The other half is needed only to ensure the existence of two spinor fields. Thus, that might be Eq. (8.3) of the Ludvigsen-Vickers construction, or \({{\mathcal T}^ +}\lambda = 0\) in the holomorphic Dougan-Mason construction, or even \({{\mathcal T}^ +}\lambda = k\sigma {\prime}{\psi{\prime}_2}{\lambda _0}\) for some constant k, a ‘deformation’ of the holomorphicity considered by Bergqvist [79]. In fact, the propagation law may even be \({\bar m^a}{\Delta _a}{\lambda _B} = {\tilde f_B}^C{\lambda _C}\) for any spinor field \({\tilde f_B}^C\) satisfying \({\pi ^{- B}}_A{\tilde f_B}^C = {\tilde f_A}^B\pi {+ ^C}B = 0\). This ensures the positivity of the energy under the same conditions and that εAB λAμB is still constant on \({\mathcal S}\) for any two solutions λA and μA, making it possible to define the norm of the resulting energy-momentum, i.e., the mass.

In the asymptotically flat spacetimes the positive energy theorems have a rigidity part as well, namely the vanishing of the energy-momentum (and, in fact, even the vanishing of the mass) implies flatness. There are analogous theorems for the Dougan-Mason energy-momenta as well [488, 490]. Namely, under the conditions of the positivity proof

  1. 1.

    \(P_{\mathcal S}^{{\underline A}{{\underline B}\prime}}\) is zero iff D(Σ) is flat, which is also equivalent to the vanishing of the quasi-local energy, \({E_{\mathcal S}}: = {1 \over {\sqrt 2}}(P_{\mathcal S}^{00{\prime}} + P_{\mathcal S}^{11{\prime}}) = 0\), and

  2. 2.

    \(P_{\mathcal S}^{{\underline A}{{\underline B}\prime}}\) is null (i.e., the quasi-local mass is zero) iff D(Σ) is a pp-wave geometry and the matter is pure radiation.

In particular [498], for a coupled Einstein-Yang-Mills system (with compact, semisimple gauge groups) the zero quasi-local mass configurations are precisely the pp-wave solutions found by Güven [230]. Therefore, in contrast to the asymptotically flat cases, the vanishing of the mass does not imply the flatness of D(Σ). Since, as we will see below, the Dougan-Mason masses tend to the ADM mass at spatial infinity, there is a seeming contradiction between the rigidity part of the positive mass theorems and the result 2 above. However, this is only an apparent contradiction. In fact, according to one of the possible positive mass proofs [38], the vanishing of the ADM mass implies the existence of a constant null vector field on D(Σ), and then the flatness follows from the incompatibility of the conditions of the asymptotic flatness and the existence of a constant null vector field: The only asymptotically flat spacetime admitting a constant null vector field is flat spacetime.

These results show some sort of rigidity of the matter + gravity system (where the latter satisfies the dominant energy condition), even at the quasi-local level, which is much more manifest from the following equivalent form of the results 1 and 2. Under the same conditions D(Σ) is flat if and only if there exist two linearly-independent spinor fields on \({\mathcal S}\), which are constant with respect to Δe, and D(Σ) is a pp-wave geometry; the matter is pure radiation if and only if there exists a Δe-constant spinor field on \({\mathcal S}\) [490]. Thus, the full information that D(Σ) is flat/pp-wave is completely encoded, not only in the usual initial data on, but in the geometry of the boundary of Σ, as well. In Section 13.5 we return to the discussion of this phenomenon, where we will see that, assuming \({\mathcal S}\) is future and past convex, the whole line element of D(Σ) (and not only the information that it is some pp-wave geometry) is determined by the two-surface data on \({\mathcal S}\).

Comparing results 1 and 2 above with the properties of the quasi-local energy-momentum (and angular momentum) listed in Section 2.2.3, the similarity is obvious: \(P_{\mathcal S}^{{\underline A}{{\underline B}\prime}} = 0\) characterizes the ‘quasi-local vacuum state’ of general relativity, while \({m_{\mathcal S}} = 0\) is equivalent to ‘pure radiative quasi-local states’. The equivalence of \({E_{\mathcal S}} = 0\) and the flatness of D(Σ) show that curvature always yields positive energy, or, in other words, with this notion of energy no classical symmetry breaking can occur in general relativity. The ‘quasi-local ground states’ (defined by \({E_{\mathcal S}} = 0\)) are just the ‘quasi-local vacuum states’ (defined by the trivial value of the field variables on D(Σ)) [488], in contrast, for example, to the well known ϕ4 theories.

8.2.4 The various limits

Both definitions give the same standard expression for round spheres [171]. Although the limit of the Dougan-Mason masses for round spheres in Reissner-Nordström spacetime gives the correct irreducible mass of the Reissner-Nordström black hole on the horizon, the constructions do not work on the surface of bifurcation itself, because that is an exceptional two-surface. Unfortunately, without additional restrictions (e.g., the spherical symmetry of the two-surfaces in a spherically-symmetric spacetime) the mass of the exceptional two-surfaces cannot be defined in a limiting process, because, in general, the limit depends on the family of generic two-surfaces approaching the exceptional one [490].

Both definitions give the same, expected results in the weak field approximation and, for large spheres, at spatial infinity; both tend to the ADM energy-momentum [172]. (The Newtonian limit in the covariant Newtonian spacetime was studied in [564].) In nonvacuum both definitions give the same, expected expression (4.9) for small spheres, in vacuum they coincide in the r5 order with that of Ludvigsen and Vickers, but in the r6 order they differ from each other. The holomorphic definition gives Eq. (8.5), but in the analogous expression for the antiholomorphic energy-momentum, the numerical coefficient 4/(45G) is replaced by 1/(9G) [171]. The Dougan-Mason energy-momenta have also been calculated for large spheres of constant Bondi-type radial coordinate value r near future null infinity [171]. While the antiholomorphic construction tends to the Bondi-Sachs energy-momentum, the holomorphic one diverges in general. In stationary spacetimes they coincide and both give the Bondi-Sachs energy-momentum. At the past null infinity it is the holomorphic construction, which reproduces the Bondi-Sachs energy-momentum, and the antiholomorphic construction diverges.

We close this section with some caution and general comments on a potential gauge ambiguity in the calculation of the various limits. By the definition of the holomorphic and antiholomorphic spinor fields they are associated with the two-surface \({\mathcal S}\) only. Thus, if \({\mathcal S}{\prime}\) is another two-surface, then there is no natural isomorphism between the space — for example of the antiholomorphic spinor fields ker \({{\mathcal H}^ -}({\mathcal S})\) on \({\mathcal S}\) and ker \({{\mathcal H}^ -}({\mathcal S}{\prime})\) on \({{\mathcal S}{\prime}}\), even if both surfaces are generic and hence, there are isomorphisms between them.Footnote 12 This (apparently ‘only theoretical’) fact has serious pragmatic consequences. In particular, in the small or large sphere calculations we compare the energy-momenta, and hence, the holomorphic or antiholomorphic spinor fields as well, on different surfaces. For example [494], in the small-sphere approximation every spin coefficient and spinor component in the GHP dyad and metric component in some fixed coordinate system \((\zeta, \,\bar \zeta)\) is expanded as a series of r, as \({\lambda _{\mathbf{A}}}(r,\,\zeta, \,\bar \zeta) = {\lambda _{\mathbf{A}}}^{(0)}(\zeta, \,\bar \zeta) + r{\lambda _{\mathbf{A}}}^{(1)}(\zeta, \,\bar \zeta) + \cdots + {r^k}{\lambda _{\bf{A}}}^{(k)}(\zeta, \,\bar \zeta) + {\mathcal O}({r^{k + 1}})\). Substituting all such expansions and the asymptotic solutions of the Bianchi identities for the spin coefficients and metric functions into the differential equations defining the holomorphic/antiholomorphic spinors, we obtain a hierarchical system of differential equations for the expansion coefficients λA(0), λA(1), …, etc. It turns out that the solutions of this system of equations with accuracy form a 2k, rather than the expected two-complex-dimensional, space. 2(k − 1) of these 2k solutions are ‘gauge’ solutions, and they correspond in the approximation with given accuracy to the unspecified isomorphism between the space of the holomorphic/antiholomorphic spinor fields on surfaces of different radii. Obviously, similar ‘gauge’ solutions appear in the large sphere expansions, too. Therefore, without additional gauge fixing, in the expansion of a quasi-local quantity only the leading nontrivial term will be gauge-independent. In particular, the r6-order correction in Eq. (8.5) for the Dougan-Mason energy-momenta is well defined only as a consequence of a natural gauge choice.Footnote 13 Similarly, the higher-order corrections in the large sphere limit of the antiholomorphic Dougan-Mason energy-momentum are also ambiguous unless a ‘natural’ gauge choice is made. Such a choice is possible in stationary spacetimes.

8.3 A specific construction for the Kerr spacetime

Logically, this specific construction should be presented in Section 12, but the technique that it is based on justifies its placement here.

By investigating the propagation law, Eqs. (8.3) and (8.4) of Ludvigsen and Vickers for the Kerr spacetimes, Bergqvist and Ludvigsen constructed a natural flat, (but nonsymmetric) metric connection [85]. Writing the new covariant derivative in the form \({\tilde \nabla _{AA{\prime}}}{\lambda _B} = {\nabla _{AA{\prime}}}{\lambda _B} + {\Gamma _{AA{\prime}B}}^C{\lambda _C}\), the ‘correction’ term \({\Gamma _{AA\prime B}}^C\) could be given explicitly in terms of the GHP spinor dyad (adapted to the two principal null directions), the spin coefficients ρ, τ and ρ′, and the curvature component ψ2. \({\Gamma _{AA\prime B}}^C\) admits a potential [86]: \({\Gamma _{AA\prime BC}} = - {\nabla _{(C}}^{B{\prime}}{H_{B)}}_{AA{\prime}B{\prime}}\), where \({H_{ABA{\prime}B{\prime}}}: = {1 \over 2}{\rho ^{- 3}}(\rho + \bar \rho){\psi _2}{o_A}{o_B}{\bar o_{A{\prime}}}{\bar o_{B{\prime}}}\). However, this potential has the structure Hab = flalb appearing in the form of the metric \({g_{ab}} = g_{ab}^0 + f{l_a}{l_b}\) for the Kerr-Schild spacetimes, where \(g_{ab}^0\) is the flat metric. In fact, the flat connection \({\tilde \nabla _e}\) above could be introduced for general Kerr-Schild metrics [234], and the corresponding ‘correction term’ ΓAA′BC could be used to easily find the Lánczos potential for the Weyl curvature [18].

Since the connection \({\tilde \nabla _{AA{\prime}}}\) is flat and annihilates the spinor metric εAB, there are precisely two linearly-independent spinor fields, say \(\lambda _A^0\) and \(\lambda _A^1\), that are constant with respect to \({\tilde \nabla _{A{A\prime}}}\) and form a normalized spinor dyad. These spinor fields are asymptotically constant. Thus, it is natural to choose the spin space \(({{\mathbf{S}}^{\underline A}},\,{\varepsilon _{\underline A \underline B}})\) to be the space of the \({\tilde \nabla _a}\)-constant spinor fields, irrespectively of the two-surface \({\mathcal S}\).

A remarkable property of these spinor fields is that the Nester-Witten 2-form built from them is closed: \(du({\lambda ^{\underline A}},\,{\bar \lambda ^{{{\underline B}\prime}}}) = 0\). This implies that the quasi-local energy-momentum depends only on the homology class of \({\mathcal S}\), i.e., if \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\) are two-surfaces, such that they form the boundary of some hypersurface in M, then \(P_{{{\mathcal S}_1}}^{\underline A {{\underline B}\prime}} = P_{{{\mathcal S}_2}}^{\underline A {{\underline B}\prime}}\), and if \({\mathcal S}\) is the boundary of some hypersurface, then \(P_{\mathcal S}^{\underline A {{\underline B}\prime}} = 0\). In particular, for two-spheres that can be shrunk to a point, the energy-momentum is zero, but for those that can be deformed to a cut of the future null infinity, the energy-momentum is that of Bondi and Sachs.

9 Quasi-Local Spin Angular Momentum

In this section we review three specific quasi-local spin-angular-momentum constructions that are (more or less) ‘quasi-localizations’ of Bramson’s expression at null infinity. Thus, the quasi-local spin angular momentum for the closed, orientable spacelike two-surface \({\mathcal S}\) will be sought in the form (3.16). Before considering the specific constructions themselves, we summarize the most important properties of the general expression of Eq. (3.16). Since the most detailed discussion of Eq. (3.16) is probably given in [494, 496], the subsequent discussions will be based on them.

First, observe that the integral depends on the spinor dyad algebraically, thus it is enough to specify the dyad only at the points of \({\mathcal S}\). Obviously, \(J_{\mathcal S}^{\underline A\underline B}\) transforms like a symmetric second-rank spinor under constant SL(2, ℂ) transformations of the dyad \(\{\lambda _A^{\underline A}\}\). Second, suppose that the spacetime is flat, and let \(\{\lambda _A^{\underline A}\}\) be constant. Then the corresponding one-form basis \(\{\vartheta _a^{\underline a}\}\) is the constant Cartesian one, which consists of exact one-forms. Then, since the Bramson superpotential \(w({\lambda ^{\underline A}},{\lambda ^{\underline B}})\) is the anti-self-dual part (in the name indices) of \(\vartheta _a^{\underline a}\vartheta _b^{\underline b} - \vartheta _b^{\underline a}\vartheta _a^{\underline b}\), which is also exact, for such spinor bases, Eq. (3.16) gives zero. Therefore, the integral of Bramson’s superpotential (3.16) measures the nonintegrability of the one-form basis \(\vartheta _a^{{\underline A}{\underline A'}} = \lambda _A^{\underline A}\bar \lambda _{A'}^{{\underline A'}}\), i.e., \(J_{\mathcal S}^{\underline A\underline B}\) is a measure of how much the actual one-form basis is ‘distorted’ by the curvature relative to the constant basis of Minkowski spacetime.

Thus, the only question is how to specify a spin frame on \({\mathcal S}\) to be able to interpret \(J_{\mathcal S}^{\underline A\underline B}\) as angular momentum. It seems natural to choose those spinor fields that were used in the definition of the quasi-local energy-momenta in Section 8. At first sight this may appear to be only an ad hoc idea, but, recalling that in Section 8 we interpreted the elements of the spin spaces \(({\bf{S}}^{\underline A},{\varepsilon _{\underline A\underline B}})\) as the ‘spinor constituents of the quasi-translations of \({\mathcal S}\)’, we can justify such a choice. Based on our experience with the superpotentials for the various conserved quantities, the quasi-local angular momentum can be expected to be the integral of something like ‘superpotential’ × ‘quasi-rotation generator’, and the ‘superpotential’ is some expression in the first derivative of the basic variables, actually the tetrad or spinor basis. Since, however, Bramson’s superpotential is an algebraic expression of the basic variables, and the number of the derivatives in the expression for the angular momentum should be one, the angular momentum expressions based on Bramson’s superpotential must contain the derivative of the ‘quasi-rotations’, i.e., (possibly a combination of) the ‘quasi-translations’. Since, however, such an expression cannot be sensitive to the ‘change of the origin’, they can be expected to yield only the spin part of the angular momentum.

The following two specific constructions differ from each other only in the choice for the spin space \(({\bf{S}}^{\underline A},{\varepsilon _{\underline A\underline B}})\), and correspond to the energy-momentum constructions of the previous Section 8. The third construction (valid only in the Kerr spacetimes) is based on the sum of two terms, where one is Bramson’s expression, and uses the spinor fields of Section 8.3. Thus, the present section is not independent of Section 8, and, for the discussion of the choice of the spin spaces \(({\bf{S}}^{\underline A},{\varepsilon _{\underline A\underline B}})\), we refer to that.

Another suggestion for the quasi-local spatial angular momentum, proposed by Liu and Yau [338], will be introduced in Section 10.4.1.

9.1 The Ludvigsen-Vickers angular momentum

Under the conditions that ensured the Ludvigsen-Vickers construction for the energy-momentum would work in Section 8.1, the definition of their angular momentum is straightforward [346]. Since in Minkowski spacetime the Ludvigsen-Vickers spinors are just the restriction to \({\mathcal S}\) of the constant spinor fields, by the general remark above the Ludvigsen-Vickers spin angular momentum is zero in Minkowski spacetime.

Using the asymptotic solution of the Einstein-Maxwell equations in a Bondi-type coordinate system it has been shown in [346] that the Ludvigsen-Vickers spin angular momentum tends to that of Bramson at future null infinity. For small spheres [494] in nonvacuum it reproduces precisely the expected result (4.10), and in vacuum it is

$$J_{{{\mathcal S}_r}}^{\underline A \underline B} = {4 \over {45G}}{r^5}{T_{AA{\prime}BB{\prime}CC{\prime}DD{\prime}}}{t^{AA{\prime}}}{t^{BB{\prime}}}{t^{CC{\prime}}}\left({r{t^{D{\prime}E}}{\varepsilon ^{DF}}\varepsilon _{\left(E \right.}^{\underline A}\varepsilon _{\left. F \right)}^{\underline B}} \right) + {\mathcal O}({r^7}).$$
(9.1)

We stress that in both the vacuum and nonvacuum cases, the factor \(r{t^{D'E}}{\varepsilon ^{DF}}\;{\mathcal E}_{(E}^{\underline A}{\mathcal E}_{F)}^{\underline B}\), interpreted in Section 4.2.2 as an average of the boost-rotation Killing fields that vanish at p, emerges naturally. No (approximate) boost-rotation Killing field was put into the general formulae by hand.

9.2 Holomorphic/antiholomorphic spin angular momenta

Obviously, the spin-angular-momentum expressions based on the holomorphic and antiholomorphic spinor fields [492] on generic two-surfaces are genuinely quasi-local. Since, in Minkowski spacetime the restriction of the two constant spinor fields to any two-surface is constant, and hence holomorphic and antiholomorphic at the same time, both the holomorphic and antiholomorphic spin angular momenta are vanishing. Similarly, for round spheres both definitions give zero [496], as would be expected in a spherically-symmetric system. The antiholomorphic spin angular momentum has already been calculated for axisymmetric two-surfaces \({\mathcal S}\), for which the antiholomorphic Dougan-Mason energy-momentum is null, i.e., for which the corresponding quasi-local mass is zero. (As we saw in Section 8.2.3, this corresponds to a pp-wave geometry and pure radiative matter fields on D(Σ) [488, 490].) This null energy-momentum vector turned out to be an eigenvector of the anti-symmetric spin-angular-momentum tensor \(J_{\mathcal S}^{\underline A\underline B}\), which, together with the vanishing of the quasi-local mass, is equivalent to the proportionality of the (null) energy-momentum vector and the Pauli-Lubanski spin [492], where the latter is defined by

$$S_{\mathcal S}^{\underline a}: = {\textstyle{1 \over 2}}{\varepsilon ^{\underline a}}_{\underline b \underline c \underline d}P_{\mathcal S}^{\underline b}J_{\mathcal S}^{\underline c \underline d}.$$
(9.2)

This is a known property of the zero-rest-mass fields in Poincaré invariant quantum field theories [231].

Both the holomorphic and antiholomorphic spin angular momenta were calculated for small spheres [494]. In nonvacuum the holomorphic spin angular momentum reproduces the expected result (4.10), and, apart from a minus sign, the antiholomorphic construction does also. In vacuum, both definitions give exactly Eq. (9.1).

In general the antiholomorphic and the holomorphic spin angular momenta are diverging near the future null infinity of Einstein-Maxwell spacetimes as r and r2, respectively. However, the coefficient of the diverging term in the antiholomorphic expression is just the spatial part of the Bondi-Sachs energy-momentum. Thus, the antiholomorphic spin angular momentum is finite in the center-of-mass frame, and hence it seems to describe only the spin part of the gravitational field. In fact, the Pauli-Lubanski spin (9.2) built from this spin angular momentum and the antiholomorphic Dougan-Mason energy-momentum is always finite, free of the ‘gauge’ ambiguities discussed in Section 8.2.4, and is built only from the gravitational data, even in the presence of electromagnetic fields. In stationary spacetimes both constructions are finite and coincide with the ‘standard’ expression (4.15). Thus, the antiholomorphic spin angular momentum defines an intrinsic angular momentum at the future null infinity. Note that this angular momentum is free of supertranslation ambiguities, because it is defined on the given cut in terms of the solutions of elliptic differential equations. These solutions can be interpreted as the spinor constituents of certain boost-rotation BMS vector fields, but the definition of this angular momentum is not based on them [496].

9.3 A specific construction for the Kerr spacetime

The angular momentum of Bergqvist and Ludvigsen [86] for the Kerr spacetime is based on their special flat, nonsymmetric but metric, connection explained briefly in Section 8.3. But their idea is not simply the use of the two \({{\tilde \nabla}_e}\)-constant spinor fields in Bramson’s superpotential. Rather, in the background of their approach there are twistor-theoretical ideas. (The twistor-theoretic aspects of the analogous flat connection for the general Kerr-Schild class are discussed in [234].)

The main idea is that, while the energy-momentum is a single four-vector in the dual of the Hermitian subspace of \({{\bf{S}}^{\underline A}} \otimes {{{\bf{\bar S}}}^{\underline B{\prime}}}\), the angular momentum is not only an anti-symmetric tensor over the same space, but should depend on the ‘origin’, a point in a four-dimensional affine space M0 as well, and should transform in a specific way under the translation of the ‘origin’. Bergqvist and Ludvigsen defined the affine space M0 to be the space of the solutions Xa of \({{\tilde \nabla}_a}{X_b} = {g_{ab}} - {H_{ab}}\), and showed that M0 is, in fact, a real, four-dimensional affine space. Then, for a given Xaa′, to each \({{\tilde \nabla}_a}\)-constant spinor field λA they associate a primed spinor field by μA′Xa′aλA. This μA′ turns out to satisfy the modified valence-one twistor equation \({{\tilde \nabla}_{A(A{\prime}}}{\mu _{B{\prime})}} = - {H_{AA{\prime}BB{\prime}}}{\lambda ^B}\). Finally, they form the 2-form

$$W\,{(X,{\lambda ^{\underline A}},{\lambda ^{\underline B}})_{ab}}: = {\rm{i}}\left[ {\lambda _A^{\underline A}{\nabla _{B\,B{\prime}}}\left({{X_{A{\prime}C}}{\varepsilon ^{CD}}\lambda _D^{\underline B}} \right) - \lambda _B^{\underline A}{\nabla _{A\,A{\prime}}}\left({{X_{B{\prime}C}}{\varepsilon ^{CD}}\lambda _D^{\underline B}} \right) + {\varepsilon _{A{\prime}B{\prime}}}\lambda _{\left(A \right.}^{\underline A}\lambda _{\left. B \right)}^{\underline B}} \right],$$
(9.3)

and define the angular momentum \(J_{\mathcal S}^{\underline A\underline B}(X)\) with respect to the origin Xa as 1/(8πG) times the integral of \(W{(X,{\lambda ^{\underline A}},{\lambda ^{\underline B}})_{ab}}\) on some closed, orientable spacelike two-surface \({\mathcal S}\). Since this Wab is closed, Δ[aWbc] = 0 (similar to the Nester-Witten 2-form in Section 8.3), the integral \(J_{\mathcal S}^{\underline A\underline B}(X)\) depends only on the homology class of \({\mathcal S}\). Under the ‘translation’ XeXe + ae of the ‘origin’ by a \({{\tilde \nabla}_a}\)-constant one-form ae, it transforms as \(J_{\mathcal S}^{\underline A\underline B}(\tilde X) = J_{\mathcal S}^{\underline A\underline B}(X) + {a^{(\underline A}}_{\underline B{\prime}}P_{\mathcal S}^{\underline B)\underline B{\prime}}\), where the components \({a_{\underline A\underline B{\prime}}}\) are taken with respect to the basis \(\{\lambda _A^{\underline A}\}\) in the solution space. Unfortunately, no explicit expression for the angular momentum in terms of the Kerr parameters m and a is given.

10 The Hamilton-Jacobi Method

If one is concentrating only on the introduction and study of the properties of quasi-local quantities, and is not interested in the detailed structure of the quasi-local (Hamiltonian) phase space, then perhaps the most natural way to derive the general formulae is to follow the Hamilton-Jacobi method. This was done by Brown and York in deriving their quasi-local energy expression [120, 121]. However, the Hamilton-Jacobi method in itself does not yield any specific construction. Rather, the resulting general expression is similar to a superpotential in the Lagrangian approaches, which should be completed by a choice for the reference configuration and for the generator vector field of the physical quantity (see Section 3.3.3). In fact, the ‘Brown-York quasi-local energy’ is not a single expression with a single well-defined prescription for the reference configuration. The same general formula with several other, mathematically-inequivalent definitions for the reference configurations are still called the ‘Brown-York energy’. A slightly different general expression was used by Kijowski [315], Epp [178], Liu and Yau [338] and Wang and Yau [544]. Although the former follows a different route to derive his expression and the latter three are not connected directly to the canonical analysis (and, in particular, to the Hamilton-Jacobi method), the formalism and techniques that are used justify their presentation in this section.

The present section is mainly based on the original papers [120, 121] by Brown and York. Since, however, this is the most popular approach to finding quasi-local quantities and is the subject of very active investigations, especially from the point of view of the applications in black hole physics, this section is perhaps less complete than the previous ones. The expressions of Kijowski, Epp, Liu and Yau and Wang and Yau will be treated in the formalism of Brown and York.

10.1 The Brown-York expression

10.1.1 The main idea

To motivate the main idea behind the Brown-York definition [120, 121], let us first consider a classical mechanical system of n degrees of freedom with configuration manifold Q and Lagrangian L: TQ × ℝ → ℝ (i.e., the Lagrangian is assumed to be first order and may depend on time explicitly). For given initial and final configurations, \((q_1^a,{t_1})\) and \((q_2^a,{t_2})\), respectively, the corresponding action functional is \({I^1}[q(t)]\;: = \int\nolimits_{{t_1}}^{{t_2}} {L({q^a}(t),{{\dot q}^a}(t),t)\;dt}\), where qa(t) is a smooth curve in Q from \({q^a}({t_1}) = q_1^a\) to \({q^a}({t_2}) = q_2^a\) with tangent \({{\dot q}^a}(t)\) at t. (The pair (qa(t), t) may be called a history or world line in the ‘spacetime’ Q × ℝ.) Let (qa(u, t(u)), t(u)) be a smooth one-parameter deformation of this history, for which (qa(0, t(0)), t(0)) = (qa(t), t), and u ∈ (−ϵ, ϵ) for some ϵ > 0. Then, denoting the derivative with respect to the deformation parameter u at u = 0 by δ, one has the well known expression

$$\delta {I^1}[q(t)] = \int\nolimits_{{t_1}}^{{t_2}} {\left({{{\partial L} \over {\partial {q^a}}} - {d \over {dt}}{{\partial L} \over {\partial {{\dot q}^a}}}} \right)} \;(\delta {q^a} - {\dot q^a}\delta t)\;dt + {{\partial L} \over {\partial {{\dot q}^a}}}\delta {q^a}\vert _{{t_1}}^{{t_2}} - \left({{{\partial L} \over {\partial {{\dot q}^a}}}{{\dot q}^a} - L} \right)\;\delta t\vert _{{t_1}}^{{t_2}}.$$
(10.1)

Therefore, introducing the Hamilton-Jacobi principal function \({S^1}(q_1^a,{t_1};q_2^a,{t_2})\) as the value of the action on the solution qa(t) of the equations of motion from \((q_1^a,{t_1})\) to \((q_2^a,{t_2})\), the derivative of S1 with respect to \(q_2^a\) gives the canonical momenta \(p_a^1: = (\partial L/\partial {{\dot q}^a})\), while its derivative with respect to t2 gives minus the energy, \(- {E^1} = - (p_a^1{{\dot q}^a} - L)\), at t2. Obviously, neither the action I1 nor the principal function S1 are unique: I[q(t)] ≔ I1[q(t)] − I0[q(t)] for any I0[q(t)] of the form \(- {E^1} = - (p_a^1{{\dot q}^a} - L)\) (dh/dt) dt with arbitrary smooth function h = h(qa(t), t) is an equally good action for the same dynamics. Clearly, the subtraction term I0[q(t)] alters both the canonical momenta and the energy according to \(p_a^1 \mapsto {p_a} = p_a^1 - (\partial h/\partial {q^a})\) and E1E = E1 + (∂h/∂t), respectively.

10.1.2 The variation of the action and the surface stress-energy tensor

The main idea of Brown and York [120, 121] is to calculate the analogous variation of an appropriate first-order action of general relativity (or of the coupled matter + gravity system) and isolate the boundary term that could be analogous to the energy above. To formulate this idea mathematically, Brown and York considered a compact spacetime domain D with topology Σ × [t1,t2] such that Σ × {t} correspond to compact spacelike hypersurfaces Σt; these form a smooth foliation of D and the two-surfaces \({{\mathcal S}_t}: = \partial {\Sigma _t}\) (corresponding to Σ × {t}) form a foliation of the timelike three-boundary 3B of D. Note that this D is not a globally hyperbolic domain.Footnote 14 To ensure the compatibility of the dynamics with this boundary, the shift vector is usually chosen to be tangent to St on 3B. The orientation of 3B is chosen to be outward pointing, while the normals, both of \({\Sigma _1}: = {\Sigma _{{t_1}}}\) and of \({\Sigma _2}: = {\Sigma _{{t_2}}}\), are chosen to be future pointing. The metric and extrinsic curvature on Σt will be denoted, respectively, by hab and χab, and those on 3B by γab and Θab.

The primary requirement of Brown and York on the action is to provide a well-defined variational principle for the Einstein theory. This claim leads them to choose for I1 the ‘trace K action’ (or, in the present notation, the ‘trace χ action’) for general relativity [572, 573, 534], and the action for the matter fields may be included. (For minimal, nonderivative couplings, the presence of the matter fields does not alter the subsequent expressions.) However, as Geoff Hayward pointed out [243], to have a well-defined variational principle, the ‘trace χ action’ should in fact be completed by two two-surface integrals, one on \({{\mathcal S}_1}\) and the other on \({{\mathcal S}_2}\). Otherwise, as a consequence of the edges \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\), called the ‘joints’ (i.e., the nonsmooth parts of the boundary ∂D), the variation of the metric at the points of the edges \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\) could not be arbitrary. (See also [242, 315, 100, 119], where the ‘orthogonal boundaries assumption’ is also relaxed.) Let η1 and η2 be the scalar product of the outward-pointing normal of 3B and the future-pointing normal of Σ1 and of Σ2, respectively. Then, varying the spacetime metric (for the variation of the corresponding principal function S1) they obtained the following:

$$\begin{array}{*{20}c} {\delta {S^1} = \int\nolimits_{{\Sigma _2}} {{1 \over {16\pi G}}\sqrt {\vert h\vert} \,({\chi ^{ab}} - \chi {h^{ab}})\;\delta {h_{ab}}{d^3}x -} \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {- \int\nolimits_{{\Sigma _1}} {{1 \over {16\pi G}}\sqrt {\vert h\vert} \,({\chi ^{ab}} - \chi {h^{ab}})\;\delta {h_{ab}}{d^3}x -} \quad \quad \quad \quad \quad \quad \quad} \\ {- \int\nolimits_{{}^3B} {{1 \over {16\pi G}}\sqrt {\vert \gamma \vert} \,({\Theta ^{ab}} - \Theta {\gamma ^{ab}})\;\delta {\gamma _{ab}}\,{d^3}x} - \quad \quad \quad \quad \quad \quad \;\;\;} \\ {\quad - {1 \over {8\pi G}}\oint\nolimits_{{{\mathcal S}_2}} {{{\tanh}^{- 1}}{\eta _2}\delta \sqrt {\vert q\vert} {d^2}x} + {1 \over {8\pi G}}\oint\nolimits_{{{\mathcal S}_1}} {{{\tanh}^{- 1}}{\eta _1}\delta \sqrt {\vert q\vert} {d^2}x}.} \\ \end{array}$$
(10.2)

The first two terms together correspond to the term \(p_a^1\delta {q^a}\vert _{{t_1}}^{{t_2}}\) of Eq. (10.1), and, in fact, the familiar ADM expression for the canonical momentum \({{\tilde p}^{ab}}\) is just \({1 \over {16\pi G}}\sqrt {\vert h\vert} ({\chi ^{ab}} - \chi {h^{ab}})\). The last two terms give the effect of the presence of the nondifferentiable ‘joints’. Therefore, it is the third term that should be analogous to the third term of Eq. (10.1). In fact, roughly, this is proportional to the proper time separation of the ‘instants’ Σ1 and Σ2, and it is reasonable to identify its coefficient as some (quasi-local) analog of the energy. However, just as in the case of the mechanical system, the action (and the corresponding principal function) is not unique, and the principal function should be written as SS1S0, where S0 is assumed to be an arbitrary function of the three-metric on the boundary ∂D = Σ23B ∪ Σ1. Then

$${\tau ^{ab}}: = - {2 \over {\sqrt {\vert \gamma \vert}}}{{\delta S} \over {\delta {\gamma _{ab}}}} = {1 \over {8\pi G}}({\Theta ^{ab}} - \Theta {\gamma ^{ab}}) + {2 \over {\sqrt {\vert \gamma \vert}}}{{\delta {S^0}} \over {\delta {\gamma _{ab}}}}$$
(10.3)

defines a symmetric tensor field on the timelike boundary 3B, and is called the surface stress-energy tensor. (Since our signature for γab on 3B is (+, −, −) rather than (−, +, +), we should define τab with the extra minus sign, according to Eq. (2.1).) Its divergence with respect to the connection 3 De on 3B determined by γab is proportional to the part γabTbcυc of the energy-momentum tensor, and hence, in particular, τab is divergence-free in vacuum. Therefore, if (3B, γab) admits a Killing vector, say Ka, then, in vacuum

$${Q_{\mathcal S}}\,[{\bf{K}}]: = \oint\nolimits_{\mathcal S} {{K_a}{\tau ^{ab}}{{\bar t}_b}\,d{\mathcal S}},$$
(10.4)

the flux integral of τabKb on any spacelike cross section \({\mathcal S}\) of 3B, is independent of the cross section itself, and hence, defines a conserved charge. If Ka is timelike, then the corresponding charge is called a conserved mass, while for spacelike Ka with closed orbits in \({\mathcal S}\) the charge is called angular momentum. (Here \({\mathcal S}\) is not necessarily an element of the foliation \({{\mathcal S}_t}\)t of 3B, and \({{\bar t}^a}\) is the unit normal to \({\mathcal S}\) tangent to 3B.)

Clearly, the trace-χ action cannot be recovered as the volume integral of some scalar Lagrangian, because it is the Hilbert action plus a boundary integral of the trace χ, and the latter depends on the location of the boundary itself. Such a Lagrangian was found by Pons [431]. This depends on the coordinate system adapted to the boundary of the domain D of integration. An interesting feature of this Lagrangian is that it is second order in the derivatives of the metric, but it depends only on the first time derivative. A detailed analysis of the variational principle, the boundary conditions and the conserved charges is given. In particular, the asymptotic properties of this Lagrangian is similar to that of the ΓΓ Lagrangian of Einstein, rather than to that of Hilbert.

10.1.3 The general form of the Brown-York quasi-local energy

The 3 + 1 decomposition of the spacetime metric yields a 2 + 1 decomposition of the metric γab, as well. Let N and Na be the lapse and the shift of this decomposition on 3B. Then the corresponding decomposition of τab defines the energy, momentum, and spatial-stress surface densities according to

$$\varepsilon : = {t_a}{t_b}{\tau ^{ab}} = - {1 \over {8\pi G}}k + {1 \over {\sqrt {\vert q\vert}}}{{\delta {S^0}} \over {\delta N}},$$
(10.5)
$${j_a}: = - {q_{ab}}{t_c}{\tau ^{bc}} = {1 \over {8\pi G}}{A_a} + {1 \over {\sqrt {\vert q\vert}}}{{\delta {S^0}} \over {\delta {N^a}}},$$
(10.6)
$${s}^{ab}: = \Pi _c^a\Pi _d^b{\tau ^{cd}} = {1 \over {8\pi G}}\left[ {{k^{ab}} - k{q^{ab}} + {q^{ab}}{t^e}({\nabla _e}{t_f})\;{\upsilon ^f}} \right] + {2 \over {\sqrt {\vert q\vert}}}{{\delta {S^0}} \over {\delta {q_{ab}}}},$$
(10.7)

where qab is the spacelike two-metric, Ae is the SO(1,1) vector potential on \({{\mathcal S}_t}\), \(\Pi _b^a\) is the projection to \({{\mathcal S}_t}\) introduced in Section 4.1.2, kab is the extrinsic curvature of \({{\mathcal S}_t}\) corresponding to the normal va orthogonal to 3B, and k is its trace. The timelike boundary 3B defines a boost-gauge on the two-surfaces \({{\mathcal S}_t}\) (which coincides with that determined by the foliation Σt in the ‘orthogonal boundaries’ case). The gauge potential Ae is taken in this gauge. Thus, although ε and ja on \({{\mathcal S}_t}\) are built from the two-surface data (in a particular boost-gauge), the spatial surface stress depends on the part ta(∇atb)vb of the acceleration of the foliation Σt as well. Let ξa be any vector field on 3B tangent to 3B, and ξa = nta + na its 2 + 1 decomposition. Then we can form the charge integral (10.4) for the leaves \({{\mathcal S}_t}\) of the foliation of 3B

$${E_t}[{\xi ^a},{t^a}]: = \oint\nolimits_{{{\mathcal S}_t}} {{\xi _a}{\tau ^{ab}}{t_b}\,d{{\mathcal S}_t}} = \oint\nolimits_{{{\mathcal S}_t}} {(n\varepsilon - {n^a}{j_a})\;d{{\mathcal S}_t}}.$$
(10.8)

Obviously, in general Et[ξa, ta] is not conserved, and depends not only on the vector field ξa and the two-surface data on the particular \({{\mathcal S}_t}\), but on the boost-gauge that 3B defines on \({t^a}\), i.e., the timelike normal ta as well. Brown and York define the general form of their quasi-local energy on \({\mathcal S}: = {{\mathcal S}_t}\) by

$${E_{{\rm{BY}}}}({\mathcal S},{t^a}): = {E_t}\;[{t^a},{t^a}],$$
(10.9)

i.e., they link the ‘quasi-time-translation’ (i.e., the ‘generator of the energy’) to the preferred unit normal ta of \({{\mathcal S}_t}\). Since the preferred unit normals ta are usually interpreted as a fleet of observers who are at rest with respect to \({{\mathcal S}_t}\), in their spirit the Brown-York-type quasi-local energy expressions are similar to EΣ[ta] given by Eq. (2.6) for the matter fields or Eq. (3.17) for the gravitational ‘field’ rather than to the charges \({Q_{\mathcal S}}[{\bf{K}}]\). For vector fields ξa = na with closed integral curved in \({{\mathcal S}_t}\) the quantity Et[ξa, ta] might be interpreted as angular momentum corresponding to ξa.

The quasi-local energy is still not completely determined, because the ‘subtraction term’ S0 in the principal function has not been specified. This term is usually interpreted as our freedom to shift the zero point of the energy. Thus, the basic idea of fixing the subtraction term is to choose a ‘reference configuration’, i.e., a spacetime in which we want to obtain zero quasi-local quantities Et[ξa, ta] (in particular zero quasi-local energy), and identify S0 with the S1 of the reference spacetime. Thus, by Eq. (10.5) and (10.6) we obtain that

$$\begin{array}{*{20}c} {\varepsilon = - {1 \over {8\pi G}}(k - {k^0}),} & {{j_a} = {1 \over {8\pi G}}({A_a} - A_a^0),} \\ \end{array}$$
(10.10)

where k0 and \(A_a^0\) are the reference values of the trace of the extrinsic curvature and SO(1, 1)-gauge potential, respectively. Note that to ensure that k0 and \(A_a^0\) really be the trace of the extrinsic curvature and SO(1, 1)-gauge potential, respectively, in the reference spacetime, they cannot depend on the lapse N and the shift Na. This can be ensured by requiring that S0 be a linear functional of them. We return to the discussion of the reference term in the various specific constructions below.

For a definition of the Brown-York energy as a quasi-local energy oparator in loop quantum gravity, see [565].

10.1.4 Further properties of the general expressions

As we noted, ε, ja, and sab depend on the boost-gauge that the timelike boundary defines on \({{\mathcal S}_t}\). Lau clarified how these quantities change under a boost gauge transformation, where the new boost-gauge is defined by the timelike boundary 3B′ of another domain D′such that the particular two-surface St is a leaf of the foliation of 3B′ as well [333]. If \(\{{{\bar \Sigma}_t}\}\) is another foliation of D such that \(\partial {{\bar \Sigma}_t} = {{\mathcal S}_t}\) and \({{\bar \Sigma}_t}\) is orthogonal to 3B, then the new ε′, ja, and \(s_{ab}{\prime}\) are built from the old ε, ja, and sab and the 2 + 1 pieces on \({{\mathcal S}_t}\) of the canonical momentum \({{\bar \tilde p}^{ab}}\), defined on \({{\bar \Sigma}_t}\). Apart from the contribution of S0, these latter quantities are

$${j_ \vdash}: = {2 \over {\sqrt {\vert h\vert}}}{\upsilon _a}{\upsilon _b}{\bar \tilde p^{ab}} = {1 \over {8\pi G}}l,$$
(10.11)
$${\hat j_a}: = {2 \over {\sqrt {\vert h\vert}}}{q_{ab}}{\upsilon _c}{\bar \tilde p^{bc}} = {1 \over {8\pi G}}{A_a},$$
(10.12)
$${t_{ab}}: = {2 \over {\sqrt {\vert h\vert}}}{q_{ac}}{q_{bd}}{\bar \tilde p^{cd}} = {1 \over {8\pi G}}\;[{l_{ab}} - {q_{ab}}\;(l + {\upsilon ^e}({\nabla _e}{\upsilon _f}){t^e})],$$
(10.13)

where lab is the extrinsic curvature of \({{\mathcal S}_t}\) corresponding to its normal ta (we denote this by τab in Section 4.1.2), and l is its trace. (By Eq. (10.12) \({{\hat j}_a}\) is not an independent quantity, that is just ja. These quantities were originally introduced as the variational derivatives of the principal function with respect to the lapse, the shift and the two-metric of the radial foliation of Σt [333, 119], which are, in fact, essentially the components of the canonical momentum.) Thus, the required transformation formulae for ε, ja, and sab follow from the definitions and those for the extrinsic curvature and the SO(1, 1) gauge potential of Section 4.1.2. The various boost-gauge invariant quantities that can be built from ε, ja, sab, j, and tab are also discussed in [333, 119].

Lau repeated the general analysis above using the tetrad (in fact, triad) variables and the Ashtekar connection on the timelike boundary, instead of the traditional ADM-type variables [331]. Here the energy and momentum surface densities are re-expressed by the superpotential \({\vee _b}^{ae}\), given by Eq. (3.6), in a frame adapted to the two-surface. (Lau called the corresponding superpotential 2-form the ‘Sparling 2-form’.) However, in contrast to the usual Ashtekar variables on a spacelike hypersurface [30], the time gauge cannot be imposed globally on the boundary Ashtekar variables. In fact, while every orientable three-manifold Σ is parallelizable [410], and hence, a globally-defined orthonormal triad can be given on Σ, the only parallelizable, closed, orientable two-surface is the torus. Thus, on 3B, we cannot impose the global time gauge condition with respect to any spacelike two-surface \({\mathcal S}\) in 3B unless \({\mathcal S}\) is a torus. Similarly, the global radial gauge condition in the spacelike hypersurfaces Σt (even in a small open neighborhood of the whole two-surfaces \({{\mathcal S}_t}\) in Σt) can be imposed on a triad field only if the two-boundaries \({{\mathcal S}_t} = \partial {\Sigma _t}\) are all tori. Obviously, these gauge conditions can be imposed on every local trivialization domain of the tangent bundle \(T{{\mathcal S}_t}\) of \({{\mathcal S}_t}\). However, since in Lau’s local expressions only geometrical objects (like the extrinsic curvature of the two-surface) appear, they are valid even globally (see also [332]). On the other hand, further investigations are needed to clarify whether or not the quasi-local Hamiltonian, using the Ashtekar variables in the radial-time gauge [333], is globally well defined.

In general, the Brown-York quasi-local energy does not have any positivity property even if the matter fields satisfy the dominant energy conditions. However, as G. Hayward pointed out [244], for the variations of the metric around the vacuum solutions that extremalize the Hamiltonian, called the ‘ground states’, the quasi-local energy cannot decrease. On the other hand, the interpretation of this result as a ‘quasi-local dominant energy condition’ depends on the choice of the time gauge above, which does not exist globally on the whole two-surface \({\mathcal S}\).

Booth and Mann [100] shifted the emphasis from the foliation of the domain D to the foliation of the boundary 3B. (These investigations were extended to include charged black holes in [101], where the gauge dependence of the quasi-local quantities is also examined.) In fact, from the point of view of the quasi-local quantities defined with respect to the observers with world lines in 3B and orthogonal to \({\mathcal S}\), it is irrelevant how the spacetime domain D is foliated. In particular, the quasi-local quantities cannot depend on whether or not the leaves Σt of the foliation of D are orthogonal to 3B. As a result, Booth and Mann recovered the quasi-local charge and energy expressions of Brown and York derived in the ‘orthogonal boundary’ case. However, they suggested a new prescription for the definition of the reference configuration (see Section 10.1.8). Also, they calculated the quasi-local energy for round spheres in the spherically-symmetric spacetimes with respect to several moving observers, i.e., in contrast to Eq. (10.9), they did not link the generator vector field ξa to the normal ta of \({{\mathcal S}_t}\). In particular, the world lines of the observers are not integral curves of (/∂t) in the coordinate basis given in Section 4.2.1 on the round spheres.

Using an explicit, nondynamic background metric \(g_{ab}^0\), one can construct a covariant first-order Lagrangian \(L({g_{ab}},g_{ab}^0)\) for general relativity [306], and one can use the action \({I_D}[{g_{ab}},g_{ab}^0]\) based on this Lagrangian instead of the trace χ action. Fatibene, Ferraris, Francaviglia, and Raiteri [184] clarified the relationship between the two actions, \({I_D}[{g_{ab}}]\) and \({I_D}[{g_{ab}},g_{ab}^0]\), and the corresponding quasi-local quantities. Considering the reference term S0 in the Brown-York expression as the action of the background metric \(g_{ab}^0\) (which is assumed to be a solution of the field equations), they found that the two first-order actions coincide if the spacetime metrics gab and \(g_{ab}^0\) coincide on the boundary ∂D. Using \(L({g_{ab}},g_{ab}^0)\), they construct the conserved Noether current for any vector field ξa and, by taking its flux integral, define charge integrals \({Q_{\mathcal S}}[{\xi ^a},{g_{ab}},g_{ab}^0]\) on two-surfaces \({\mathcal S}\).Footnote 15 Again, the Brown-York quasi-local quantity Et[ξa, ta] and \({Q_{{{\mathcal S}_t}}}[{\xi ^a},{g_{ab}},g_{ab}^0]\) coincide if the spacetime metrics coincide on the boundary ∂D and if ξa has some special form. Therefore, although the two approaches are basically equivalent under the boundary condition above, this boundary condition is too strong from both the point of view of the variational principle and that of the quasi-local quantities. We will see in Section 10.1.8 that even the weaker boundary condition, that requires only the induced three-metrics on 3B fromgab and from \(g_{ab}^0\) to be the same, is still too strong.

10.1.5 The Hamiltonians

If we can write the action I[q(t)] of our mechanical system into the canonical form \(\int\nolimits_{{t_1}}^{{t_2}} {[{p_a}{{\dot q}^a} - H({q^a},{p_a},t)]}\), then it is straightforward to read off the Hamiltonian of the system. Thus, having accepted the trace χ action as the action for general relativity, it is natural to derive the corresponding Hamiltonian in the analogous way. Following this route Brown and York derived the Hamiltonian, corresponding to the ‘basic’ (or nonreferenced) action I1 as well [121]. They obtained the familiar integral of the sum of the Hamiltonian and the momentum constraints, weighted by the lapse N and the shift Na, respectively, plus Et[Nta + Na, ta], given by Eq. (10.8), as a boundary term. This result is in complete agreement with the expectations, as their general quasi-local quantities can also be recovered as the value of the Hamiltonian on the constraint surface (see also [100]). This Hamiltonian was investigated further in [119]. Here all the boundary terms that appear in the variation of their Hamiltonian are determined and decomposed with respect to the two-surface Σ. It is shown that the change of the Hamiltonian under a boost of Σ yields precisely the boosts of the energy and momentum surface density discussed above.

Hawking, Horowitz, and Hunter also derived the Hamiltonian from the trace χ action \(I_D^1[{g_{ab}}]\) both with the orthogonal [241] and nonorthogonal boundary assumptions [242]. They allowed matter fields ΦN, whose dynamics is governed by a first-order action \(I_{{\rm{m}}D}^1[{g_{ab}},{\Phi _N}]\), to be present. However, they treated the reference configuration in a different way. In the traditional canonical analysis of the fields and the geometry based on a noncompact Σ (for example in the asymptotically flat case) one has to impose certain falloff conditions that ensure the finiteness of the action, the Hamiltonian, etc. This finiteness requirement excludes several potentially interesting field + gravity configurations from our investigations. In fact, in the asymptotically flat case we compare the actual matter + gravity configurations with the flat spacetime + vanishing matter fields configuration. Hawking and Horowitz generalized this picture by choosing a static, but otherwise arbitrary, solution \(g_{ab}^0\), \(\Phi _N^0\) of the field equations, considered the timelike boundary 3B of D to be a timelike cylinder ‘near the infinity’, and considered the action

$${I_D}\,[{g_{ab}},{\Phi _N}]: = I_D^1\,[{g_{ab}}] + I_{{\rm{m}}D}^1\,[{g_{ab}},{\Phi _N}] - I_D^1\left[ {g_{ab}^0} \right] - I_{{\rm{m}}D}^1\,[g_{ab}^0,\Phi _N^0]$$

and those matter + gravity configurations that induce the same value on 3B as and \(\Phi _N^0\) and \(g_{ab}^0\). Its limit as 3B is ‘pushed out to infinity’ can be finite, even if the limit of the original (i.e., nonreferenced) action is infinite. Although in the nonorthogonal boundaries case the Hamiltonian derived from the nonreferenced action contains terms coming from the ‘joints’, by the boundary conditions at 3B they are canceled from the referenced Hamiltonian. This latter Hamiltonian coincides with that obtained in the orthogonal boundaries case. Both the ADM and the Abbott-Deser energy can be recovered from this Hamiltonian [241], and the quasi-local energy for spheres in domains with nonorthogonal boundaries in the Schwarzschild solution is also calculated [242]. A similar Hamiltonian, including the ‘joints’ or ‘corner’ terms, was obtained by Francaviglia and Raiteri [191] for the vacuum Einstein theory (and for Einstein-Maxwell systems in [9]), using a Noether charge approach. Their formalism, using the language of jet bundles, is, however, slightly more sophisticated than that common in general relativity.

Booth and Fairhurst [95] reexamined the general form of the Brown-York energy and angular momentum from a Hamiltonian point of view.Footnote 16 Their starting point is the observation that the domain D is not isolated from its environment, thus, the quasi-local Hamiltonian cannot be time independent. Therefore, instead of the standard Hamiltonian formalism for the autonomous systems, a more general formalism, based on the extended phase space, must be used. This phase space consists of the usual bulk configuration and momentum variables \(({h_{ab}},{{\tilde p}^{ab}})\) on the typical three-manifold Σ and the time coordinate t, the space coordinates xA on the two-boundary \({\mathcal S} = \partial \Sigma\), and their conjugate momenta π and πa.

The second important observation of Booth and Fairhurst is that the Brown-York boundary conditions are too restrictive. The two-metric, lapse, and shift need not be fixed, but their variations corresponding to diffeomorphisms on the boundary must be allowed. Otherwise diffeomorphisms that are not isometries of the three-metric γab on 3B cannot be generated by any Hamiltonian. Relaxing the boundary conditions appropriately, they show that there is a Hamiltonian on the extended phase space, which generates the correct equations of motions, and the quasi-local energy and angular momentum expression of Brown and York are just (minus) the momentum π conjugate to the time coordinate t. The only difference between the present and the original Brown-York expressions is the freedom in the functional form of the unspecified reference term. Because of the more restrictive boundary conditions of Brown and York, their reference term is less restricted. Choosing the same boundary conditions in both approaches, the resulting expressions coincide completely.

10.1.6 The flat space and light cone references

The quasi-local quantities introduced above become well defined only if the subtraction term S0 in the principal function is specified. The usual interpretation of a choice for S0 is the calibration of the quasi-local quantities, i.e., fixing where to take their zero value.

The only restriction on S0 that we had is that it must be a functional of the metric γab on the timelike boundary 3B. To specify S0, it seems natural to expect that the principal function S be zero in Minkowski spacetime [216, 120]. Then S0 would be the integral of the trace Θ0 of the extrinsic curvature of 3B, if it were embedded in Minkowski spacetime with the given intrinsic metric γab. However, a general Lorentzian three-manifold (3B, γab) cannot be isometrically embedded, even locally, into the Minkowski spacetime. (For a detailed discussion of this embedability, see [120] and Section 10.1.8.)

Another assumption on S0 might be the requirement of the vanishing of the quasi-local quantities, or of the energy and momentum surface densities, or only of the energy surface density ε, in some reference spacetime, e.g., in Minkowski or anti-de Sitter spacetime. Assuming that S0 depends on the lapse N and shift Na linearly, the functional derivatives (∂S0/∂N) and (∂S0/∂Na) depend only on the two-metric qab and on the boost-gauge that 3B defined on \({{\mathcal S}_t}\). Therefore, ε and ja take the form (10.10), and, by the requirement of the vanishing of ε in the reference spacetime it follows that k0 should be the trace of the extrinsic curvature of \({{\mathcal S}_t}\) in the reference spacetime. Thus, it would be natural to fix k0 as the trace of the extrinsic curvature of \({{\mathcal S}_t}\), when (\({{\mathcal S}_t}\), qab) is embedded isometrically into the reference spacetime. However, this embedding is far from unique (since, in particular, there are two independent normals of \({{\mathcal S}_t}\) in the spacetime and it would not be fixed which normal should be used to calculate k0), and hence the construction would be ambiguous. On the other hand, one could require (\({{\mathcal S}_t}\), qab) to be embedded into flat Euclidean three-space, i.e., into a spacelike hyperplane of Minkowski spacetime. This is the choice of Brown and York [120, 121]. In fact, as we already noted in Section 4.1.3, for two-surfaces with everywhere positive scalar curvature, such an embedding exists and is unique. (The order of the differentiability of the metric is reduced in [261] to C2.) A particularly interesting two-surface that cannot be isometrically embedded into the flat three-space is the event horizon of the Kerr black hole, if the angular momentum parameter a exceeds the irreducible mass (but is still not greater than the mass parameter m), i.e., if \(\sqrt 3 m < 2\vert a\vert \; < 2m\) [463]. (On the other hand, for its global isometric embedding into ℝ4, see [203].) Thus, the construction works for a large class of two-surfaces, but certainly not for every potentially interesting two-surface. The convexity condition is essential.

It is known that the (local) isometric embedability of (\({\mathcal S}\), qab) into flat three-space with extrinsic curvature \(k_{ab}^0\) is equivalent to the Gauss-Codazzi-Mainardi equations \({\delta _a}({k^{0a}}_b - \delta _b^a{k^0}) = 0\) and \(^{\mathcal S}R - {({k^0})^2} + k_{ab}^0{k^{0ab}} = 0\). Here δa is the intrinsic Levi-Civita covariant derivative and \(^{\mathcal S}R\) is the corresponding curvature scalar on \({\mathcal S}\) determined by qab. Thus, for given qab and (actually the flat) embedding geometry, these are three equations for the three components of \(k_{ab}^0\), and hence, if the embedding exists, qab determines k0. Therefore, the subtraction term k0 can also be interpreted as a solution of an under-determined elliptic system, which is constrained by a nonlinear algebraic equation. In this form the definition of the reference term is technically analogous to the definition of those in Sections 7, 8, and 9, but, by the nonlinearity of the equations, in practice it is much more difficult to find the reference term k0 than the spinor fields in the constructions of Sections 7, 8, and 9.

Accepting this choice for the reference configuration, the reference SO(1,1) gauge potential \(A_a^0\) will be zero in the boost-gauge in which the timelike normal of \({{\mathcal S}_t}\) in the reference Minkowski spacetime is orthogonal to the spacelike three-plane, because this normal is constant. Thus, to summarize, for convex two-surfaces, the flat space reference of Brown and York is uniquely determined, k0 is determined by this embedding, and \(A_a^0 = 0\). Then \(8\pi G{S^0} = - \int\nolimits_{{{\mathcal S}_t}} {N{k^0}} d{{\mathcal S}_t}\), from which sab can be calculated (if needed). The procedure is similar if, instead of a spacelike hyperplane of Minkowski spacetime, a spacelike hypersurface of constant curvature (for example in the de Sitter or anti-de Sitter spacetime) is used. The only difference is that extra (known) terms appear in the Gauss-Codazzi-Mainardi equations.

Brown, Lau, and York considered another prescription for the reference configuration as well [118, 334, 335]. In this approach the two-surface (\({{\mathcal S}_t}\), qab) is embedded into the light cone of a point of the Minkowski or anti-de Sitter spacetime instead of into a spacelike hypersurface of constant curvature. The essential difference between the new (‘light cone reference’) and the previous (‘flat space reference’) prescriptions is that the embedding into the light cone is not unique, but the reference term k0 may be given explicitly, in a closed form. The positivity of the Gauss curvature of the intrinsic geometry of (\({\mathcal S}\), qab) is not needed. In fact, by a result of Brinkmann [115], every locally-conformally-flat Riemannian n-geometry is locally isometric to an appropriate cut of a light cone of the n + 2 dimensional Minkowski spacetime (see, also, [178]). To achieve uniqueness some extra condition must be imposed. This may be the requirement of the vanishing of the ‘normal momentum density’ \(j_\vdash^0\) in the reference spacetime [334, 335], yielding \({k^0} = \sqrt {{2^{\mathcal S}}R + 4/{\lambda ^2}}\), where \(^{\mathcal S}R\) is the Ricci scalar of (\({\mathcal S}\), qab) and λ is the cosmological constant of the reference spacetime. The condition \(j_\vdash^0 = 0\) defines something like a ‘rest frame’ in the reference spacetime. Another, considerably more complicated, choice for the light cone reference term is used in [118].

10.1.7 Further properties and the various limits

Although the general, nonreferenced expressions are additive, the prescription for the reference term k0 destroys the additivity in general. In fact, if \({{\mathcal S}{\prime}}\) and \({{\mathcal S}^{^{\prime\prime}}}\) are two-surfaces such that \({{\mathcal S}{\prime}} \cap {{\mathcal S}^{^{\prime\prime}}}\) is connected and two-dimensional (more precisely, it has a nonempty open interior, for example, in \({{\mathcal S}{\prime}}\)), then in general \(\overline {{{\mathcal S}{\prime}} \cup {{\mathcal S}^{^{\prime\prime}}} - {{\mathcal S}{\prime}} \cap {{\mathcal S}^{^{\prime\prime}}}}\) (overline means topological closure) is not guaranteed to be embeddable, the flat three-space, and even if it is embeddable then the resulting reference term k0 differs from the reference terms k0 and k0 determined from the individual embeddings.

As noted in [100], the Brown-York energy with the flat space reference configuration is not zero in Minkowski spacetime in general. In fact, in the standard spherical polar coordinates let Σ1 be the spacelike hyperboloid \(t = - \sqrt {{\rho ^2} + {r^2}}, {\Sigma _0}\) the hyperplane t = −T = const. < −ρ < 0 and \({\mathcal S}:{\Sigma _0} \cap {\Sigma _1}\), the sphere of radius \(\sqrt {{T^2} - {\rho ^2}}\) in the t = −T hyperplane. Then the trace of the extrinsic curvature of \({\mathcal S}\) in Σ0 and in Σ1 is \(2/\sqrt {{T^2} - {\rho ^2}}\) and \(2T/\rho \sqrt {{T^2} - {\rho ^2}}\), respectively. Therefore, the Brown-York quasi-local energy (with the flat three-space reference) associated with \({\mathcal S}\) and the normals of Σ1 on \({\mathcal S}\) is \(- \sqrt {(T + \rho){{(T - \rho)}^3}}/(\rho G)\). Similarly, the Brown-York quasi-local energy with the light cone references in [334] and in [118] is also negative for such surfaces with the boosted observers.

Recently, Shi and Tam [458] have proven interesting theorems in Riemannian three-geometries, which can be used to prove positivity of the Brown-York energy if the two-surface \({\mathcal S}\) is a boundary of some time-symmetric spacelike hypersurface on which the dominant energy condition holds. In the time-symmetric case, this energy condition is just the condition that the scalar curvature be non-negative. The key theorem of Shi and Tam is the following: let Σ be a compact, smooth Riemannian three-manifold with non-negative scalar curvature and smooth two-boundary \({\mathcal S}\) such that each connected component \({{\mathcal S}_i}\) of \({\mathcal S}\) is homeomorphic to S2 and the scalar curvature of the induced two-metric on \({{\mathcal S}_i}\) is strictly positive. Then, for each component \(\oint\nolimits_{{{\mathcal S}_i}} {kd} {{\mathcal S}_i} \leq \oint\nolimits_{{{\mathcal S}_i}} {{k^0}} d{{\mathcal S}_i}\) holds, where k is the trace of the extrinsic curvature of \({\mathcal S}\) in Σ with respect to the outward-directed normal, and k0 is the trace of the extrinsic curvature of \({{\mathcal S}_i}\) in the flat Euclidean three-space when \({{\mathcal S}_i}\) is isometrically embedded. Furthermore, if in these inequalities the equality holds for at least one \({{\mathcal S}_i}\), then \({\mathcal S}\) itself is connected and Σ is flat. This result is generalized in [459] by weakening the energy condition, in which case lower estimates of the Brown-York energy can still be given. For some rigidity theorems connected with this positivity result, see [461]; and for their generalization for higher dimensional spin manifolds, see [329].

The energy expression for round spheres was calculated in [121, 100]. In the spherically-symmetric metric discussed in Section 4.2.1, on round spheres the Brown-York energy with the flat space reference and fleet of observers /∂t on \({\mathcal S}\) is \(G{E_{{\rm{BY}}}}[{{\mathcal S}_r}{(\partial/\partial t)^a}] = r(1 - \exp (- \alpha))\). In particular, it is \(r[1 - \sqrt {1 - (2m/r)} ]\) for the Schwarzschild solution. This deviates from the standard round sphere expression, and, for the horizon of the Schwarzschild black hole, it is 2m (instead of the expected m). (The energy has also been calculated explicitly for boosted foliations of the Schwarzschild solution and for round spheres in isotropic cosmological models [119].) Still in the spherically-symmetric context the definition of the Brown-York energy is extended to spherical two-surfaces beyond the event horizon in [347] (see also [443]). A remarkable result is that while the total energy of the electrostatic field of a point charge in any finite three-volume surrounding the point charge in Minkowski spacetime is always infinite, the negative gravitational binding energy compensates the electrostatic energy so that the quasi-local energy is negative within a certain radius under the event horizon in the Reissner-Nordström spacetime and tends to −|e| as r → 0. The Brown-York energy is discussed from the point of view of observers in spherically-symmetric spacetimes (e.g., the connection between this energy and the effective energy in the geodesic equation for radial geodesics) in [90, 576]. The explicit calculation of the Brown-York energy with the (implicitly assumed) flat-space reference in Friedmann-Robertson-Walker spacetimes (as particular examples for the general round sphere case) is given in [6].

The Newtonian limit can be derived from the round sphere expression by assuming that m is the mass of a fluid ball of radius r and m/r is small: It is \(G{E_{{\rm{BY}}}} = m + ({m^2}/2r) + {\mathcal O}({r^{- 2}})\). The first term is simply the mass defined at infinity, and the second term is minus the Newtonian potential energy associated with building a spherical shell of mass m and radius from individual particles, bringing them together from infinity. (For the calculation of the Newtonian limit in the covariant Newtonian spacetime, see [564].) However, taking into account that on the Schwarzschild horizon \(G{E_{{\rm{BY}}}} = 2m\), while at spatial infinity it is just m, the Brown-York energy is monotonically decreasing with r. Also, the first law of black hole mechanics for spherically-symmetric black holes can be recovered by identifying EBY with the internal energy [120, 121]. The thermodynamics of the Schwarzschild-anti-de Sitter black holes was investigated in terms of the quasi-local quantities in [116]. Still considering EBY to be the internal energy, the temperature, surface pressure, heat capacity, etc. are calculated (see Section 13.3.1). The energy has also been calculated for the Einstein-Rosen cylindrical waves [119].

The energy is explicitly calculated for three different kinds of two-spheres in the t = const. slices (in the Boyer-Lindquist coordinates) of the slow rotation limit of the Kerr black hole spacetime with the flat space reference [356]. These surfaces are the r = const. surfaces (such as the outer horizon), spheres whose intrinsic metric (in the given slow rotation approximation) is of a metric sphere of radius R with surface area 4πR2, and the ergosurface (i.e., the outer boundary of the ergosphere). The slow rotation approximation is defined such that |a|/R ≪ 1, where R is the typical spatial measure of the two-surface. In the first two cases the angular momentum parameter enters the energy expression only in the m2a2/R3 order. In particular, the energy for the outer horizon \({r_ +}: = m + \sqrt {{m^2} - {a^2}}\), which is twice the irreducible mass of the black hole. An interesting feature of this calculation is that the energy cannot be calculated for the horizon directly, because, as previously noted, the horizon itself cannot be isometrically embedded into a flat three-space if the angular momentum parameter exceeds the irreducible mass [463]. The energy for the ergosurface is positive, as for the other two kinds of surfaces.

The spacelike infinity limit of the charges interpreted as the energy, spatial momentum, and spatial angular momentum are calculated in [119] (see also [241]). Here the flat-space reference configuration and the asymptotic Killing vectors of the spacetime are used, and the limits coincide with the standard ADM energy, momentum, and spatial angular momentum. The analogous calculation for the center-of-mass is given in [57]. It is shown that the corresponding large sphere limit is just the center-of-mass expression of Beig and Ó Murchadha [64]. Here the center-of-mass integral is also given in terms of a charge integral of the curvature. The large sphere limit of the energy for metrics with the weakest possible falloff conditions is calculated in [181, 462]. A further demonstration that the spatial infinity limit of the Brown-York energy in an asymptotically Schwarzschild spacetime is the ADM energy is given in [180].

Although the prescription for the reference configuration by Hawking and Horowitz cannot be imposed for a general timelike three-boundary 3B (see Section 10.1.8), asymptotically, when 3B is pushed out to infinity, this prescription can be used, and coincides with the prescription of Brown and York. Choosing the background metric \(g_{ab}^0\) to be the anti-de Sitter one, Hawking and Horowitz [241] calculated the limit of the quasi-local energy, and they found it to tend to the Abbott-Deser energy. (For the spherically-symmetric Schwarzschild-anti-de Sitter case see also [116].) In [117] the null infinity limit of the integral of N(k0k)/(8πG) was calculated both for the lapses N, generating asymptotic time translations and supertranslations at the null infinity, and the fleet of observers was chosen to tend to the BMS translation. In the former case the Bondi-Sachs energy, in the latter case Geroch’s supermomenta are recovered. These calculations are based directly on the Bondi form of the spacetime metric, and do not use the asymptotic solution of the field equations. (The limit of the Brown-York energy on general asymptotically hyperboloidal hypersurfaces is calculated in [330].) In a slightly different formulation Booth and Creighton calculated the energy flux of outgoing gravitational radiation [94] (see also Section 13.1) and they recovered the Bondi-Sachs mass-loss.

However, the calculation of the small sphere limit based on the flat-space reference configuration gave strange results [335]. While in nonvacuum the quasi-local energy is the expected (4π/3)r3Tabtatb, in vacuum it is proportional to 4EabEab + HabHab, instead of the Bel-Robinson ‘energy’ Tabcdtatbtctd. (Here Eab and Hab are, respectively, the conformal electric and conformal magnetic curvatures, and ta plays a double role. It defines the two-sphere of radius r [as is usual in the small sphere calculations], and defines the fleet of observers on the two-sphere.) On the other hand, the special light cone reference used in [118, 335] reproduces the expected result in nonvacuum, and yields [1/(90G)]r5Tabcdtatbtctd in vacuum. The small sphere limit was also calculated in [181] for small geodesic spheres in a time symmetric spacelike hypersurface.

The light cone reference \({k^0} = \sqrt {{2^{\mathcal S}}R + 4/{\lambda ^2}}\) was shown to work in the large sphere limit near the null and spatial infinities of asymptotically flat spacetimes and near the infinity of asymptotically anti-de Sitter spacetimes [334]. Namely, the Brown-York quasi-local energy expression with this null-cone reference term tends to the Bondi-Sachs, the ADM, and Abbott-Deser energies. The supermomenta of Geroch at null infinity can also be recovered in this way. The proof is simply a demonstration of the fact that this light cone and the flat space prescriptions for the subtraction term have the same asymptotic structure up to order \({\mathcal O}({r^{- 3}})\). This choice seems to work properly only in the asymptotics, because for small ellipsoids in the Minkowski spacetime this definition yields nonzero energy and for small spheres in vacuum it does not yield the Bel-Robinson ‘energy’.Footnote 17

A formulation and a proof of a version of Thorne’s hoop conjecture for spherically symmetric configuarations in terms of EBY are given in [402], and will be discussed in Section 13.2.2.

10.1.8 Other prescriptions for the reference configuration

As previously noted, Hawking, Horowitz, and Hunter [241, 242] defined their reference configuration by embedding the Lorentzian three-manifold (3B, γab) isometrically into some given Lorentzian spacetime, e.g., into the Minkowski spacetime (see also [216]). However, for the given intrinsic three-metric γab and the embedding four-geometry the corresponding Gauss and Codazzi-Mainardi equations form a system of 6 + 8 = 14 equations for the six components of the extrinsic curvature Θab [120]. Thus, in general, this is a highly overdetermined system, and hence it may be expected to have a solution only in exceptional cases. However, even if such an embedding existed, even the small perturbations of the intrinsic metric hab would break the conditions of embedability. Therefore, in general, this prescription for the reference configuration can work only if the three-surface 3B is ‘pushed out to infinity’, but does not work for finite three-surfaces [120].

To rule out the possibility that the Brown-York energy can be nonzero even in Minkowski spacetime (on two-surfaces in the boosted flat data set), Booth and Mann [100] suggested that one embed \(({\mathcal S},{q_{ab}})\) isometrically into a reference spacetime \(({M^0},g_{ab}^0)\) (mostly into the Minkowski spacetime) instead of a spacelike slice of it, and to map the evolution vector field ξa = Nta + Na of the dynamics, tangent to 3B, to a vector field ξ0a in M0 such that \({-\!\!\!\!\! L_\xi}{q_{ab}} = {\phi ^{\ast}}({-\!\!\!\!\! L_{\xi 0}}q_{ab}^0)\) and \({\xi ^a}{\xi _a} = {\phi ^{\ast}}({\xi ^{0a}}\xi _a^0)\). Here ϕ is a diffeomorphism mapping an open neighborhood U of \({\mathcal S}\) in M into M0 such that \(\phi {\vert _{\mathcal S}}\), the restriction of ϕ to \({\mathcal S}\), is an isometry, and \({-\!\!\!\!\! L_\xi}{q_{ab}}\) denotes the Lie derivative of qab along ξa. This condition might be interpreted as some local version of that of Hawking, Horowitz, and Hunter. However, Booth and Mann did not investigate the existence or the uniqueness of this choice.

10.2 Kijowski’s approach

10.2.1 The role of the boundary conditions

In the Brown-York approach the leading principle was the claim to have a well-defined variational principle. This led them (i) to modify the Hilbert action to the trace-χ-action and (ii) to the boundary condition that the induced three-metric on the boundary of the domain D of the action is fixed.

However, as stressed by Kijowski [315, 317, 229], the boundary conditions have much deeper content. For example in thermodynamics the different definitions of the energy (internal energy, enthalpy, free energy, etc.) are connected with different boundary conditions. Fixing the pressure corresponds to enthalpy, but fixing the temperature corresponds to free energy. Thus, the different boundary conditions correspond to different physical situations, and, mathematically, to different phase spaces.Footnote 18 Therefore, to relax the a priori boundary conditions, Kijowski abandoned the variational principle and concentrated on the equations of motions. However, to treat all possible boundary conditions on an equal footing he used the enlarged phase space of Tulczyjew (see, for example, [317]).Footnote 19 The boundary condition of Brown and York is only one of the possible boundary conditions.

10.2.2 The analysis of the Hilbert action and the quasi-local internal and free energies

Starting with the variation of Hilbert’s Lagrangian (in fact, the corresponding Hamilton-Jacobi principal function on a domain D above), and defining the Hamiltonian by the standard Legendre transformation on the typical compact spacelike three-manifold Σ and its boundary \({\mathcal S} = \partial \Sigma\) as well, Kijowski arrived at a variation formula involving the value on \({\mathcal S}\) of the variation of the canonical momentum, \({{\tilde \pi}^{ab}}: = - {1 \over {16\pi G}}\sqrt {\vert \gamma \vert} ({\Theta ^{ab}} - \Theta {\gamma ^{ab}})\), conjugate to γab. (Apart from a numerical coefficient and the subtraction term, this is essentially the surface stress-energy tensor τab given by Eq. (10.3).) Since, however, it is not clear whether or not the initial + boundary value problem for the Einstein equations with fixed canonical momenta (i.e., extrinsic curvature) is well posed, he did not consider the resulting Hamiltonian as the appropriate one, and made further Legendre transformations on the boundary \({\mathcal S}\).

The first Legendre transformation that he considered gave a Hamiltonian whose variation involves the variation of the induced two-metric qab on \({\mathcal S}\) and the parts \({{\tilde \pi}^{ab}}{t_a}{t_b}\) and \({{\tilde \pi}^{ab}}{t_a}\Pi _b^c\) of the canonical momentum above. Explicitly, with the notation of Section 10.1, the latter two are πabtatb = k/(16πG) and πabtaqbc = Ac/(16πG), respectively. (πab is the de-densitized \({{\tilde \pi}^{ab}}\).) Then, however, the lapse and the shift on the boundary \({\mathcal S}\) will not be independent. As Kijowski shows, they are determined by the boundary conditions of the two-metric and the freely specifiable parts k and Ac of the canonical momentum πab. Then, to define the ‘quasi-symmetries’ of the two-surface, Kijowski suggests that one embed first the two-surface isometrically into an x0 = const. hyperplane of the Minkowski spacetime, and then define a world tube by dragging this two-surface along the integral curves of the Killing vectors of the Minkowski spacetime. For example, to define ‘quasi time translation’ of the two-surface in the physical spacetime we must consider the time translation in the Minkowski spacetime of the two-surface embedded in the x0 = const. hyperplane. This world tube gives an extrinsic curvature \(k_{ab}^0\) and vector potential \(A_c^0\). Finally, Kijowski’s choices for k and Ac are just k0 and \(A_c^0\), respectively. In particular, to define ‘quasi time translation’ he takes πabtatb = k0/(16πG) and \({\pi ^{ab}}{t_a}\Pi _b^c = 0\), because this choice yields zero shift and constant lapse with value one. The corresponding quasi-local quantity, the Kijowski energy, is

$${E_{\rm{K}}}({\mathcal S}): = {1 \over {16\pi G}}\oint\nolimits_{\mathcal S} {{{{{({k^0})}^2} - ({k^2} - {l^2})} \over {{k^0}}}\;d{\mathcal S}}.$$
(10.14)

Here, as above, k and l are the trace of the extrinsic curvatures of \({\mathcal S}\) in the physical spacetime corresponding to the outward-pointing spacelike and the future pointing timelike unit normals to \({\mathcal S}\), which are orthogonal to each other. Obviously, \({E_{\rm{K}}}({\mathcal S})\) is invariant with respect to the boost gauge transformations of the normals, because the ‘generator vector field’ of the energy is not linked to one of the normals of \({\mathcal S}\). A remarkable property of this procedure is that, for round spheres in the Schwarzschild solution, the choice πabtatb = k0/(16πG), πabtaqbc = 0 (i.e., the flat spacetime values) reproduces the lapse of the correct Schwarzschild time [315]. For round spheres (see Section 4.2.1) Eq. (10.14) gives \({r \over {2G}}[1 - \exp (- 2\alpha)]\), which is precisely the standard round sphere expression (4.8). In particular [315], for the event horizon of the Schwarzschild solution it gives the expected value m/G. However, there exist spacelike topological two-spheres \({\mathcal S}\) in the Minkowski spacetime for which \({E_{\rm{K}}}({\mathcal S})\) is positive [401].

Kijowski considered another Legendre transformation on the two-surface as well, and in the variation of the resulting Hamiltonian only the value on \({\mathcal S}\) of the variation of the metric γab, appears. Thus, in this phase space the components of γab, can be specified freely on \({\mathcal S}\), and Kijowski calls the value of the resulting Hamiltonian the ‘free energy’. Its form is

$${F_{\rm{K}}}({\mathcal S}): = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {\left({{k^0} - \sqrt {{k^2} - {l^2}}} \right)\;d{\mathcal S}}.$$
(10.15)

In the special boost-gauge when l = 0 the ‘free energy’ \({E_{\rm{K}}}({\mathcal S})\) reduces to the Brown-York expression \({E_{{\rm{BY}}}}({\mathcal S})\) given by Eq. (10.9). \({F_{\rm{K}}}({\mathcal S})\) appears to have been rediscovered recently by Liu and Yau [338], and we discuss the properties of \({F_{\rm{K}}}({\mathcal S})\) further in Section 10.4. A more detailed discussion of the possible quasi-local Hamiltonians and the strategies to define the appropriate ‘quasi-symmetries’ of \({\mathcal S}\) are given in [316].

10.3 Epp’s expression

10.3.1 The general form of Epp’s expression

The Brown-York energy expression, based on the original flat space reference, has the highly undesirable property that it gives nonzero energy even in the Minkowski spacetime if the fleet of observers on the spherical \({\mathcal S}\) is chosen to be radially accelerating (see the second paragraph of Section 10.1.7). Thus, it would be a legitimate aim to reduce this extreme dependence of the quasi-local energy on the choice of the observers. One way of doing this is to formulate the quasi-local quantities in terms of boost-gauge invariant objects. Such a boost-gauge invariant geometric object is the length of the mean extrinsic curvature vector Qa of Section 4.1.2, which, in the notation of this section, is \(\sqrt {{k^2} - {l^2}}\). If Qa is spacelike or null, then this square root is real, and (apart from the reference term k0 in equation (10.9)) in the special case l = 0 it reduces to −8πG times the surface energy density of Brown and York. This observation lead Epp to suggest

$${E_{\rm{E}}}({\mathcal S}): = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {\left({\sqrt {{{({k^0})}^2} - {{({l^0})}^2}} - \sqrt {{k^2} - {l^2}}} \right)\;d{\mathcal S}}$$
(10.16)

as the general definition of the ‘invariant quasi-local energy’ [178]. Here, as in the Brown-York definition, k0 and l0 give the ‘reference term’ that should be fixed in a separate procedure. Note that it is \({E_{\rm{E}}}({\mathcal S})\) that is referenced and not the mean curvatures k and l, i.e., \({E_{\rm{E}}}({\mathcal S})\) is not the integral of \(\sqrt {{\varepsilon ^2} - j_\vdash^2}\). Apart from the fact that MΣ of Eq. (2.7) is associated with a three-surface, Epp’s invariant quasi-local energy expression appears to be analogous to MΣ rather than to MΣ[ξa] of Eq. (2.6) or to \({Q_{\mathcal S}}[{\bf{K}}]\) of Eq. (2.5). However, although at first sight \({E_{\rm{E}}}({\mathcal S})\) appears to be a quasi-local mass, it turns out in special situations that it behaves as an energy expression. In the ‘quasi-local rest frame’, i.e., in which l = 0, it reduces to the Brown-York expression, provided k is positive. Note that Qa must be spacelike to have a quasi-local rest frame. This condition can be interpreted as a very weak convexity condition on \({\mathcal S}\). In particular, k is not needed to be positive, only k2 > l2 is required. While EBY is sensitive to the sign of k, EE is not. Hence, \({E_{\rm{E}}}({\mathcal S})\) is not simply the value of the Brown-York expression in the quasi-local rest frame.

10.3.2 The definition of the reference configuration

The subtraction term in Eq. (10.16) is defined through an isometric embedding of \(({\mathcal S},{q_{ab}})\) into some reference spacetime instead of a three-space. This spacetime is usually Minkowski or anti-de Sitter spacetime. Since the two-surface data consist of the metric, the two extrinsic curvatures and the SO(1,1)-gauge potential, for given \(({\mathcal S},{q_{ab}})\) and ambient spacetime \(({M^0},g_{ab}^0)\) the conditions of the isometric embedding form a system of six equations for eight quantities, namely for the two extrinsic curvatures and the gauge potential Ae (see Section 4.1.2, and especially Eqs. (4.1) and (4.2)). Therefore, even a naïve function counting argument suggests that the embedding exists, but is not unique. To have uniqueness, additional conditions must be imposed. However, since Ae is a gauge field, one condition might be a gauge fixing in the normal bundle, and Epp’s suggestion is to require that the curvature of the connection one-form Ae in the reference spacetime and in the physical spacetime be the same [178]. Or, in other words, not only the intrinsic metric qab of \({\mathcal S}\) is required to be preserved in the embedding, but the whole curvature \({f^a}_{bcd}\) of the connection δe as well. In fact, in the connection δe on the spinor bundle \({{\bf{S}}^A}({\mathcal S})\) both the Levi-Civita and the SO(1,1) connection coefficients appear on an equal footing. (Recall that we interpreted the connection δe to be a part of the universal structure of \({\mathcal S}\).) With this choice of reference configuration \({E_{\rm{E}}}({\mathcal S})\) depends not only on the intrinsic two-metric qab of \({\mathcal S}\), but on the connection δe on the normal bundle as well.

Suppose that \({\mathcal S}\) is a two-surface in M such that k2 > l2 with k > 0, and, in addition, \(({\mathcal S},{q_{ab}})\) can be embedded into the flat three-space with k0 ≥ 0. Then there is a boost gauge (the ‘quasi-local rest frame’) in which \({E_{\rm{E}}}({\mathcal S})\) coincides with the Brown-York energy \({E_{{\rm{BY}}}}({\mathcal S},{t^a})\) in the particular boost-gauge ta for which taQa = 0. Consequently, every statement stated for the latter is valid for \({E_{\rm{E}}}({\mathcal S})\), and every example calculated for \({E_{{\rm{BY}}}}({\mathcal S},{t^a})\) is an example for \({E_{\rm{E}}}({\mathcal S})\) as well [178]. A clear and careful discussion of the potential alternative choices for the reference term, especially their potential connection with the angular momentum, is also given there.

10.3.3 The various limits

First, it should be noted that Epp’s quasi-local energy is vanishing in Minkowski spacetime for any two-surface, independent of any fleet of observers. In fact, if \({\mathcal S}\) is a two-surface in Minkowski spacetime, then the same physical Minkowski spacetime defines the reference spacetime as well, and hence, \({E_{\rm{E}}}({\mathcal S}) = 0\). For round spheres in the Schwarzschild spacetime it yields the result that EBY gave. In particular, for the horizon, it is 2m/G (instead of m/G), and at infinity it is m/G [178]. Thus, in particular, EE is also monotonically decreasing with r in Schwarzschild spacetime. The explicit calculation of Epp’s energy in Friedmann-Robertson-Walker spacetimes is given in [6].

Epp calculated the various limits of his expression as well [178]. In the large sphere limit, near spatial infinity, he recovered the Ashtekar-Hansen form of the ADM energy, and at future null infinity, the Bondi-Sachs energy. The technique that is used in the latter calculation is similar to that of [117]. In nonvacuum, in the small sphere limit, \({E_{\rm{E}}}({\mathcal S})\) reproduces the standard \({{4\pi} \over 3}{r^3}{T_{ab}}{t^a}{t^b}\) result, but the calculations for the vacuum case are not completed. The leading term is still probably of order r5, but its coefficient has not been calculated. Although in these calculations ta plays only the role of fixing the two-surfaces, as a result we get the energy seen by the observer ta instead of mass. This is why \({E_{\rm{E}}}({\mathcal S})\) is considered to be energy rather than mass. In the asymptotically anti-de Sitter spacetime (with the anti-de Sitter spacetime as the reference spacetime) EE gives zero. This motivated Epp to modify his expression to recover the mass parameter of the Schwarzschild-anti-de Sitter spacetime at infinity. The modified expression is, however, not boost-gauge invariant. Here the potential connection with the AdS/CFT correspondence is also discussed (see also [48]).

10.4 The expression of Liu and Yau

10.4.1 The Liu-Yau definition

Let \(({\mathcal S},{q_{ab}})\) be a spacelike topological two-sphere in spacetime such that the metric has positive scalar curvature. Then, by the embedding theorem, there is a unique isometric embedding of \(({\mathcal S},{q_{ab}})\) into the flat three-space, and this embedding is unique. Let k0 be the trace of the extrinsic curvature of \({\mathcal S}\) in this embedding, which is completely determined by qab and is necessarily positive. Let k and l be the trace of the extrinsic curvatures of \({\mathcal S}\) in the physical spacetime corresponding to the outward-pointing unit spacelike and future-pointing timelike normals, respectively. Then Liu and Yau define their quasi-local energy in [338] by

$${E_{{\rm{LY}}}}({\mathcal S}): = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {\left({{k^0} - \sqrt {{k^2} - {l^2}}} \right)\,d{\mathcal S}}.$$
(10.17)

However, this is precisely Kijowski’s ‘free energy’ given by Eq. (10.15), \({E_{{\rm{LY}}}}({\mathcal S}) = {F_{\rm{K}}}({\mathcal S})\), and hence, we denote this by \({E_{{\rm{KLY}}}}({\mathcal S})\). Obviously, this is well defined only if, in addition to the usual convexity condition R > 0 for the intrinsic metric, k2l2 also holds, i.e., the mean curvature vector Qa is spacelike or null. If k ≥ 0 then \({E_{{\rm{KLY}}}}({\mathcal S}) \geq {E_{{\rm{BY}}}}({\mathcal S},{t^a})\), where the equality holds for ta corresponding to the quasi-local rest frame (in the sense that it is orthogonal to the mean curvature vector of the two-surface: taQa = 0). The mean curvature mass of [11, 12] is precisely \({E_{{\rm{LY}}}}({\mathcal S})\) (see also Section 11.3.4).

Isolating the gauge invariant part of the SO(1,1) connection one-form, Liu and Yau defined a quasi-local angular momentum as follows [338]. Let α be the solution of the Poisson equation 2qabδaδbα = Im(f) on \({\mathcal S}\), whose source is just the field strength of Aa (see Eq. (4.3)). This α is globally well defined on \({\mathcal S}\) and is unique up to addition of a constant. Then, define \({\gamma _a}: = {A_a} - {\varepsilon _a}^b{\delta _b}\alpha\) on the domain of the connection one-form Aa, which is easily seen to be closed. Assuming the space and time orientability of the spacetime, Aa is globally defined on \({\mathcal S} \approx {S^2}\), and hence, by H1(S2) = 0 the one-form γa is exact: γa = δaγ for some globally defined real function γ on \({\mathcal S}\). This function is unique up to an additive constant. Therefore, \({A_a} = {\varepsilon _a}^b{\delta _b}\alpha + {\delta _a}\gamma\), where the first term is gauge invariant, while the second represents the gauge content of Aa. Then for any rotation Killing vector K0i of the flat three-space Liu and Yau define the quasi-local angular momentum by

$${J_{{\rm{LY}}}}({\mathcal S},{K^{0i}}): = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {\varphi _{\ast}^{- 1}({K^{0i}}\Pi _i^{0a})\;{\varepsilon _a}^b({\delta _b}\alpha)\,d{\mathcal S}}.$$
(10.18)

Here \(\varphi: {\mathcal S} \rightarrow {{\rm{{\mathbb R}}}^3}\) is the embedding and \(\Pi _i^{0a}\) is the projection to the tangent planes of \(\varphi ({\mathcal S})\) in ℝ3. Thus, in contrast to the Brown-York definition for the angular momentum (see Eqs. (10.4), (10.5), (10.6), (10.7), and (10.8)), in \({J_{{\rm{LY}}}}({\mathcal S},{K^{0i}})\) only the gauge invariant part \({\varepsilon _a}^b{\delta _b}\alpha\) of the gauge potential Aa is used, and its generator vector field is the pullback to \({\mathcal S}\) of the rotation Killing vector of the flat three-space.

For a definition of the Kijowski-Liu-Yau energy as a quasi-local energy oparator in loop quantum gravity, see [565].

10.4.2 The main properties of. \({E_{{\rm{KLY}}}}({\mathcal S})\)

The most important property of the quasi-local energy (10.17) is its positivity. Namely [338], let Σ be a compact spacelike hypersurface with smooth boundary Σ, consisting of finitely many connected components \({{\mathcal S}_1}, \ldots, {{\mathcal S}_k}\) such that each of them has positive intrinsic curvature. Suppose that the matter fields satisfy the dominant energy condition on Σ. Then \({E_{{\rm{KLY}}}}(\partial \Sigma): = \sum\nolimits_{i = 1}^k {{E_{{\rm{KLY}}}}({{\mathcal S}_i})}\) is strictly positive unless the spacetime is flat along Σ. In this case ∂Σ is necessarily connected. The proof is based on the use of Jang’s equation [289], by means of which the general case can be reduced to the results of Shi and Tam in the time-symmetric case [458], stated in Section 10.1.7 (see also [566]). This positivity result is generalized in [339], Namely, \({{E_{{\rm{KLY}}}}({{\mathcal S}_i})}\) is shown to be non-negative for all i = 1, …, k, and if \({{E_{{\rm{KLY}}}}({{\mathcal S}_i}) = 0}\) for some i, then the spacetime is flat along Σ and Σ is connected. (In fact, since EKLY (Σ) depends only on Σ but is independent of the actual Σ, if the energy condition is satisfied on the domain of dependence D(Σ), then EKLY(Σ) = 0 implies the flatness of the spacetime along every Cauchy surface for D(Σ), i.e., the flatness of the whole domain of dependence as well.) A potential spinorial proof of the positivity of \({{E_{{\rm{KLY}}}}({{\mathcal S}_i})}\) is suggested in [12]. This is based on the use of the Nester-Witten 2-form and a Witten type argumentation. However, the spinor field solving the Witten equation on the spacelike hypersurface Σ would have to satisfy a nonlinear boundary condition.

If \({\mathcal S}\) is an apparent horizon, i.e., l = ±k, then \({{E_{{\rm{KLY}}}}({\mathcal S})}\) is just the integral of k0/(8πG). Then, by the Minkowski inequality for the convex surfaces in the flat three-space (see, e.g., [519]) one has

$$E _{{\rm{KLY}}}\;({\mathcal S}) = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {{k^0}\,d{\mathcal S} \geq} {1 \over {8\pi G}}\sqrt {16\pi {\rm{Area}}({\mathcal S})} = 2\sqrt {{{{\rm{Area}}({\mathcal S})} \over {16\pi {G^2}}}},$$

i.e., it is not less than twice the irreducible mass of the horizon. For round spheres \({{E_{{\rm{KLY}}}}({\mathcal S})}\) coincides with \({{E_E}({\mathcal S})}\), and hence, it does not reduce to the standard round sphere expression (4.8). In particular, for the event horizon of the Schwarzschild black hole it is 2m/G. (For a more detailed discussion, and, in particular, the interpretation of \({{E_{{\rm{KLY}}}}({\mathcal S})}\) in the spherically-symmetric context, see [400].) \({{E_{{\rm{KLY}}}}({\mathcal S})}\) was calculated for small spheres both in nonvacuum and vacuum, and for large spheres near the future null infinity in [575]. In the leading order in nonvacuum we get the expected result \({{4\pi} \over 3}{r^3}{T_{ab}}{t^a}{t^b}\) (see Eq. (4.9)), but in vacuum, in addition to the expected Bel-Robinson ‘energy’, there are extra terms in the leading r5 order. As could be expected, at null infinity \({{E_{{\rm{KLY}}}}({\mathcal S})}\) reproduces the Bondi energy.

However, \({{E_{{\rm{KLY}}}}({\mathcal S})}\) can be positive even if \({\mathcal S}\) is in the Minkowski spacetime. In fact, for a given intrinsic metric qab on \({\mathcal S}\) (with positive scalar curvature) \({\mathcal S}\) can be embedded into the flat ℝ3; this embedding is unique, and the trace of the extrinsic curvature k0 is determined by qab. On the other hand, the isometric embedding of \({\mathcal S}\) in the Minkowski spacetime is not unique. The equations of the embedding (i.e., the Gauss, Codazzi-Mainardi, and Ricci equations) form a system of six equations for the six components of the two extrinsic curvatures kab and lab and the two components of the SO(1,1) gauge potential Ae. Thus, even if we impose a gauge condition for the connection one-form Ae, we have only six equations for the seven unknown quantities, leaving enough freedom to deform \({\mathcal S}\) (with given, fixed intrinsic metric) in the Minkowski spacetime to get positive Kijowski-Liu-Yau energy. Indeed, specific two-surfaces in the Minkowski spacetime are given in [401], for which \({{E_{{\rm{KLY}}}}({\mathcal S}) > 0}\). Moreover, it is shown in [361] that the Kijowski-Liu-Yau energy for a closed two-surface \({\mathcal S}\) in Minkowski spacetime strictly positive unless \({\mathcal S}\) lies in a spacelike hyperplane. On the applicability of \({{E_{{\rm{KLY}}}}({\mathcal S})}\) in the formulation and potential proof of Thorne’s hoop conjecture see Section 13.2.2.

10.4.3 Generalizations of the original construction

In the definition of \({E_{{\rm{KLY}}}}({\mathcal S})\) one of the assumptions is the positivity of the scalar curvature of the intrinsic metric on the two-surface \({\mathcal S}\). Thus, it is natural to ask if this condition can be relaxed and whether or not the quasi-local mass can be associated with a wider class of surfaces. Moreover, though in certain circumstances \({E_{{\rm{KLY}}}}({\mathcal S})\) behaves as energy (see [400, 575]), it is the (renormalized) integral of the length of the mean curvature vector, i.e., it is analogous to mass (compare with Eq. (2.7)). Hence, it is natural to ask if a energy-momentum four-vector can be introduced in this way. In addition, in the calculation of the large sphere limit of \({E_{{\rm{KLY}}}}({\mathcal S})\) in asymptotically anti-de Sitter spacetimes it seems natural to choose the reference configuration by embedding \({\mathcal S}\) into a hyperbolic rather than Euclidean three-space. These issues motivate the following generalization [542] of the Kijowski-Liu-Yau expression.

One of the key ideas is that two-surfaces with spherical topology and scalar curvature that are bounded from below by a negative constant, i.e., R > −2κ2, can be isometrically embedded in a unique way into the hyperbolic space \(\mathbb H_{- {\kappa ^2}}^3\) with constant sectional curvature −κ2, and hence, this embedding can be (and in fact is) used to define the reference configuration. Let k0 denote the mean curvature of \({\mathcal S}\) in this embedding, where the hyperbolic space \(\mathbb H_{- {\kappa ^2}}^3\) is thought of as a spacelike hypersurface with constant negative curvature in the Minkowski spacetime ℝ1,3. Then the main result is that, assuming that the mean curvature vector \({Q^a}_{ab}\) of \({\mathcal S}\) in the spacetime is spacelike, there exists a function \({W^{\underline a}}:{\mathcal S} \rightarrow {{\mathbb R}^{1,3}}\), depending only on the length \(\vert {Q^a}_{ab}\vert = \sqrt {{k^2} - {l^2}}\) of the mean curvature vector and the embedding of \({\mathcal S}\) into ℝ1,3, such that the four integrals

$$\int\nolimits_{\mathcal S} {\left({{k^0} - \sqrt {{k^2} - {l^2}}} \right){W^{\underline a}}\,d{\mathcal S}}$$
(10.19)

form a future-pointing nonspacelike vector in ℝ1,3. The functions \({W^{\underline a}},{\underline a} = 0, \ldots, 3\), are solutions of a parabolic equation and are related to the norm of the Killing spinors on \(\mathbb H_{- {\kappa ^2}}^3\). If κ → 0 then \({W^{\underline a}}\) tend to the components of a constant vector field. Expression (10.19) can be interpreted as a comparison theorem for the total mean curvature of \({\mathcal S}\) in the physical spacetime and in the hyperboloid \(\mathbb H_{- {\kappa ^2}}^3 \subset {\mathbb R^{1,3}}\). A similar result is proven in the Riemannian case, i.e., when \({\mathcal S}\) is considered to be the boundary of a compact Riemannian three-manifold (Σ, hab), and in (10.19) the length of the mean curvature vector is replaced by the mean curvature k of \({\mathcal S}\) in Σ. Comparing (10.19) with the expression of the Bondi-Sachs energy-momentum (4.14) or with Eq. (6.2), the integrals can also be interpreted as the components of a quasi-local energy-momentum four-vector.

The proof of the nonspacelike nature of (10.19) is based on a Witten type argumentation, in which ‘the mass with respect to a Dirac spinor ϕ0 on \({\mathcal S}\)’ takes the form of an integral of \(({k^0} - \sqrt {{k^2} - {l^2}})\) weighted by the norm of ϕ0. Thus, the norm of ϕ0 appears to be a nontrivial lapse function. The suggestion of [580] for a quasi-local mass-like quantity is based on an analogous expression. Let \({\mathcal S}\) be the boundary of some spacelike hypersurface Σ on which the intrinsic scalar curvature is positive, let us isometrically embed \({\mathcal S}\) into the Euclidean three-space, and let ϕ0 be the pull back to \({\mathcal S}\) of a constant spinor field. Suppose that the dominant energy condition is satisfied on Σ, and consider the solution ϕ of the Witten equation on Σ with one of the chiral boundary conditions Π±(ϕϕ0) = 0, where Π± are the projections to the space of the right/left handed Dirac spinors, built from the projections κ±AB of Section 4.1.7. Then, by the Sen-Witten identity, a positive definite boundary expression is introduced, and interpreted as the ‘quasi-local mass’ associated with \({\mathcal S}\). In contrast to Brown-York type expressions, this mass, associated with the two-spheres of radius in the t = const. hypersurfaces in Schwarzschild spacetime, is an increasing function of the radial coordinate, and tends to the ADM mass. In general, however, this limit is EADM − |Padm|, rather than the expected ADM mass. This construction is generalized in [581] by embedding \({\mathcal S}\) into some \(\mathbb H_{- {\kappa ^2}}^3\) instead of ℝ3. A modified version of these constructions is given in [582], which tends to the ADM energy and mass at spatial infinity.

Suggestion (11.12), due to Anco [11], can also be considered as a generalization of the Kijowski-Liu-Yau mass.

10.5 The expression of Wang and Yau

The new quasi-local energy (in fact, energy-momentum) expression of Wang and Yau [544] (and for a review, see also [540]) is based on the ‘renormalized’ form of the ‘natural’ Hamiltonian

$$H\,[{\bf{K}}] = {1 \over {8\pi G}}\int\nolimits_\Sigma {{K^a}{G_{ab}}{1 \over {3!}}{\varepsilon ^b}_{cde} - {1 \over {8\pi G}}\oint\nolimits_{\partial \Sigma} {{K^a}({}^ \perp {\varepsilon _{ab}}{Q_c}^{cb} + {A_a})\;d{\mathcal S}}.}$$
(10.20)

(See also Eq. (11.11), and compare with Eq. (8.1): apart from the SO(1, 3) gauge-dependent terms, this boundary expression is just the two-surface integral of the Nester-Witten 2-form.) Thus, while the expressions based on Eq. (10.17) are analogous to Eq. (2.7), i.e., the two-surface integrals of locally-defined mass density, the expressions based on Eq. (10.20) are analogous to Eq. (2.5) (or rather Eq. (2.8)), i.e., the charge integrals ‘indexed’ by a vector field Ka.

Since Ae is boost-gauge dependent and Eq. (10.20) in itself does not yield, e.g., the correct ADM energy in asymptotically flat spacetime, a boost gauge and a restriction on the vector field Ka and/or a ‘renormalization’ of Eq. (10.20) (in the form of an appropriate reference term) must be given. Wang and Yau suggest that one determine these by embedding the spacelike two-surface \({\mathcal S}\) isometrically into the Minkowski spacetime in an appropriate way.

Thus, suppose that there is an isometric embedding \(i:{\mathcal S} \rightarrow {\mathbb R^{1,3}}\), and let us fix a constant future-pointing unit timelike vector field \({T^{\underline a}}\) in ℝ1,3. This \({T^{\underline a}}\) defines a global orthonormal frame field \({\{_0}{t^{\underline a}}{,_0}{v^{\underline a}}\}\) in the normal bundle of \(i:({\mathcal S}) \subset {\mathbb R^{1,3}}\) by requiring \(_0{v^{\underline a}}{T_{\underline a}} = 0\), and let us denote the mean extrinsic curvature vector of this embedding by \(_0{Q^{\underline a}}_{{\underline a}{\underline b}}\). Then, supposing that the mean extrinsic curvature vector \({Q^a}_{ab}\) of \({\mathcal S}\) in the physical spacetime is spacelike, there is a uniquely-determined global orthonormal frame field \(\{{{\bar t}^a},{{\bar v}^a}\}\) in the normal bundle of \({\mathcal S} \subset M\) such that \({Q^{a}\;_{ab}\bar{t}^{b}=\;_{0}Q^{\underline{a}}\;_{\underline{a}\;\underline{b}}\;_{0}t^{\underline{a}}}\). This fixes the boost gauge in \(N{\mathcal S}\), and, in addition, makes it possible to identify the normal bundle of \({\mathcal S}\) in M and the normal bundle of \(i({\mathcal S})\) in ℝ1,3 via the identification \(_0{t^{\underline a}} \mapsto {{\bar t}^a}{,_0}{v^{\underline a}} \mapsto {{\bar v}^a}\). This, together with the natural identification of the tangent bundle \(T{\mathcal S}\) of \({\mathcal S}\) and the tangent bundle \(Ti({\mathcal S})\) of \(i({\mathcal S})\) yields a natural identification of the Lorentzian vector bundles over \({\mathcal S}\) in M and over \(i({\mathcal S})\) in ℝ1,3. Therefore, any vector (and tensor) field on \(i({\mathcal S})\) yields a vector (tensor) field on \({\mathcal S}\). In particular, if \({T^{\underline a}}{= _0}{N_0}{t^{\underline a}}{+ _0}{N^{\underline a}}\), then \(_0N^{\underline a}\) is a tangent of \(i({\mathcal S})\), and hence, there is a uniquely determined tangent 0Na of \({\mathcal S}\) such that \(_0{N^{\underline a}} = {i_\ast}{(_0}{N^a})\). Consequently \({T^{\underline a}}\) can be identified with the vector field \({}_0N\bar t^a+{}_0N^a\) on \({\mathcal S}\). Similarly, the connection one-form \(_0{A_{\underline a}}\) on the normal bundle (in the boost gauge \({\{_0}{t^{\underline a}}{,_0}{v^{\underline a}}\}\)) can be pulled back along i to a one-form 0Aa on \({\mathcal S}\). Then, denoting by 0k and \({\bar k}\) he mean curvature of \(i({\mathcal S})\) and \({\mathcal S}\) in the direction \(_0{v^{\underline a}}\) and \({{\bar v}^a}\), respectively, Wang and Yau [544] define the quasi-local energy with respect to the pair \((i,{T^{\underline a}})\) by

$${E_{{\rm{W}}\,{\rm{Y}}}}({\mathcal S};i,{T^{\underline a}}): = {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {\left({{{({}_0k - \bar k)}_0}N - {{({}_0{A_e} - {{\bar A}_e})}_0}{N^e}} \right)\;d{\mathcal S}}.$$
(10.21)

Here \({\mathcal S}\) is assumed only to be isometrically embeddable into ℝ1,3 and that \({\mathcal S}\) has spacelike mean curvature vector in M. Note that this energy still depends on the pair \((i,{T^{\underline a}})\).

To prove, e.g., the positivity of this energy, or to ensure that in flat spacetime the energy be zero, further conditions must be satisfied. Wang and Yau formulate these conditions in the notion of admissible pairs \((i,{T^{\underline a}})\) should have a convex shadow in the direction \({T^{\underline a}},i({\mathcal S})\) must be the boundary of some spacelike hypersurface in ℝ1,3 on which the Dirichlet boundary value problem for the Jang equation can be solved with the time function τ discussed in Section 4.1.3, and the connection 1-form and the mean curvature in a certain gauge must satisfy an inequality. (For the precise definition of the admissible pairs see [544]; for the geometrical background see [543] and Section 4.1.3.) Then it is shown that if the dominant energy condition holds and \({\mathcal S}\) has a spacelike mean curvature vector, then for the admissible pairs the quasi-local energy (10.21) is non-negative. Therefore, if the set of the admissible pairs is not empty (e.g., when the scalar curvature of (\({\mathcal S}\), qab) is positive), then the infimum \({m_{{\rm{WY}}}}({\mathcal S})\) of \({E_{{\rm{WY}}}}({\mathcal S};i,{T^{\underline a}})\) among all admissible pairs is non-negative, and is called the quasi-local mass. If this infimum is achieved by the pair \((i,{T^{\underline a}})\), i.e., by an embedding i and a timelike \({T^{\underline a}}\), then \({P^{\underline a}}: = {m_{{\rm{WY}}}}({\mathcal S}){T^{\underline a}}\) is called the quasi-local energy-momentum, which is then future pointing and timelike. It is still an open question that if the quasi-local mass \({m_{{\rm{WY}}}}({\mathcal S})\) is vanishing, then the domain of dependence D(Σ) of the spacelike hypersurface Σ with boundary \({\mathcal S}\) can be curved (e.g., a pp-wave geometry with pure radiation) or not. If not, then the quasi-local energy-momentum would be expected to be null.

The quasi-local energy-momentum associated with any two-surface in Minkowski spacetime with a convex shadow in some direction is clearly zero. The mass has been calculated for round spheres in the Schwarzschild spacetime. It is \(r(1 - \sqrt {1 - (2m/r)})/G\), and hence, for the event horizon it gives 2m/G. mWY has been calculated for large spheres and it has the expected limits at the spatial and null infinities [545, 142]. Also, it has the correct small sphere limit both in nonvacuum and vacuum [544]. Upper and lower estimates of the Wang-Yau energy are derived, and its critical points are investigated in [362] and [363], respectively. On the applicability of \({E_{{\rm{WY}}}}({\mathcal S})\) in the formulation and potential proof of Thorne’s hoop conjecture see Section 13.2.2. A recent review of the results in connection with the Wang-Yau energy see [541].

11 Towards a Full Hamiltonian Approach

The Hamilton-Jacobi method is only one possible strategy for defining the quasi-local quantities in a large class of approaches, called the Hamiltonian or canonical approaches. Thus, there is a considerable overlap between the various canonical methods, and hence, the cutting of the material into two parts (Section 10 and Section 11) is, in some sense, artificial. In Section 10 we reviewed those approaches that are based on the analysis of the action, while in this section we discuss those that are based primarily on the analysis of the Hamiltonian in the spirit of Regge and Teitelboim [433].Footnote 20

By a full Hamiltonian analysis we mean a detailed study of the structure of the quasi-local phase space, including the constraints, the smearing fields, the symplectic structure and the Hamiltonian itself, according to the standard, or some generalized, Hamiltonian scenarios, in the traditional 3 + 1 or in the fully Lorentz-covariant form, or even in the 2 + 2 form, using the metric or triad/tetrad variables (or even the Weyl or Dirac spinors). In the literature of canonical general relativity (at least in the asymptotically flat context) there are examples for all these possibilities, and we report on the quasi-local investigations on the basis of the decomposition they use. Since the 2 + 2 decomposition of the spacetime is less known, we also summarize its basic idea.

11.1 The 3 + 1 approaches

There is a lot of literature on the canonical formulation of general relativity both in the traditional ADM and the Møller tetrad (or, recently, the closely related complex Ashtekar) variables. Thus, it is quite surprising how little effort has been spent systematically quasi-localizing them. One motivation for the quasi-localization of the ADM-Regge-Teitelboim analysis came from the need for a deeper understanding of the dynamics of subsystems of the universe. In particular, such a systematic Hamiltonian formalism would shed new light on the basic results on the initial boundary value problem in general relativity, initiated by Friedrich and Nagy [202] (see also [201, 554, 555] and, for some recent reviews, see [435, 556] and references therein), and would yield the interpretation of their boundary conditions from a different perspective. Conversely, quasi-local Hamiltonian techniques could potentially be used to identify a large class of boundary conditions that are compatible with the evolution equation. (For a discussion of such a potential link between the two appraches, see e.g., [502, 16]). Moreover, in the quasi-local Hamiltonian approach we might hope to be able to associate nontrivial observables (and, in particular, conserved quantities) with localized systems in a natural way.

Another motivation is to try to provide a solid classical basis for the microscopic understanding of black hole entropy [47, 46, 123]: What are the microscopic degrees of freedom behind the phenomenological notion of black hole entropy? Since the aim of the present paper is to review the construction of the quasi-local quantities in classical general relativity, we discuss only the classical two-surface observables by means of which the ‘quantum edge states’ on the black hole event horizons were intended to be constructed.

11.1.1 The quasi-local constraint algebra and the basic Hamiltonian

If Σ, the three-manifold on which the ADM canonical variables \({h_{ab}},{{\tilde p}^{ab}}\) are defined, has a smooth boundary \({\mathcal S}: = \partial \Sigma\), then the usual vacuum constraints

$$C\,[N,{N^a}]: = - \int\nolimits_\Sigma {\left\{{N\left[ {{{{1 \over {16\pi G}}}\,^3}R\sqrt {\vert h \vert} + {{16\pi G} \over {\sqrt {\vert h \vert}}}\,\left({{1 \over 2}{{({{\tilde p}^{ab}}{h_{ab}})}^2} - {{\tilde p}^{ab}}{{\tilde p}_{ab}}} \right)} \right] + 2{N_a}{D_b}{{\tilde p}^{ab}}} \right\}{d^3}x}$$
(11.1)

are differentiable with respect to the canonical variables if the fields N and Na are vanishing on \({\mathcal S}\) and the area 2-form on \({\mathcal S}\), induced from the configuration variable hab, is fixed.Footnote 21 Under these conditions the constraint functions close to a Poisson algebra \({\mathcal C}\) (the ‘quasi-local constraint algebra’); moreover, the evolution equations preserve these boundary conditions [499]. However, the evolution in the spacetime corresponding to lapses and shifts that are vanishing on the two-boundary \({\mathcal S}\) yields new Cauchy surfaces in the same Cauchy development D(Σ) of Σ, and during such an evolution the boundary \({\mathcal S}\) remains pointwise fixed.

A similar analysis [499] shows that the basic Hamiltonian

$${H_0}[N,{N^a}]: = C\,[N,{N^a}] + \oint\nolimits_{\mathcal S} {2{D_a}({{\tilde p}^{ab}}{h_{bc}}{N^c})\,d{x^3}},$$
(11.2)

coming from the Lagrangian \({1 \over {16\pi G}}(R + {\chi ^{ab}}{\chi _{ab}} - {\chi ^2})\), is differentiable with respect to the canonical variables if N is vanishing on \({\mathcal S}\), Na is tangent to \({\mathcal S}\) on \({\mathcal S}\), and the area 2-form on \({\mathcal S}\) is fixed. If, in addition, the shift is required to be divergence-free with respect to the connection δe on \({\mathcal S}\), i.e., δeNe = 0, then the evolution equations preserve these boundary conditions, the basic Hamiltonians form a closed Poisson algebra 0 in which \({\mathcal C}\) is an ideal, and the evaluation of the basic Hamiltonians on the constraint surface,

$$O\,[{N^a}] = - {1 \over {8\pi G}}\oint\nolimits_{\mathcal S} {{N^a}{A_a}\,d{\mathcal S}},$$
(11.3)

defines a Lie algebra homomorphism from the Lie algebra of the δe-divergence-free vector fields on \({\mathcal S}\) to the quotient Lie algebra \({\mathcal H_0}/{\mathcal C}\). The evolution with such lapses and shifts in the spacetime is a mapping of the domain of dependence D(Σ) onto itself, keeping the boundary \({\mathcal S}\) as a submanifold fixed, but not pointwise.

The condition that the area 2-form εab should be fixed appears to be the part of the ‘ultimate’ boundary condition for the canonical variables. In fact, in a systematic quasi-local Hamiltonian analysis boundary terms appear in the calculation of the Poisson bracket of two Hamiltonians also, which we called Poisson boundary terms in Section 3.3.3. Nevertheless, as we already mentioned there, the quasi-local Hamiltonian analysis of a single real scalar field in Minkowski space shows, these boundary terms represent the infinitesimal flow of energy-momentum and relativistic angular momentum. Thus, they must be gauge invariant [502]. Assuming that in general relativity the Poisson boundary terms should have similar interpretation, their gauge invariance should be expected, and the condition of their gauge invariance can be determined. It is precisely the condition on the lapse and shift that the spacetime vector field Ka = Nta + Na built from them on the 2-surface must be divergence free there with respect to the connection Δa of Section 4.1.2, i.e., ΔaKa = 0. However, this is precisely the condition under which the evolution equations preserve the boundary condition δεab = 0. It might also be worth noting that this condition for the lapse and shift is just one of the ten components of the Killing equation: 0 = 2ΔaKa = qab(∇aKb + ∇bKa). (For the details, see [502].)

It should be noted that the area 2-form on the boundary 2-surface \({\mathcal S}\) appears naturally in connection with the general symplectic structure on the ADM variables on a compact spacelike hypersurface Σ with smooth boundary \({\mathcal S}\). In fact, in [229] an identity is derived for the variation of the ADM canonical variables on Σ and of various geometrical quantities on \({\mathcal S}\). Examples are also given to illustrate how the resulting ‘quasi-local energy’ depends on the choice of the boundary conditions.

For the earlier investigations see [47, 46, 123], where stronger boundary conditions, namely fixing the whole three-metric hab on \({\mathcal S}\) (but without the requirement δeNe = 0), were used to ensure the functional differentiability.

11.1.2 The two-surface observables

To understand the meaning of the observables (11.3, recall that any vector field Na on Σ generates a diffeomorphism, which is an exact (gauge) symmetry of general relativity, and the role of the momentum constraint C[0, Na] is just to generate this gauge symmetry in the phase space. However, the boundary \({\mathcal S}\) breaks the diffeomorphism invariance of the system, and hence, on the boundary the diffeomorphism gauge motions yield the observables O[Na] and the gauge degrees of freedom give rise to physical degrees of freedom, making it possible to introduce edge states [47, 46, 123].

Analogous investigations were done by Husain and Major in [281]. Using Ashtekar’s complex variables [30] they determine all the local boundary conditions for the canonical variables \(A_a^{\rm{i}}\), \(\tilde E_{\rm{i}}^a\) and for the lapse N, the shift Na, and the internal gauge generator Ni on \({\mathcal S}\) that ensure the functional differentiability of the Gauss, the diffeomorphism, and the Hamiltonian constraints. Although there are several possibilities, Husain and Major discuss the two most significant cases. In the first case the generators N, Na, and Ni are vanishing on \({\mathcal S}\), and thus there are infinitely many two-surface observables, both from the diffeomorphism and the Gauss constraints, but no observables from the Hamiltonian constraint. The structure of these observables is similar to that of those coming from the ADM diffeomorphism constraint above. The other case considered is when the canonical momentum \({\mathcal S}\) (and hence, in particular, the three-metric) is fixed on the two-boundary. Then the quasi-local energy could be an observable, as in the ADM analysis above.

All of the papers [47, 46, 123, 281] discuss the analogous phenomenon of how the gauge freedoms become true physical degrees of freedom in the presence of two-surfaces on the two-surfaces themselves in the Chern-Simons and BF theories. Weakening the boundary conditions further (allowing certain boundary terms in the variation of the constraints), a more general algebra of ‘observables’ can be obtained [125, 409]. They form the Virasoro algebra with a central charge. (In fact, Carlip’s analysis in [125] is based on the covariant Noether-charge formalism below.) Since this algebra is well known in conformal field theories, this approach might be a basis for understanding the microscopic origin of the black hole entropy [124, 125, 126, 409, 127]. However, this quantum issue is beyond the scope of the present review.

Returning to the discussion of O[Na] above, note first that, though Ae is a gauge potential, by δeNe = 0 it is boost gauge invariant. Without this condition, Eq. (11.3) would give potentially reasonable physical quantity only if the boost gauge on \({\mathcal S}\) were geometrically given, e.g., when \({\mathcal S}\) were a leaf of a physically-distinguished foliation of a physically-distinguished spacelike or timelike hypersurface [39]. In particular, the angular momentum of Brown and York [121] also takes the form (11.3), and is well defined (because Na is assumed to be a Killing vector of the intrinsic geometry of \({\mathcal S}\)). (In the angular momentum of Liu and Yau [338] only the gauge invariant part of Ae is present in Eq. (11.3) instead of Ae itself.) Similarly, the expressions in [47, 571] can also be rewritten into the form (11.3), but they should be completed by the condition δeNe = 0.

In general Eq. (11.3) is used as a definition of the Na-component of the angular momentum of quasi-locally defined black holes [40, 97, 227]. This interpretation is supported by the following observations [499]. In axisymmetric spacetimes for axisymmetric surfaces O[Na] can be rewritten into the Komar integral, the usual definition of angular momentum in axisymmetric spacetimes. Moreover, if Σ extends to spatial infinity, then δeNe = 0 together with the requirement of the finiteness of the r → ∞ limit of the observable O[Na] already fix the asymptotic form of Na, which is precisely the combination of the asymptotic spatial rotation Killing vectors, and O[Na] reproduces the standard spatial ADM angular momentum. Similarly, at null infinity Na must be a rotation BMS vector field. However, the null infinity limit of O[Na] is sensitive to the first two terms (rather than only the leading term) in the asymptotic expansion of Na, and hence in general radiative spacetime O[Na] in itself does not yield an unambiguous definition for angular momentum. (But in stationary spacetimes the ambiguities disappear and O[Na] reproduces the standard formula (4.15).) Thus, additional ideas are needed to restrict the BMS vector field Na.

Such an idea could be based on the observation that the eigenspinors of the δe-Dirac operators define δe-divergence-free vector fields on \({\mathcal S}\), and on metric spheres these vector fields built from the eigenspinors with the lowest eigenvalue are just the linear combinations of the three rotation Killing fields [501]. Solving the eigenvalue problem for the δe-Dirac operators on large spheres near scri in the first two leading orders, a well-defined (ambiguity-free) angular-momentum expression is suggested. The angular momenta associated with different cuts of \({\mathcal S}\) can be compared, and the angular momentum flux can also be calculated.

It is tempting to interpret O[Na] as the Na-component of the quasi-local angular momentum of the gravity + matter system associated with \({\mathcal S}\). However, without additional conditions on Na the integral O[Na] could be nonzero even in Minkowski spacetime [501]. Hence, Na must satisfy additional conditions. Cook and Whiting [153] suggest that one derive Na from a variational principle on topological two-spheres. Here the action functional is the norm of the Killing operator. (For a viable, general notion of approximate Killing fields see [359].) Another realization of the approximate Killing fields is given by Beetle in [59], where the vector field Na is searched for in the form of the solution of an eigenvalue problem for an equation, derived from the Killing equations. Both prescriptions have versions in which they give δe-divergence-free Na. The definition of Na suggested in [323] is based on the fact that six of the infinitely many conformal Killing fields on \({\mathcal S}\) with spherical topology are globally defined, and after an appropriate globally-defined conformal rescaling of the intrinsic metric they become the generators of the standard SO(1, 3) action on \({\mathcal S}\). Then these three are used to define the angular momentum that will be the Killing fields in the rescaled geometry. In general these vector fields are not δe-divergence-free. Thus, as in the Liu-Yau definition, to keep boost gauge invariance the gauge invariant piece of the connection one-form Ae can be used instead of the Ae itself.

11.2 Approaches based on the double-null foliations

11.2.1 The 2 + 2 decomposition

The decomposition of the spacetime in a 2 + 2 way with respect to two families of null hypersurfaces is as old as the study of gravitational radiation and the concept of the characteristic initial value problem (see, e.g., [441, 419]). The basic idea is that we foliate an open subset U of the spacetime by a two-parameter family of (e.g., closed) spacelike two-surfaces. If \({\mathcal S}\) is the typical two-surface, then this foliation is defined by a smooth embedding \(\phi : {\mathcal S} \times (- \epsilon, \epsilon) \times (- \epsilon, \epsilon) \rightarrow U:(p,{\nu _ +},{\nu _ -}) \mapsto \phi (p,{\nu _ +},{\nu _ -})\). Then, keeping ν+ fixed and varying ν, or keeping ν fixed and varying \({\nu _ +},{{\mathcal S}_{{\nu _ +},{\nu _ -}}}: = \phi ({\mathcal S},{\nu _ +},{\nu _ -})\) defines two one-parameter families of hypersurfaces Σν+ and Σν respectively. Requiring one (or both) of the hypersurfaces Σν+ to be null, we get a null (or double-null, respectively) foliation of U. (In Section 4.1.8 we require the hypersurfaces Σν± to be null only for the special value ν± = 0 of the parameters.) As is well known, because of the conjugate points, in the null or double null cases the foliation can be well defined only locally. For fixed ν+ and \(p \in {\mathcal S}\) the prescription νϕ(p, ν+, ν) defines a curve through \(\phi (p,{\nu _ +},0) \in {\mathcal S_{{\nu _ +},0}}\) in Σν+, and hence a vector field \(\xi _ + ^a: = {(\partial/\partial {\nu _ -})^a}\) tangent everywhere to Σ+ on U. The Lie bracket of \(\xi _ + ^a\) and the analogously-defined \(\xi _ - ^a\) are zero. There are several inequivalent ways of introducing coordinates or rigid frame fields on U, which are fit naturally to the null or double null foliation \(\{{\mathcal S_{{\nu _ +},{\nu _ -}}}\}\), in which the (vacuum) Einstein equations and Bianchi identities take a relatively simple form[441, 209, 160, 480, 522, 245, 225, 105, 254].

Defining the ‘time derivative’ to be the Lie derivative, for example, along the vector field \(\xi _ + ^a\), the Hilbert action can be rewritten according to the 2 + 2 decomposition. Then the 2 + 2 form of the Einstein equations can be derived from the corresponding action as the Euler—Lagrange equations, provided the fact that the foliation is null is imposed only after the variation has been made. (Otherwise, the variation of the action with respect to the less-than-ten nontrivial components of the metric would not yield all ten Einstein equations.) One can form the corresponding Hamiltonian, in which the null character of the foliation should appear as a constraint. Then the formal Hamilton equations are just the Einstein equations in their 2 + 2 form [160, 522, 245, 254]. However, neither the boundary terms in this Hamiltonian nor the boundary conditions that could ensure its functional differentiability were considered. Therefore, this Hamiltonian can be ‘correct’ only up to boundary terms. Such a Hamiltonian was used by Hayward [245, 248] as the basis of his quasi-local energy expression discussed already in Section 6.3. (A similar energy expression was derived by Ikumi and Shiromizi [282], starting with the idea of the ‘freely falling two-surfaces’.)

11.2.2 The 2 + 2 quasi-localization of the Bondi-Sachs mass-loss

As we mentioned in Section 6.1.3, this double-null foliation was used by Hayward [247] to quasi-localize the Bondi-Sachs mass-loss (and mass-gain) by using the Hawking energy. Thus, we do not repeat the review of his results here.

Yoon investigated the vacuum field equations in a coordinate system based on a null 2 + 2 foliation. Thus, one family of hypersurfaces was (outgoing) null, e.g., \({{\mathcal N}_u}\), but the other was timelike, e.g., Bv. The former defined a foliation of the latter in terms of the spacelike two-surfaces \({{\mathcal S}_{u,\,\upsilon}}: = {{\mathcal N}_u} \cap {B_\upsilon}\). Yoon found [567, 568] a certain two-surface integral on \({{\mathcal S}_{u,\,\upsilon}}\), denoted by (u, v), for which the difference (u2, v) − (u1, v), u1 < u2, could be expressed as a flux integral on the portion of the timelike hypersurface Bv between \({{\mathcal S}_{{u_1},\,\upsilon}}\) and \({{\mathcal S}_{{u_2},\,\upsilon}}\). In general this flux does not have a definite sign, but Yoon showed that asymptotically, when Bv is ‘pushed out to null infinity’ (i.e., in the v → ∞ limit in an asymptotically flat spacetime), it becomes negative definite. In fact, ‘renormalizing’ (u, v) by a subtraction term, \(\tilde E(u,\,\upsilon)\) tends to the Bondi energy, and the flux integral tends to the Bondi mass-loss between the cuts u = u1 and u = u2 [567, 568]. These investigations were extended for other integrals in [569, 570, 571], which are analogous to spatial momentum and angular momentum. However, all these integrals, including (u, v) above, depend not only on the geometry of the spacelike two-surface \({{\mathcal S}_{u,\,\upsilon}}\) but on the 2 + 2 foliation on an open neighborhood of \({{\mathcal S}_{u,\,\upsilon}}\) as well.

11.3 The covariant approach

11.3.1 The covariant phase space methods

The traditional ADM approach to conserved quantities and the Hamiltonian analysis of general relativity is based on the 3 + 1 decomposition of fields and geometry. Although the results and the content of a theory may be covariant even if their form is not, the manifest spacetime covariance of a formalism may help to find the (spacetime covariant) observables and conserved quantities, boundary conditions, etc. more easily. No a posteriori spacetime interpretation of the results is needed. Such a spacetime-covariant Hamiltonian formalism was initiated by Nester [377, 380].

His idea is to use (tensor or Dirac spinor valued) differential forms as the field variables on the spacetime manifold M. Thus, his phase space is the collection of fields on the four-manifold M, endowed with the (generalized) symplectic structure of Kijowski and Tulczyjew [317]. He derives the field equations from the Lagrangian 4-form, and for a fixed spacetime vector field Ka finds a Hamiltonian 3-form H(K)abc whose integral on a spacelike hypersurface takes the form

$$H\;[{\bf{K}}] = {1 \over {8\pi G}}\int\nolimits_\Sigma {{K^a}{G_{ab}}{1 \over {3!}}{\varepsilon ^b}_{cde}} + \oint\nolimits_{\partial \Sigma} {B\,{{({K^a})}_{cd}}},$$
(11.4)

the sum of the familiar ADM constraints and a boundary term. The Hamiltonian is determined from the requirement of the functional differentiability of H[K], i.e., that the variation δH[K] with respect to the canonical variables should not contain any boundary term on an asymptotically flat Σ (see Sections 2.2.2, 3.2.1, and 3.2.2). For asymptotic translations the boundary term in the Hamiltonian gives the ADM energy-momentum four-vector. In tetrad variables H(K)abc is essentially Sparling’s 3-form [476], and the two-component spinor version of B(Ka)cd is essentially the Nester-Witten 2-form contracted in the name index with the components of Ka (see Eq. (3.13), Section 3.2.1 and the introductory paragraphs in Section 8).

The spirit of the first systematic investigations of the covariant phase space of the classical field theories [158, 33, 197, 336] is similar to that of Nester’s. These ideas were recast into the systematic formalism by Wald and Iyer [536, 287, 288], the covariant Noether charge formalism (see also [535, 336]). This formalism generalizes many of the previous approaches. The Lagrangian 4-form may be any diffeomorphism-invariant local expression of any finite-order derivatives of the field variables. It gives a systematic prescription for the Noether currents, the symplectic structure, the Hamiltonian etc. In particular, the entropy of the stationary black holes turns out to be just a Noether charge derived from Hilbert’s Lagrangian.

11.3.2 The general expressions of Chen, Nester and Tung: Covariant quasi-local Hamiltonians with explicit reference configurations

The quasi-local Hamiltonian for a large class of geometric theories, allowing torsion and non-metricity of the connection, was investigated by Chen, Nester, and Tung [139, 136, 382] in the covariant approach of Nester, above [377, 380]. Starting with a Lagrangian 4-form for a first-order formulation of the theory and an arbitrary vector field Ka, they determine the general form of the Hamiltonian 3-form H(K)abc, including the boundary 2-form B(Ka)cd. However, in the variation of the corresponding Hamiltonian there will be boundary terms in general. To cancel them, the boundary 2-form has to be modified. Introducing an explicit reference field ϕ0A and canonical momentum \(\pi _A^0\) (which are solutions of the field equations), Chen, Nester, and Tung suggest (in differential form notation) either of the two four-covariant boundary 2-forms

$${B_\phi}({K^a}): = {\iota _{\bf{K}}}{\phi ^A} \wedge ({\pi _A} - \pi _A^0) - {(-)^k}({\phi ^A} - {\phi ^{0A}}) \wedge {\iota _{\bf{K}}}\pi _A^0,$$
(11.5)
$${B_\pi}({K^a}): = {\iota _{\bf{K}}}{\phi ^{0A}} \wedge ({\pi _A} - \pi _A^0) - {(-)^k}({\phi ^A} - {\phi ^{0A}}) \wedge {\iota _{\bf{K}}}{\pi _A},$$
(11.6)

where the configuration variable ϕA is some (tensor-valued) k-form and \({\iota _{\rm{K}}}{\phi ^A}\) is the interior product of the k-form \(\phi _{{a_1} \ldots {a_k}}^A\) and the vector field Ka, i.e., in the abstract index formalism \({({\iota _{\rm{K}}}{\phi ^A})_{{a_2} \ldots {a_k}}} = k{K^a}\phi _{a{a_2} \ldots {a_k}}^A\). Thus, the boundary terms of Chen, Nester and Tung contain not only a general reference term, but the reference values of the canonical variables. Or, in other words, the ‘calibration’ of their quasi-local quantities is made at the level of the basic variables, rather than at the level of the boundary term.

The boundary term in the variation δH[K] of the Hamiltonian with the boundary term (11.5) and (11.6) is the two-surface integral on Σ of \({\iota _{\rm{K}}}(\delta {\phi ^A} \wedge ({\pi _A} - \pi _A^0))\) and \({\iota _{\rm{K}}}(- ({\phi ^A} - {\phi ^{0A}}) \wedge \delta {\pi _A}\), respectively. Therefore, the Hamiltonian is functionally differentiable with the boundary 2-form Bϕ(Ka) if the configuration variable ϕA is fixed on ∂Σ, but Bπ(Ka) should be used if πA is fixed on ∂Σ. Thus, the first boundary 2-form corresponds to a four-covariant Dirichlet-type, while the second corresponds to a four-covariant Neumann-type boundary condition. Obviously, the Hamiltonian evaluated in the reference configuration \(({\phi ^{0A}},\,\pi _A^0)\) gives zero. Chen and Nester show [136] that Bϕ(Ka) and Bπ(Ka) are the only boundary 2-forms for which the resulting boundary 2-form C(Ka)bc in the variation δH(Ka)bcd of the Hamiltonian 3-form vanishes on Σ, which reflects the type of boundary conditions (i.e., which fields are fixed on the boundary), and is built from the configuration and momentum variables four-covariantly (‘uniqueness’). A further remarkable property of Bϕ(Ka) and Bπ(Ka) is that the corresponding Hamiltonian 3-form can be derived directly from appropriate Lagrangians. One possible choice for the vector field Ka is a Killing vector of the reference geometry. This reference geometry is, however, not yet specified, in general.

These general ideas were applied to general relativity in the tetrad formalism (and also in the Dirac spinor formulation of the theory [139, 132], yielding a Hamiltonian, which is slightly different from Eq. (11.4)) as well as in the usual metric formalism [132, 137]. In the latter it is the appropriate projections to Σ of \({\phi ^{\alpha \beta}}: = {1 \over {8\pi G}}\sqrt {\left\vert g \right\vert} {g^{\alpha \beta}}\) in some coordinate system {xα} that is chosen to be fixed on Σ. Then the dual of the corresponding Dirichlet and Neumann boundary 2-forms will be, respectively,

$$B_\phi ^{ab}({K^e}): = \delta _{def}^{abc}(\Gamma _{gc}^d - \Gamma _{gc}^{0d})\;{\phi ^{ge}}{K^f} + \delta _{ef}^{ab}\nabla _c^0{K^e}({\phi ^{cf}} - {\phi ^{0cf}}),$$
(11.7)
$$B_\pi ^{ab}({K^e}): = \delta _{def}^{abc}(\Gamma _{gc}^d - \Gamma _{gc}^{0d})\;{\phi ^{0ge}}{K^f} + \delta _{ef}^{ab}{\nabla _c}{K^e}({\phi ^{cf}} - {\phi ^{0cf}}).$$
(11.8)

The first terms are analogous to Freud’s superpotential, while the second ones are analogous to Komar’s superpotential. (Since the boundary 2-form contains \(\Gamma _{\mu \beta}^\alpha\) only in the form \(\Gamma _{\mu \beta}^\alpha - \Gamma _{\mu \beta}^{0\alpha}\), this is always tensorial. If \(\Gamma _{\mu \beta}^{0\alpha}\) is chosen to be vanishing, then the first term reduces to Freud’s superpotential.) Because of the Komar-like term, the quasi-local quantities depend not only on the two-surface data (both in the physical spacetime and the reference configuration), but on the normal directional derivative of Ka as well. The connection between the present expressions and the similar previous results (pseudotensorial, tensorial, and quasi-local) is also discussed in [136, 132]. In particular, the expression based on the Dirichlet-type boundary 2-form (11.7) gives precisely the Katz-Bicak-Lynden-Bell superpotential [306]. In the spinor formulation of these ideas the vector field Ka would be built from a Dirac spinor (or a pair of Weyl spinors). The main difficulty is, however, to find spinor fields representing both translational and boost-rotational displacements [140]. In the absence of a prescription for the reference configuration (even though that should be defined only on an open neighborhood of the two-surface) the construction is still not complete, even if the vector field Ka is chosen to be a Killing vector of the reference spacetime. A recent manifestly covariant way of introduction to these ideas is given in [383].

A nice application of the covariant expression is a derivation of the first law of black hole thermodynamics [136]. The quasi-local energy expressions have been evaluated for several specific two-surfaces. For round spheres in the Schwarzschild spacetime, both the four-covariant Dirichlet and Neumann boundary terms (with the Minkowski reference spacetime and Ka as the timelike Killing vector (∂/∂t)a) give m/G at infinity, but at the horizon the former gives 2m/G and the latter is infinite [136]. The Dirichlet boundary term gives, at spatial infinity in the Kerr-anti-de Sitter solution, the standard m/G and ma/G values for the energy and angular momentum, respectively [257]. The center-of-mass is also calculated, both in the metric and the tetrad formulation of general relativity, for the eccentric Schwarzschild solution at spatial infinity [389, 390], and it was found that the ‘Komar-like term’ is needed to recover the correct, expected value. At future null infinity of asymptotically flat spacetimes it gives the Bondi-Sachs energy-momentum and the expression of Katz [305, 310] for the angular momentum [258]. The general formulae are evaluated for the Kerr-Vaidya solution as well.

The quasi-local energy-momentum is calculated on two-surfaces lying in intrinsically-flat space-like hypersurfaces in static spherically-symmetric spacetimes [138], and, in particular, for two-surfaces in the τ = const. slicing of the Schwarzschild solution in the Painlevé-Gullstrand coordinates. Though these hypersurfaces are flat, and hence, the total (ADM type) energy is expected to be vanishing, the quasi-local energy expression based on Eq. (11.7) and a ‘naturally chosen’ frame field gives 2m/G. (N.B., the Cauchy data on the τ = const hypersurfaces do not satisfy the falloff conditions of Section 3.2.1. Though the intrinsic metric is flat, the extrinsic curvature tends to zero only as \({r^{- {3 \over 2}}}\), while in the expression of the ADM linear momentum a slightly faster than \({r^{- {3 \over 2}}}\) falloff is needed. Thus, the vanishing of the naïvly introduced ADM-type energy does not contradict the rigidity part of the positive energy theorem.)

The null infinity limit of the quasi-local energy and the corresponding outgoing energy flux, based on Eq. (11.5), are calculated in [563]. It is shown that, with Minkowski spacetime as a reference configuration, and even with three different embeddings of the two-surface \({\mathcal S}\) into the reference spacetime, the null infinity limit of these two quantities are just the standard Bondi energy and Bondi mass-loss, respectively. A more detailed discussion of the general formulae for the quasi-local energy flux, coming from Eqs. (11.5)(11.6) and the two additional boundary expressions of [137],

$${B_{{\rm{dyn}}}}({K^a}): = {\iota _{\bf{K}}}{\phi ^{0A}} \wedge ({\pi _A} - \pi _A^0) - {(-)^k}\;({\phi ^A} - {\phi ^{0A}}) \wedge {\iota _{\bf{K}}}\pi _A^0,$$
(11.9)
$${B_{{\rm{constr}}}}({K^a}): = {\iota _{\bf{K}}}{\phi ^A} \wedge ({\pi _A} - \pi _A^0) - {(-)^k}\;({\phi ^A} - {\phi ^{0A}}) \wedge {\iota _{\bf{K}}}{\pi _A},$$
(11.10)

is given in [141]. A less technical presentation and further discussions of the energy flux calculations are given in [388].

The quasi-local energy flux of spacetime perturbations on a stationary background is calculated by Tung and Yu [531] using the covariant Noether charge formalism and the boundary terms above. As an example they considered the Vaidya spacetime as a time-dependent perturbation of a stationary one with the orthonormal frame field being adapted to the spherical symmetry. At null infinity they recovered the Bondi mass-loss, while for the dynamical horizons they recovered the flux expression of Ashtekar and Krishnan (see Section 13.3.2).

The quasi-local energy-momentum, based on Eq. (11.7) in the tetrad approach to general relativity, is calculated for arbitrary two-surfaces \({\mathcal S}\) lying in the hypersurfaces of the homogeneity in all the Bianchi cosmological models in [391] (see also [340]). In these calculations the tetrad field was chosen to be the geometrically distinguished triad, being invariant with respect to the global action of the isometry group, and the future-pointing unit timelike normal of the hypersurfaces; while the vector field Ka was chosen to have constant components in this frame. For class A models (i.e., for I, II, VI0, VII0, VIII and IX Bianchi types) this is zero, and for class B models (III, IV, V, VIh and VIIh Bianchi models) the quasi-local energy is negative, and the energy is proportional to the volume of the domain that is bounded by \({\mathcal S}\). (Here a sign error in the previous calculations, reported in [134, 387, 385], is corrected.) The apparent contradiction of the nonpositivity of the energy in the present context and the non-negativity of the energy in general small-sphere calculations indicates that the geometrically distinguished tetrad field in the Bianchi models does not reduce to the ‘natural’ approximate translational Killing fields near a point. Another interpretation of the vanishing and negativity of the quasi-local energy, different from this and those in Section 4.3, is also given.

Instead of the specific boundary terms, So considered a two-parameter family of boundary terms [464], which generalized the special expressions (11.5)(11.6) and (11.9)(11.10). The main idea behind this generalization is that one cannot, in general, expect to be able to control only, for example, either the configuration or the momentum variables, rather only a combination of them. Hence, the boundary condition is not purely of a Dirichlet or Neumann type, but rather a more general mixed one. It is shown that, with an appropriate value for these parameters, the resulting energy expression for small spheres is positive definite, even in the holonomic description.

11.3.3 The reference configuration of Nester, Chen, Liu and Sun

In the general covariant quasi-local Hamiltonians Chen, Nester and Tung left the reference configuration and the boundary conditions unspecified, and hence their construction was not complete. These have been specified in [386]. The key ideas are as follow.

First, because of its correct, advantagous properties (especially its asymptotic behaviour in asymptotically flat spacetimes), Nester, Chen, Liu and Sun choose (11.7) a priori as their Hamiltonian boundary term. Their reference configuration is chosen to be the Minkowski spacetime, and the generator vector field is the general Killing vector (depending on ten parameters).

Next, to match the physical and the reference geometries, they require the two full 4-dimensional metrics to coincide at the points of the two-surface \({\mathcal S}\) (rather than only the induced two-metrics on \({\mathcal S}\)). This condition leaves two unspecified functions in the quasi-local quantities. To find the ‘best matched’ such embedding of \({\mathcal S}\) into the Minkowski spacetime, Nester, Chen, Liu and Sun propose to choose the one that extremize the quasi-local mass.

This, and some other related strategies have been used to compute quasi-local energy in various spherically symmetric configurations in [135, 341, 561, 562].

11.3.4 Covariant quasi-local Hamiltonians with general reference terms

Anco and Tung investigated the possible boundary conditions and boundary terms in the quasi-local Hamiltonian using the covariant Noether charge formalism both of general relativity (with the Hilbert Lagrangian and tetrad variables) and of Yang-Mills-Higgs systems [13, 14]. (Some formulae of the journal versions were recently corrected in the latest arXiv versions.) They considered the world tube of a compact spacelike hypersurface Σ with boundary \({\mathcal S}: = \partial \Sigma\). Thus, the spacetime domain they considered is the same as in the Brown-York approach: D ≈ Σ × [t1, t2]. Their evolution vector field Ka is assumed to be tangent to the timelike boundary 3BΣ × [t1, t2] of the domain D. They derived a criterion for the existence of a well-defined quasi-local Hamiltonian. Dirichlet and Neumann-type boundary conditions are imposed. In general relativity, the variations of the tetrad fields are restricted on 3B by requiring in the first case that the induced metric γab is fixed and the adaptation of the tetrad field to the boundary is preserved, while in the second case that the tetrad components \({\Theta _{ab}}E_{\underline a}^b\) of the extrinsic curvature of 3B is fixed. Then the general allowed boundary condition was shown to be just a mixed Dirichlet-Neumann boundary condition. The corresponding boundary terms of the Hamiltonian, written in the form \(\oint\nolimits_{\mathcal S} {{K^a}{P_a}d{\mathcal S}}\), were also determined [13]. The properties of the co-vectors \(P_a^{\rm{D}}\) and \(P_a^{\rm{N}}\) (called the Dirichlet and Neumann symplectic vectors, respectively) were investigated further in [14]. Their part tangential to \({\mathcal S}\) is not boost gauge invariant, and to evaluate them, the boost gauge determined by the mean extrinsic curvature vector Qa is used (see Section 4.1.2). Both \(P_a^{\rm{D}}\) and \(P_a^{\rm{N}}\) are calculated for various spheres in several special spacetimes. In particular, for the round spheres of radius r in the t = const. hypersurface in the Reissner-Nordström solution \(P_a^{\rm{D}} = {2 \over r}(1 - 2m/r + {e^2}/{r^2})\delta _a^0\) and \(P_a^{\rm{N}} = - (m/{r^2} - {e^2}/{r^2})\delta _a^0\), and hence, the Dirichlet and Neumann ‘energies’ with respect to the static observer Ka = (/∂t)a are \(\oint\nolimits_{{{\mathcal S}_r}} {{K^a}P_a^{\rm{D}}d{{\mathcal S}_r} = 8\pi r - 16\pi [m - {e^2}/(2r)]}\) and \(\oint\nolimits_{{{\mathcal S}_r}} {{K^a}P_a^{\rm{D}}d{{\mathcal S}_r} = 4\pi (m - {e^2}/2r)}\), respectively. Thus, \(P_a^{\rm{N}}\) does not reproduce the standard round-sphere expression, while \(P_a^{\rm{D}}\) gives the standard round sphere and correct ADM energies only if it is ‘renormalized’ by its own value in Minkowski spacetime [14].

Anco continued the investigation of the Dirichlet Hamiltonian in [11], which takes the form (see also Eqs. (8.1) and (10.20))

$$H\;[{\bf{K}}] = {1 \over {8\pi G}}\int\nolimits_\Sigma {{K^a}{G_{ab}}{1 \over {3!}}{\varepsilon ^b}_{cde} - {1 \over {8\pi G}}\oint\nolimits_{\partial \Sigma} {{K^a}({}^ \bot {\varepsilon _{ab}}{Q_c}^{cb} + {A_a} + {B_a})\;d{\mathcal S}}}.$$
(11.11)

Here the two-surface Σ is assumed to be mean convex, in which case the boost gauge freedom in the SO(1, 1) gauge potential Aa can be, and, indeed, is, fixed by using the globally-defined orthonormal vector basis \({e_0^a,,e_1^a}\) in the normal bundle obtained by normalizing the mean curvature basis \(\{{{\tilde Q}_a},\,{Q_a}\}\). The vector field Ka is still arbitrary, and Ba is assumed to have the structure \({B^a} = e_0^aB\) for B as an arbitrary function of qab. This Hamiltonian gives the correct Einstein equations and, for solutions, its value, e.g., with \({K^a} = e_0^a\), is the general expression of the quasi-local energy of Brown and York. (Compare Eq. (11.11) with Eq. (11.3), or with Eqs. (10.8), (10.9) and (10.10).)

However, to rule out the dependence of this notion of quasi-local energy on the completely freely specifiable vector field Ka (i.e., on three arbitrary functions on \({\mathcal S}\)), Anco makes Ka dynamic by linking it to the vector field \({{\tilde Q}^a}\). Namely, let \({K^a}: = {c_0}{[{\rm{Area}}({\mathcal S})]^{{n \over 2}}}{\left| {{{\tilde Q}_e}{{\tilde Q}^e}} \right|^{{{n - 1} \over 2}}}{{\tilde Q}^a}\), where c0 and n are constant, Area(\({\mathcal S}\)) is the area of \({\mathcal S}\), and extend this Ka from \({\mathcal S}\) to Σ in a smooth way. Then Anco proves that, keeping the two-metric qab and Ka fixed on \({\mathcal S}\),

$$H\;[{\bf{K}}] = {1 \over {8\pi G}}\int\nolimits_\Sigma {{K^a}{G_{ab}}{1 \over {3!}}{\varepsilon ^b}_{cde} - {{{c_0}} \over {8\pi G(n + 1)}}{{[{\rm{Area}}({\mathcal S})]}^{{n \over 2}}}\;\oint\nolimits_{\partial \Sigma} {\left({B - {{\left\vert {{{\tilde Q}_e}{{\tilde Q}^e}} \right\vert}^{{{n + 1} \over 2}}}} \right)\;d{\mathcal S}}}$$
(11.12)

is a correct Hamiltonian for the Einstein equations, where B is still an arbitrary function of qab. For n = 1 with the choice \(B = {2^{\mathcal S}}R\) the boundary term reduces to the Hawking energy, and for n = 0 it is the Epp and Kijowski-Liu-Yau energies depending on the choice of B (i.e., the definition of the reference term). For general n, choosing the reference term B appropriately, Anco gives a one-parameter generalization of Hawking and Epp-Kijowski-Liu-Yau-type quasi-local energies (called the ‘mean curvature masses’). In addition, he defines a family of quasi-local angular momenta. Using the positivity of the Kijowski-Liu-Yau energy (n = 0) he shows that the higher power (n > 0) mean curvature masses are bounded from below. Although these masses seem to have the correct large sphere limit at spatial infinity, for general convex two-surfaces in Minkowski spacetime they do not vanish.

The boundary condition on closed untrapped spacelike two-surfaces that make the covariant Hamiltonian functionally differentiable were investigated by Tung [526, 527]. He showed that such a boundary condition might be the following: the area 2-form and the mean curvature vector of \({\mathcal S}\) are fixed, and the evolution vector field Ka is proportional to the dual mean curvature vector, where the factor of proportionality is a function of the area 2-form. Then, requiring that the value of the Hamiltonian reproduce the ADM energy, he recovers the Hawking energy. If, however, Ka is allowed to have a part tangential to \({\mathcal S}\), and KaAa is required to be fixed (up to total δe-divergences), then, though the value of the Hamiltonian is still proportional to the Hawking energy, the factor of proportionality depends on the angular momentum, given by (11.3), as well. With this choice the vector field Ka becomes a generalization of the Kodama vector field [321] (see also Section 4.2.1). The results of [527, 528] are extensions of those in [526].

11.3.5 Pseudotensors and quasi-local quantities

As we discussed briefly in Section 3.3.1, many, apparently different, pseudotensors and SO(1, 3)-gauge-dependent energy-momentum density expressions can be recovered from a single differential form defined on the bundle L(M) of linear frames over the spacetime manifold. The corresponding superpotentials are the pullbacks to M of the various forms of the Nester-Witten 2-from \(u{k \over {ab}}\) from L(M) along the various local sections of the bundle [192, 358, 486, 487]. Thus, the different pseudotensors are simply the gauge-dependent manifestations of the same geometric object on the bundle L(M) in the different gauges. Since, however, \(u{k \over {ab}}\) is the unique extension of the Nester-Witten 2-form \(u{({\varepsilon ^{\underline K}},\,{{\bar \varepsilon}^{\underline K}})_{ab}}\), on the principal bundle of normalized spin frames \(\{\varepsilon {K \over A}\}\) (given in Eq. (3.10)), and the latter has been proven to be connected naturally to the gravitational energy-momentum, the pseudotensors appear to describe the same physics as the spinorial expressions, though in a slightly old fashioned form. That this is indeed the case was demonstrated clearly by Chang, Nester, and Chen [131, 137, 382] by showing an intimate connection between the covariant quasi-local Hamiltonian expressions and the pseudotensors. Writing the Hamiltonian H[K] in the form of the sum of the constraints and a boundary term, in a given coordinate system the integrand of this boundary term may be the superpotential of any of the pseudotensors. Then the requirement of the functional differentiability of H[K] gives the boundary conditions for the basic variables at Σ. For example, for the Freud superpotential (for Einstein’s pseudotensor) what is fixed on the boundary Σ is a certain piece of \(\sqrt {\left\vert g \right\vert} {g^{\alpha \beta}}\).

12 Constructions for Special Spacetimes

12.1 The Komar integral for spacetimes with Killing vectors

Although the Komar integral (and, in general, the linkage (3.15) for some α) does not satisfy our general requirements discussed in Section 4.3.1, and does not always give the standard values in specific situations (see, for example, the ‘factor-of-two anomaly’ or the examples below), in the presence of a Killing vector, the Komar integral, built from the Killing field, could be a very useful tool in practice. (For Killing fields the linkage \({L_{\mathcal S}}[{\bf{K}}]\) reduces to the Komar integral for any α.)

One of its most important properties is that in vacuum \({L_{\mathcal S}}[{\bf{K}}]\) depends only on the homology class of the two-surface (see, e.g., [534]). This follows directly from the explicit form of Komar’s canonical Noether current: 8 \(8\pi G{C^a}[{\bf{K}}] = {G^a}_b{K^b} + {\nabla _b}{\nabla ^{[a}}{K^{b]}} = - {1 \over 2}R{K^a} - {\nabla _b}({\nabla ^{(a}}{K^{b)}} - {g^{ab}}{\nabla _c}{K^c})\). In fact, if \({\mathcal S}\) and \({{\mathcal S}{\prime}}\) are any two two-surfaces such that \({\mathcal S} - {{\mathcal S}{\prime}} = \partial \Sigma\) for some compact three-dimensional hypersurface Σ on which the energy-momentum tensor of the matter fields is vanishing and Ka is a Killing vector, then \({L_{\mathcal S}}[{\bf{K}}] = {L_{{{\mathcal S}{\prime}}}}[{\bf{K}}]\). (Note that, as we already stressed, the structure of the Noether current above dictates that the numerical coefficient in the definition (3.15) of the linkage would have to be \({1 \over {16\pi G}}\) rather than \({1 \over {8\pi G}}\), i.e., the one that gives the correct value of angular momentum (rather than the mass) in Kerr spacetime.) In particular, the Komar integral for the static Killing field in the Schwarzschild spacetime is the mass parameter m of the solution for any two-surface \({\mathcal S}\) surrounding the black hole, but it is zero if \({\mathcal S}\) does not surround it. The explicit form of the current shows that, for timelike Killing field Ka, the small sphere expression of Komar’s quasi-local energy in the first non-trivial order is \(- {{2\pi} \over 3}{r^3}{T_{ab}}{g^{ab}}{t_c}{K^c}\), i.e., it does not reproduce the expected result (4.9); moreover, in vacuum it always gives zero rather than, e.g., the Bel-Robinson ‘energy’ (see Section 4.2.2).

Furthermore [510], the analogous integral in the Reissner-Nordström spacetime on a metric two-sphere of radius r is me2/r, which deviates from the generally accepted round-sphere value me2/(2r). Similarly, in Einstein’s static universe for spheres of radius r on a t = const. hypersurface, \({L_{\mathcal S}}[{\mathbf{K}}]\) is zero instead of the round sphere result \({{4\pi} \over 3}{r^3}[\mu + \lambda/8\pi G]\), where μ is the energy density of the matter and λ is the cosmological constant.

Accurate numerical calculations show that in stationary, axisymmetric asymptotically flat spacetimes describing a black hole or a rigidly-rotating dust disc surrounded by a perfect fluid ring the Komar energy of the black hole or the dust disc could be negative, even though the conditions of the positive energy theorem hold [21]. Moreover, the central black hole’s event horizon can be distorted by the ring so that the black hole’s Komar angular momentum is greater than the square of its Komar energy [20].

12.2 The effective mass of Kulkarni, Chellathurai, and Dadhich for the Kerr spacetime

The Kulkarni-Chellathurai-Dadhich [328] effective mass for the Kerr spacetime is obtained from the Komar integral (i.e., the linkage with α = 0) using a hypersurface orthogonal vector field Xa instead of the Killing vector Ta of stationarity. The vector field Xa is defined to be Ta + ωΦa, where Φa is the Killing vector of axisymmetry and the function ω is −g(T, Φ)/g(Φ, Φ). This is timelike outside the horizon, it is the asymptotic time translation at infinity, and coincides with the null tangent on the event horizon. On the event horizon r = r+ it yields \({M_{{\rm{KCD}}}} = \sqrt {{m^2} - {a^2}}\), while in the limit r → ∞ it is the mass parameter m of the solution. The effective mass is computed for the Kerr-Newman spacetime in [133].

12.3 Expressions in static spacetimes

12.3.1 Tolman’s energy for static spacetimes

Let Ka be a hypersurface-orthogonal timelike Killing vector field, Σ a spacelike hypersurface to which Ka is orthogonal, and f2KaKa. Then \(- {D_a}{D^a}f = 4\pi G(\pi + 3p - {\lambda \over {4\mu G}})f\), a field equation for f, follows from Einstein equations (see, e.g., pp. 71–74 of [240] or [199]). Here μTabtatb and \(p: = - {1 \over 3}{T_{ab}}{h^{ab}}\), the energy density and the average spatial pressure of the matter fields, respectively, seen by the observer at rest with respect to Σ (or Ka).

In the study of (‘quasi-static’) equilibrium configurations of self-gravitating systems Tolman [520, 521] found the integral

$${E_{\rm{T}}}(D): = \int\nolimits_D {(\mu + 3p - {\lambda \over {4\pi G}})f\,d\Sigma} = {1 \over {4\pi G}}\oint\nolimits_{\partial D} {{\upsilon ^a}({D_a}f)\;d{\mathcal S}}$$
(12.1)

to be the energy of the system. Here D ⊂ Σ is a compact domain with smooth boundary ∂D, υa is the outward pointing unit normal of ∂D in Σ; and the second expression follows from the field equation for f above. ET(D) can in fact be interpreted as some form of a quasi-local energy [2, 3], called the Tolman energy. Clearly, for matter fields with non-negative energy density and average pressure and non-positive cosmological constant this is non-negative on the domain where Ka is timelike. Using the defining equation of ET in terms of the two-surface integral, one can show that in asymptotically flat spacetimes it tends to the ADM energy as a non-decreasing set function. The second expression in Eq. (12.1) implies that, similarly to Komar’s expression, in vacuum ET depends only on the homology class of the 2-surface ∂D. Thus, in particular, it associates zero energy with vacuum domains. For spherically symmetric configurations on round spheres with the area radius r (see Section 4.2.1) it is \({{{r^3}} \over {2G}}f{e^\alpha}(8\pi G{T_{ab}}{\nu ^a}{\nu ^b} - \lambda + {1 \over {{r^2}}}[1 - {e^{- 2\alpha}}])\), which in vacuum reduces to the Misner-Sharp energy.

The Tolman energy appeared to be a useful tool in practice: By means of ET Abreu and Visser gave remarkable entropy bounds for localized, but uncollapsed bodies [2, 3]. (We discuss this bound in Section 13.4.3.)

12.3.2 The Katz-Lynden-Bell-Israel energy for static spacetimes

Let \({{\mathcal S}_{\rm{K}}}: = \{f = K\}\), the set of those points of Σ where the length of the Killing field is the value K, i.e., \({{\mathcal S}_{\rm{K}}}\) are the equipotential surfaces in Σ. Let DK ⊂ Σ be the set of those points where the magnitude of Ka is not greater than K. Suppose that DK is compact and connected. Katz, Lynden-Bell, and Israel [309] associate a quasi-local energy to the two-surfaces \({{\mathcal S}_{\rm{K}}}\) as follows. Suppose that the matter fields can be removed from int DK and concentrated into a thin shell on \({{\mathcal S}_{\rm{K}}}\) in such a way that the space inside is flat but the geometry outside remains the same. Then, denoting the (necessarily distributional) energy-momentum tensor of the shell by \(T_s^{ab}\) and assuming that it satisfies the weak energy condition, the total energy of the shell, \(\int\nolimits_{{D_{\rm{K}}}} {{K_a}T_s^{ab}} {t_b}\,d\Sigma\), is positive. Here ta is the future-directed unit normal to Σ. Then, using the Einstein equations, the energy of the shell can be rewritten in terms of geometric objects on the two-surface as

$${E_{{\rm{KLI}}}}({{\mathcal S}_{\rm{K}}}): = {1 \over {8\pi G}}K\oint\nolimits_{{{\mathcal S}_{\rm{K}}}} {[k]\;d{{\mathcal S}_{\rm{K}}}},$$
(12.2)

where [k] is the jump across the two-surface of the trace of the extrinsic curvatures of the two-surface itself in Σ. Remarkably enough, the Katz-Lynden-Bell-Israel quasi-local energy EKLI in the form (12.2), associated with the equipotential surface \({{\mathcal S}_{\rm{K}}}\), is independent of any distributional matter field, and can also be interpreted as follows. Let hab be the metric on Σ, kab the extrinsic curvature of \({{\mathcal S}_{\rm{K}}}\) in (Σ, hab) and khabkab. Then, suppose that there is a flat metric \(h_{ab}^0\) on Σ such that the induced metric from \(h_{ab}^0\) on \({{\mathcal S}_{\rm{K}}}\) coincides with that induced from hab, and \(h_{ab}^0\) matches continuously to hab, on \({{\mathcal S}_{\rm{K}}}\). (Thus, in particular, the induced area element \(d{{\mathcal S}_{\rm{K}}}\) determined on \({{\mathcal S}_{\rm{K}}}\) by hab, and \(h_{ab}^0\) coincide.) Let the extrinsic curvature of \({{\mathcal S}_{\rm{K}}}\) in \(h_{ab}^0\) be 0kab, and \({k^0}: = {h^{ab}}k_{ab}^0\). Then \({E_{{\rm{KLI}}}}({{\mathcal S}_{\rm{K}}})\) is the integral on \({{\mathcal S}_{\rm{K}}}\) of K times the difference kk0. Apart from the overall factor K, this is essentially the Brown-York energy.

In asymptotically flat spacetimes \({E_{{\rm{KLI}}}}({{\mathcal S}_{\rm{K}}})\) tends to the ADM energy [309]. However, it does not reduce to the round-sphere energy in spherically-symmetric spacetimes [374], and, in particular, gives zero for the event horizon of a Schwarzschild black hole.

12.3.3 Static spacetimes and post-Newtonian approximation

The Newtonian limit of general relativity is defined in [240], pp. 71–74, via static and (at spatial infinity) asymptotically flat spacetimes. The Newtonian scalar potential ϕ is identified with the logarithm of the length of the Killing vector, i.e., (in traditional units) it is ϕ = c2 ln f. Decomposing the energy density μ as the sum of the rest-mass energy and the internal energy, μ = c2ρ + u, Einstein’s equations yield the field equation

$$- {h^{ab}}{D_a}{D_b}\phi = 4\pi {G_\rho} + {{4\pi G} \over {{c^2}}}\left({u + 3p - {{{c^4}\lambda} \over {4\pi G}}} \right) + {1 \over {{c^2}}}{h^{ab}}({D_a}\phi)\,({D_b}\phi).$$
(12.3)

Identifying the last term as \({{4\pi G} \over {{c^2}}}(U + 3P)\), the sum of the energy density and three times of the average spatial stress of the field ϕ, (see the definitions (3.2) in the Newtonian case), the exact field equation (12.3) can be compared with the naïve, relativistically corrected Newtonian field equation (3.3); and one can read off the various relativistic corrections [199]. Then (3.4) motivates the definition of an ‘effective’ quasi-local energy as the integral of all the effective source terms on the right hand side of the exact field equation (12.3):

$${E_D}: = \int\nolimits_D {(u + 3p - {{{c^4}\lambda} \over {4\pi G}} - {1 \over {4\pi G}}\vert {D_a}\phi {\vert ^2}){\rm{d}}\Sigma} = {{{c^2}} \over {4\pi G}}\oint\nolimits_{\mathcal S} {{\upsilon ^a}\,({D_a}\phi)\;d{\mathcal S}}.$$
(12.4)

if the spacetime is asymptotically flat at spatial infinity (in which case λ = 0) such that the hypersurface extends to spatial infinity, then, using e.g. the result of [60], one can show that ED tends to the ADM energy as D is enlarged to exhaust the whole Σ [199]. Since in the vacuum region the integrand of the 3-dimensional integral is negative definite, near infinity ED tends to the ADM energy as a monotonically decreasing set function. However, because of the extra relativistic correction term \({{4\pi G} \over {{c^2}}}3P = {{4\pi G} \over {{c^2}}}U\) in the source, the rate of change of this set function deviates from the one in the naïve relativistically corrected Newtonian theory of Section 3.1.1. In fact, for a two-sphere of radius r in the Schwarzschild spacetime with mass parameter <monospace>m</monospace> the quasi-local energy, for large r, is \({E_{{D_r}}} = {{\rm{m}} \over {2G}}(1 + {{\rm{m}} \over r}) + {\mathcal O}({r^{- 2}})\), rather than \({E_{{D_r}}} = {{\rm{m}} \over {2G}}(1 + {1 \over 2}{{\rm{m}} \over r}) + {\mathcal O}({r^{- 2}})\) (see Section 3.1.1).

Though ED is negative in the vacuum regime, for spherically symmetric configurations, when the material source of the gravitational ‘field’ is contained in D, it is positive if an energy condition is satisfied; and it is zero if and only if the domain of dependence of D in the spacetime is flat. (For the details see [199].)

13 Applications in General Relativity

In this section we give a very short review of some of the potential applications of the paradigm of quasi-locality in general relativity. This part of the review is far from complete, and our aim here is not to discuss the problems considered in detail, but rather to give a collection of problems that are (effectively or potentially) related to quasi-local ideas, tools, notions, etc. In some of these problems the various quasi-local expressions and techniques have been used successfully, but others may provide new and promising areas for their application. For a recent review of the applications of these ideas, especially in black hole physics, with an extended bibliography, see [294, 293].

13.1 Calculation of tidal heating

According to astronomical observations, there is intense volcanic activity on the moon Io of Jupiter. One possible explanation of this phenomenon is that Jupiter is heating Io via gravitational tidal forces (like the Moon, whose gravitational tidal forces raise the ocean’s tides on the Earth). To check if this is really the case, one must be able to calculate how much energy is pumped into Io. However, gravitational energy (both in Newtonian theory and in general relativity) is only ambiguously defined (and hence, cannot be localized), while the phenomena mentioned above cannot depend on the mathematics that we use to describe them. The first investigations intended to calculate the tidal work (or heating) of a compact massive body were based on the use of various gravitational pseudotensors [432, 185]. It has been shown that, although in the given (slow motion and isolated body) approximation the interaction energy between the body and its companion is ambiguous, the tidal work that the companion does on the body via the tidal forces is not. This is independent of both the gauge conditions [432] and the actual pseudotensor (Einstein, Møller, Bergmann, or Landau-Lifshitz) [185].

Recently, these calculations were repeated using quasi-local concepts by Booth and Creighton [94]. They calculated the time derivative of the Brown-York energy, given by Eqs. (10.8) and (10.9). Assuming the form of the metric used in the pseudotensorial calculations, for the tidal work they recovered the gauge invariant expressions obtained in [432, 185]. In these approximate calculations the precise form of the boundary conditions (or reference configurations) is not essential, because the results obtained by using different boundary conditions deviate from each other only in higher order.

13.2 Geometric inequalities for black holes

13.2.1 On the Penrose inequality

To rule out a certain class of potential counterexamples to the (weak) cosmic censorship hypothesis [416], Penrose derived an inequality that any asymptotically flat initial data set with (outermost) apparent horizon \({\mathcal S}\) must satisfy [418]: The ADM mass mADM of the data set cannot be less than the irreducible mass of the horizon, \(M: = \sqrt {{\rm{Area(}}{\mathcal S}{\rm{)/(16}}\pi {G^2})}\) (see, also, [213, 113, 354]). However, as stressed by Ben-Dov [75], the more careful formulation of the inequality, due to Horowitz [273], is needed: Assuming that the dominant energy condition is satisfied, the ADM mass of the data set cannot be less than the irreducible mass of the two-surface \({{\mathcal S}_{\min}}\), where \({{\mathcal S}_{\min}}\) has the minimum area among the two-surfaces enclosing the apparent horizon \({\mathcal S}\). In [75] a spherically-symmetric asymptotically flat data set with future apparent horizon is given, which violates the first, but not the second version of the Penrose inequality.

The inequality has been proven for the outermost future apparent horizons outside the outermost past apparent horizon in maximal data sets in spherically-symmetric spacetimes [352] (see, also, [578, 250, 251]), for static black holes (using the Penrose mass, as mentioned in Section 7.2.5) [513, 514] and for the perturbed Reissner-Nordström spacetimes [301] (see, also, [302]). Although the original specific potential counterexample has been shown not to violate the Penrose inequality [214], the inequality has not been proven for a general data set. (For the limitations of the proof of the Penrose inequality for the area of a trapped surface and the Bondi mass at past null infinity [345], see [82].) If the inequality were true, then this would be a strengthened version of the positive mass theorem, providing a positive lower bound for the ADM mass.

On the other hand, for time-symmetric data sets the Penrose inequality has been proven, even in the presence of more than one black hole. The proof is based on the use of some quasi-local energy expression, mostly of Geroch or of Hawking. First it is shown that these expressions are monotonic along the normal vector field of a special foliation of the time-symmetric initial hypersurface (see Sections 6.1.3 and 6.2, and also [193]), and then the global existence of such a foliation between the apparent horizon and the two-sphere at infinity is proven. The first complete proof of the latter was given by Huisken and Ilmanen [278, 279]. (An alternative proof, using a conformal technique, was given by Bray [110, 111, 112].) A simple (but complete) proof of the Riemannian Penrose inequality is given in the special case of axisymmetric time-symmetric data sets by using Brill’s energy positivity proof [218].

A more general form of the conjecture, containing the electric charge parameter e of the black hole, was formulated by Gibbons [213]: The ADM mass is claimed not to be exceeded by M + e2/ (4G2M). Although the weaker form of the inequality, the Bogomolny inequality mADM ≥ |e| /G, has been proven (under assumptions on the matter content, see, e.g., [219, 508, 344, 217, 371, 213]), Gibbons’ inequality for the electric charge has been proven for special cases (for spherically-symmetric spacetimes see, e.g., [251]), and for time-symmetric initial data sets using Geroch’s inverse mean curvature flow [290]. As a consequence of the results of [278, 279] the latter has become a complete proof. However, this inequality does not seem to work in the presence of more than one black hole: For a time-symmetric data set describing k > 1 nearly-extremal Reissner-Nordström black holes, M + e2/(4G2M) can be greater than the ADM mass, where 16πGM2 is either the area of the outermost marginally-trapped surface [546], or the sum of the areas of the individual black hole horizons. On the other hand, the weaker inequality (13.1) below, derived from the cosmic censorship assumption, does not seem to be violated, even in the presence of more than one black hole.Footnote 22

Repeating Penrose’s argumentation (weak cosmic censorship hypothesis, the conjecture that the final state of black holes is described by some Kerr-Newman solution, Bondi’s mass-loss and the assumption that the Bondi mass is not greater than the ADM mass) in axisymmetric electrovacuum spacetime, and assuming that the angular momentum ma/G, measured at the future null infinity in the stationary stage (defined by the Komar integral using the Killing vector of axisymmetry) coincides with the ADM angular momentum JADM, for the irreducible mass M of the black hole we obtain the upper bound (see also [168])

$$2{M^2} \leq m_{{\rm{ADM}}}^2 - {1 \over {2G}}{q^2} + \sqrt {m_{{\rm{ADM}}}^4 - m_{{\rm{ADM}}}^2{{{q^2}} \over G} - {{\left({{{{J_{{\rm{ADM}}}}} \over G}} \right)}^2}}.$$
(13.1)

Here the electric charge q, measured at spatial infinity as well, is related to the charge parameter of the black hole final state as \(q = e/\sqrt G\). If initially there are more, say k, black holes, then M in (13.1) is built from the irreducible masses of the individual black holes as \({M^2}: = \sum\nolimits_{i = 1}^k {M_i^2}\). The inequality (13.1) implies that one of the following inequalities:

$$m_{{\rm{ADM}}}^2 > 2M\left({M + {{{q^2}} \over {4GM}}} \right),$$
(13.2)
$$m_{{\rm{ADM}}}^2 \geq {\left({M + {{{q^2}} \over {4GM}}} \right)^2} + {\left({{{{J_{{\rm{ADM}}}}} \over {2GM}}} \right)^2}$$
(13.3)

holds: If (13.2) is violated, then, by (13.1), the inequality (13.3) holds (though one does not exclude the other). Both inequalities give positive lower bounds for the ADM mass in terms of the irreducible mass M and other quantities measured also at spatial infinity. The Kerr-Newman solution saturates (13.3). However, while lower bounds for mADM in terms of q and M can be given even on a general asymptotically flat data set, in lack of axisymmetry it does not seem to be possible to control ma/G in terms of JADM, and hence, to derive lower bounds for mADM in terms of M, q and JADM.

The structure of Eqs. (13.2) and (13.3) suggests another interpretation, too. In fact, since M is a quasi-locally defined property of the black hole itself, it is natural to ask if the lower bound for the ADM mass can be given only in terms of quasi-locally defined quantities. In the absence of charges outside the horizon, q is just the charge measured at \({{\mathcal S}_{\min}}\), and if, in addition, the spacetime is axisymmetric and vacuum, then JADM coincides with the Komar angular momentum also at \({{\mathcal S}_{\min}}\). However, in general it is not clear what J2 would have to be: The magnitude of some quasi-locally defined relativistic angular momentum, or only of the spatial part of the angular momentum, or even the Pauli-Lubanski spin?

Penrose-like inequalities are studied numerically in [295], while counter-examples to a new version, and to a generalized form (including charge) of the Penrose inequality are given in [129] and [166], respectively. Reviews of the Penrose inequality with an extended bibliography are [354, 355].

13.2.2 On the hoop conjecture

In connection with the formation of black holes and the weak cosmic censorship hypothesis, another geometric inequality has also been formulated. This is the hoop conjecture of Thorne [506, 366], saying that ‘black holes with horizons form when and only when a mass m gets compacted into a region whose circumference C in every direction is C4πGm’ (see, also, [188, 538]). Mathematically, this conjecture is not precisely formulated. Neither the mass nor the notion of the circumference is well defined. In certain situations the mass might be the ADM or the Bondi mass, but might be the integral of some locally-defined ‘mass density’, as well [188, 50, 350, 320]. The most natural formulation of the hoop conjecture would be based on some spacelike two-surface \({\mathcal S}\) and some reasonable notion of the quasi-local mass, and the trapped nature of the surface would be characterized by the mass and the ‘circumference’ of \({\mathcal S}\). In fact, for round spheres outside the outermost trapped surface and the standard round-sphere definition of the quasi-local energy (4.7) one has 4πGE = 2πr[1 − exp(− 2α)] < 2πr = C, where we use the fact that r is an areal radius (see Section 4.2.1).

Another formulation of the hoop conjecture, also for the spherically symmetric configurations, was given by Ó Murchadha, Tung, Xie and Malec in [402] using the Brown-York energy. They showed that a spherical 2-surface, which is embedded in a spherically symmetric asymptotically flat 3-slice with a regular center and which satisfies C < 2πGEBY, is trapped. Moreover, if C > 2πGEBY holds for all embeddings, then the surface is not trapped. The root of the deviation of the numerical coefficient in front of the quasi-local energy EBY here (viz. 2π) from the one in Thorne’s original formulation (i.e., 4π) is the fact that EBY on the event horizon of a Schwarzschild black hole is 2m, rather than the expected m. It is also shown in [402] that no analogous statement can be proven in terms of the Kijowski-Liu-Yau or the Wang-Yau energies.

If, however, \({\mathcal S}\) is not axisymmetric, then there is no natural definition (or, there are several inequivalent ‘natural’ definitions) for the circumference of \({\mathcal S}\). Interesting, necessary and also sufficient conditions for the existence of averaged trapped surfaces in non-spherically-symmetric cases, both in special asymptotically flat and cosmological spacetimes, are found in [350, 320]. For the investigations of the hoop conjecture in the Gibbons-Penrose spacetime of the collapsing thin matter shell see [51, 50, 518, 411], and for colliding black holes see [574]. One reformulation of the hoop conjecture, using the new concept of the ‘trapped circle’ instead of the ill-defined circumference, is suggested by Senovilla [450]. Another version of the hoop conjecture was suggested by Gibbons in terms of the ADM mass and the Birkhoff invariant of horizon of spherical topology, and this form of the conjecture was proved in a number a special cases [215, 159].

13.2.3 On the Dain inequality

The Kerr-Newman solution describes a black hole precisely when the mass parameter dominates the angular momentum and the charge parameters: m2a2 + e2. Thus, it is natural to ask whether or not an analogous inequality holds for more general, dynamic black holes. As Dain has proven, in the axisymmetric, vacuum case there is an analogous inequality, a consequence of an extremality property of Brill’s form of the ADM mass. Namely, it is shown in [165], that the unique absolute minimum of the ADM mass functional on the set of the vacuum Brill data sets with fixed ADM angular momentum is the extreme Kerr data set. Here a Brill data set is an axisymmetric, asymptotically flat, maximal, vacuum data set, which, in addition, satisfies certain global conditions (viz. the form of the metric is given globally, and nontrivial boundary conditions are imposed) [218, 165]. The key tool is a manifestly positive definite expression of the ADM energy in the form of a three-dimensional integral, given in globally defined coordinates. If the angular momentum is nonzero, then by the assumption of axisymmetry and vacuum, the data set contains a black hole (or black holes), and hence, the extremality property of the ADM energy implies that the ADM mass of this (in general, nonstationary) black hole cannot be less than its ADM angular momentum. For further discussion of this inequality, in particular its role analogous to that of the Penrose inequality, see [164]; and for earlier versions of the extremality result above, see [163, 162, 161].

Since in the above result the spacetime is axisymmetric and vacuum, the ADM angular momentum could be written as the Komar integral built from the Killing vector of axisymmetry on any closed spacelike spherical two-surface homologous to the large sphere near the actual infinity. Thus, the angular momentum in Dain’s inequality can be considered as a quasi-local expression. Hence, it is natural to ask if the whole inequality is a condition on quasi-locally defined quantities or not. However, as already noted in Section 12.1, in the stationary axisymmetric but nonvacuum case it is possible to arrange the matter outside the horizon in such a way that the Komar angular momentum on the horizon is greater than the Komar energy there, or the latter can even be negative [20, 21]. Therefore, if a mass-angular momentum inequality is expected to hold quasi-locally at the horizon, then it is not obvious which definitions for the quasi-local mass and angular momentum should be used. In the stationary axisymmetric case, the angular momentum could still be the Komar expression, but the mass is the area of the event horizon [266]: Area(\({\rm{Area(}}{\mathcal S}{\rm{)}} \geq {\rm{8}}\pi G{J_{\rm{K}}}\)) ≥ 8πGJK. For the extremal case (even in the presence of Maxwell fields), see [22]. (For the extremality of black holes formulated in terms of isolated and dynamic horizons, see [99] and Section 13.3.2.)

For a recent, very well-readable and comprehensive review of the Dain inequality with the extended bibliography, where both the old and the recent results are summarized, see the topical review of Dain himself in [167].

13.3 Quasi-local laws of black hole dynamics

13.3.1 Quasi-local thermodynamics of black holes

Black holes are usually introduced in asymptotically flat spacetimes [237, 238, 240, 534], and hence, it is natural to derive the formal laws of black hole mechanics/thermodynamics in the asymptotically flat context (see, e.g., [49, 67, 68], and for a comprehensive review, [539]). The discovery of Hawking radiation [239] showed that the laws of black hole thermodynamics are not only analogous to the laws of thermodynamics, but black holes are genuine thermodynamic objects: black hole temperature is a physical temperature, that is ħc/(2πk) times the surface gravity, and its entropy is a physical entropy, kc3/(4) times the area of the horizon (in the traditional units with the Boltzmann constant k, speed of light c, Newton’s gravitational constant G, and Planck’s constant ħ) (see, also, [537]). Apparently, the detailed microscopic (quantum) theory of gravity is not needed to derive black hole entropy, and it can be derived even from the general principles of a conformal field theory on the horizon of black holes [124, 125, 126, 409, 127, 128].

However, black holes are localized objects, thus, one must be able to describe their properties and dynamics even at the quasi-local level. Nevertheless, beyond this rather theoretical claim, there are pragmatic reasons that force us to quasi-localize the laws of black hole dynamics. In particular, it is well known that the Schwarzschild black hole, fixing its temperature at infinity, has negative heat capacity. Similarly, in an asymptotically anti-de Sitter spacetime, fixing black hole temperature via the normalization of the timelike Killing vector at infinity is not justified because there is no such physically-distinguished Killing field (see [116]). These difficulties lead to the need of a quasi-local formulation of black hole thermodynamics. In [116], Brown, Creighton, and Mann investigated the thermal properties of the Schwarzschild-anti-de Sitter black hole. They used the quasi-local approach of Brown and York to define the energy of the black hole on a spherical two-surface \({\mathcal S}\) outside the horizon. Identifying the Brown-York energy with the internal (thermodynamic) energy and (in the k = ħ = c = 1 units) 1/(4G) times the area of the event horizon with the entropy, they calculated the temperature, surface pressure, and heat capacity. They found that these quantities do depend on the location of the surface \({\mathcal S}\). In particular, there is a critical value T0 such that for temperatures T greater than T0 there are two black hole solutions, one with positive and one with negative heat capacity, but there are no Schwarzschild-anti-de Sitter black holes with temperature T less than T0. In [157] the Brown-York analysis is extended to include dilaton and Yang-Mills fields, and the results are applied to stationary black holes to derive the first law of black hole thermodynamics. The Noether charge formalism of Wald [536], and Iyer and Wald [287] can be interpreted as a generalization of the Brown-York approach from general relativity to any diffeomorphism invariant theory to derive quasi-local quantities [288]. However, this formalism gave a general expression for the black hole entropy, as well. That is the Noether charge derived from the Hilbert Lagrangian corresponding to the null normal of the horizon, and explicitly this is still 1/(4G) times the area of the horizon. (For related work see, e.g., [205, 253]). A comparison of the various proposals for the surface gravity of dynamic black holes in spherically-symmetric black hole spacetimes is given by Nielsen and Yoon [396].

There is extensive literature on the quasi-local formulation of the black hole dynamics and relativistic thermodynamics in the spherically-symmetric context (see, e.g., [250, 252, 251, 256] and for non-spherically-symmetric cases [372, 254, 96]). These investigations are based on the quasilocally defined notion of trapping horizons [246]. A trapping horizon is a smooth hypersurface that can be foliated by (e.g., future) marginally-trapped surfaces such that the expansion of the outgoing null normals is decreasing along the incoming null normals. (On the other hand, the investigations of [248, 246, 249] are based on gauge-dependent energy and angular momentum definitions; see also Sections 4.1.8 and 6.3.) For reviews of the quasi-local formulations and the various aspects of black hole dynamics based on the notion of trapping horizons, see [41, 294, 395, 255], and, for a recent one with an extended bibliography, see [292].

13.3.2 On isolated and dynamic horizons

The idea of isolated horizons (more precisely, the gradually more restrictive notion of nonexpanding, weakly isolated and isolated horizons, and the special weakly isolated horizon called rigidly rotating) generalizes the notion of Killing horizons by keeping their basic properties without the existence of any Killing vector in general. Thus, while the black hole is thought to be settled down to its final state, the spacetime outside the black hole may still be dynamic. (For a review see [32, 41] and references therein, especially [34, 31].) The phase space for asymptotically flat spacetimes containing an isolated horizon is based on a three-manifold with an asymptotic end (or finitely many such ends) and an inner boundary. The boundary conditions on the inner boundary are determined by the precise definition of the isolated horizon. Then the Hamiltonian is the sum of the constraints and boundary terms, corresponding both to the ends and the horizon. Thus, the appearance of the boundary term on the inner boundary makes the Hamiltonian partly quasi-local. It is shown that the condition of the Hamiltonian evolution of the states on the inner boundary along the evolution vector field is precisely the first law of black hole mechanics [34, 31].

Booth [93] applied the general idea of Brown and York to a domain D whose boundary consists not only of two spacelike submanifolds Σ1 and Σ2 and a timelike one 3B, but a further, internal boundary Δ as well, which is null. Thus, he made the investigations of the isolated horizons fully quasi-local. Therefore, the topology of Σ1 and Σ2 is S2 × [a, b], and the inner (null) boundary is interpreted as (a part of) a nonexpanding horizon. Then, to have a well-defined variational principle on D, the Hilbert action had to be modified by appropriate boundary terms. However, by requiring Δ to be a rigidly-rotating horizon, the boundary term corresponding to Δ and the allowed variations are considerably restricted. This made it possible to derive the ‘first law of rigidly rotating horizon mechanics’ quasi-locally, an analog of the first law of black hole mechanics. The first law for rigidly-rotating horizons was also derived by Allemandi, Francaviglia, and Raiteri in the Einstein-Maxwell theory [9] using their Regge-Teitelboim-like approach [191]. The first law for ‘slowly evolving horizons’ was derived in [96].

Another concept is the notion of a dynamic horizon [39, 40]. This is a smooth spacelike hypersurface that can be foliated by a geometrically distinguished family of (e.g., future) marginally-trapped surfaces, i.e., it is a generalization of the trapping horizon above. The isolated horizons are thought to be the asymptotic state of dynamic horizons. The local existence of such horizons was proven by Andersson, Mars and Simon [19]: If \({\mathcal S}\) is a (strictly stably outermost) marginally trapped surface lying in a leaf, e.g., Σ0, of a foliation Σt of the spacetime, then there exists a hypersurface ℋ (the ‘horizon’) such that \({\mathcal H}\) lies in ℋ, and which is foliated by marginally outer-trapped surfaces. (For the related uniqueness properties of the structure of the dynamic horizons see [35]). This structure of the dynamic horizons makes it possible to derive balance equations for the areal radius of the surfaces \({\mathcal S}\) and the angular momentum given by Eq. (11.3) [32, 40] (see also [41]). In particular, the difference of the areal radius of two marginally-trapped surfaces of the foliation, e.g., \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\), is just the flux integral on the portion of \({\mathcal H}\) between \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\) of a positive definite expression: This is the flux of the energy current of the matter fields and terms that can be interpreted as the energy flux carried by the gravitational waves. Interestingly enough, the generator vector field in this flux expression is proportional to the geometrically distinguished outward null normal of the surfaces \({\mathcal S}\), just as in the derivation of black hole entropy as a Noether charge by Wald [536] and Iyer and Wald [287] above. Thus, the second law of black hole mechanics is proven for dynamic horizons. Moreover, this supports the view that the energy that we should associate with marginally-trapped surfaces is the irreducible mass. For further discussion (and generalizations) of the basic flux expressions see [227, 228]. For a different calculation of the energy flux in the Vaidya spacetime, see [531].

In [97, 98] Booth and Fairhurst extended their previous investigations [93, 95] (see above and Section 10.1.5). In [97] a canonical analysis, based on the extended phase space, is given such that the underlying three-manifold has an inner boundary, which can be any of the horizon types above. Though the formalism does not give any explicit expression for the energy on the horizons, an argument is given that supports the expectation that this must be the irreducible mass of the horizon. The variations of marginally trapped surfaces, generated by vector fields orthogonal to the surfaces, are investigated and the corresponding variations of various geometric objects (intrinsic metric, expansions, connection one-form on the normal bundle, etc.) on the surfaces are calculated in [98]. In terms of these, several basic properties of marginally trapped or future outer trapped surfaces (and hence, of the horizons themselves) are derived in a straightforward way.

13.4 Entropy bounds

13.4.1 On Bekenstein’s bounds for the entropy

Having associated the entropy Sbh ≔ [kc3/(4)] Area(\({{\mathcal S}_{{\rm{bh}}}}: = [k{c^3}/(4G\hbar)]Area({\mathcal S})\)) with the (spacelike cross section \({\mathcal S}\) of the) event horizon, it is natural to expect the generalized second law (GSL) of thermodynamics to hold, i.e., the sum Sm + Sbh of the entropy of the matter and the black holes cannot decrease in any process. However, as Bekenstein pointed out, it is possible to construct thought experiments (e.g., the Geroch process) in which the GSL is violated, unless a universal upper bound for the entropy-to-energy ratio for bounded systems exists [69, 70]. (For another resolution of the apparent contradiction to the GSL, based on the calculation of the buoyancy force in the thermal atmosphere of the black hole, see [532, 537].) In traditional units this upper bound is given by Sm/E ≤ [2πk/(ħc)]R, where E and Sm are, respectively, the total energy and entropy of the system, and R is the radius of the sphere that encloses the system. It is remarkable that this inequality does not contain Newton’s constant, and hence, it can be expected to be applicable even for nongravitating systems. Although this bound is violated for several model systems, for a wide class of systems in Minkowski spacetime the bound does hold [404, 405, 406, 71] (see also [104]). The Bekenstein bound has been extended to systems with electric charge by Zaslavskii [579] and to rotating systems by Hod [269] (see also [72, 226]). Although these bounds were derived for test bodies falling into black holes, interestingly enough these Bekenstein bounds hold for the black holes themselves, provided the generalized Gibbons-Penrose inequality (13.1) holds. Identifying E with mADMc2 and letting R be a radius for which 4πR2 is not less than the area of the event horizon of the black hole, Eq. (13.3) can be rewritten in the traditional units as

$$2\pi \sqrt {{{(RE)}^2} - {J^2}} \geq {{\hbar c} \over k}{S_{{\rm{bh}}}} + \pi {q^2}.$$
(13.4)

Obviously, the Kerr-Newman solution saturates this inequality, and in the q = 0 = J, J = 0, and q = 0 special cases, (13.4) reduces to the upper bound given, respectively, by Bekenstein, Zaslavskii, and Hod. A further consequence of the GSL is that there is a lower bound for the ratio of the viscosity to the entropy density of fluids [190, 271]. (It is interesting to note that an analogous lower bound for the relaxation time of any perturbed system, derived for nongravitational systems in [270], is saturated by extremal Reissner-Nordström black holes.)

One should stress, however, that in general curved spacetimes the notion of energy, angular momentum, and radial distance appearing in Eq. (13.4) are not yet well defined. Perhaps it is just the quasi-local ideas that should be used to make them well defined, and there is a deep connection between the Gibbons-Penrose inequality and the Bekenstein bound. The former is the geometric manifestation of the latter for black holes.

13.4.2 On the holographic hypothesis

In the literature there is another kind of upper bound for the entropy of a localized system, the holographic bound. The holographic principle [504, 482, 104] says that, at the fundamental (quantum) level, one should be able to characterize the state of any physical system located in a compact spatial domain by degrees of freedom on the surface of the domain as well, analogous to the holography by means of which a three-dimensional image is encoded into a two-dimensional surface. Consequently, the number of physical degrees of freedom in the domain is bounded from above by the area of the boundary of the domain instead of its volume, and the number of physical degrees of freedom on the two-surface is not greater than one-fourth of the area of the surface measured in Planck-area units \(L_{\rm{P}}^2: = G\hbar/{c^3}\). This expectation is formulated in the (spacelike) holographic entropy bound [104]. Let Σ be a compact spacelike hypersurface with boundary \({\mathcal S}\). Then the entropy S(Σ) of the system in Σ should satisfy \(S(\Sigma) \leq \,k\,{\rm{Area}}({\mathcal S})/(4L_{\rm{P}}^2)\). Formally, this bound can be obtained from the Bekenstein bound with the assumption that 2ERc4/G, i.e., that R is not less than the Schwarzschild radius of E. Also, as with the Bekenstein bounds, this inequality can be violated in specific situations (see also [539, 104]).

On the other hand, there is another formulation of the holographic entropy bound, due to Bousso [103, 104]. Bousso’s covariant entropy bound is much more quasi-local than the previous formulations, and is based on spacelike two-surfaces and the null hypersurfaces determined by the two-surfaces in the spacetime. Its classical version has been proven by Flanagan, Marolf, and Wald [189]. If \({\mathcal N}\) is an everywhere noncontracting (or nonexpanding) null hypersurface with spacelike cuts \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\), then, assuming that the local entropy density of the matter is bounded by its energy density, the entropy flux \({{\mathcal S}_{\mathcal N}}\) through \({\mathcal N}\) between the cuts \({{\mathcal S}_1}\) and \({{\mathcal S}_2}\) is bounded: \({{\mathcal S}_{\mathcal N}} \leq k\vert \mathrm {Area}({{\mathcal S}_2}) - \mathrm {Area}({{\mathcal S}_1})\vert/(4L_{\mathrm {P}}^2)\). For a detailed discussion see [539, 104]. For another, quasi-local formulation of the holographic principle see Section 2.2.5 and [498].

13.4.3 Entropy bounds of Abreu and Visser for uncollapsed bodies

Let the spacetime be static and asymptotically flat (and hence we use the notation of Section 12.3.1), and the localized, uncollapsed body is contained in the domain D ⊂ Σ with smooth, compact boundary \({\mathcal S}: = \partial D\). Then Abreu and Visser define the surface gravity vector to be the acceleration of the Killing observers weighted by the red-shift factor: κe ≔ −fae = Def. However, its flux integral on \({\mathcal S}\) is just 4πG/c4 times of the Tolman energy (see Section 12.3.1). Then, by the Gibbs-Duhem relation, the equilibrium and stability conditions of Tolman and the Unruh relation between temperature and surface gravity, Abreu and Visser derive [2, 3] the upper bound

$$S[D] \leq {1 \over 2}{{k{c^3}} \over {G\hbar}}{\rm{Area}}({\mathcal S})$$
(13.5)

for the entropy S[D] of the uncollapsed body. The numerical factor ½ (instead of the well known ¼ in the Bekenstein entropy for black holes) is interpreted to be a consequence of the fact that here temperature is the usual intensive variable for uncollapsed matter, in contrast to the black hole temperature (which is not an intensive variable). The bound (13.5) is generalized and extended to stationary (rotating) uncollapsed bodies in [4].

13.5 Quasi-local radiative modes of general relativity

In Section 8.2.3 we discuss the properties of the Dougan-Mason energy-momenta, and we see that, under the conditions explained there, the energy-momentum is vanishing iff D(Σ) is flat, and it is null iff D(Σ) is a pp-wave geometry with pure radiative matter, and that these properties of the domain of dependence D(Σ) are completely encoded into the geometry of the two-surface \({\mathcal S}\). However, there is an important difference between these two statements. While in the former case we know the metric of D(Σ) is flat, in the second we know only that the geometry admits a constant null vector field, but we do not know the line element itself. Thus, the question arises as to whether the metric of D(Σ) is also determined by the geometry of \({\mathcal S}\) even in the zero quasi-local-mass case.

In [492] it is shown that under the condition above there is a complex valued function Φ on \({\mathcal S}\), describing the deviation of the antiholomorphic and holomorphic spinor dyads from each other, which plays the role of a potential for the curvature \({F^A}_{Bcd}\) on \({\mathcal S}\). Then, assuming that \({\mathcal S}\) is future and past convex and the matter is an N-type zero-rest-mass field, Φ and the value ϕ of the matter field on \({\mathcal S}\) determine the curvature of D(Σ). Since the field equations for the metric of D(Σ) reduce to Poisson-like equations with the curvature as the source, the metric of D(Σ) is also determined by Φ and ϕ on \({\mathcal S}\). Therefore, the (purely radiative) pp-wave geometry and matter field on D(Σ) are completely encoded in the geometry of \({\mathcal S}\) and complex functions defined on \({\mathcal S}\), respectively, in complete agreement with the holographic principle of Section 13.4.

As we saw in Section 2.2.5, the radiative modes of the zero-rest-mass-fields in Minkowski spacetime, defined by their Fourier expansion, can be characterized quasi-locally on the globally hyperbolic subset D(Σ) of the spacetime by the value of the Fourier modes on the appropriately convex spacelike two-surface \({\mathcal S} = \partial \Sigma\). Thus, the two transversal radiative modes of these fields are encoded in certain fields on \({\mathcal S}\). On the other hand, because of the nonlinearity of the Einstein equations, it is difficult to define the radiative modes of general relativity. It could be done when the field equations become linear, i.e., near the null infinity, in the linear approximation and for pp-waves. In the first case the gravitational radiation is characterized on a cut \({{\mathcal S}_\infty}\) of the null infinity ℐ+ by the u-derivative \({\dot \sigma ^0}\) of the asymptotic shear of the outgoing null hypersurface \({\mathcal N}\) for which \({{\mathcal S}_\infty} = {\mathcal N} \cap {{\mathscr I}^ +}\), i.e., by a complex function on \({{\mathcal S}_\infty}\). It is remarkable that it is precisely this complex function, which yields the deviation of the holomorphic and antiholomorphic spin frames at the null infinity (see, for example, [496]). The linear approximation of Einstein’s theory is covered by the analysis of Section 2.2.5, thus those radiative modes can be characterized quasi-locally, while for the pp-waves, the result of [492], reported above, gives just such a quasi-local characterization in terms of a complex function measuring the deviation of the holomorphic and antiholomorphic spin frames. However, the deviation of the holomorphic and antiholomorphic structures on \({\mathcal S}\) can be defined even for generic two-surfaces in generic spacetimes as well, which might yield the possibility of introducing the radiative modes quasi-locally in general.

13.6 Potential applications in cosmology

The systematic deviation of the observed luminosity-red-shift values for type Ia supernovae for large red shift from the expected ones in the standard Friedmann-Robertson-Walker model is usually interpreted as evidence that the expansion of the universe is accelerating. To generate this acceleration, a hypothetical matter field, the dark matter violating the strong energy condition, is postulated. Here the homogeneity and isotropy of the space, i.e., the use of the Friedmann-Robertson-Walker line element, seems to be justified by the isotropy and the thermal nature of the cosmic microwave background radiation. Nevertheless, as is well known, the observed matter distribution is far from being homogeneous. There are huge voids and the matter is distributed as walls between the voids, as in as foam; and hence, the homogeneity of the universe is expected only after an averaging at a larger scale.

However, motivated by quasi-local energy-momentum ideas, Wiltshire [547, 548, 551] suggested a new averaging procedure (see also [550, 549]). Since by general relativistic redshift clocks in the voids run significantly faster than in the presence of matter (i.e., in the walls), the average should be taken in the voids and in the walls separately, and the model of the universe is built from these two like Swiss cheese. Then cosmic acceleration is explained only as an apparent phenomenon, due to the naïve averaging above, in which the general relativistic clock effect was not taken into account, and hence, no dark energy is needed. A well-readable review of the key ideas is [552].

14 Summary: Achievements, Difficulties, and Open Issues

In the previous sections we have tried to give an objective review of the present state of the art. This section is, however, more subjective: We close the present review with a critical discussion, evaluating strategies, approaches etc. that are explicitly and unambiguously given and (at least in principle) applicable in any generic spacetime.

14.1 On the Bartnik mass and Hawking energy

Although in the literature the notions mass and energy are used almost synonymously, in the present review we have made a distinction between them. By energy we mean the time component of the energy-momentum four-vector, i.e., a reference-frame-dependent quantity, while by mass we mean the length of the energy-momentum, i.e., an invariant. In fact, these two have different properties. The quasi-local energy (both for matter fields and for gravity according to the Dougan-Mason definition) is vanishing precisely for the ‘ground state’ of the theory (i.e., for the vanishing energy-momentum tensor in the domain of dependence D(Σ) and the flatness of D(Σ), see Sections 2.2.5 and 8.2.3, respectively). In particular, for configurations describing pure radiation (purely radiative matter fields and pp-waves, respectively) the energy is positive. On the other hand, the vanishing of the quasi-local mass does not characterize the ‘ground state’, rather that is equivalent only to these purely radiative configurations.

The Bartnik mass is a natural quasi-localization of the ADM mass, and its monotonicity and positivity makes it a potentially very useful tool in proving various statements on the spacetime, because it fully characterizes the nontriviality of the finite Cauchy data by a single scalar. However, our personal opinion is that, by its strict positivity requirement for nonflat three-dimensional domains, it overestimates the ‘physical’ quasi-local mass. In fact, if (Σ, hab, χab) is a finite data set for a pp-wave geometry (i.e., a compact subset of the data set for a pp-wave metric), then it probably has an asymptotically flat extension \((\hat \Sigma, \,{\hat h_{ab}},\,{{\hat \chi}_{ab}})\) satisfying the dominant energy condition with bounded ADM energy and no apparent horizon between Σ and infinity. Thus, while the Dougan-Mason mass of Σ is zero, the Bartnik mass mB(Σ) is strictly positive, unless (Σ, hab, χab) is trivial. Thus, this example shows that it is the procedure of taking the asymptotically flat extension that gives strictly positive mass. Indeed, one possible proof of the rigidity part of the positive energy theorem [38] (see also [488]) is to prove first that the vanishing of the ADM mass implies, through the Witten equation, that the spacetime admits a constant spinor field, i.e., it is a pp-wave spacetime, and then that the only asymptotically flat spacetime that admits a constant null vector field is the Minkowski spacetime. Therefore, it is only the global condition of the asymptotic flatness that rules out the possibility of nontrivial spacetimes with zero ADM mass. Hence, it would be instructive to calculate the Bartnik mass for a compact part of a pp-wave data set. It might also be interesting to calculate its small surface limit to see its connection with the local fields (energy-momentum tensor and probably the Bel-Robinson tensor).

The other very useful definition is the Hawking energy (and its slightly modified version, the Geroch energy). Its advantage is its simplicity, calculability, and monotonicity for special families of two-surfaces, and it has turned out to be a very effective tool in practice in proving for example the Penrose inequality. The small sphere limit calculation shows that the Hawking energy is, in fact, energy rather than mass, so, in principle, one should be able to complete this by a linear momentum to an energy-momentum four-vector. One possibility is Eq. (6.2), but, as far as we are aware, its properties have not been investigated. Unfortunately, although the energy can be defined for two-surfaces with nonzero genus, it is not clear how the four-momentum could be extended for such surfaces. Although Hawking energy is a well-defined two-surface observable, it has not been linked to any systematic (Lagrangian or Hamiltonian) scenario. Perhaps it does not have any such interpretation, and it is simply a natural (but, in general spacetimes for quite general two-surfaces, not quite viable) generalization of the standard round sphere expression (4.8). This view appears to be supported by the fact that Hawking energy has strange properties for nonspherical surfaces, e.g., for two-surfaces in Minkowski spacetime, which are not metric spheres.

14.2 On the Penrose mass

Penrose’s suggestion for the quasi-local mass (or, more generally, energy-momentum and angular momentum) was based on a promising and far-reaching strategy to use twistors at the fundamental level. The basic object of the construction, the kinematical twistor, is intended to comprise both the energy-momentum and angular momentum, and is a well-defined quasi-local quantity on generic spacelike surfaces homeomorphic to S2. It can be interpreted as the value of a quasi-local Hamiltonian, and the four independent two-surface twistors play the role of the quasi-translations and quasi-rotations. The kinematical twistor was calculated for a large class of special two-surfaces and gave acceptable results.

However, the construction is not complete. First, the construction does not work for two-surfaces, whose topology is different from S2, and does not work even for certain topological two-spheres for which the two-surface twistor equation admits more than four independent solutions (‘exceptional two-surfaces’). Second, two additional objects, the infinity twistor and a Hermitian inner product on the space of two-surface twistors, are needed to get the energy-momentum and angular momentum from the kinematical twistor and to ensure their reality. The latter is needed if we want to define the quasi-local mass as a norm of the kinematical twistor. However, no natural infinity twistor has been found, and no natural Hermitian scalar product can exist if the two-surface cannot be embedded into a conformally flat spacetime. In addition, in small surface calculations the quasi-local mass may be complex. If, however, we do not want to form invariants of the kinematical twistor (e.g., the mass), but we do want to extract the energy-momentum and angular momentum from the kinematical twistor and we want them to be real, then only a special combination of the infinity twistor and the Hermitian scalar product, the ‘bar-hook combination’ (see Eq. (7.9)), would be needed.

To save the main body of the construction, the definition of the kinematical twistor was modified. Nevertheless, the mass in the modified constructions encountered an inherent ambiguity in the small surface approximation. One can still hope to find an appropriate ‘bar-hook’, and hence, real energy-momentum and angular momentum, but invariants, such as norms, cannot be formed.

14.3 On the Dougan-Mason energy-momenta and the holomorphic/anti-holomorphic spin angular momenta

From pragmatic points of view the Dougan-Mason energy-momenta (see Section 8.2) are certainly among the most successful definitions. The energy-positivity and rigidity (zero energy implies flatness), and the intimate connection between the pp-waves and the vanishing of the masses make these definitions potentially useful quasi-local tools such as the ADM and Bondi-Sachs energy-momenta in the asymptotically flat context. Similar properties are proven for the quasi-local energy-momentum of the matter fields, in particular for the non-Abelian Yang-Mills fields. The properties depend only on the two-surface data on \({\mathcal S}\), they have a clear Lagrangian interpretation, and the spinor fields that they are based on can be considered as the spinor constituents of the quasi-translations of the two-surface. In fact, in the Minkowski spacetime the corresponding spacetime vectors are precisely the restriction to \({\mathcal S}\) of the constant Killing vectors. These notions of energy-momentum are linked completely to the geometry of \({\mathcal S}\), and are independent of any ad hoc choice for the ‘fleet of observers’ on it. On the other hand, the holomorphic/antiholomorphic spinor fields determine a six-real-parameter family of orthonormal frame fields on \({\mathcal S}\), which can be interpreted as some distinguished class of observers. In addition, they reproduce the expected, correct limits in a number of special situations. In particular, these energy-momenta appear to have been completed by spin angular momenta (see Section 9.2) in a natural way.

However, in spite of their successes, the Dougan-Mason energy-momenta and the spin angular momenta based on Bramson’s superpotential and the holomorphic/antiholomorphic spinor fields have some unsatisfactory properties, as well (see the lists of our expectations in Section 4.3). First, they are defined only for topological two-spheres (but not for other topologies, e.g., for the torus S1 × S1), and, even for certain topological two-spheres, they are not well defined. Such surfaces are, for example, past marginally trapped surfaces in the antiholomorphic (and future marginally trapped surfaces in the holomorphic) case. Although the quasi-local mass associated with a marginally trapped surface \({\mathcal S}\) is expected to be its irreducible mass \(\sqrt {{\rm{Area}}({\mathcal S})/(16\pi {G^2})}\), neither of the Dougan-Mason masses is well defined for the bifurcation surfaces of the Kerr-Newman (or even Schwarzschild) black hole. Second, the role and the physical content of the holomorphicity/antiholomorphicity of the spinor fields is not clear. The use of the complex structure is justified a posteriori by the nice physical properties of the constructions and the pure mathematical fact that it is only the holomorphy and antiholomorphy operators in a large class of potentially acceptable first-order linear differential operators acting on spinor fields that have a two-dimensional kernel. Furthermore, since the holomorphic and antiholomorphic constructions are not equivalent, we have two constructions instead of one, and it is not clear why we should prefer, for example, holomorphicity instead of antiholomorphicity, even at the quasi-local level.

The angular momentum based on Bramson’s superpotential and the antiholomorphic spinors together with the antiholomorphic Dougan-Mason energy-momentum give acceptable Pauli-Lu-banski spin for axisymmetric zero-mass Cauchy developments, for small spheres, and at future null infinity, but the global angular momentum at the future null infinity is finite and well defined only if the spatial three-momentum part of the Bondi-Sachs four-momentum is vanishing, i.e., only in the center-of-mass frame. (The spatial infinity limit of the spin angular momenta has not been calculated.)

Thus, the Nester-Witten 2-form appears to serve as an appropriate framework for defining the energy-momentum, and it is the two spinor fields, which should probably be changed, and a new choice would be needed. The holomorphic/antiholomorphic spinor fields appears to be ‘too rigid’. In fact, it is the topology of \({\mathcal S}\), namely the zero genus of \({\mathcal S}\), that restricts the solution space to two complex dimensions, instead of the local properties of the differential equations. (Thus, the situation is the same as in the twistorial construction of Penrose.) On the other hand, Bramson’s superpotential is based on the idea of Bergmann and Thomson, that the angular momentum of gravity is analogous to the spin. Thus, the question arises as to whether this picture is correct, or if the gravitational angular momentum also has an orbital part, in which case Bramson’s superpotential describes only (the general form of) its spin part. The fact that our antiholomorphic construction gives the correct, expected results for small spheres, but unacceptable ones for large spheres near future null infinity in frames that are not center-of-mass frames, may indicate the lack of such an orbital term. This term could be neglected for small spheres, but certainly not for large spheres. For example, in the special quasi-local angular momentum of Bergqvist and Ludvigsen for the Kerr spacetime (see Section 9.3), it is the sum of Bramson’s expression and a term that can be interpreted as the orbital angular momentum.

14.4 On the Brown-York-type expressions

The idea of Brown and York that the quasi-local conserved quantities should be introduced via the canonical formulation of the theory is quite natural. In fact, as we saw, one could arrive at their general formulae from different points of departure (functional differentiability of the Hamiltonian two-surface observables). If the a priori requirement that we should have a well-defined action principle for the trace-χ-action yielded undoubtedly well behaving quasi-local expressions, then the results would a posteriori justify this basic requirement (like the holomorphicity or antiholomorphicity of the spinor fields in the Dougan-Mason definitions). However, if not, then that might be considered as an unnecessarily restrictive assumption, and the question arises as to whether the present framework is wide enough to construct reasonable quasi-local energy-momenta and angular momenta.

Indeed, the basic requirement automatically yields the boundary condition that the three-metric γab should be fixed on the boundary \({\mathcal S}\), and that the boundary term in the Hamiltonian should be built only from the surface stress tensor τab. Since the boundary conditions are given, no Legendre transformation of the canonical variables on the two-surface is allowed (see the derivation of Kijowski’s expression in Section 10.2). The use of τab has important consequences. First, the quasi-local quantities depend not only on the geometry of the two-surface \({\mathcal S}\), but on an arbitrarily chosen boost gauge, interpreted as a ‘fleet of observers ta being at rest with respect to \({\mathcal S}\prime\), as well. This leaves a huge ambiguity in the Brown-York energy (three arbitrary functions of two variables, corresponding to the three boost parameters at each point of \({\mathcal S}\)) unless a natural gauge choice is prescribed.Footnote 23 Second, since τab does not contain the extrinsic curvature of \({\mathcal S}\) in the direction ta, which is a part of the two-surface data, this extrinsic curvature is ‘lost’ from the point of view of the quasi-local quantities. Moreover, since τab is a tensor only on the three-manifold 3B, the integral of Kaτabtb on \({\mathcal S}\) is not sensitive to the component of Ka normal to 3B. The normal piece υaυbKb of the generator Ka is ‘lost’ from the point of view of the quasi-local quantities.

The other important ingredient of the Brown-York construction is the prescription of the subtraction term. Considering the Gauss-Codazzi-Mainardi equations of the isometric embedding of the two-surface into the flat three-space (or rather into a spacelike hyperplane of Minkowski spacetime) only as a system of differential equations for the reference extrinsic curvature, this prescription — contrary to frequently appearing opinions — is as explicit as the condition of the holomorphicity/antiholomorphicity of the spinor fields in the Dougan-Mason definition. (One essential, and, from pragmatic points of view, important, difference is that the Gauss-Codazzi-Mainardi equations form an underdetermined elliptic system constrained by a nonlinear algebraic equation.) Similar to the Dougan-Mason definitions, the general Brown-York formulae are valid for arbitrary spacelike two-surfaces, but solutions to the equations defining the reference configuration exist certainly only for topological two-spheres with strictly positive intrinsic scalar curvature. Thus, there are exceptional two-surfaces here, too. On the other hand, the Brown-York expressions (both for the flat three-space and the light cone references) work properly for large spheres.

At first sight, this choice for the definition of the subtraction term seems quite natural. However, we do not share this view. If the physical spacetime is the Minkowski one, then we expect that the geometry of the two-surface in the reference Minkowski spacetime would be the same as in the physical Minkowski spacetime. In particular, if \({\mathcal S}\) — in the physical Minkowski spacetime — does not lie in any spacelike hyperplane, then we think that it would be unnatural to require the embedding of \({\mathcal S}\) into a hyperplane of the reference Minkowski spacetime. Since in the two Minkowski spacetimes the extrinsic curvatures can be quite different, the quasi-local energy expressions based on this prescription of the reference term can be expected to yield a nonzero value even in flat spacetime. Indeed, there are explicit examples showing this defect. (Epp’s definition is free of this difficulty, because he embeds the two-surface into the Minkowski spacetime by preserving its ‘universal structure’; see Section 4.1.4.)

Another objection against the embedding into flat three-space is that it is not Lorentz covariant. As we discussed in Section 4.2.2, Lorentz covariance (together with the positivity requirement) was used to show that the quasi-local energy expression for small spheres in vacuum is of order r5 with the Bel-Robinson ‘energy’ as the factor of proportionality. The Brown-York expression (even with the light cone reference \({k^0} = \sqrt {{2^{\mathcal S}}R}\)) fails to give the Bel-Robinson ‘energy’.

Finally, in contrast to the Dougan-Mason definitions, the Brown-York type expressions are well defined on marginally trapped surfaces. However, they yield just twice the expected irreducible mass, and they do not reproduce the standard round sphere expression, which, for nontrapped surfaces, arises from all the other expressions discussed in the present section (including Kijowski’s definition). It is remarkable that the derivation of the first law of black hole thermodynamics, based on the identification of the thermodynamic internal energy with the Brown-York energy, is independent of the definition of the subtraction term.