1 Introduction

1.1 Setting the stage

If one does a search on the topic of relativistic fluids on any of the major physics article databases one is overwhelmed by the number of “hits”. This is a manifestation of the importance that the fluid model has long had for modern physics and engineering. For relativistic physics, in particular, the fluid model is essential. After all, many-particle astrophysical and cosmological systems are the best sources of detectable effects associated with General Relativity. Two obvious examples, the expansion of the Universe and oscillations of neutron stars, indicate the vast range of scales on which relativistic fluid models are relevant. A particularly exciting context for general relativistic fluids today is their use in the modeling of sources of gravitational waves. This includes the compact binary inspiral problem, either of two neutron stars or a neutron star and a black hole, the collapse of stellar cores during supernovae, or various neutron star instabilities. One should also not forget the use of special relativistic fluids in modeling collisions of heavy nuclei, astrophysical jets, and gamma-ray burst sources.

This review provides an introduction to the modeling of fluids in General Relativity. Our main target audience is graduate students with a need for an understanding of relativistic fluid dynamics. Hence, we have made an effort to keep the presentation pedagogical. The article will (hopefully) also be useful to researchers who work in areas outside of General Relativity and gravitation per se (e.g. a nuclear physicist who develops neutron star equations of state), but who require a working knowledge of relativistic fluid dynamics.

Throughout most of the article we will assume that General Relativity is the proper description of gravity. Although not too severe, this is a restriction since the problem of fluids in other theories of gravity has interesting aspects. As we hope that the article will be used by students and researchers who are not necessarily experts in General Relativity and techniques of differential geometry, we have included some discussion of the mathematical tools required to build models of relativistic objects. Even though our summary is not a proper introduction to General Relativity we have tried to define the tools that are required for the discussion that follows. Hopefully our description is sufficiently self-contained to provide a less experienced reader with a working understanding of (at least some of) the mathematics involved. In particular, the reader will find an extended discussion of the covariant and Lie derivatives. This is natural since many important properties of fluids, both relativistic and non-relativistic, can be established and understood by the use of parallel transport and Lie-dragging. But it is vital to appreciate the distinctions between the two.

Ideally, the reader should have some familiarity with standard fluid dynamics, e.g. at the level of the discussion in Landau and Lifshitz [66], basic thermodynamics [96], and the mathematics of action principles and how they are used to generate equations of motion [65]. Having stated this, it is clear that we are facing a real challenge. We are trying to introduce a topic on which numerous books have been written (e.g. [112, 66, 72, 9, 123]), and which requires an understanding of much of theoretical physics. Yet, one can argue that an article of this kind is timely. In particular, there have recently been exciting developments for multi-constituent systems, such as superfluid/superconducting neutron star coresFootnote 1. Much of this work has been guided by the geometric approach to fluid dynamics championed by Carter [17, 19, 21]. This provides a powerful framework that makes extensions to multi-fluid situations quite intuitive. A typical example of a phenomenon that arises naturally is the so-called entrainment effect, which plays a crucial role in a superfluid neutron star core. Given the potential for future applications of this formalism, we have opted to base much of our description on the work of Carter and his colleagues.

Even though the subject of relativistic fluids is far from new, a number of issues remain to be resolved. The most obvious shortcoming of the present theory concerns dissipative effects. As we will discuss, dissipative effects are (at least in principle) easy to incorporate in Newtonian theory but the extension to General Relativity is problematic (see, for instance, Hiscock and Lindblom [54]). Following early work by Eckart [39], a significant effort was made by Israel and Stewart [58, 57] and Carter [17, 19]. Incorporation of dissipation is still an active enterprise, and of key importance for future gravitational-wave asteroseismology which requires detailed estimates of the role of viscosity in suppressing possible instabilities.

1.2 A brief history of fluids

The two fluids air and water are essential to human survival. This obvious fact implies a basic human need to divine their innermost secrets. Homo Sapiens has always needed to anticipate air and water behavior under a myriad of circumstances, such as those that concern water supply, weather, and travel. The essential importance of fluids for survival, and that they can be exploited to enhance survival, implies that the study of fluids probably reaches as far back into antiquity as the human race itself. Unfortunately, our historical record of this ever-ongoing study is not so great that we can reach very far accurately.

A wonderful account (now in Dover print) is “A History and Philosophy of Fluid Mechanics” by G.A. Tokaty [111]. He points out that while early cultures may not have had universities, government sponsored labs, or privately funded centers pursuing fluids research (nor the Living Reviews archive on which to communicate results!), there was certainly some collective understanding. After all, there is a clear connection between the viability of early civilizations and their access to water. For example, we have the societies associated with the Yellow and Yangtze rivers in China, the Ganges in India, the Volga in Russia, the Thames in England, and the Seine in France, to name just a few. We should also not forget the Babylonians and their amazing technological (irrigation) achievements in the land between the Tigris and Euphrates, and the Egyptians, whose intimacy with the flooding of the Nile is well documented. In North America, we have the so-called Mississippians, who left behind their mound-building accomplishments. For example, the Cahokians (in Collinsville, Illinois) constructed Monk’s Mound, the largest pre-Columbian earthen structure in existence that is “…over 100 feet tall, 1000 feet long, and 800 feet wide (larger at its base than the Great Pyramid of Giza)” (see http://en.wikipedia.org/wiki/Monk’s_Mound).

In terms of ocean and sea travel, we know that the maritime ability of the Mediterranean people was a main mechanism for ensuring cultural and economic growth and societal stability. The finely-tuned skills of the Polynesians in the South Pacific allowed them to travel great distances, perhaps reaching as far as South America, and certainly making it to the “most remote spot on the Earth”, Easter Island. Apparently, they were remarkably adept at reading the smallest of signs — water color, views of weather on the horizon, subtleties of wind patterns, floating objects, birds, etc. — as indication of nearby land masses. Finally, the harsh climate of the North Atlantic was overcome by the highly accomplished Nordic sailors, whose skills allowed them to reach several sites in North America. Perhaps it would be appropriate to think of these early explorers as adept geophysical fluid dynamicists/oceanographers?

Many great scientists are associated with the study of fluids. Lost are the names of those individuals who, almost 400,000 years ago, carved “aerodynamically correct” [46] wooden spears. Also lost are those who developed boomerangs and fin-stabilized arrows. Among those not lost is Archimedes, the Greek mathematician (287–212 BC), who provided a mathematical expression for the buoyant force on bodies. Earlier, Thales of Miletus (624–546 BC) asked the simple question: What is air and water? His question is profound since it represents a clear departure from the main, myth-based modes of inquiry at that time. Tokaty ranks Hero of Alexandria as one of the great, early contributors. Hero (c. 10–70) was a Greek scientist and engineer, who left behind many writings and drawings that, from today’s perspective, indicate a good grasp of basic fluid mechanics. To make complete account of individual contributions to our present understanding of fluid dynamics is, of course, impossible. Yet it is useful to list some of the contributors to the field. We provide a highly subjective “timeline” in Figure 1. Our list is to a large extent focussed on the topics covered in this review, and includes chemists, engineers, mathematicians, philosophers, and physicists. It recognizes those who have contributed to the development of non-relativistic fluids, their relativistic counterparts, multi-fluid versions of both, and exotic phenomena such as superfluidity. We have provided this list with the hope that the reader can use these names as key words in a modern, web-based literature search whenever more information is required.

Figure 1
figure 1

A “timeline” focussed on the topics covered in this review, including chemists, engineers, mathematicians, philosophers, and physicists who have contributed to the development of non-relativistic fluids, their relativistic counterparts, multi-fluid versions of both, and exotic phenomena such as superfluidity.

Tokaty [111] discusses the human propensity for destruction when it comes to our water resources. Depletion and pollution are the main offenders. He refers to a “Battle of the Fluids” as a struggle between their destruction and protection. His context for this discussion was the Cold War. He rightly points out the failure to protect our water and air resources by the two predominant participants — the USA and USSR. In an ironic twist, modern study of the relativistic properties of fluids has also a “Battle of the Fluids”. A self-gravitating mass can become absolutely unstable and collapse to a black hole, the ultimate destruction of any form of matter.

1.3 Notation and conventions

Throughout the article we assume the “MTW” [80] conventions. We also generally assume geometrized units c = G = 1, unless specifically noted otherwise. A coordinate basis will always be used, with spacetime indices denoted by lowercase Greek letters that range over {0, 1, 2, 3} (time being the zeroth coordinate), and purely spatial indices denoted by lowercase Latin letters that range over {1, 2, 3}. Unless otherwise noted, we assume the Einstein summation convention.

2 Physics in a Curved Spacetime

There is an extensive literature on special and general relativity, and the spacetime-based viewFootnote 2 of the laws of physics. For the student at any level interested in developing a working understanding we recommend Taylor and Wheeler [109] for an introduction, followed by Hartle’s excellent recent text [51] designed for students at the undergraduate level. For the more advanced students, we suggest two of the classics, “MTW” [80] and Weinberg [117], or the more contemporary book by Wald [114]. Finally, let us not forget the Living Reviews archive as a premier online source of up-to-date information!

In terms of the experimental and/or observational support for special and general relativity, we recommend two articles by Will that were written for the 2005 World Year of Physics celebration [122, 121]. They summarize a variety of tests that have been designed to expose breakdowns in both theories. (We also recommend Will’s popular book Was Einstein Right? [119] and his technical exposition Theory and Experiment in Gravitational Physics [120].) To date, Einstein’s theoretical edifice is still standing!

For special relativity, this is not surprising, given its long list of successes: explanation of the Michaelson-Morley result, the prediction and subsequent discovery of anti-matter, and the standard model of particle physics, to name a few. Will [122] offers the observation that genetic mutations via cosmic rays require special relativity, since otherwise muons would decay before making it to the surface of the Earth. On a more somber note, we may consider the Trinity site in New Mexico, and the tragedies of Hiroshima and Nagasaki, as reminders of E = mc2.

In support of general relativity, there are Eötvös-type experiments testing the equivalence of inertial and gravitational mass, detection of gravitational red-shifts of photons, the passing of the solar system tests, confirmation of energy loss via gravitational radiation in the Hulse-Taylor binary pulsar, and the expansion of the Universe. Incredibly, general relativity even finds a practical application in the GPS system: If general relativity is neglected, an error of about 15 meters results when trying to resolve the location of an object [122]. Definitely enough to make driving dangerous!

The evidence is thus overwhelming that general relativity, or at least something that passes the same tests, is the proper description of gravity. Given this, we assume the Einstein Equivalence Principle, i.e. that [122, 121, 120]

  • test bodies fall with the same acceleration independently of their internal structure or composition;

  • the outcome of any local non-gravitational experiment is independent of the velocity of the freely-falling reference frame in which it is performed;

  • the outcome of any local non-gravitational experiment is independent of where and when in the Universe it is performed.

If the Equivalence Principle holds, then gravitation must be described by a metric-based theory [122]. This means that

  1. 1.

    spacetime is endowed with a symmetric metric,

  2. 2.

    the trajectories of freely falling bodies are geodesics of that metric, an

  3. 3.

    in local freely falling reference frames, the non-gravitational laws of physics are those of special relativity.

For our present purposes this is very good news. The availability of a metric means that we can develop the theory without requiring much of the differential geometry edifice that would be needed in a more general case. We will develop the description of relativistic fluids with this in mind. Readers that find our approach too “pedestrian” may want to consult the recent article by Gourgoulhon [49], which serves as a useful complement to our description.

2.1 The metric and spacetime curvature

Our strategy is to provide a “working understanding” of the mathematical objects that enter the Einstein equations of general relativity. We assume that the metric is the fundamental “field” of gravity. For four-dimensional spacetime it determines the distance between two spacetime points along a given curve, which can generally be written as a one parameter function with, say, components xμ(λ). As we will see, once a notion of parallel transport is established, the metric also encodes information about the curvature of its spacetime, which is taken to be of the pseudo-Riemannian type, meaning that the signature of the metric is − + ++ (cf. Equation (2) below).

In a coordinate basis, which we will assume throughout this review, the metric is denoted by gμν = gνμ. The symmetry implies that there are in general ten independent components (modulo the freedom to set arbitrarily four components that is inherited from coordinate transformations; cf. Equations (8) and (9) below). The spacetime version of the Pythagorean theorem takes the form

$${\rm{d}}{s^2} = {g_{\mu \nu}}\,{\rm{d}}{x^\mu}\,{\rm{d}}{x^\nu},$$
(1)

and in a local set of Minkowski coordinates {t, x, y, z} (i.e. in a local inertial frame, or small patch of the manifold) it looks like

$${\rm{d}}{s^2} = - {({{\rm{d}}t})^2} + {({{\rm{d}}x})^2} + {({{\rm{d}}y})^2} + {({{\rm{d}}z})^2}.$$
(2)

This illustrates the − + ++ signature. The inverse metric gμν is such that

$${g^{\mu \rho}}{g_{\rho \nu}} = {\delta ^\mu}_\nu ,$$
(3)

where \({\delta ^\mu}_\nu\) is the unit tensor. The metric is also used to raise and lower spacetime indices, i.e. if we let denote a contravariant vector, then its associated covariant vector (also known as a covector or one-form) is obtained as

$${V_\mu} = {g_{\mu \nu}}{V^\nu}\qquad \Leftrightarrow \qquad {V^\mu} = {g^{\mu \nu}}{V_\nu}.$$
(4)

We can now consider three different classes of curves: timelike, null, and spacelike. A vector is said to be timelike if gμνVμVν < 0, null if gμνVμVν = 0, and spacelike if gμνVμVν > 0. We can naturally define timelike, null, and spacelike curves in terms of the congruence of tangent vectors that they generate. A particularly useful timelike curve for fluids is one that is parameterized by the so-called proper time, i.e. xμ(τ) where

$${\rm{d}}{\tau ^2} = - {\rm{d}}{s^2}.$$
(5)

The tangent to such a curve has unit magnitude; specifically,

$${u^\mu} \equiv {{{\rm{d}}{x^\mu}} \over {{\rm{d}}\tau}},$$
(6)

and thus

$${g_{\mu \nu}}{u^\mu}{u^\nu} = {g_{\mu \nu}}{{{\rm{d}}{x^\mu}} \over {{\rm{d}}\tau}}{{{\rm{d}}{x^\nu}} \over {{\rm{d}}\tau}} = {{{\rm{d}}{s^2}} \over {{\rm{d}}{\tau ^2}}} = - 1.$$
(7)

Under a coordinate transformation \({x^\mu} \to {{\bar x}^\mu}\), contravariant vectors transform as

$${\overline V^\mu} = {{\partial {{\overline x}^\mu}} \over {\partial {x^\nu}}}{V^\nu}$$
(8)

and covariant vectors as

$${\overline V_\mu} = {{\partial {x^\nu}} \over {\partial {{\overline x}^\mu}}}{V_\nu}.$$
(9)

Tensors with a greater rank (i.e. a greater number of indices), transform similarly by acting linearly on each index using the above two rules.

When integrating, as we need to when we discuss conservation laws for fluids, we must be careful to have an appropriate measure that ensures the coordinate invariance of the integration. In the context of three-dimensional Euclidean space the measure is referred to as the Jacobian. For spacetime, we use the so-called volume form ϵϵμνρ. It is completely antisymmetric, and for four-dimensional spacetime, it has only one independent component, which is

$${\epsilon _{0123}} = \sqrt {- g} \qquad {\rm{and}}\qquad {\epsilon ^{0123}} = {1 \over {\sqrt {- g}}},$$
(10)

where g is the determinant of the metric (cf. Appendix A for details). The minus sign is required under the square root because of the metric signature. By contrast, for three-dimensional Euclidean space (i.e. when considering the fluid equations in the Newtonian limit) we have

$${\epsilon _{123}} = \sqrt g \qquad {\rm{and}}\qquad {\epsilon ^{123}} = {1 \over {\sqrt g}},$$
(11)

where g is the determinant of the three-dimensional space metric. A general identity that is extremely useful for writing fluid vorticity in three-dimensional, Euclidean space — using lowercase Latin indices and setting s = 0, n = 3 and j = 1 in Equation (361) of Appendix A — is

$${\epsilon ^{mij}}{\epsilon _{mkl}} = {\delta ^i}_k{\delta ^j}_l - {\delta ^j}_k{\delta ^i}_l.$$
(12)

The general identities in Equations (360, 361, 362) of Appendix A will be used in our discussion of the variational principle.

2.2 Parallel transport and the covariant derivative

In order to have a generally covariant prescription for fluids, i.e. written in terms of spacetime tensors, we must have a notion of derivative that is itself covariant. For example, when acts on a vector Vν a rank-two tensor of mixed indices must result:

$${\bar \nabla _\mu}{\bar V^\nu} = {{\partial {x^\sigma}} \over {\partial {{\bar x}^\mu}}}{{\partial {{\bar x}^\nu}} \over {\partial {x^\rho}}}{\nabla _\sigma}{V^\rho}.$$
(13)

The ordinary partial derivative does not work because under a general coordinate transformation

$${{\partial {{\bar V}^\nu}} \over {\partial {{\bar x}^\mu}}} = {{\partial {x^\sigma}} \over {\partial {{\bar x}^\mu}}}{{\partial {{\bar x}^\nu}} \over {\partial {x^\rho}}}{{\partial {V^\rho}} \over {\partial {x^\sigma}}} + {{\partial {x^\sigma}} \over {\partial {{\bar x}^\mu}}}{{{\partial ^2}{{\bar x}^\nu}} \over {\partial {x^\sigma}\partial {x^\rho}}}{V^\rho}.$$
(14)

The second term spoils the general covariance, since it vanishes only for the restricted set of rectilinear transformations

$${\bar x^\mu} = a_\nu ^\mu {x^\nu} + {b^\mu},$$
(15)

where \(a_\nu ^\mu\) and bμ are constants.

For both physical and mathematical reasons, one expects a covariant derivative to be defined in terms of a limit. This is, however, a bit problematic. In three-dimensional Euclidean space limits can be defined uniquely in that vectors can be moved around without their lengths and directions changing, for instance, via the use of Cartesian coordinates (the {i, j, k} set of basis vectors) and the usual dot product. Given these limits those corresponding to more general curvilinear coordinates can be established. The same is not true for curved spaces and/or spacetimes because they do not have an a priori notion of parallel transport.

Consider the classic example of a vector on the surface of a sphere (illustrated in Figure 2). Take that vector and move it along some great circle from the equator to the North pole in such a way as to always keep the vector pointing along the circle. Pick a different great circle, and without allowing the vector to rotate, by forcing it to maintain the same angle with the locally straight portion of the great circle that it happens to be on, move it back to the equator. Finally, move the vector in a similar way along the equator until it gets back to its starting point. The vector’s spatial orientation will be different from its original direction, and the difference is directly related to the particular path that the vector followed.

Figure 2
figure 2

A schematic illustration of two possible versions of parallel transport. In the first case (a) a vector is transported along great circles on the sphere locally maintaining the same angle with the path. If the contour is closed, the final orientation of the vector will differ from the original one. In case (b) the sphere is considered to be embedded in a three-dimensional Euclidean space, and the vector on the sphere results from projection. In this case, the vector returns to the original orientation for a closed contour.

On the other hand, we could consider that the sphere is embedded in a three-dimensional Euclidean space, and that the two-dimensional vector on the sphere results from projection of a three-dimensional vector. Move the projection so that its higher-dimensional counterpart always maintains the same orientation with respect to its original direction in the embedding space. When the projection returns to its starting place it will have exactly the same orientation as it started with (see Figure 2). It is thus clear that a derivative operation that depends on comparing a vector at one point to that of a nearby point is not unique, because it depends on the choice of parallel transport.

Pauli [88] notes that Levi-Civita [71] is the first to have formulated the concept of parallel “displacement”, with Weyl [118] generalizing it to manifolds that do not have a metric. The point of view expounded in the books of Weyl and Pauli is that parallel transport is best defined as a mapping of the “totality of all vectors” that “originate” at one point of a manifold with the totality at another point. (In modern texts, one will find discussions based on fiber bundles.) Pauli points out that we cannot simply require equality of vector components as the mapping.

Let us examine the parallel transport of the force-free, point particle velocity in Euclidean three-dimensional space as a means for motivating the form of the mapping. As the velocity is constant, we know that the curve traced out by the particle will be a straight line. In fact, we can turn this around and say that the velocity parallel transports itself because the path traced out is a geodesic (i.e. the straightest possible curve allowed by Euclidean space). In our analysis we will borrow liberally from the excellent discussion of Lovelock and Rund [76]. Their text is comprehensive yet readable for one who is not well-versed with differential geometry. Finally, we note that this analysis will be of use later when we obtain the Newtonian limit of the general relativistic equations, in an arbitrary coordinate basis.

We are all well aware that the points on the curve traced out by the particle can be described, in Cartesian coordinates, by three functions xi(t) where t is the standard Newtonian time. Likewise, we know that the tangent vector at each point of the curve is given by the velocity components vi(t) = dxi/dt, and that the force-free condition is equivalent to

$${a^i}(t) = {{{\rm{d}}{v^i}} \over {{\rm{d}}t}} = 0\qquad \Rightarrow \qquad {v^i}(t) = {\rm{const}}.$$
(16)

Hence, the velocity components vi(0) at the point xi(0) are equal to those at any other point along the curve, say vi(T) at xi(T), and so we could simply take vi(0) = vi(T) as the mapping. But as Pauli warns, we only need to reconsider this example using spherical coordinates to see that the velocity components \(\{\dot r,\dot \theta, \dot \phi \}\) must necessarily change as they undergo parallel transport along a straight-line path (assuming the particle does not pass through the origin). The question is what should be used in place of component equality? The answer follows once we find a curvilinear coordinate version of dvi/dt = 0.

What we need is a new “time” derivative \(\bar D/{\rm{d}}t\), that yields a generally covariant statement

$${{\overline {\rm{D}} {{\bar v}^i}} \over {{\rm{d}}t}} = 0,$$
(17)

where the \({{\bar \upsilon}^i}(t) = {\rm{d}}{{\bar x}^i}/{\rm{d}}t\) are the velocity components in a curvilinear system of coordinates. Consider now a coordinate transformation to the curvilinear coordinate system \({{\bar x}^i}\), the inverse being \({x^i} = {x^i}({{\bar x}^i})\). Given that

$${v^i} = {{\partial {x^i}} \over {\partial {{\bar x}^j}}}{\bar v^j}$$
(18)

we can write

$${{{\rm{d}}{v^i}} \over {{\rm{d}}t}} = \left({{{\partial {x^i}} \over {\partial {{\bar x}^j}}}{{\partial {{\bar v}^j}} \over {\partial {{\bar x}^k}}} + {{{\partial ^2}{x^i}} \over {\partial {{\bar x}^k}\partial {{\bar x}^j}}}{{\bar v}^j}} \right){\bar v^k},$$
(19)

where

$${{{\rm{d}}{{\bar v}^i}} \over {{\rm{d}}t}} = {{\partial {{\bar v}^i}} \over {\partial {{\bar x}^j}}}{\bar v^j}.$$
(20)

Again, we have an “offending” term that vanishes only for rectilinear coordinate transformations. However, we are now in a position to show the importance of this term to a definition of covariant derivative.

First note that the metric for our curvilinear coordinate system is obtainable from

$${\bar g_{ij}} = {{\partial {x^k}} \over {\partial {{\bar x}^i}}}{{\partial {x^l}} \over {\partial {{\bar x}^j}}}{\delta _{kl}},$$
(21)

where

$${\delta _{ij}} = \left\{{\begin{array}{*{20}c} {\rm{1}} & {{\rm{for}}\;i = j,}\\ {\rm{0}} & {{\rm{for}}\;i \neq j.}\\ \end{array}} \right.$$
(22)

Differentiating Equation (21) with respect to \({\bar x}\), and permuting indices, we can show that

$${{{\partial ^2}{x^h}} \over {\partial {{\bar x}^i}\partial {{\bar x}^j}}}{{\partial {x^l}} \over {\partial {{\bar x}^k}}}{\delta _{hl}} = {1 \over 2}({{{\bar g}_{ik,j}} + {{\bar g}_{jk,i}} - {{\bar g}_{ij,k}}}) \equiv {\bar g_{il}}\overline {\left\{{\begin{array}{*{20}c} l \\ {j\;\;k} \\ \end{array}} \right\}} ,$$
(23)

where

$${\bar g_{ij,k}} \equiv {{\partial {{\bar g}_{ij}}} \over {\partial {{\bar x}^k}}}.$$
(24)

Using the inverse transformation of \({{\bar g}_{ij}}\) to δij implied by Equation (21), and the fact that

$${\delta ^i}_j = {{\partial {{\bar x}^k}} \over {\partial {x^j}}}{{\partial {x^i}} \over {\partial {{\bar x}^k}}},$$
(25)

we get

$${{{\partial ^2}{x^i}} \over {\partial {{\bar x}^j}\partial {{\bar x}^k}}} = \overline {{\{_j}{l_{\;k}}\}} {{\partial {x^i}} \over {\partial {{\bar x}^l}}}.$$
(26)

Now we subsitute Equation (26) into Equation (19) and find

$${{{\rm{d}}{v^i}} \over {{\rm{d}}t}} = {{\partial {x^i}} \over {\partial {{\bar x}^j}}}{{\overline {\rm{D}} {{\bar v}^j}} \over {{\rm{d}}t}},$$
(27)

where

$${{\overline {\rm{D}} {{\bar v}^i}} \over {{\rm{d}}t}} = {\bar v^j}\left({{{\partial {{\bar v}^i}} \over {\partial {{\bar x}^j}}} + \overline {{\{_k}{i_j}\}} {{\bar v}^k}} \right).$$
(28)

The operator \(\bar D/{\rm{d}}t\) is easily seen to be covariant with respect to general transformations of curvilinear coordinates.

We now identify our generally covariant derivative (dropping the overline) as

$${\nabla _j}{v^i} = {{\partial {v^i}} \over {\partial {x^j}}} + {\{_k}{i_j}\} {v^k} \equiv {v^i}_{;j}.$$
(29)

Similarly, the covariant derivative of a covector is

$${\nabla _j}{v_i} = {{\partial {v_i}} \over {\partial {x^j}}} - {\{_i}{k_j}\} {v_k} \equiv {v_{i;j}}.$$
(30)

One extends the covariant derivative to higher rank tensors by adding to the partial derivative each term that results by acting linearly on each index with \(\left\{{\begin{array}{*{20}c} {\;i} \\ {k\;\;j} \\ \end{array}} \right\}\) using the two rules given above.

By relying on our understanding of the force-free point particle, we have built a notion of parallel transport that is consistent with our intuition based on equality of components in Cartesian coordinates. We can now expand this intuition and see how the vector components in a curvilinear coordinate system must change under an infinitesimal, parallel displacement from xi(t) to xi(t+δt). Setting Equation (28) to zero, and noting that viδt = δxi, implies

$$\delta {v^i} \equiv {{\partial {v^i}} \over {\partial {x^j}}}\delta {x^j} = - {\{_k}{i_j}\} {v^k}\delta {x^j}.$$
(31)

In general relativity we assume that under an infinitesimal parallel transport from a spacetime point xμ(λ) on a given curve to a nearby point xμ(λ + δλ) on the same curve, the components of a vector will change in an analogous way, namely

$$\delta V_\parallel ^\mu \equiv {{\partial {V^\mu}} \over {\partial {x^\nu}}}\delta {x^\nu} = - \Gamma _{\rho \nu}^\mu {V^\rho}\delta {x^\nu},$$
(32)

where

$$\delta {x^\mu} \equiv {{{\rm{d}}{x^\mu}} \over {{\rm{d}}\lambda}}\delta \lambda .$$
(33)

Weyl [118] refers to the symbol \(\Gamma _{\rho \nu}^\mu\) as the “components of the affine relationship”, but we will use the modern terminology and call it the connection. In the language of Weyl and Pauli, this is the mapping that we were looking for.

For Euclidean space, we can verify that the metric satisfies

$${\nabla _i}{g_{jk}} = 0$$
(34)

for a general, curvilinear coordinate system. The metric is thus said to be “compatible” with the covariant derivative. We impose metric compatibility in general relativity. This results in the so-called Christoffel symbol for the connection, defined as

$$\Gamma _{\mu \nu}^\rho = {1 \over 2}{g^{\rho \sigma}}({{g_{\mu \sigma ,\nu}} + {g_{\nu \sigma ,\mu}} - {g_{\mu \nu ,\sigma}}}){.}$$
(35)

The rules for the covariant derivative of a contravariant vector and a covector are the same as in Equations (29) and (30), except that all indices are for full spacetime.

2.3 The Lie derivative and spacetime symmetries

From the above discussion it should be evident that there are other ways to take derivatives in a curved spacetime. A particularly important tool for measuring changes in tensors from point to point in spacetime is the Lie derivative. It requires a vector field, but no connection, and is a more natural definition in the sense that it does not even require a metric. It yields a tensor of the same type and rank as the tensor on which the derivative operated (unlike the covariant derivative, which increases the rank by one). The Lie derivative is as important for Newtonian, non-relativistic fluids as for relativistic ones (a fact which needs to be continually emphasized as it has not yet permeated the fluid literature for chemists, engineers, and physicists). For instance, the classic papers on the Chandrasekhar-Friedman-Schutz instability [44, 45] in rotating stars are great illustrations of the use of the Lie derivative in Newtonian physics. We recommend the book by Schutz [104] for a complete discussion and derivation of the Lie derivative and its role in Newtonian fluid dynamics (see also the recent series of papers by Carter and Chamel [22, 23, 24]). We will adapt here the coordinate-based discussion of Schouten [99], as it may be more readily understood by those not well-versed in differential geometry.

In a first course on classical mechanics, when students encounter rotations, they are introduced to the idea of active and passive transformations. An active transformation would be to fix the origin and axis-orientations of a given coordinate system with respect to some external observer, and then move an object from one point to another point of the same coordinate system. A passive transformation would be to place an object so that it remains fixed with respect to some external observer, and then induce a rotation of the object with respect to a given coordinate system, by rotating the coordinate system itself with respect to the external observer. We will derive the Lie derivative of a vector by first performing an active transformation and then following this with a passive transformation and finding how the final vector differs from its original form. In the language of differential geometry, we will first “push-forward” the vector, and then subject it to a “pull-back”.

In the active, push-forward sense we imagine that there are two spacetime points connected by a smooth curve xμ (δ). Let the first point be at λ = 0, and the second, nearby point at λ = ϵ, i.e. (ϵ); that is,

$$x_\epsilon ^\mu \equiv {x^\mu}(\epsilon) \approx x_0^\mu + \epsilon \,{\xi ^\mu},$$
(36)

where \(x_0^\mu = {x^\mu}(0)\) and

$${\xi ^\mu} = {\left. {{{{\rm{d}}{x^\mu}} \over {{\rm{d}}\lambda}}} \right\vert _{\lambda = 0}}$$
(37)

is the tangent to the curve at λ = 0. In the passive, pull-back sense we imagine that the coordinate system itself is changed to \({{\bar x}^\mu} = {{\bar x}^\mu}({x^\nu})\), but in the very special form

$${\bar x^\mu} = {x^\mu} - \epsilon {\xi ^\mu}.$$
(38)

It is in this last step that the Lie derivative differs from the covariant derivative. In fact, if we insert Equation (36) into Equation (38) we find the result \(\bar x_\epsilon ^\mu = x_0^\mu\). This is called “Lie-dragging” of the coordinate frame, meaning that the coordinates at λ = 0 are carried along so that at λ = ϵ, and in the new coordinate system, the coordinate labels take the same numerical values.

Figure 3
figure 3

A schematic illustration of the Lie derivative. The coordinate system is dragged along with the flow, and one can imagine an observer “taking derivatives” as he/she moves with the flow (see the discussion in the text).

As an interesting aside it is worth noting that Arnold [10], only a little whimsically, refers to this construction as the “fisherman’s derivative”. He imagines a fisherman sitting in a boat on a river, “taking derivatives” as the boat moves with the current. Playing on this imagery, the covariant derivative is cast with the high-test Zebco parallel transport fishing pole, the Lie derivative with the Shimano, Lie-dragging ultra-light. Let us now see how Lie-dragging reels in vectors.

For some given vector field that takes values Vμ(λ), say, along the curve, we write

$$V_0^\mu = {V^\mu}(0)$$
(39)

for the value of Vμ at λ = 0 and

$$V_\epsilon ^\mu = {V^\mu}(\epsilon)$$
(40)

for the value at λ = ϵ. Because the two points \(x_0^\mu\) and \(x_\epsilon ^\mu\) are infinitesimally close (ϵ ≪ 1) and we can thus write

$$V_\epsilon ^\mu \approx V_0^\mu + \epsilon \,{\xi ^\nu}{\left. {{{\partial {V^\mu}} \over {\partial {x^\nu}}}} \right\vert _{\lambda = 0}}$$
(41)

for the value of Vμ at the nearby point and in the same coordinate system. However, in the new coordinate system at the nearby point, we find

$$\bar V_\epsilon ^\mu = {\left. {\left({{{\partial {{\bar x}^\mu}} \over {\partial {x^\nu}}}{V^\nu}} \right)} \right\vert _{\lambda = \epsilon}} \approx V_\epsilon ^\mu - \epsilon \,V_0^\nu {\left. {{{\partial {\xi ^\mu}} \over {\partial {x^\nu}}}} \right\vert _{\lambda = 0}}.$$
(42)

The Lie derivative now is defined to be

$$\begin{array}{*{20}c} {{{\mathcal L}_\xi}{V^\mu} = \underset{\epsilon \rightarrow 0}{\lim} {{\bar V_\epsilon ^\mu - {V^\mu}} \over \epsilon}\quad \quad \quad \quad \;\;\;}\\ {= {\xi ^\nu}{{\partial {V^\mu}} \over {\partial {x^\nu}}} - {V^\nu}{{\partial {\xi ^\mu}} \over {\partial {x^\nu}}}}\;\;\\ {= {\xi ^\nu}{\nabla _\nu}{V^\mu} - {V^\nu}{\nabla _\nu}{\xi ^\mu},}\\ \end{array}$$
(43)

where we have dropped the “0” subscript and the last equality follows easily by noting \(\Gamma _{\mu \nu}^\rho = \Gamma _{\nu \mu}^{^\rho}\).

The Lie derivative of a covector Aμ is easily obtained by acting on the scalar AμVμ for an arbitrary vector Vμ:

$$\begin{array}{*{20}c} {{{\mathcal L}_\xi}{A_\mu}{V^\mu} = {V^\mu}{{\mathcal L}_\xi}{A_\mu} + {A_\mu}{{\mathcal L}_\xi}{V^\mu}\quad \quad \quad \quad \quad \quad \quad \quad \quad\;}\\ {= {V^\mu}{{\mathcal L}_\xi}{A_\mu} + {A_\mu}({{\xi ^\nu}{\nabla _\nu}{V^\mu} - {V^\nu}{\nabla _\nu}{\xi ^\mu}}){.}}\\ \end{array}$$
(44)

But, because AμVμ is a scalar,

$$\begin{array}{*{20}c} {{{\mathcal L}_\xi}{A_\mu}{V^\mu} = {\xi ^\nu}{\nabla _\nu}{A_\mu}{V^\mu}\quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {= {\xi ^\nu}({{V^\mu}{\nabla _\nu}{A_\mu} + {A_\mu}{\nabla _\nu}{V^\mu}}),}\\ \end{array}$$
(45)

and thus

$${V^\mu}({{{\mathcal L}_\xi}{A_\mu} - {\xi ^\nu}{\nabla _\nu}{A_\mu} - {A_\nu}{\nabla _\mu}{\xi ^\nu}}) = 0{.}$$
(46)

Since Vμ is arbitrary,

$${{\mathcal L}_\xi}{A_\mu} = {\xi ^\nu}{\nabla _\nu}{A_\mu} + {A_\nu}{\nabla _\mu}{\xi ^\nu}.$$
(47)

We introduced in Equation (32) the effect of parallel transport on vector components. By contrast, the Lie-dragging of a vector causes its components to change as

$$\delta V_{\mathcal L}^\mu = {{\mathcal L}_\xi}{V^\mu}\,\epsilon .$$
(48)

We see that if ξVμ = 0, then the components of the vector do not change as the vector is Liedragged. Suppose now that Vμ represents a vector field and that there exists a corresponding congruence of curves with tangent given by ξμ. If the components of the vector field do not change under Lie-dragging we can show that this implies a symmetry, meaning that a coordinate system can be found such that the vector components will not depend on one of the coordinates. This is a potentially very powerful statement.

Let ξμ represent the tangent to the curves drawn out by, say, the μ = a coordinate. Then we can write xa(δ) = δ which means

$${\xi ^\mu} = {\delta ^\mu}_a.$$
(49)

if the Lie derivative of Vμ with respect to ξν vanishes we find

$${\xi ^\nu}{{\partial {V^\mu}} \over {\partial {x^\nu}}} = {V^\nu}{{\partial {\xi ^\mu}} \over {\partial {x^\nu}}} = 0.$$
(50)

Using this in Equation (41) implies \(V_\epsilon ^\mu = V_0^\mu\), that is to say, the vector field Vμ(xν) does not depend on the xa coordinate. Generally speaking, every ξμ that exists that causes the Lie derivative of a vector (or higher rank tensors) to vanish represents a symmetry.

Let us take the spacetime metric gμν as an example. A spacetime symmetry can be represented by a generating vector field ξμ such that

$${{\mathcal L}_\xi}{g_{\mu \nu}} = {\nabla _\mu}{\xi _\nu} + {\nabla _\nu}{\xi _\mu} = 0.$$
(51)

This is known as Killing’s equation, and solutions to this equation are naturally referred to as Killing vectors. As a particular case, let us consider the class of stationary, axisymmetric, and asymptotically flat spacetimes. These are highly relevant in the present context since they capture the physics of rotating, equilibrium configurations. In other words, the geometries that result are among the most fundamental, and useful, for the relativistic astrophysics of spinning black holes and neutron stars. Stationary, axisymmetric, and asymptotically flat spacetimes are such that [14]

  1. 1.

    there exists a Killing vector tμ that is timelike at spatial infinity;

  2. 2.

    there exists a Killing vector ϕμ that vanishes on a timelike 2-surface (called the axis of symmetry), is spacelike everywhere else, and whose orbits are closed curves; and

  3. 3.

    asymptotic flatness means the scalar products tμtμ, and tμϕμ tend to, respectively, −1, +∞, and 0 at spatial infinity.

These conditions imply [16] that a coordinate system exists where

$${t^\mu} = (1,0,0,0),\qquad {\phi ^\mu} = (0,0,0,1){.}$$
(52)

So, for instance, the two Lie derivatives of the metric gμν, say, are such that

$$\begin{array}{*{20}c} {{{\mathcal L}_t}{g_{\mu \nu}} = {t^\sigma}{\nabla _\sigma}{g_{\mu \nu}} + {g_{\mu \sigma}}{\nabla _\nu}{t^\sigma} + {g_{\sigma \nu}}{\nabla _\mu}{t^\sigma}}\\ {= {g_{\mu \sigma}}{\partial _\nu}{t^\sigma} + {g_{\sigma \nu}}{\partial _\mu}{t^\sigma}\quad \quad}\\ {= 0,\quad \quad \quad \quad \quad \quad \quad \quad\;\;}\\ \end{array}$$
(53)

and

$$\begin{array}{*{20}c} {{{\mathcal L}_\phi}{g_{\mu \nu}} = {\phi ^\sigma}{\nabla _\sigma}{g_{\mu \nu}} + {g_{\mu \sigma}}{\nabla _\nu}{\phi ^\sigma} + {g_{\sigma \nu}}{\nabla _\mu}{\phi ^\sigma}}\\ {= {g_{\mu \sigma}}{\partial _\nu}{\phi ^\sigma} + {g_{\sigma \nu}}{\partial _\mu}{\phi ^\sigma}\quad \quad}\\ {= 0.\quad \quad \quad \quad \quad \quad \quad \quad \;\;\;}\\ \end{array}$$
(54)

2.4 Spacetime curvature

The main message of the previous two Sections 2.2 and 2.3 is that one must have an a priori idea of how vectors and higher rank tensors are moved from point to point in spacetime. Another manifestation of the complexity associated with carrying tensors about in spacetime is that the covariant derivative does not commute. For a vector we find

$${\nabla _\rho}{\nabla _\sigma}{V^\mu} - {\nabla _\sigma}{\nabla _\rho}{V^\mu} = {R^\mu}_{\nu \;\rho \sigma}{V^\nu},$$
(55)

where \({R^\mu}_{\nu \rho \sigma}\) is the Riemann tensor. It is obtained from

$${R^\mu}_{\nu \rho \sigma} = \Gamma _{\nu \sigma ,\rho}^\mu - \Gamma _{\nu \rho ,\sigma}^\mu + \Gamma _{\tau \rho}^\mu \Gamma _{\nu \sigma}^\tau - \Gamma _{\tau \sigma}^\mu \Gamma _{\nu \rho}^\tau .$$
(56)

Closely associated are the Ricci tensor Rνσ = Rσν and scalar R that are defined by the contractions

$${R_{\nu \sigma}} = {R^\mu}_{\nu \mu \sigma},\qquad R = {g^{\nu \sigma}}{R_{\nu \sigma}}.$$
(57)

We will later also have need for the Einstein tensor, which is given by

$${G_{\mu \nu}} = {R_{\mu \nu}} - {1 \over 2}R{g_{\mu \nu}}.$$
(58)

It is such that \({\nabla _\mu}{G^\mu}_\nu\) vanishes identically (this is known as the Bianchi identity).

A more intuitive understanding of the Riemann tensor is obtained by seeing how its presence leads to a path-dependence in the changes that a vector experiences as it moves from point to point in spacetime. Such a situation is known as a “non-integrability” condition, because the result depends on the whole path and not just the initial and final points. That is, it is unlike a total derivative which can be integrated and thus depends on only the lower and upper limits of the integration. Geometrically we say that the spacetime is curved, which is why the Riemann tensor is also known as the curvature tensor.

To illustrate the meaning of the curvature tensor, let us suppose that we are given a surface that can be parameterized by the two parameters λ and η. Points that live on this surface will have coordinate labels xμ(λ, η). We want to consider an infinitesimally small “parallelogram” whose four corners (moving counterclockwise with the first corner at the lower left) are given by xμ(λ, η), xμ(λ, η + δη), xμ(λ + δλ, η + δη), and xμ(λ + δλ, η). Generally speaking, any “movement” towards the right of the parallelogram is effected by varying η, and that towards the top results by varying λ. The plan is to take a vector Vμ (λ, η) at the lower-left corner xμ(λ, η), parallel transport it along a λ = const curve to the lower-right corner at xμ(λ, η + δη) where it will have the components Vμ(λ, η + δη), and end up by parallel transporting Vμ at xμ(λ, η + δη) along an η = const curve to the upper-right corner at xμ(λ, δλ, η + δη). We will call this path I and denote the final component values of the vector as \(V_{\rm{I}}^\mu\). We repeat the same process except that the path will go from the lower-left to the upper-left and then on to the upper-right corner. We will call this path II and denote the final component values as \(V_{{\rm{II}}}^\mu\).

Recalling Equation (32) as the definition of parallel transport, we first of all have

$${V^\mu}(\lambda ,\eta + \delta \eta) \approx {V^\mu}(\lambda ,\eta) + {\delta _\eta}V_\parallel ^\mu (\lambda ,\eta) = {V^\mu}(\lambda ,\eta) - \Gamma _{\nu \rho}^\mu {V^\nu}{\delta _\eta}{x^\rho}$$
(59)

and

$${V^\mu}(\lambda + \delta \lambda ,\eta) \approx {V^\mu}(\lambda ,\eta) + {\delta _\lambda}V_\parallel ^\mu (\lambda ,\eta) = {V^\mu}(\lambda ,\eta) - \Gamma _{\nu \rho}^\mu {V^\nu}{\delta _\lambda}{x^\rho},$$
(60)

where

$${\delta _\eta}{x^\mu} \approx {x^\mu}(\lambda ,\eta + \delta \eta) - {x^\mu}(\lambda ,\eta),\qquad {\delta _\lambda}{x^\mu} \approx {x^\mu}(\lambda + \delta \lambda ,\eta) - {x^\mu}(\lambda ,\eta){.}$$
(61)

Next, we need

$$V_{\rm{I}}^\mu \approx {V^\mu}(\lambda ,\eta + \delta \eta) + {\delta _\lambda}V_\parallel ^\mu (\lambda ,\eta + \delta \eta),$$
(62)
$$V_{{\rm{II}}}^\mu \approx {V^\mu}(\lambda + \delta \lambda ,\eta) + {\delta _\eta}V_\parallel ^\mu (\lambda + \delta \lambda ,\eta){.}$$
(63)

Working things out, we find that the difference between the two paths is

$$\Delta {V^\mu} \equiv V_{\rm{I}}^\mu - V_{{\rm{II}}}^\mu = {R^\mu}_{\nu \rho \sigma}{V^\nu}{\delta _\lambda}{x^\sigma}{\delta _\eta}{x^\rho},$$
(64)

which follows because δλδηxμ = δηδλxμ, i.e. we have closed the parallelogram.

3 The Stress-Energy-Momentum Tensor and the Einstein Equations

Any discussion of relativistic physics must include the stress-energy-momentum tensor Tμν. It is as important for general relativity as Gμν in that it enters the Einstein equations in as direct a way as possible, i.e.

$${G_{\mu \nu}} = 8\pi {T_{\mu \nu}}.$$
(65)

Misner, Thorne, and Wheeler [80] refer to Tμν as “… a machine that contains a knowledge of the energy density, momentum density, and stress as measured by any and all observers at that event”.

Without an a priori, physically-based specification for Tμν, solutions to the Einstein equations are devoid of physical content, a point which has been emphasized, for instance, by Geroch and Horowitz (in [52]). Unfortunately, the following algorithm for producing “solutions” has been much abused: (i) specify the form of the metric, typically by imposing some type of symmetry, or symmetries, (ii) work out the components of Gμν based on this metric, (iii) define the energy density to be G00 and the pressure to be G11, say, and thereby “solve” those two equations, and (iv) based on the “solutions” for the energy density and pressure solve the remaining Einstein equations. The problem is that this algorithm is little more than a mathematical game. It is only by sheer luck that it will generate a physically viable solution for a non-vacuum spacetime. As such, the strategy is antithetical to the raison d’être of gravitational-wave astrophysics, which is to use gravitational-wave data as a probe of all the wonderful microphysics, say, in the cores of neutron stars. Much effort is currently going into taking given microphysics and combining it with the Einstein equations to model gravitational-wave emission from realistic neutron stars. To achieve this aim, we need an appreciation of the stress-energy tensor and how it is obtained from microphysics.

Those who are familiar with Newtonian fluids will be aware of the roles that total internal energy, particle flux, and the stress tensor play in the fluid equations. In special relativity we learn that in order to have spacetime covariant theories (e.g. well-behaved with respect to the Lorentz transformations) energy and momentum must be combined into a spacetime vector, whose zeroth component is the energy and the spatial components give the momentum. The fluid stress must also be incorporated into a spacetime object, hence the necessity for Tμν. Because the Einstein tensor’s covariant divergence vanishes identically, we must have also \({\nabla _\mu}{T^\mu}_\nu = 0\) (which we will later see happens automatically once the fluid field equations are satisfied).

To understand what the various components of Tμν mean physically we will write them in terms of projections into the timelike and spacelike directions associated with a given observer. In order to project a tensor index along the observer’s timelike direction we contract that index with the observer’s unit four-velocity Uμ. A projection of an index into spacelike directions perpendicular to the timelike direction defined by Uμ (see [105] for the idea from a “3 + 1” point of view, or [21] from the “brane” point of view) is realized via the operator defined as

$$\bot _\nu ^\mu = {\delta ^\mu}_\nu + {U^\mu}{U_\nu},\qquad {U^\mu}{U_\mu} = - 1.$$
(66)

Any tensor index that has been “hit” with the projection operator will be perpendicular to the timelike direction associated with Uμ in the sense that \(\bot _\nu ^\mu {U^\nu} = 0\). Figure 4 is a local view of both projections of a vector Vμ for an observer with unit four-velocity Uμ. More general tensors are projected by acting with Uμ or \(\bot _\nu ^\mu\) on each index separately (i.e. multi-linearly).

Figure 4
figure 4

The projections at point P of a vector Vμ onto the worldline defined by Uμ and into the perpendicular hypersurface (obtained from the action of \(\bot _\nu ^\mu\)).

The energy density ρ as perceived by the observer is (see Eckart [39] for one of the earliest discussions)

$$\rho = {T_{\mu \nu}}{U^\mu}{U^\nu},$$
(67)

while

$${{\mathcal P}_\mu} = - \bot _\mu ^\rho {U^\nu}{T_{\rho \nu}}$$
(68)

is the spatial momentum density, and the spatial stress \({{\mathcal S}_{\mu \nu}}\) is

$${{\mathcal S}_{\mu \nu}} = \bot _\mu ^\sigma \bot _\nu ^\rho {T_{\sigma \rho}}.$$
(69)

The manifestly spatial component Sij is understood to be the ith-component of the force across a unit area that is perpendicular to the jth-direction. With respect to the observer, the stress-energy-momentum tensor can be written in full generality as the decomposition

$${T_{\mu \nu}} = \rho \,{U_\mu}{U_\nu} + 2{U_{(\mu}}{{\mathcal P}_{\nu)}} + {{\mathcal S}_{\mu \nu}},$$
(70)

where \(2{U_{(\mu}}{{\mathcal P}_\nu}) = {U_\mu}{{\mathcal P}_\nu} + {U_\nu}{{\mathcal P}_\mu}\). Because \({U^\mu}{{\mathcal P}_\mu} = 0\), we see that the trace T = Tμ is

$$T ={\mathcal S} - \rho ,$$
(71)

where \({\mathcal S} = {{\mathcal S}^\mu}_\mu\). We should point out that use of ρ for the energy density is not universal. Many authors prefer to use the symbol ε and reserve ρ for the mass-density. We will later (in Section 6.2) use the above decomposition as motivation for the simplest perfect fluid model.

4 Why Are Fluids Useful Models?

The Merriam-Webster online dictionary (http://www.m-w.com/) defines a fluid as “… a substance (as a liquid or gas) tending to flow or conform to the outline of its container” when taken as a noun and “… having particles that easily move and change their relative position without a separation of the mass and that easily yield to pressure: capable of flowing” when taken as an adjective. The best model of physics is the Standard Model which is ultimately the description of the “substance” that will make up our fluids. The substance of the Standard Model consists of remarkably few classes of elementary particles: leptons, quarks, and so-called “force” carriers (gauge-vector bosons). Each elementary particle is quantum mechanical, but the Einstein equations require explicit trajectories. Moreover, cosmology and neutron stars are basically many particle systems and, even forgetting about quantum mechanics, it is not practical to track each and every “particle” that makes them up, regardless of whether these are elementary (leptons, quarks, etc.) or collections of elementary particles (e.g. stars in galaxies and galaxies in cosmology). The fluid model is such that the inherent quantum mechanical behavior, and the existence of many particles are averaged over in such a way that it can be implemented consistently in the Einstein equations.

Central to the model is the notion of a “fluid particle”, also known as a “fluid element” or “material particle” [68]. It is an imaginary, local “box” that is infinitesimal with respect to the system en masse and yet large enough to contain a large number of particles (e.g. an Avogadro’s number of particles). This is illustrated in Figure 5. We consider an object with characteristic size D that is modeled as a fluid that contains M fluid elements. From inside the object we magnify a generic fluid element of characteristic size L. In order for the fluid model to work we require MN ≫ 1 and DL. Strictly speaking, our model has L infinitesimal, M → ∞, but with the total number of particles remaining finite. An operational point of view is that discussed by Lautrup in his fine text “Physics of Continuous Matter” [68]. He rightly points out that implicit in the model is some statement of the intended precision. At some level, any real system will be discrete and no longer represented by a continuum. As long as the scale where the discreteness of matter and fluctuations are important is much smaller than the desired precision, a continuum approximation is valid.

Figure 5
figure 5

An object with a characteristic size D is modeled as a fluid that contains M fluid elements. From inside the object we magnify a generic fluid element of characteristic size L. In order for the fluid model to work we require MN ≫ 1 and DL.

The explicit trajectories that enter the Einstein equations are those of the fluid elements, not the much smaller (generally fundamental) particles that are “confined”, on average, to the elements. Hence, when we speak later of the fluid velocity, we mean the velocity of fluid elements. In this sense, the use of the phrase “fluid particle” is very apt. For instance, each fluid element will trace out a timelike trajectory in spacetime. This is illustrated in Figure 7 for a number of fluid elements. An object like a neutron star is a collection of worldlines that fill out continuously a portion of spacetime. In fact, we will see later that the relativistic Euler equation is little more than an “integrability” condition that guarantees that this filling (or fibration) of spacetime can be performed. The dual picture to this is to consider the family of three-dimensional hypersurfaces that are pierced by the worldlines at given instants of time, as illustrated in Figure 7. The integrability condition in this case will guarantee that the family of hypersurfaces continuously fill out a portion of spacetime. In this view, a fluid is a so-called three-brane (see [21] for a general discussion of branes). In fact the method used in Section 8 to derive the relativistic fluid equations is based on thinking of a fluid as living in a three-dimensional “matter” space (i.e. the left-hand-side of Figure 7).

Once one understands how to build a fluid model using the matter space, it is straight-forward to extend the technique to single fluids with several constituents, as in Section 9, and multiple fluid systems, as in Section 10. An example of the former would be a fluid with one species of particles at a non-zero temperature, i.e. non-zero entropy, that does not allow for heat conduction relative to the particles. (Of course, entropy does flow through spacetime.) The latter example can be obtained by relaxing the constraint of no heat conduction. In this case the particles and the entropy are both considered to be fluids that are dynamically independent, meaning that the entropy will have a four-velocity that is generally different from that of the particles. There is thus an associated collection of fluid elements for the particles and another for the entropy. At each point of spacetime that the system occupies there will be two fluid elements, in other words, there are two matter spaces (cf. Section 10). Perhaps the most important consequence of this is that there can be a relative flow of the entropy with respect to the particles. In general, relative flows lead to the so-called entrainment effect, i.e. the momentum of one fluid in a multiple fluid system is in principle a linear combination of all the fluid velocities [6]. The canonical examples of two fluid models with entrainment are superfluid He4 [94] at non-zero temperature and a mixture of superfluid He4 and He3 [8].

5 A Primer on Thermodynamics and Equations of State

Fluids consist of many fluid elements, and each fluid element consists of many particles. The state of matter in a given fluid element is determined thermodynamically [96], meaning that only a few parameters are tracked as the fluid element evolves. Generally, not all the thermodynamic variables are independent, being connected through the so-called equation of state. The number of independent variables can also be reduced if the system has an overall additivity property. As this is a very instructive example, we will now illustrate such additivity in detail.

5.1 Fundamental, or Euler, relation

Consider the standard form of the combined First and Second LawsFootnote 3 for a simple, single-species system:

$${\rm{d}}E = T\,{\rm{d}}S - p\,{\rm{d}}V + \mu \,{\rm{d}}N.$$
(72)

This follows because there is an equation of state, meaning that E = E(S, V, N) where

$$T = {\left. {{{\partial E} \over {\partial S}}} \right\vert _{V,N}},\quad \quad p = - {\left. {{{\partial E} \over {\partial V}}} \right\vert _{S,N}},\quad \quad \mu = {\left. {{{\partial E} \over {\partial N}}} \right\vert _{S,V}}.$$
(73)

The total energy E, entropy S, volume V, and particle number N are said to be extensive if when S, V, and N are doubled, say, then E will also double. Conversely, the temperature T, pressure p, and chemical potential μ are called intensive if they do not change their values when V, N, and S are doubled. This is the additivity property and we will now show that it implies an Euler relation (also known as the “fundamental relation” [96]) among the thermodynamic variables.

Let a tilde represent the change in thermodynamic variables when S, V, and N are all increased by the same amount λ, i.e.

$$\tilde S = \lambda S,\qquad \tilde V = \lambda V,\qquad \tilde N = \lambda N.$$
(74)

Taking E to be extensive then means

$$\tilde E(\tilde S,\tilde V,\tilde N) = \lambda E(S,V,N){.}$$
(75)

Of course we have for the intensive variables

$$\tilde T = T,\qquad \tilde p = p,\qquad \tilde \mu = \mu .$$
(76)

Now,

$$\begin{array}{*{20}c} {{\rm{d}}\tilde E = \lambda \,{\rm{d}}E + E\,{\rm{d}}\lambda \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;}\\ {= \tilde T\,{\rm{d}}\tilde S - \tilde p\,{\rm{d}}\tilde V + \tilde \mu \,{\rm{d}}\tilde N\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {= \lambda ({T{\rm{d}}S - p{\rm{d}}V + \mu {\rm{d}}N}) + \left({TS - pV + \mu N} \right){\rm{d}}\lambda ,}\\ \end{array}$$
(77)

and therefore we find the Euler relation

$$E = TS - pV + \mu N.$$
(78)

if we let ρ = E/V denote the total energy density, s = S/V the total entropy density, and n = N/V the total particle number density, then

$$p + \rho = Ts + \mu n.$$
(79)

The nicest feature of an extensive system is that the number of parameters required for a complete specification of the thermodynamic state can be reduced by one, and in such a way that only intensive thermodynamic variables remain. To see this, let λ = 1/V, in which case

$$\tilde S = s,\qquad \tilde V = 1,\qquad \tilde N = n.$$
(80)

The re-scaled energy becomes just the total energy density, i.e. \(\tilde E = E/V = \rho\), and moreover ρ = ρ(s, n) since

$$\rho = \tilde E(\tilde S,\tilde V,\tilde N) = \tilde E(S/V,1,N/V) = \tilde E(s,n){.}$$
(81)

The first law thus becomes

$${\rm{d}}\tilde E = \tilde T\,{\rm{d}}\tilde S - \tilde p\,{\rm{d}}\tilde V + \tilde \mu \,{\rm{d}}\tilde N = T\,{\rm{d}}s + \mu \,{\rm{d}}n,$$
(82)

or

$${\rm{d}}\rho = T\,{\rm{d}}s + \mu \,{\rm{d}}n.$$
(83)

This implies

$$T = {\left. {{{\partial \rho} \over {\partial s}}} \right\vert _n},\quad \;\;\mu = {\left. {{{\partial \rho} \over {\partial n}}} \right\vert _s}.$$
(84)

The Euler relation (79) then yields the pressure as

$$p = - \rho + s{\left. {{{\partial \rho} \over {\partial s}}} \right\vert _n} + n{\left. {{{\partial \rho} \over {\partial n}}} \right\vert _s}.$$
(85)

We can think of a given relation ρ(s, n) as the equation of state, to be determined in the flat, tangent space at each point of the manifold, or, physically, small enough patches across which the changes in the gravitational field are negligible, but also large enough to contain a large number of particles. For example, for a neutron star Glendenning [48] has reasoned that the relative change in the metric over the size of a nucleon with respect to the change over the entire star is about 10−19, and thus one must consider many internucleon spacings before a substantial change in the metric occurs. In other words, it is sufficient to determine the properties of matter in special relativity, neglecting effects due to spacetime curvature. The equation of state is the major link between the microphysics that governs the local fluid behavior and global quantities (such as the mass and radius of a star).

In what follows we will use a thermodynamic formulation that satisfies the fundamental scaling relation, meaning that the local thermodynamic state (modulo entrainment, see later) is a function of the variables N/V, S/V, etc. This is in contrast to the fluid formulation of “MTW” [80]. In their approach one fixes from the outset the total number of particles N, meaning that one simply sets dN = 0 in the first law of thermodynamics. Thus without imposing any scaling relation, one can write

$${\rm{d}}\rho = {\rm{d}}({E/V}) = T\,{\rm{d}}s + {1 \over n}({p + \rho - Ts}){\rm{d}}n{.}$$
(86)

This is consistent with our starting point for fluids, because we assume that the extensive variables associated with a fluid element do not change as the fluid element moves through spacetime. However, we feel that the use of scaling is necessary in that the fully conservative, or non-dissipative, fluid formalism presented below can be adapted to non-conservative, or dissipative, situations where dN = 0 cannot be imposed.

5.2 From microscopic models to the fluid equation of state

Let us now briefly discuss how an equation of state is constructed. For simplicity, we focus on a one-parameter system, with that parameter being the particle number density. The equation of state will then be of the form ρ = ρ(n). In many-body physics (such as studied in condensed matter, nuclear, and particle physics) one can in principle construct the quantum mechanical particle number density nQM, stress-energy-momentum tensor \(T_{\mu \nu}^{QM}\), and associated conserved particle number density current \(n_{QM}^\mu\) (starting with some fundamental Lagrangian, say; cf. [115, 48, 116]). But unlike in quantum field theory in curved spacetime [13], one assumes that the matter exists in an infinite Minkowski spacetime (cf. the discussion following Equation (85)). If the reader likes, the application of \(T_{\mu \nu}^{QM}\) at a spacetime point means that \(T_{\mu \nu}^{QM}\) has been determined with respect to a flat tangent space at that point.

Once \(T_{\mu \nu}^{QM}\) is obtained, and after (quantum mechanical and statistical) expectation values with respect to the system’s (quantum and statistical) states are taken, one defines the energy density as

$$\rho = {u^\mu}{u^\nu}\langle T_{\mu \nu}^{{\rm{QM}}}\rangle ,$$
(87)

where

$${u^\mu} \equiv {1 \over n}\langle n_{{\rm{QM}}}^\mu \rangle ,\qquad n = \langle {n_{{\rm{QM}}}}\rangle .$$
(88)

At sufficiently small temperatures, ρ will just be a function of the number density of particles n at the spacetime point in question, i.e. ρ = ρ(n). Similarly, the pressure is obtained as

$$p = {1 \over 3}({\langle {T^{{\rm{QM}}\mu}}_\mu \rangle + \rho})$$
(89)

and it will also be a function of n.

One must be very careful to distinguish \(T_{\mu \nu}^{QM}\) from Tμν. The former describes the states of elementary particles with respect to a fluid element, whereas the latter describes the states of fluid elements with respect to the system. Comer and Joynt [35] have shown how this line of reasoning applies to the two-fluid case.

6 An Overview of the Perfect Fluid

There are many different ways of constructing general relativistic fluid equations. Our purpose here is not to review all possible methods, but rather to focus on a couple: (i) an “off-the-shelve” consistency analysis for the simplest fluid a la Eckart [39], to establish some key ideas, and then (ii) a more powerful method based on an action principle that varies fluid element world lines. The ideas behind this variational approach can be traced back to Taub [108] (see also [101]). Our description of the method relies heavily on the work of Brandon Carter, his students, and collaborators [19, 36, 37, 28, 29, 67, 90, 91]. We prefer this approach as it utilizes as much as possible the tools of the trade of relativistic fields, i.e. no special tricks or devices will be required (unlike even in the case of our “off-the-shelve” approach). One’s footing is then always made sure by well-grounded, action-based derivations. As Carter has always made clear: When there are multiple fluids, of both the charged and uncharged variety, it is essential to distinguish the fluid momenta from the velocities, in particular in order to make the geometrical and physical content of the equations transparent. A well-posed action is, of course, perfect for systematically constructing the momenta.

6.1 Rates-of-change and Eulerian versus Lagrangian observers

The key geometric difference between generally covariant Newtonian fluids and their general relativistic counterparts is that the former have an a priori notion of time [22, 23, 24]. Newtonian fluids also have an a priori notion of space (which can be seen explicitly in the Newtonian covariant derivative introduced earlier; cf. the discussion in [22]). Such a structure has clear advantages for evolution problems, where one needs to be unambiguous about the rate-of-change of the system. But once a problem requires, say, electromagnetism, then the a priori Newtonian time is at odds with the full spacetime covariance of electromagnetic fields. Fortunately, for spacetime covariant theories there is the so-called “3 + 1” formalism (see, for instance, [105]) that allows one to define “rates-of-change” in an unambiguous manner, by introducing a family of spacelike hypersurfaces (the “3”) given as the level surfaces of a spacetime scalar (the “1”).

Something that Newtonian and relativistic fluids have in common is that there are preferred frames for measuring changes — those that are attached to the fluid elements. In the parlance of hydrodynamics, one refers to Lagrangian and Eulerian frames, or observers. A Newtonian Eulerian observer is one who sits at a fixed point in space, and watches fluid elements pass by, all the while taking measurements of their densities, velocities, etc. at the given location. In contrast, a Lagrangian observer rides along with a particular fluid element and records changes of that element as it moves through space and time. A relativistic Lagrangian observer is the same, but the relativistic Eulerian observer is more complicated to define. Smarr and York [105] define such an observer as one who would follow along a worldline that remains everywhere orthogonal to the family of spacelike hypersurfaces.

The existence of a preferred frame for a one fluid system can be used to great advantage. In the next Section 6.2 we will use an “off-the-shelf” analysis that exploits a preferred frame to derive the standard perfect fluid equations. Later, we will use Eulerian and Lagrangian variations to build an action principle for the single and multiple fluid systems. These same variations can also be used as the foundation for a linearized perturbation analysis of neutron stars [63]. As we will see, the use of Lagrangian variations is absolutely essential for establishing instabilities in rotating fluids [44, 45]. Finally, we point out that multiple fluid systems can have as many notions of Lagrangian observers as there are fluids in the system.

6.2 The single, perfect fluid problem: “Off-the-shelf” consistency analysis

We earlier took the components of a general stress-energy-momentum tensor and projected them onto the axes of a coordinate system carried by an observer moving with four-velocity Uμ. As mentioned above, the simplest fluid is one for which there is only one four-velocity uμ. Hence, there is a preferred frame defined by uμ, and if we want the observer to sit in this frame we can simply take Uμ = uμ. With respect to the fluid there will be no momentum flux, i.e. \({{\mathcal P}_\mu} = 0\). Since we use a fully spacetime covariant formulation, i.e. there are only spacetime indices, the resulting stress-energy-momentum tensor will transform properly under general coordinate transformations, and hence can be used for any observer.

The spatial stress is a two-index, symmetric tensor, and the only objects that can be used to carry the indices are the four-velocity uμ and the metric gμν. Furthermore, because the spatial stress must also be symmetric, the only possibility is a linear combination of gμν and uμuν. Given that \({u^\mu}{{\mathcal S}_{\mu \nu}} = 0\), we find

$${{\mathcal S}_{\mu \nu}} = {1 \over 3}{\mathcal S}({g_{\mu \nu}} + {u_\mu}{u_\nu}){.}$$
(90)

As the system is assumed to be locally isotropic, it is possible to diagonalize the spatial-stress tensor. This also implies that its three independent diagonal elements should actually be equal to the same quantity, which is the local pressure. Hence we have \(p = {\mathcal S}/3\) and

$${T_{\mu \nu}} =({\rho + p}){u_\mu}{u_\nu} + p{g_{\mu \nu}},$$
(91)

which is the well-established result for a perfect fluid.

Given a relation p = p(ρ), there are four independent fluid variables. Because of this the equations of motion are often understood to be \({\nabla _\mu}{T^\mu}_\nu = 0\), which follows immediately from the Einstein equations and the fact that \({\nabla _\mu}{G^\mu}_\nu = 0\). To simplify matters, we take as equation of state a relation of the form ρ = ρ(n) where n is the particle number density. The chemical potential μ is then given by

$${\rm{d}}\rho = {{\partial \rho} \over {\partial n}}{\rm{d}}n \equiv \mu \,{\rm{d}}n,$$
(92)

and we see from the Euler relation (79) that

$$\mu n = p + \rho .$$
(93)

Let us now get rid of the free index of \({\nabla _\mu}{T^\mu}_\nu = 0\) in two ways: first, by contracting it with uν and second, by projecting it with \(\bot _\rho ^\nu\) (letting Uμ = uμ). Recalling the fact that Uμuμ = − 1 we have the identity

$${\nabla _\mu}({{u^\nu}{u_\nu}}) = 0\qquad \Rightarrow \qquad {u_\nu}{\nabla _\mu}{u^\nu} = 0.$$
(94)

Contracting with uν and using this identity gives

$${u^\mu}{\nabla _\mu}\rho + (\rho + p){\nabla _\mu}{u^\mu} = 0.$$
(95)

The definition of the chemical potential μ and the Euler relation allow us to rewrite this as

$$\mu {u^\mu}{\nabla _\mu}n + \mu n{\nabla _\mu}{u^\mu}\qquad \Rightarrow \qquad {\nabla _\mu}{n^\mu} = 0,$$
(96)

where nμ = nuμ. Projection of the free index using \(\bot _\mu ^\rho\) leads to

$${D_\mu}p = - (\rho + p){a_\mu},$$
(97)

where \({D_\mu} \equiv \bot _\mu ^\rho {\nabla _\rho}\) is a purely spatial derivative and aμuννuμ is the acceleration. This is reminiscent of the familiar Euler equation for Newtonian fluids.

However, we should not be too quick to think that this is the only way to understand \({\nabla _\mu}{T^\mu}_\nu = 0\). There is an alternative form that makes the perfect fluid have much in common with vacuum electromagnetism. If we define

$${\mu _\nu} = \mu {u_\nu},$$
(98)

and note that uμduμ = 0, then

$${\rm{d}}\rho = - {\mu _\nu}\,{\rm{d}}{n^\nu}.$$
(99)

The stress-energy-momentum tensor can now be written in the form

$${T^\mu}_\nu = p{\delta ^\mu}_\nu + {n^\mu}{\mu _\nu}.$$
(100)

We have here our first encounter with the fluid element momentum μν that is conjugate to the particle number density current nμ. Its importance will become clearer as this review develops, particularly when we discuss the two-fluid case. If we now project onto the free index of \({\nabla _\mu}{T^\mu}_\nu = 0\) using \(\bot _\nu ^\mu\), as before, we find

$${f_\nu} + \left({{\nabla _\mu}{n^\mu}} \right){\mu _\nu} = 0,$$
(101)

where the force density fν is

$${f_\nu} = {n^\mu}{\omega _{\mu \nu}},$$
(102)

and the vorticity ωμν is defined as

$${\omega _{\mu \nu}} \equiv 2\nabla {}_{\left[ \mu \right.}{\mu _{\left. \nu \right]}} = {\nabla _\mu}{\mu _\nu} - {\nabla _\nu}{\mu _\mu}.$$
(103)

Contracting Equation (101) with nν implies (since ωμν = −ωνμ) that

$${\nabla _\mu}{n^\mu} = 0$$
(104)

and as a consequence

$${f_\nu} = {n^\mu}{\omega _{\mu \nu}} = 0.$$
(105)

The vorticity two-form ωμν has emerged quite naturally as an essential ingredient of the fluid dynamics [72, 19, 11, 60]. Those who are familiar with Newtonian fluids should be inspired by this, as the vorticity is often used to establish theorems on fluid behavior (for instance the Kelvin-Helmholtz theorem [66]) and is at the heart of turbulence modeling [93].

To demonstrate the role of as the vorticity, consider a small region of the fluid where the time direction tμ, in local Minkowski coordinates, is adjusted to be the same as that of the fluid four-velocity so that uμ = tμ = (1, 0, 0, 0). Equation (105) and the antisymmetry then imply that ωμν can only have purely spatial components. Because the rank of ωμν is two, there are two “nulling” vectors, meaning their contraction with either index of ωμν yields zero (a condition which is true also for vacuum electromagnetism). We have arranged already that tμ be one such vector. By a suitable rotation of the coordinate system the other one can be taken as zu = (0, 0, 0, 1), thus implying that the only non-zero component of ωμν is xμν. “MTW” [80] points out that such a two-form can be pictured geometrically as a collection of oriented worldtubes, whose walls lie in the x = const and y = const planes. Any contraction of a vector with a two-form that does not yield zero implies that the vector pierces the walls of the worldtubes. But when the contraction is zero, as in Equation (105), the vector does not pierce the walls. This is illustrated in Figure 6, where the red circles indicate the orientation of each world-tube. The individual fluid element four-velocities lie in the centers of the world-tubes. Finally, consider the closed contour in Figure 6. If that contour is attached to fluid-element worldlines, then the number of worldtubes contained within the contour will not change because the worldlines cannot pierce the walls of the worldtubes. This is essentially the Kelvin-Helmholtz theorem on conservation of vorticity. From this we learn that the Euler equation is an integrability condition which ensures that the vorticity two-surfaces mesh together to fill out spacetime.

Figure 6
figure 6

A local, geometrical view of the Euler equation as an integrability condition of the vorticity for a single-constituent perfect fluid.

As we have just seen, the form Equation (105) of the equations of motion can be used to discuss the conservation of vorticity in an elegant way. It can also be used as the basis for a derivation of other known theorems in fluid mechanics. To illustrate this, let us derive a generalized form of Bernoulli’s theorem. Let us assume that the flow is invariant with respect to transport by some vector field kρ. That is, we have

$${{\mathcal L}_k}{\mu _\rho} = 0,\quad \quad \rightarrow \quad \quad ({\nabla _\rho}{\mu _\sigma} - {\nabla _\sigma}{\mu _\rho}){k^\sigma} = {\nabla _\rho}({k^\sigma}{\mu _\sigma}).$$
(106)

Here one may consider two particular situations. If kσ is taken to be the four-velocity, then the scalar kσμσ represents the “energy per particle”. If instead kσ represents an axial generator of rotation, then the scalar will correspond to an angular momentum. For the purposes of the present discussion we can leave kσ unspecified, but it is still useful to keep these possibilities in mind. Now contract the equation of motion (105) with kσ and assume that the conservation law Equation (104) holds. Then it is easy to show that we have

$${n^\rho}{\nabla _\rho}\left({{\mu _\sigma}{k^\sigma}} \right) = {\nabla _\rho}\left({{n^\rho}{\mu _\sigma}{k^\sigma}} \right) = 0.$$
(107)

In other words, we have shown that nρμσkσ is a conserved quantity.

Given that we have just inferred the equations of motion from the identity that \({\nabla _\mu}{T^\mu}_\nu = 0\), we now emphatically state that while the equations are correct the reasoning is severely limited. In fact, from a field theory point of view it is completely wrong! The proper way to think about the identity is that the equations of motion are satisfied first, which then guarantees that \({\nabla _\mu}{T^\mu}_\nu = 0\). There is no clearer way to understand this than to study the multi-fluid case: Then the vanishing of the covariant divergence represents only four equations, whereas the multi-fluid problem clearly requires more information (as there are more velocities that must be determined). We have reached the end of the road as far as the “off-the-shelf” strategy is concerned, and now move on to an action-based derivation of the fluid equations of motion.

7 Setting the Context: The Point Particle

The simplest physics problem, i.e. the point particle, has always served as a guide to deep principles that are used in much harder problems. We have used it already to motivate parallel transport as the foundation for the covariant derivative. Let us call upon the point particle again to set the context for the action-based derivation of the fluid field equations. We will simplify the discussion by considering only motion in one dimension. We assure the reader that we have good reasons, and ask for patience while we remind him/her of what may be very basic facts.

Early on we learn that an action appropriate for the point particle is

$$I = \int\nolimits_{{t_{\rm{i}}}}^{{t_{\rm{f}}}} {\rm{d}} t\,T = \int\nolimits_{{t_{\rm{i}}}}^{{t_{\rm{f}}}} {\rm{d}} t\left({{1 \over 2}m{{\dot x}^2}} \right),$$
(108)

where m is the mass and T the kinetic energy. A first-order variation of the action with respect to x(t) yields

$$\delta I = - \int\nolimits_{{t_{\rm{i}}}}^{{t_{\rm{f}}}} {\rm{d}} t\left({m\ddot x} \right)\delta x + \left. {\left({m\dot x\delta x} \right)} \right\vert _{{t_{\rm{i}}}}^{{t_{\rm{f}}}}.$$
(109)

if this is all the physics to be incorporated, i.e. if there are no forces acting on the particle, then we impose d’Alembert’s principle of least action [65], which states that those trajectories x(t) that make the action stationary, i.e. δI = 0, are those that yield the true motion. We see from the above that functions x(t) that satisfy the boundary conditions

$$\delta x({t_{\rm{i}}}) = 0 = \delta x({t_{\rm{f}}})$$
(110)

and the equation of motion

$$m\ddot x = 0$$
(111)

will indeed make δI = 0. This same reasoning applies in the substantially more difficult fluid actions that will be considered later.

But, of course, forces need to be included. First on the list are the so-called conservative forces, describable by a potential V(x), which are placed into the action according to

$$I = \int\nolimits_{{t_{\rm{i}}}}^{{t_{\rm{f}}}} {\rm{d}} tL(x,\dot x) = \int\nolimits_{{t_{\rm{i}}}}^{{t_{\rm{f}}}} {\rm{d}} t\left({{1 \over 2}m{{\dot x}^2} - V(x)} \right),$$
(112)

where L = TV is the so-called Lagrangian. The variation now leads to

$$\delta I = - \int\nolimits_{{t_{\rm{i}}}}^{{t_{\rm{f}}}} {\rm{d}} t\left({m\ddot x + {{\partial V} \over {\partial x}}} \right)\delta x + \left. {\left({m\dot x\delta x} \right)} \right\vert _{{t_{\rm{i}}}}^{{t_{\rm{f}}}}.$$
(113)

Assuming no externally applied forces, d’Alembert’s principle yields the equation of motion

$$m\ddot x + {{\partial V} \over {\partial x}} = 0.$$
(114)

An alternative way to write this is to introduce the momentum p (not to be confused with the fluid pressure introduced earlier) defined as

$$p = {{\partial L} \over {\partial \dot x}} = m\dot x,$$
(115)

in which case

$$\dot p + {{\partial V} \over {\partial x}} = 0.$$
(116)

In the most honest applications, one has the obligation to incorporate dissipative, i.e. non-conservative, forces. Unfortunately, dissipative forces Fd cannot be put into action principles. Fortunately, Newton’s second law is of great guidance, since it states

$$m\ddot x + {{\partial V} \over {\partial x}} = {F_{\rm{d}}},$$
(117)

when both conservative and dissipative forces act. A crucial observation of Equation (117) is that the “kinetic” \((m\ddot x = \dot p)\) and conservative (∂V/∂x) forces, which enter the left-hand side, do follow from the action, i.e.

$${{\delta I} \over {\delta x}} = - \left({m\ddot x + {{\partial V} \over {\partial x}}} \right).$$
(118)

When there are no dissipative forces acting, the action principle gives us the appropriate equation of motion. When there are dissipative forces, the action defines for us the kinetic and conservative force terms that are to be balanced by the dissipative terms. It also defines for us the momentum.

We should emphasize that this way of using the action to define the kinetic and conservative pieces of the equation of motion, as well as the momentum, can also be used in a context where the system experiences an externally applied force Fext. The force can be conservative or dissipative, and will enter the equation of motion in the same way as Fd did above, that is

$$- {{\delta I} \over {\delta x}} = {F_{\rm{d}}} + {F_{{\rm{ext}}}}.$$
(119)

Like a dissipative force, the main effect of the external force can be to siphon kinetic energy from the system. Of course, whether a force is considered to be external or not depends on the a priori definition of the system.

To summarize: The variational argument leads to equations of motion of the form

$$- {{\delta I} \over {\delta x}} = m\ddot x + {{\partial V} \over {\partial x}} = \dot p + {{\partial V} \over {\partial x}}\left\{{\begin{array}{*{20}c} {= 0} & {{\rm{for}}\;{\rm{conservative}}\;{\rm{forces}},\quad \quad \quad \quad \,} \\ {\neq 0} & {\quad {\rm{for}}\;{\rm{dissipative}}\;{\rm{and/or}}\;{\rm{external}}\;{\rm{forces}}.} \\ \end{array}} \right.$$
(120)

8 The “Pull-back” Formalism for a Single Fluid

In this section the equations of motion and the stress-energy-momentum tensor for a one-component, general relativistic fluid are obtained from an action principle. Specifically a so-called “pull-back” approach (see, for instance, [36, 37, 34]) is used to construct a Lagrangian displacement of the number density four-current nμ, whose magnitude n is the particle number density. This will form the basis for the variations of the fundamental fluid variables in the action principle.

As there is only one species of particle considered here, nμ is conserved, meaning that once a number of particles N is assigned to a particular fluid element, then that number is the same at each point of the fluid element’s worldline. This would correspond to attaching a given number of particles (i.e. N1, N2, etc.) to each of the worldlines in Figure 7. Mathematically, one can write this as a standard particle-flux conservation equationFootnote 4:

$${\nabla _\mu}{n^\mu} = 0.$$
(121)

for reasons that will become clear shortly, it is useful to rewrite this conservation law in what may (if taken out of context) seem a rather obscure way. We introduce the dual nνλτ to nμ, i.e.

$${n_{\nu \lambda \tau}} = {\epsilon _{\nu \lambda \tau \mu}}{n^\mu},\quad {n^\mu} = {1 \over {3!}}{\epsilon ^{\mu \nu \lambda \tau}}{n_{\nu \lambda \tau}},$$
(122)

such that

$${n^2} = {1 \over {3!}}{n_{\nu \lambda \tau}}{n^{\nu \lambda \tau}}.$$
(123)

This shows that nνλτ acts as a volume measure which, for example, allows us to “count” the number of fluid elements. In Figure 6 we have seen that a two-form gives worldtubes. A three-form is the next higher-ranked object and it can be thought of, in an analogous way, as leading to boxes [80]. The key step here is to realize that the conservation rule is equivalent to having the three-form nνλτ be closed. In differential geometry this means that

$$\nabla {}_{\left[ \mu \right.}{n_\nu}{\lambda _{\left. \tau \right]}} = 0.$$
(124)

This can be shown to be equivalent to Equation (121) by contracting with ϵμνλτ.

Figure 7
figure 7

The push-forward from “fluid-particle” points in the three-dimensional matter space labelled by the coordinates {X1, X2, X3} to fluid-element worldlines in spacetime. Here, the push-forward of the “Ith” (I = 1, 2, …, n) fluid-particle to, say, an initial point on a worldline in spacetime can be taken as \(X_I^A = {X^A}(0,x_I^i)\) where \(x_I^i\) is the spatial position of the intersection of the worldline with the t = 0 time slice.

The main reason for introducing the dual is that it is straightforward to construct a particle number density three-form that is automatically closed, since the conservation of the particle number density current should not — speaking from a strict field theory point of view — be a part of the equations of motion, but rather should be automatically satisfied when evaluated on a solution of the “true” equations.

This can be made to happen by introducing a three-dimensional “matter” space — the left-hand part of Figure 7 — which can be labelled by coordinates XA, where A, B, C, etc. = 1, 2, 3. For each time slice in spacetime, there will be the same configuration in the matter space; that is, as time goes forward, the fluid particle positions in the matter space remain fixed even as the worldlines form their weavings in spacetime. In this sense we are “pushing forward” from the matter space to spacetime (cf. the previous discussion of the Lie derivative). The three-form can be pulled back to its three-dimensional matter space by using the mappings XA. This construction then guarantees a three-form that is automatically closed on spacetime, namely

$${n_{\nu \lambda \tau}} = {N_{ABC}}\left({{\nabla _\nu}{X^A}} \right)\left({{\nabla _\lambda}{X^B}} \right)({\nabla _\tau}{X^C}),$$
(125)

where NABC is completely antisymmetric in its indices and is a function only of the XA. As implied by Figure 7 the XA are also comoving on their respective worldlines, meaning that they are independent of the proper time τ, say, that parameterizes each curve:

$$n{{{\rm{d}}{X^A}} \over {{\rm{d}}\tau}} \equiv {n^\mu}{\nabla _\mu}{X^A} = {{\mathcal L}_n}{X^A} = {1 \over {3!}}{\epsilon ^{\mu \nu \lambda \tau}}{N_{BCD}}\left({{\nabla _\nu}{X^B}} \right)\left({{\nabla _\lambda}{X^C}} \right)\left({{\nabla _\tau}{X^D}} \right)\left({{\nabla _\mu}{X^A}} \right) = 0.$$
(126)

The time part of the spacetime dependence of the XA is thus somewhat ad hoc; that is, if we take the flow of time tμ to be the proper time of the worldlines (tμ is parallel to nμ), the XA do not change. An apparent time dependence in spacetime means that tμ is such as to cut across fluid worldlines (tμ is not parallel to nμ), which of course have different values for the XA.

Because the matter space indices are three-dimensional and the closure condition involves four spacetime indices, and also the XA are scalars on spacetime (and thus two covariant differentiations commute), the construction does indeed produce a closed three-form:

$${\nabla _{\left[ \mu \right.}}{n_{\left. {\nu \lambda \tau} \right]}} = {\nabla _{\left[ \mu \right.}}\left({{N_{ABC}}{\nabla _\nu}{X^A}{\nabla _\lambda}{X^B}{\nabla _{\left. \tau \right]}}{X^C}} \right) \equiv 0.$$
(127)

In terms of the scalar fields XA, we now have particle number density currents that are automatically conserved. Thus, another way of viewing the construction is that the fundamental fluid field variables are the XA. In fact, the variations of nνλτ can now be taken with respect to variations of the XA, namely

$$\begin{array}{*{20}c} {\delta {n_{\nu \lambda \tau}} = \left({{{\partial {N_{ABC}}} \over {\partial {X^D}}}{\nabla _\nu}{X^A}{\nabla _\lambda}{X^B}{\nabla _\tau}{X^C}} \right)\delta {X^D}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {+ {N_{ABC}}\left({{\nabla _\nu}\delta {X^A}{\nabla _\lambda}{X^B}{\nabla _\tau}{X^C} + {\nabla _\nu}{X^A}{\nabla _\lambda}\delta {X^B}{\nabla _\tau}{X^C} + {\nabla _\nu}{X^A}{\nabla _\lambda}{X^B}{\nabla _\tau}\delta {X^C}} \right).} \\ \end{array}$$
(128)

As the XA are scalars, the covariant derivatives can be replaced with ordinary partial derivatives, meaning that there is no metric variation to consider.

Let us introduce the Lagrangian displacement on spacetime for the particles, to be denoted This is related to the variation δXA via another push-forward from the matter space into spacetime,

$$\delta {X^A} = - \left({{\nabla _\mu}{X^A}} \right){\xi ^\mu} = - {{\mathcal L}_\xi}{X^A},$$
(129)

where ξ is the Lie derivative along (cf. Equation (43)). Using the fact that

$$\begin{array}{*{20}c} {{\nabla _\nu}\delta {X^A} = - {\nabla _\nu}\left({\left[ {{\nabla _\mu}{X^A}} \right]{\xi ^\mu}} \right)\quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= - \left({{\nabla _\mu}{X^A}} \right){\nabla _\nu}{\xi ^\mu} - \left({{\nabla _\mu}{\nabla _\nu}{X^A}} \right){\xi ^\mu}} \\ \end{array}$$
(130)

means [67]

$$\delta {n_{\nu \lambda \tau}} = - \left({{\xi ^\sigma}{\nabla _\sigma}{n_{\nu \lambda \tau}} + {n_{\sigma \lambda \tau}}{\nabla _\nu}{\xi ^\sigma} + {n_{\nu \sigma \tau}}{\nabla _\lambda}{\xi ^\sigma} + {n_{\nu \lambda \sigma}}{\nabla _\tau}{\xi ^\sigma}} \right) = - {{\mathcal L}_\xi}{n_{\nu \lambda \tau}}.$$
(131)

Using Equation (122) above and Equation (367) of Appendix A, we can thus infer that

$$\begin{array}{*{20}c} {\delta {n^\mu} = {n^\sigma}{\nabla _\sigma}{\xi ^\mu} - {\xi ^\sigma}{\nabla _\sigma}{n^\mu} - {n^\mu}\left({{\nabla _\sigma}{\xi ^\sigma} + {1 \over 2}{g^{\sigma \rho}}\delta {g_{\sigma \rho}}} \right)} \\ {= - {{\mathcal L}_\xi}{n^\mu} - {n^\mu}\left({{\nabla _\sigma}{\xi ^\sigma} + {1 \over 2}{g^{\sigma \rho}}\delta {g_{\sigma \rho}}} \right).\quad \quad \,\,\,\,} \\ \end{array}$$
(132)

This constraint guarantees that is conserved. By introducing the decomposition

$${n^\mu} = n{u^\mu},\quad {u_\mu}{u^\mu} = - 1,$$
(133)

we can furthermore show that

$$\delta n = - {\nabla _\sigma}\left({n{\xi ^\sigma}} \right) - n\left({{u_\nu}{u^\sigma}{\nabla _\sigma}{\xi ^\nu} + {1 \over 2}\left[ {{g^{\sigma \rho}} + {u^\sigma}{u^\rho}} \right]\delta {g_{\sigma \rho}}} \right),$$
(134)

and

$$\delta {u^\mu} = \left({{\delta ^\mu}_\rho + {u^\mu}{u_\rho}} \right)\left({{u^\sigma}{\nabla _\sigma}{\xi ^\rho} - {\xi ^\sigma}{\nabla _\sigma}{u^\rho}} \right) + {1 \over 2}{u^\mu}{u^\sigma}{u^\rho}\delta {g_{\sigma \rho}}.$$
(135)

The Lagrangian variation resulting from the Lagrangian displacement is given by

$$\Delta \equiv \delta + {{\mathcal L}_\xi},$$
(136)

from which it follows that

$$\Delta {n_{\mu \lambda \tau}} = 0,$$
(137)

which is entirely consistent with the pull-back construction. We also find that

$$\Delta {u^\mu} = {1 \over 2}{u^\mu}{u^\sigma}{u^\rho}\Delta {g_{\sigma \rho}},$$
(138)
$$\Delta {\epsilon _{\nu \lambda \tau \sigma}} = {1 \over 2}{\epsilon _{\nu \lambda \tau \sigma}}{g^{\mu \rho}}\Delta {g_{\mu \rho}},$$
(139)
$$\Delta n = - {n \over 2}\left({{g^{\sigma \rho}} + {u^\sigma}{u^\rho}} \right)\Delta {g_{\sigma \rho}}.$$
(140)

These formulae and their Newtonian analogues have been adroitly used by Friedman and Schutz in their establishment of the so-called Chandrasekhar-Friedman-Schutz (CFS) instability [30, 44, 45] (see Section 13). Later, in Section 12, we take the limit that the speed of light becomes large so as to construct non-relativistic fluid equations. The same procedure can be used to obtain the Newtonian analogues of the formulae constructed here.

At first glance, there appears to be a glaring inconsistency between the pull-back construction and the Lagrangian variation, since the latter seems to have four independent components, but the former clearly has three. In fact, there is a gauge freedom in the Lagrangian variation that can be used to reduce the number of independent components. Take Equation (132) and substitute

$${\xi ^\mu} = {\bar \xi ^\mu} + {{\mathcal G}^\mu}.$$
(141)

Using the fact that ∇μnμ = 0, it can be shown that

$$\delta {n^\mu} = \delta {\bar n^\mu} - {1 \over 2}{\epsilon ^{\mu \sigma \tau \lambda}}{\nabla _\sigma}\left({n{\epsilon _{\tau \lambda \rho \nu}}{u^\rho}{{\mathcal G}^\nu}} \right),$$
(142)

where \(\delta {{\bar n}^\mu}\) is as in Equation (132) except ξμ is replaced with \({{\bar \xi}^\mu}\). If we set

$${{\mathcal G}^\mu} = {\mathcal G}{u^\mu},$$
(143)

then the last term vanishes and \(\delta {n^\mu} = \delta {{\bar n}^\mu}\). Thus, we can use the arbitrary function \({\mathcal G}\) to reduce the number of independent ξμ to three.

With a general variation of the conserved four-current in hand, we can now use an action principle to derive the equations of motion and the stress-energy-momentum tensor. The central quantity in the analysis is the so-called “master” function Λ, which is a function of the scalar n2 = −nμnμ. For this single fluid system, it is such that −Λ corresponds to the local thermodynamic energy density. In the action principle, the master function is the Lagrangian density for the fluid, i.e. for a spacetime region the fluid action is

$${I_{{\rm{fluid}}}} = \int\nolimits_{\mathcal M} {\sqrt {- g}} \,\Lambda \,{{\rm{d}}^4}x.$$
(144)

To obtain the Einstein equation (65) we must add the Einstein-Hilbert action:

$${I_{{\rm{EH}}}} = {1 \over {16\pi}}\int\nolimits_{\mathcal M} {\sqrt {- g}} \,R\,{{\rm{d}}^4}x + {1 \over {8\pi}}\int\nolimits_{\partial {\mathcal M}} {\sqrt h} \,K\,{{\rm{d}}^3}x,$$
(145)

where R is the Ricci scalar, h is the determinant of the metric of the boundary ∂ℳ (if present) of , and K is the trace of the extrinsic curvature of the boundary. This last term is added to ensure that fixing the metric on the boundary leads to a well-posed action principle [124].

We should point out that our consideration of a master function of the form Λ = Λ(n2) is based, in part, on the assumptions that the matter is locally isotropic, meaning that there are no locally preferred directions (such as in a neutron star crust) or another covector Aμ to form the additional scalar nμAμ (as would be the case with coupling to electromagnetism, say). A term that could be added is of the form nμμϕ for an arbitrary scalar field ϕ. Unlike the previous two possible additional terms, it would not affect the equations of motion, since ∇μnμ = 0 by construction, and an integration by parts generates a boundary term. Our point of view is that the master function is fixed by the local microphysics of the matter; (cf. the discussion in Section 5.2).

An unconstrained variation of Λ(n2) is with respect to nμ and the metric gμν, and allows the four components of nμ to be varied independently. It takes the form

$$\delta \Lambda = {\mu _\nu}\delta {n^\nu} + {1 \over 2}{n^\mu}{\mu ^\nu}\delta {g_{\mu \nu}},$$
(146)

where

$${\mu _\nu} = {\mathcal B}{n_\nu},\quad {\mathcal B} \equiv - 2{{\partial \Lambda} \over {\partial {n^2}}}.$$
(147)

The use of the letter \({\mathcal B}\) is to remind us that this is a bulk fluid effect, which is present regardless of the number of fluids and constituents. The momentum covector μν is what we encountered earlier. It is dynamically, and thermodynamically, conjugate to nμ, and its magnitude can be seen to be the chemical potential of the particles by recalling that Λ = −ρ. Correctly identifying the momentum is essential for a concise and transparent formulation of fluids, e.g. for constructing proofs about vorticity, and also for coupling in dissipative terms in the equations of motion (cf. the discussion of Section 7). For later convenience, we also introduce the momentum form defined as

$${\mu ^{\mu \nu \lambda}} = {\epsilon ^{\mu \nu \lambda \rho}}{\mu _\rho}.$$
(148)

If the variation of the four-current was left unconstrained, the equations of motion for the fluid deduced from the above variation of Λ would require, incorrectly, that the momentum covector μμ should vanish in all cases. This reflects the fact that the variation of the conserved four-current must be constrained, meaning that not all components of nμ can be treated as independent. In terms of the constrained Lagrangian displacement of Equation (132), a first-order variation of the general relativity plus fluid Lagrangian yields

$$\begin{array}{*{20}c} {\delta \left({\sqrt {- g} \left[ {{1 \over {16\pi}}R + \Lambda} \right]} \right) = \sqrt {- g} \left[ {- {1 \over {16\pi}}{G^{\mu \nu}} + {1 \over 2}\left({\Psi {\delta ^\mu}_\lambda + {n^\mu}{\mu _\lambda}} \right){g^{\lambda \nu}}} \right]\delta {g_{\mu \nu}} - \sqrt {- g} {f_\nu}{\xi ^\nu}} \\ {{\rm{boundary}}\;{\rm{terms}},} \\ \end{array}$$
(149)

where the boundary terms do not contribute to the field equations or stress-energy-momentum tensor (the divergence theorem implies that they become boundary terms in the action), the force density fν is defined in Equation (102), and the generalized pressure Ψ is defined to be

$$\Psi = \Lambda - {n^\mu}{\mu _\mu}.$$
(150)

We have utilized the well-known formula [80] for the variation of the metric determinant, which can be found in Equation (370) of Appendix A. We also note that the fluid boundary term (FBT) is

$${\rm{FBT}} = {\nabla _\nu}\left({{1 \over 2}\sqrt {- g} {\mu ^{\nu \lambda \tau}}{n_{\lambda \tau \mu}}{\xi ^\mu}} \right),$$
(151)

so that \({\mu ^{\nu \lambda \tau}}{n_{\lambda \tau \mu}}{\xi ^\mu}\) on the boundary is fixed (i.e. ξμ vanishes) to have a well-posed action principle.

At this point we can return to the view that nμ is the fundamental field for the fluid. The principle of least action implies that the coefficients of ξν and δgμν and the boundary terms must vanish. Thus, the equations of motion consist of the original conservation condition from Equation (121), the Euler equation

$${f_\nu} = {n^\mu}{\omega _{\mu \nu}} = 0,$$
(152)

and of course the Einstein equation. When δgμν ≠ 0, it is well-established in the relativity literature that the stress-energy-momentum tensor follows from a variation of the Lagrangian with respect to the metric, that is

$${T^{\mu \nu}}\delta {g_{\mu \nu}} \equiv {2 \over {\sqrt {- g}}}\delta \left({\sqrt {- g} {\mathcal L}} \right) = \left({\Psi {\delta ^\mu}_\lambda + {n^\mu}{\mu _\lambda}} \right){g^{\lambda \nu}}\delta {g_{\mu \nu}}.$$
(153)

When the complete set of field equations is satisfied then we see from Equation (105), which applies here, that it is automatically true that \({\nabla _\mu}{T^\mu}_\nu = 0\).

Let us now recall the discussion of the point particle. There we pointed out that only the fully conservative form of Newton’s Second Law follows from an action, i.e. external or dissipative forces are excluded. However, we argued that a well-established form of Newton’s Second Law is known that allows for external and/or dissipative forces (cf. Equation (120)). There is thus much purpose in using the particular symbol fν in Equation (152). We may take the fν to be the relativistic analogue of the left-hand-side of Equation (120) in every sense. In particular, when dissipation and/or external “forces” act in a general relativistic setting, they are incorporated via the right-hand-side of Equation (152).

9 The Two-Constituent, Single Fluid

Generally speaking, the total energy density ρ can be a function of independent parameters other than the particle number density nn, such as the entropy density ns, assuming that the system scales in the manner discussed in Section 5 so that only densities need enter the equation of state. (For later convenience we will introduce the constituent indices x, y, etc. which range over the two constituents {n, s}, and do not satisfy any kind of summation convention.) If there is no heat conduction, then this is still a single fluid problem, meaning that there is still just one unit flow velocity uμ [36]. This is what we mean by a two-constituent, single fluid. We assume that the particle number and entropy are both conserved along the flow, in the same sense as in Equation (126). Associated with each parameter there is thus a conserved current density four-vector, i.e. \(n_{\rm{n}}^\mu = {n_{\rm{n}}}{u^\mu}\) for the particle number density and \(n_{\rm{s}}^\mu = {n_{\rm{s}}}{u^\mu}\) for the entropy density. Note that the ratio xs = ns/nn is comoving in the sense that

$${u^\mu}{\nabla _\mu}\left({{x_{\rm{s}}}} \right) = 0.$$
(154)

In terms of constituent indices x, y, the associated combined first and second laws can be written in the form

$${\rm{d}}\rho = \sum\limits_{{\rm{x}} = \{n,s\}} {{\mu ^{\rm{x}}}} \,{\rm{d}}{n_{\rm{x}}} = - \sum\limits_{{\rm{x}} = \{n,s\}} {\mu _\nu ^{\rm{x}}} \,{\rm{d}}n_{\rm{x}}^\nu ,$$
(155)

since \(\rho = \rho (n_{\rm{n}}^2,n_{\rm{s}}^2)\), where

$$n_{\rm{x}}^\mu = {n_{\rm{x}}}{u^\mu},\quad n_{\rm{x}}^2 = - {g_{\mu \nu}}n_{\rm{x}}^\mu n_{\rm{x}}^\nu ,$$
(156)

and

$$\mu _\nu ^{\rm{x}} = {g_{\nu \mu}}{{\mathcal B}^{\rm{x}}}n_{\rm{x}}^\mu ,\quad {{\mathcal B}^{\rm{x}}} \equiv 2{{\partial \rho} \over {\partial n_{\rm{x}}^2}}.$$
(157)

Because of Equation (83) μs = T, where T is the temperature.

Given that we only have one four-velocity, the system will still just have one fluid element per spacetime point. But unlike before, there will be an additional conserved number, Ns, that can be attached to each worldline, like the particle number Nn of Figure 7. In order to describe the worldlines we can use the same three scalars XA(xμ) as before. But how do we get a construction that allows for the additional conserved number? Recall that the intersections of the worldlines with some hypersurface, say t = 0, is uniquely specified by the three XA(0, xi) scalars. Each worldline will have also the conserved numbers Nn and Ns assigned to them. Thus, the values of these numbers can be expressed as functions of the XA(0, xi). But most importantly, the fact that each Nx is conserved, means that this function specification must hold for all of spacetime, so that in particular the ratio xs is of the form xs(xμ) = xs(XA(xμ)). Consequently, we now have a construction whereby this ratio identically satisfies Equation (154), and the action principle remains a variational problem just in terms of the three XA scalars.

The variation of the action follows like before, except now a constituent index x must be attached to the particle number density current and three-form:

$$n_{\nu \lambda \tau}^{\rm{x}} = {\epsilon _{\nu \lambda \tau \mu}}n_{\rm{x}}^\mu .$$
(158)

Once again it is convenient to introduce the momentum form, now defined as

$$\mu _{\rm{x}}^{\mu \nu \lambda} = {\epsilon ^{\mu \nu \lambda \rho}}\mu _\rho ^{\rm{x}}.$$
(159)

Since the XA are the same for each \(n_{\nu \lambda \tau}^{\rm{x}}\), the above discussion indicates that the pull-back construction now is to be based on

$$n_{\nu \lambda \tau}^{\rm{x}} = N_{ABC}^{\rm{x}}\left({{\nabla _\nu}{X^A}} \right)\left({{\nabla _\lambda}{X^B}} \right){\nabla _\tau}{X^C},$$
(160)

where \(N_{ABC}^{\rm{x}}\) is completely antisymmetric and a function of the XA. After some due deliberation, the reader should be convinced that the only thing required here in addition to the previous Section 8 is to attach an index x to nνλτ, nμ, and n in Equations (131), (132), and (134), respectively. If we now define the master function as

$$\Lambda = - \rho$$
(161)

and the generalized pressure Ψ to be

$$\Psi = \Lambda - \sum\limits_{{\rm{x}} = \{{\rm{n}},\,{\rm{s}}\}} {\mu _\nu ^{\rm{x}}n_{\rm{x}}^\nu = \Lambda + \sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} {{\mu ^{\rm{x}}}{n_{\rm{x}}}}} ,$$
(162)

then the first-order variation for Λ is

$$\begin{array}{*{20}c} {\delta \left({\sqrt {- g} \Lambda} \right) = {1 \over 2}\sqrt {- g} \left({\Psi {g^{\mu \nu}} + \left[ {\Psi - \Lambda} \right]{u^\mu}{u^\nu}} \right)\delta {g_{\mu \nu}} - \sqrt {- g} \left({\sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} f_\nu ^{\rm{x}}} \right){\xi ^\nu} +} \\ {{\nabla _\nu}\left({{1 \over 2}\sqrt {- g} \sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} \mu _{\rm{x}}^{\nu \lambda \tau}n_{\lambda \tau \mu}^{\rm{x}}{\xi ^\mu}} \right),} \\ \end{array}$$
(163)

where

$$f_\nu ^{\rm{x}} = n_{\rm{x}}^\mu \omega _{\mu \nu}^{\rm{x}},$$
(164)

and

$$\omega _{\mu \nu}^{\rm{x}} = 2\nabla {}_{\left[ \mu \right.}\mu _{\left. \nu \right]}^{\rm{x}}.$$
(165)

Keeping in mind that we need the Einstein-Hilbert action for the Einstein equation, the final fluid equations of motion are

$$\sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} {f_\nu ^{\rm{x}}} = 0,$$
(166)

and

$${\nabla _\nu}n_{\rm{x}}^\nu = 0,$$
(167)

whereas the stress-energy-momentum tensor takes the familiar form

$${T^{\mu \nu}} = \Psi {g^{\mu \nu}} + (\Psi - \Lambda){u^\mu}{u^\nu}.\,$$
(168)

10 The “Pull-Back” Formalism for Two Fluids

Having discussed the single fluid model, and how one accounts for stratification, it is time to move on to the problem of modeling multi-fluid systems. We will experience for the first time novel effects due to the presence of a relative flow between two interpenetrating fluids, and the fact that there is no longer a single, preferred rest-frame. This kind of formalism is necessary, for example, for the simplest model of a neutron star, since it is generally accepted that the inner crust is permeated by an independent neutron superfluid, and the outer core is thought to contain superfluid neutrons, superconducting protons, and a highly degenerate gas of electrons. Still unknown is the number of independent fluids that would be required for neutron stars that have quark matter in the deep core [1]. The model can also be used to describe superfluid Helium and heat-conducting fluids, which is of importance for incorporation of dissipation (see Section 14). We will focus on this example here. It should be noted that, even though the particular system we concentrate on consists of only two fluids, it illustrates all new features of a general multi-fluid system. Conceptually, the greatest step is to go from one to two fluids. A generalization to a system with more degrees of freedom is straightforward.

In keeping with the previous Section 9, we will rely on use of the constituent index, which for all remaining formulas of this section will range over x, y, z = n, s. In the example that we consider the two fluids represent the particles (n) and the entropy (s). Once again, the number density four-currents, to be denoted \(n_{\rm{x}}^\mu\), are taken to be separately conserved, meaning

$${\nabla _\mu}n_{\rm{x}}^\mu = 0.$$
(169)

As before, we use the dual formulation, i.e. introduce the three-forms

$$n_{\nu \lambda \tau}^{\rm{x}} = {\epsilon _{\nu \lambda \tau \mu}}n_{\rm{x}}^\mu ,\quad n_{\rm{x}}^\mu = {1 \over {3!}}{\epsilon ^{\mu \nu \lambda \tau}}n_{\nu \lambda \tau}^{\rm{x}}.$$
(170)

Also like before, the conservation rules are equivalent to the individual three-forms being closed, i.e.

$$\nabla {}_{\left[ \mu \right.}n_{\left. {\nu \lambda \tau} \right]}^{\rm{x}} = 0.$$
(171)

However, we need a formulation whereby such conservation obtains automatically, at least in principle.

We make this happen by introducing the three-dimensional matter space, the difference being that we now need two such spaces. These will be labelled by coordinates and we recall that A, B, C, etc. = 1, 2, 3. This is depicted in Figure 8, which indicates the important facts that (i) a given spatial point can be intersected by each fluid’s worldline and (ii) the individual worldlines are not necessarily parallel at the intersection, i.e. the independent fluids are interpenetrating and can exhibit a relative flow with respect to each other. Although we have not indicated this in Figure 8 (in order to keep the figure as uncluttered as possible) attached to each worldline of a given constituent will be a fixed number of particles \(N_1^{\rm{x}}\), \(N_2^{\rm{x}}\), etc. (cf. Figure 7). For the same reason, we have also not labelled (as in Figure 7) the “push-forwards” (represented by the arrows) from the matter spaces to spacetime.

Figure 8
figure 8

The push-forward from a point in the xth-constituent’s three-dimensional matter space (on the left) to the corresponding “fluid-particle” worldline in spacetime (on the right). The points in matter space are labelled by the coordinates \(\{X_{\rm{x}}^1,X_{\rm{x}}^2,X_{\rm{x}}^3\}\), and the constituent index x = n, s. There exist as many matter spaces as there are dynamically independent fluids, which for this case means two.

By pulling-back each constituent’s three-form onto its respective matter space we can once again construct three-forms that are automatically closed on spacetime, i.e. let

$$n_{\nu \lambda \tau}^{\rm{x}} = N_{ABC}^{\rm{x}}{\nabla _\nu}X_{\rm{x}}^A{\nabla _\lambda}X_{\rm{x}}^B{\nabla _\tau}X_{\rm{x}}^C,$$
(172)

where \(N_{ABC}^{\rm{x}}\) is completely antisymmetric in its indices and is a function of the \(X_{\rm{x}}^A\). Using the same reasoning as in the single fluid case, the construction produces three-forms that are automatically closed, i.e. they satisfy Equation (171) identically. If we let the scalar fields \(X_{\rm{x}}^A\) (as functions on spacetime) be the fundamental variables, they yield a representation for each particle number density current that is automatically conserved. The variations of the three-forms can now be derived by varying them with respect to the \(X_{\rm{x}}^A\).

Lagrangian displacements on spacetime for each fluid, to be denoted \(\xi _{\rm{x}}^\mu\), will now be introduced. They are related to the variations \(\delta X_{\rm{x}}^A\) via

$$\delta X_{\rm{x}}^A = - \left({{\nabla _\mu}X_{\rm{x}}^A} \right)\xi _{\rm{x}}^\mu .$$
(173)

It should be clear that the analogues of Equations (131, 132, 134, 135) for this two-fluid case are given by the same formulas except that each displacement and four-current will now be associated with a constituent index, using the decomposition

$$n_{\rm{x}}^\mu = {n_{\rm{x}}}u_{\rm{x}}^\mu ,\quad u_\mu ^{\rm{x}}u_{\rm{x}}^\mu = - 1,$$
(174)

where \(u_\mu ^{\rm{x}} = {g_{\mu \nu}}u_{\rm{x}}^\nu\).

Associated with each constituent’s Lagrangian displacement is its own Lagrangian variation. These are naturally defined to be

$${\Delta _{\rm{x}}} \equiv \delta + {{\mathcal L}_{{\xi _{\rm{x}}}}},$$
(175)

so that it follows that

$${\Delta _{\rm{x}}}n_{\mu \lambda \tau}^{\rm{x}} = 0,$$
(176)

as expected for the pull-back construction. Likewise, two-fluid analogues of Equations (138, 139, 140) exist which take the same form except that the constituent index is attached. However, in contrast to the ordinary fluid case, there are many more options to consider. For instance, we could also look at the Lagrangian variation of the first constituent with respect to the second constituent’s flow, i.e. Δsnn, or the other way around, i.e. Δnns. The Newtonian analogues of these Lagrangian displacements were essential to an analysis of instabilities in rotating superfluid neutron stars [7].

We are now in a position to construct an action principle that will yield the equations of motion and the stress-energy tensor. Again, the central quantity is the “master” function Λ, which is now a function of all the different scalars that can be formed from the \(u_\mu ^{\rm{x}}\), i.e. the scalars nx together with

$$n_{{\rm{xy}}}^2 = n_{{\rm{yx}}}^2 = - {g_{\mu \nu}}n_{\rm{x}}^\mu n_{\rm{y}}^\nu .$$
(177)

In the limit where all the currents are parallel, i.e. the fluids are comoving, −Λ corresponds again to the local thermodynamic energy density. In the action principle, Λ is the Lagrangian density for the fluids. It should be noted that our choice to use only the fluid currents to form scalars implies that the system is “locally isotropic” in the sense that there are no a priori preferred directions, i.e. the fluids are equally free to move in any direction. Structures like the crust believed to exist near the surface of a neutron star generally could be locally anisotropic, e.g. sound waves in the lattice move in the preferred directions associated with the lattice.

An unconstrained variation of Λ with respect to the independent vectors \(n_{\rm{x}}^\mu\) and the metric gμν takes the form

$$\delta \Lambda = \sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} {\mu _\nu ^{\rm{x}}} \,\delta n_{\rm{x}}^\nu + {1 \over 2}{g^{\lambda \nu}}\left({\sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} {n_{\rm{x}}^\mu} \mu _\lambda ^{\rm{x}}} \right)\delta {g_{\mu \nu}},$$
(178)

where

$$\mu _\nu ^{\rm{x}} = {g_{\nu \mu}}\left({{{\mathcal B}^{\rm{x}}}n_{\rm{x}}^\mu + {{\mathcal A}^{{\rm{xy}}}}n_{\rm{y}}^\mu} \right),$$
(179)
$${{\mathcal A}^{{\rm{xy}}}} = {{\mathcal A}^{{\rm{yx}}}} = - {{\partial \Lambda} \over {\partial n_{{\rm{xy}}}^2}},\quad {\rm{for}}\,{\rm{x}} \neq {\rm{y}}.$$
(180)

The momentum covectors \(\mu _v^{\rm{x}}\) are each dynamically, and thermodynamically, conjugate to their respective number density currents \(n_{\rm{x}}^\mu\), and their magnitudes are the chemical potentials. We see that \({{\mathcal A}^{{\rm{xy}}}}\) represents the fact that each fluid momentum \(\mu _v^{\rm{x}}\) may, in general, be given by a linear combination of the individual currents \(n_{\rm{x}}^\mu\). That is, the current and momentum for a particular fluid do not have to be parallel. This is known as the entrainment effect. We have chosen to represent it by the letter \({\mathcal A}\) for historical reasons. When Carter first developed his formalism he opted for this notation, referring to the “anomaly” of having misaligned currents and momenta. It has since been realized that the entrainment is a key feature of most multi-fluid systems and it would, in fact, be anomalous to leave it out!

In the general case, the momentum of one constituent carries along some mass current of the other constituents. The entrainment only vanishes in the special case where Λ is independent of \(n_{{\rm{xy}}}^2({\rm{x}} \neq {\rm{y}})\) because then we obviously have \({{\mathcal A}^{{\rm{xy}}}} = 0\). Entrainment is an observable effect in laboratory superfluids [94, 110] (e.g. via flow modifications in superfluid 4He and mixtures of superfluid 3He and 4He). In the case of neutron stars, entrainment is an essential ingredient of the current best explanation for the so-called glitches [95, 97]. Carter [19] has also argued that these “anomalous” terms are necessary for causally well-behaved heat conduction in relativistic fluids, and by extension necessary for building well-behaved relativistic equations that incorporate dissipation.

In terms of the constrained Lagrangian displacements, a variation of A now yields

$$\delta \left({\sqrt {- g} \Lambda} \right) = {1 \over 2}\sqrt {- g} \left({\Psi {\delta ^\mu}_\lambda + \sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} n_{\rm{x}}^\mu \mu _\lambda ^{\rm{x}}} \right){g^{\lambda \nu}}\delta {g_{\mu \nu}} - \sqrt {- g} \sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} f_\nu ^{\rm{x}}\xi _{\rm{x}}^\nu + {\nabla _\nu}\left({{1 \over 2}\sqrt {- g} \sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} \mu _{\rm{x}}^{\nu \lambda \tau}n_{\lambda \tau \mu}^{\rm{x}}\xi _{\rm{x}}^\mu} \right),$$
(181)

where \(f_v^{\rm{x}}\) is as defined in Equation (164) except that the individual velocities are no longer parallel. The generalized pressure Ψ is now

$$\Psi = \Lambda - \sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} {n_{\rm{x}}^\nu \mu _\nu ^{\rm{x}}.}$$
(182)

At this point we will return to the view that \(n_{\rm{n}}^\mu\) and \(n_{\rm{s}}^\mu\) are the fundamental variables for the fluids. Because the \(\xi _{\rm{x}}^\mu\) are independent variations, the equations of motion consist of the two original conservation conditions of Equation (169), plus two Euler type equations

$$f_\nu ^{\rm{x}} = 0,$$
(183)

and of course the Einstein equation (obtained exactly as before by adding in the Einstein-Hilbert term). We also find that the stress-energy tensor is

$${T^\mu}_\nu = \Psi {\delta ^\mu}_\nu + \sum\limits_{{\rm{x}} = \{{\rm{n}},{\rm{s}}\}} n_{\rm{x}}^\mu \mu _\nu ^{\rm{x}}.$$
(184)

When the complete set of field equations is satisfied then it is automatically true that \({\nabla _\mu}{T^\mu}_v = 0\). One can also verify that Tμν is symmetric. The momentum form \(\mu _{\rm{x}}^{\mu v\lambda}\) entering the boundary term is the natural extension of Equation (159) to this two-fluid case.

It must be noted that Equation (183) is significantly different from the multi-constituent version of Equation (166). This is true even if one is solving for a static and spherically symmetric configuration, where the fluid four-velocities would all necessarily be parallel. Simply put, Equation (183) still represents two independent equations. If one takes entropy as an independent fluid, then the static and spherically symmetric solutions will exhibit thermal equilibrium [38]. This explains, for instance, why one must specify an extra condition (e.g. convective stability [117]) to solve for a double-constituent star with only one four-velocity.

11 Speeds of Sound

Crucial to the understanding of black holes is the effect of spacetime curvature on the light-cone structures, that is, the totality of null vectors that emanate from each spacetime point. Crucial to the propagation of massless fields is the light-cone structure. In the case of fluids, it is both the speed of light and the speed (and/or speeds) of sound that dictate how waves propagate through the matter, and thus how the matter itself propagates. We give here a simple analysis that uses plane-wave propagation to derive the speed(s) of sound in the single-fluid, two-constituent single-fluid, and two-fluid cases, assuming that the metric variations vanish (see [19] for a more rigorous derivation). The analysis is local, assuming that the speed of sound is a locally defined quantity, and performed using local, Minkowski coordinates xμ. The purpose of the analysis is to illuminate how the presence of various constituents and multi-fluids impact the local dynamics.

In a small region we will assume that the configuration of the matter with no waves present is locally isotropic, homogeneous, and static. Thus for the background \(n_{\rm{x}}^\mu = \{{n_{\rm{x}}},0,0,0\}\) and the vorticity vanishes. For all cases the general form of the variation of the force density \(f_v^{\rm{x}}\) for each constituent is

$$\delta f_\nu ^{\rm{x}} = 2n_{\rm{x}}^\mu \partial {}_{\left[ \mu \right.}\delta \mu _{\left. \nu \right]}^{\rm{x}}.$$
(185)

Similarly, the conservation of the \(n_{\rm{x}}^\mu\) gives

$${\partial _\mu}\delta n_{\rm{x}}^\mu = 0.$$
(186)

We are here using the point of view that the \(n_{\rm{x}}^\mu\) are the fundamental fluid fields, and thus plane wave propagation means

$$\delta n_{\rm{x}}^\mu = A_{\rm{x}}^\mu {e^{i{k_\nu}{x^\nu}}},$$
(187)

where the amplitudes \(A_{\rm{x}}^\mu\) and the wave vector kμ are constant. Equations (186) and (187) imply simply

$${k_\mu}\delta n_{\rm{x}}^\mu = 0,$$
(188)

i.e. the waves are “transverse” in the spacetime sense. We will give general forms for \(\delta \mu _v^{\rm{x}}\) in what follows, and only afterwards assume static, homogeneous, and isotropic backgrounds.

11.1 Single fluid case

Suppose that there is only one constituent, with index x = n. The master function Λ then depends only on \(n_{\rm{n}}^2\). The variation in the chemical potential due to a small disturbance \(\delta n_{\rm{n}}^\mu\) is

$$\delta \mu _\mu ^{\rm{n}} = {\mathcal B}_{\mu \nu}^{\rm{n}}\delta n_{\rm{n}}^\nu ,$$
(189)

where

$${\mathcal B}_{\mu \nu}^{\rm{n}} = {{\mathcal B}^{\rm{n}}}{g_{\mu \nu}} - 2{{\partial {{\mathcal B}^{\rm{n}}}} \over {\partial n_{\rm{n}}^2}}{g_{\mu \sigma}}{g_{\nu \rho}}n_{\rm{n}}^\sigma n_{\rm{n}}^\rho .$$
(190)

The equation of motion is \(\delta f_\mu ^{\rm{n}} = 0\). It is not difficult to show, by using the condition of transverse wave propagation (188) and contracting with the spatial part of the wave vector ki = gijkj, that the equation of motion reduces to

$$\left({{{\mathcal B}^{\rm{n}}} + {\mathcal B}_{00}^{\rm{n}}{{{k_j}{k^j}} \over {k_0^2}}} \right){k_i}\delta n_{\rm{n}}^i = 0.$$
(191)

The speed of sound is thus

$$C_{\rm{S}}^2 \equiv {{k_0^2} \over {{k^i}{k_i}}} = 1 + {{\partial \ln {{\mathcal B}^{\rm{n}}}} \over {\partial \ln {n_{\rm{n}}}}}.$$
(192)

Given that a well-constructed fluid model should have \(C_{\rm{S}}^2 \leq 1\), we see that the second term must be negative. This would ensure that the model is causal.

11.2 Two-constituent, single fluid case

Now consider the case when there are the two constituents with densities nn and ns, two conserved density currents \(n_{\rm{n}}^\mu\) and \(n_{\rm{s}}^\mu\), two chemical potential covectors \(\mu _\mu ^{\rm{n}}\) and \(\mu _\mu ^{\rm{s}}\), but still only one four-velocity uμ. The master function Λ depends on both \(n_{\rm{n}}^2\) and \(n_{\rm{s}}^2\) meaning that

$$\delta \mu _\mu ^{\rm{x}} = {\mathcal B}_{\mu \nu}^{\rm{x}}\delta n_{\rm{x}}^\nu + {\mathcal X}_{\mu \nu}^{{\rm{xy}}}\delta n_{\rm{y}}^\nu ,$$
(193)

where

$${\mathcal X}_{\mu \nu}^{{\rm{xy}}} = - 2{{\partial {{\mathcal B}^{\rm{x}}}} \over {\partial n_{\rm{y}}^2}}{g_{\mu \sigma}}{g_{\nu \rho}}n_{\rm{x}}^\sigma n_{\rm{y}}^\rho$$
(194)

and \({\mathcal B}_{\mu v}^{\rm{s}}\) is as in Equation (190) except that each n is replaced with s. We have chosen to use the letter \({\mathcal X}\) to represent what is a true multi-constituent effect, which arises due to composition gradients in the system. An alternative would have been to use \({\mathcal C}\) since the effect is due to the presence of different constituents. However, in his papers Carter tends to use \({{\mathcal B}^{\rm{s}}} = {\mathcal C}\), referring to the bulk entropy contribution as “caloric”. Our chosen notation is intended to avoid confusion. It is also the case that the presence of the composition term \({\mathcal X}_{\mu v}^{{\rm{xy}}}\) has not been emphasized in previous work. This may be unfortunate since a composition variation is known to affect the dynamics of a system, e.g. by giving rise to the g-modes in a neutron star [98].

The fact that \(n_{\rm{s}}^\mu\) is parallel to \(n_{\rm{n}}^\mu\) implies that it is only the magnitude of the entropy density current that is independent. One can show that the condition of transverse propagation, as applied to both currents, implies

$$\delta n_{\rm{s}}^\mu = {x_{\rm{s}}}\delta n_{\rm{n}}^\mu .$$
(195)

Now, we proceed as in the previous example, noting that the equation of motion is \(\delta f_\mu ^{\rm{n}} + \delta f_\mu ^{\rm{s}} = 0\), which reduces to

$$\left({\left[ {{{\mathcal B}^{\rm{n}}} + x_{\rm{s}}^2{{\mathcal B}^{\rm{s}}}} \right]{{k_0^2} \over {{k^j}{k_j}}} - \left[ {{{\mathcal B}^{\rm{n}}}c_{\rm{n}}^2 + x_{\rm{s}}^2{{\mathcal B}^{\rm{s}}}c_{\rm{s}}^2 - 2{x_{\rm{s}}}{\mathcal X}_{00}^{{\rm{ns}}}} \right]} \right){k_i}\delta n_{\rm{n}}^i = 0,$$
(196)

where

$$c_{\rm{x}}^2 \equiv 1 + {{\partial \ln {{\mathcal B}^{\rm{x}}}} \over {\partial \ln {n_{\rm{x}}}}}.$$
(197)

The speed of sound is thus

$$C_{\rm{S}}^2 = {{{{\mathcal B}^{\rm{n}}}c_{\rm{n}}^2 + x_{\rm{s}}^2{{\mathcal B}^{\rm{s}}}c_{\rm{s}}^2 - 2{x_{\rm{s}}}{\mathcal X}_{00}^{{\rm{ns}}}} \over {{{\mathcal B}^{\rm{n}}} + x_{\rm{s}}^2{{\mathcal B}^{\rm{s}}}}}.$$
(198)

11.3 Two fluid case

The two-fluid problem is qualitatively different from the previous two cases, since there are now two independent density currents. This fact impacts the analysis in two crucial ways: (i) The master function Λ depends on \(n_{\rm{n}}^2\), \(n_{\rm{s}}^2\), and \(n_{{\rm{ns}}}^2 = n_{{\rm{sn}}}^2\) (i.e. entrainment is present), and (ii) the equations of motion, after taking into account the transverse flow condition, are doubled to \(\delta f_\mu ^{\rm{n}} = 0 = \delta f_\mu ^{\rm{s}}\). As we will see, the key point is that there are now two sound speeds that must be determined.

A variation of the chemical potential covectors that leaves the metric fixed takes the form

$$\delta \mu _\mu ^{\rm{x}} = \left({{\mathcal B}_{\mu \nu}^{\rm{x}} + {\mathcal A}_{\mu \nu}^{{\rm{xx}}}} \right)\delta n_{\rm{x}}^\nu + \left({{\mathcal X}_{\mu \nu}^{{\rm{xy}}} + {\mathcal A}_{\mu \nu}^{{\rm{xy}}}} \right)\delta n_{\rm{y}}^\nu ,$$
(199)

where the complicated terms

$${\mathcal A}_{\mu \nu}^{{\rm{xx}}} = - {g_{\mu \sigma}}{g_{\nu \rho}}\left({{{\partial {{\mathcal B}^{\rm{x}}}} \over {\partial n_{{\rm{xy}}}^2}}\left[ {n_{\rm{x}}^\rho n_{\rm{y}}^\sigma + n_{\rm{x}}^\sigma n_{\rm{y}}^\rho} \right] + {{\partial {{\mathcal A}^{{\rm{xy}}}}} \over {\partial n_{{\rm{xy}}}^2}}n_{\rm{y}}^\sigma n_{\rm{y}}^\rho} \right),$$
(200)
$${\mathcal A}_{\mu \nu}^{{\rm{xy}}} = {{\mathcal A}^{{\rm{xy}}}}{g_{\mu \nu}} - {g_{\mu \sigma}}{g_{\nu \rho}}\left({{{\partial {{\mathcal B}^{\rm{x}}}} \over {\partial n_{{\rm{xy}}}^2}}n_{\rm{x}}^\sigma n_{\rm{x}}^\rho + {{\partial {{\mathcal B}^{\rm{y}}}} \over {\partial n_{{\rm{xy}}}^2}}n_{\rm{y}}^\sigma n_{\rm{y}}^\rho + {{\partial {{\mathcal A}^{{\rm{xy}}}}} \over {\partial n_{{\rm{xy}}}^2}}n_{\rm{y}}^\sigma n_{\rm{x}}^\rho} \right),$$
(201)

exist solely because of entrainment. The same procedure as in the previous two examples leads to the dispersion relation

$$\left({{{\mathcal B}^{\rm{n}}} - [{\mathcal B}_{00}^{\rm{n}} + {\mathcal A}_{00}^{{\rm{nn}}}]{{{k^i}{k_i}} \over {k_0^2}}} \right)\left({{{\mathcal B}^{\rm{s}}} - [{\mathcal B}_{00}^{\rm{s}} + {\mathcal A}_{00}^{{\rm{ss}}}]{{{k^j}{k_j}} \over {k_0^2}}} \right) - {\left({{{\mathcal A}^{{\rm{ns}}}} - [{\mathcal X}_{00}^{{\rm{ns}}} + {\mathcal A}_{00}^{{\rm{ns}}}]{{{k^i}{k_i}} \over {k_0^2}}} \right)^2} = 0.$$
(202)

This is quadratic in \({k^i}{k_i}/k_0^2\) meaning there are two sound speeds. This is a natural result of the doubling of fluid degrees of freedom.

The sound speed analysis is local, but its results are seen globally in the analysis of modes of oscillation of a fluid body. For a neutron star, the full spectrum of modes is quite impressive (see McDermott et al. [77]): polar (or spheroidal) f-, p-, and g-modes, and the axial (or toroidal) r-modes. Epstein [41] was the first to suggest that there should be even more modes in superfluid neutron stars because the superfluidity allows the neutrons to move independently of the protons. Mendell [78] developed this idea further by using an analogy with coupled pendulums. He argued that the new modes should feature a counter-motion between the neutrons and protons, i.e. as the neutrons move out radially, say, the protons will move in. This is in contrast to ordinary fluid motion that would have the neutrons and protons move in more or less “lock-step”. Analytical and numerical studies [70, 74, 38, 5] have confirmed this basic picture and the new modes of oscillation are known as superfluid modes.

12 The Newtonian Limit and the Euler Equations

One reason relativistic fluids are needed is that they can be used to model neutron stars. However, even though neutron stars are clearly general relativistic objects, one often starts with good old Newtonian physics when considering new applications. The main reason is that effects (such as acoustic modes of oscillation) which are primarily due to the fluids that make up the star can often be understood qualitatively from Newtonian calculations. There are also certain regimes in a neutron star where the Newtonian limit is sufficient quantitatively (such as the outer layers).

There has been much progress recently in the analysis of Newtonian multiple fluid systems. Prix [91] has developed an action-based formalism, analogous to what has been used here. Carter and Chamel [22, 23, 24] have done the same except that they use a fully spacetime covariant formalism. We will be somewhat less ambitious (for example, as in [5]) by extracting the Newtonian equations as the non-relativistic limit of the fully relativistic equations. Given the results of Prix as well as of Carter and Chamel we can think of this exercise as a consistency check of our equations.

The Newtonian limit consists of writing the general relativistic field equations to order c0 where c is the speed of light. The Newtonian equations are obtained, strictly speaking, in the limit where c → ∞. To order c0 the metric becomes

$${\rm{d}}{s^2} = - {c^2}\left({1 + {{2\Phi} \over {{c^2}}}} \right){\rm{d}}{t^2} + {g_{ij}}\,{\rm{d}}{x^i}\,{\rm{d}}{x^j} + {\mathcal O}\left({{c^0}} \right),$$
(203)

where the xi are Cartesian coordinates and gij is the metric of Equation (22). The gravitational potential, denoted Φ, is assumed to be small in the sense that

$$- 1 \ll {\Phi \over {{c^2}}} \leq 0.$$
(204)

We will be somewhat lax by simply writing the general relativistic equations to \({\mathcal O}({c^0})\), and thereby avoiding some of the additional complexities that occur because the spacetime metric clearly becomes singular as c → ∞. For the reader interested in a rigorous construction of the Newtonian equations, we recommend the series of papers by Carter and Chamel [22, 23, 24].

If τ is the proper time as measured along a fluid element worldline, then the curve it traces out can be written

$${x^\mu}(\tau) = \{t(\tau),{x^i}(\tau)\} .$$
(205)

Recall that its four-velocity is given by (cf. Equation (6))

$${u^\mu} = {{{\rm{d}}{x^\mu}} \over {{\rm{d}}\tau}},$$
(206)

but because the speed of light has been restored, uμuμ = −c2. The proper time is now given by

$$- {\rm{d}}{s^2} = {c^2}{\rm{d}}{\tau ^2} = {c^2}\left({1 + {{2\Phi} \over {{c^2}}} - {{{g_{ij}}{v^i}{v^j}} \over {{c^2}}}} \right){\rm{d}}{t^2},$$
(207)

where vi = dxi/dt is the Newtonian three-velocity of the fluid. It is considered to be small in the sense that

$${{\vert {{v^i}}\vert} \over c} \ll 1.$$
(208)

Hence, to the correct order the four-velocity components are

$${u^t} = 1 - {\Phi \over {{c^2}}} + {{{v^2}} \over {2{c^2}}},\qquad {u^i} = {v^i},$$
(209)

where v2 = gijvivj. Note that the particle number current becomes

$${n^\mu} = n({{u^\mu}/c}){.}$$
(210)

The factor of c appears because we define nμnμ = −n2, regardless of whether c = 1 or the meter/second value.

In order to write the single fluid Euler equations, keeping terms to the required order, it is necessary to explicitly break up the master function into its mass, kinetic, and “potential” energy ℰ parts, i.e. to write Λ as

$$\Lambda = - mn{c^2} - {1 \over 2}m{g_{ij}}{n^i}{n^j}/n - {\mathcal E}(n) + {\mathcal O}({{c^0}}),$$
(211)

where m is the particle mass. The internal energy is small compared to the mass, meaning

$$0 \leq {{\mathcal E} \over {mn{c^2}}} \ll 1.$$
(212)

With this choice, a variation of Λ that leaves the metric fixed yields

$${\rm{d}}\Lambda = -({m{c^2} + \mu}){\rm{d}}n,$$
(213)

where

$$\mu = {{\partial {\mathcal E}} \over {\partial n}}.$$
(214)

The \({{\mathcal B}^{\rm{n}}}\) coefficient reduces to

$${{\mathcal B}^{\rm{n}}} = m{c^2} + {\mu \over n}.$$
(215)

In terms of these variables, the pressure p is seen to be

$$p = - {\mathcal E} + \mu n.$$
(216)

The Newtonian limit of the general relativistic fluid equations, in a general coordinate basis, is then

$$0 = {{\partial n} \over {\partial t}} + {\nabla _i}(n{v^i})$$
(217)

and

$$0 = {\partial \over {\partial t}}{v^i} + {v^j}{\nabla _j}{v^i} + {g^{ij}}{\nabla _j}\left({\Phi + {\mu \over m}} \right),$$
(218)

where ∇i is the Euclidean space covariant derivative. Of course, the gravitational potential Φ is determined by a Poisson equation with mn as source.

A Newtonian two-fluid system can be obtained in a similar fashion. As discussed in Section 10, the main difference is that we need two sets of worldlines, describable, say, by curves \(x_{\rm{x}}^\mu ({\tau _{\rm{x}}})\) where τx is the proper time along a constituent’s worldline. Of course, entrainment also comes into play. Its presence implies that the relative flow of the fluids is required to specify the local thermodynamic state of the system, and that the momentum of a given fluid is not simply proportional to that fluid’s flux. This is the situation for superfluid He4 [94, 110], where the entropy can flow independently of the superfluid Helium atoms. Superfluid He3 can also be included in the mixture, in which case there will be a relative flow of the He3 isotope with respect to He4, and relative flows of each with respect to the entropy [113].

Let us consider a two-fluid model like a mixture of He4 and He3, or neutrons and protons in a neutron star. We will denote the two fluids as fluids a and b. The magnitude squared of the difference of three-velocities

$$w_i^{{\rm{yx}}} = v_i^{\rm{y}} - v_i^{\rm{x}}$$
(219)

is denoted ω2 so that the equation of state takes the form ω = ω(na, nb, w2). Hence,

$${\rm{d}}{\mathcal E} = {\mu ^{\rm{a}}}\,{\rm{d}}{n_{\rm{a}}} + {\mu ^{\rm{b}}}\,{\rm{d}}{n_{\rm{b}}} + \alpha \,{\rm{d}}{w^2},$$
(220)

where

$${\mu ^{\rm{a}}} = {\left. {{{\partial {\mathcal E}} \over {\partial {n_{\rm{a}}}}}} \right\vert _{{n_{\rm{b}}},{w^2}}},\qquad {\mu ^{\rm{b}}} = {\left. {{{\partial {\mathcal E}} \over {\partial {n_{\rm{b}}}}}} \right\vert _{{n_{\rm{a}}},{w^2}}},\qquad \alpha = {\left. {{{\partial {\mathcal E}} \over {\partial {w^2}}}} \right\vert _{{n_{\rm{a}}},{n_{\rm{b}}}}}.$$
(221)

The a coefficient reflects the effect of entrainment on the equation of state. Similarly, entrainment causes the fluid momenta to be modified to

$${{p_i^{\rm{x}}} \over {{m^{\rm{x}}}}} = v_{\rm{x}}^i + 2{\alpha \over {{\rho _{\rm{x}}}}}w_{{\rm{yx}}}^i.$$
(222)

The number density of each fluid obeys a continuity equation:

$${{\partial {n_{\rm{x}}}} \over {\partial t}} + {\nabla _j}({n_{\rm{x}}}v_{\rm{x}}^j) = 0.$$
(223)

Each fluid is also seen to satisfy an Euler-type equation, which ensures the conservation of total momentum. This equation can be written

$$\left({{\partial \over {\partial t}} + v_{\rm{x}}^j{\nabla _j}} \right)[v_i^{\rm{x}} + {\varepsilon _{\rm{x}}}w_i^{{\rm{yx}}}] + {\nabla _i}(\Phi + {\tilde \mu _{\rm{x}}}) + {\varepsilon _{\rm{x}}}w_j^{{\rm{yx}}}{\nabla _i}v_{\rm{x}}^j = 0,$$
(224)

where

$${\tilde \mu _{\rm{x}}} = {{{\mu ^{\rm{x}}}} \over {{m^{\rm{x}}}}},$$
(225)

i.e. \({{\tilde \mu}_{\rm{x}}}\) is the relevant chemical potential per unit mass, and the entrainment is included via the coefficients

$${\varepsilon _{\rm{x}}} = 2{\rho _{\rm{x}}}\alpha .$$
(226)

for a detailed discussion of these equations, see [91, 6].

13 The CFS Instability

Investigations into the stability properties of rotating self-gravitating bodies are of obvious relevance to astrophysics. By improving our understanding of the relevant issues we can hope to shed light on the nature of the various dynamical and secular instabilities that may govern the spinevolution of rotating stars. The relevance of such knowledge for neutron star astrophysics may be highly significant, especially since instabilities may lead to detectable gravitational-wave signals. In this section we will review the Lagrangian perturbation framework developed by Friedman and Schutz [44, 45] for rotating non-relativistic stars. This will lead to criteria that can be used to decide when the oscillations of a rotating neutron star are unstable. We provide an explicit example proving the instability of the so-called r-modes at all rotation rates in a perfect fluid star.

13.1 Lagrangian perturbation theory

Following [44, 45], we work with Lagrangian variations. We have already seen that the Lagrangian perturbation ΔQ of a quantity Q is related to the Eulerian variation δQ by

$$\Delta Q = \delta Q + {{\mathcal L}_\xi}Q,$$
(227)

where (as before) ξ is the Lie derivative that was introduced in Section 2. The Lagrangian change in the fluid velocity follows from the Newtonian limit of Equation (135):

$$\Delta {v^i} = {\partial _t}{\xi ^i},$$
(228)

where ξi is the Lagrangian displacement. Given this, and

$$\Delta {g_{ij}} = {\nabla _i}{\xi _j} + {\nabla _j}{\xi _i},$$
(229)

we have

$$\Delta {v_i} = {\partial _t}{\xi _i} + {v^j}{\nabla _i}{\xi _j} + {v^j}{\nabla _j}{\xi _i}.$$
(230)

Let us consider the simplest case, namely a barotropic ordinary fluid for which = (n). Then we want to perturb the continuity and Euler equations from the previous Section 12. The conservation of mass for the perturbations follows immediately from the Newtonian limits of Equations (134) and (138) (which as we recall automatically satisfy the continuity equation):

$$\Delta n = - n{\nabla _i}{\xi ^i},\qquad \delta n = - {\nabla _i}(n{\xi ^i}){.}$$
(231)

Consequently, the perturbed gravitational potential follows from

$${\nabla ^2}\delta \Phi = 4\pi Gm\,\delta n = - 4\pi Gm{\nabla _i}(n{\xi ^i}){.}$$
(232)

In order to perturb the Euler equations we first rewrite Equation (218) as

$$({\partial _t} + {{\mathcal L}_v}){v_i} + {\nabla _i}\left({\tilde \mu + \Phi - {1 \over 2}{v^2}} \right) = 0,$$
(233)

where \(\tilde \mu = \mu/m\). This form is particularly useful since the Lagrangian variation commutes with the operator t + v. Perturbing Equation (233) we thus have

$$({\partial _t} + {{\mathcal L}_v})\Delta {v_i} + {\nabla _i}\left({\Delta \tilde \mu + \Delta \Phi - {1 \over 2}\Delta ({v^2})} \right) = 0.$$
(234)

We want to rewrite this equation in terms of the displacement vector ξ. After some algebra one finds

$$\partial _t^2{\xi _i} + 2{v^j}{\nabla _j}{\partial _t}{\xi _i} + {({v^j}{\nabla _j})^2}{\xi _i} + {\nabla _i}\delta \Phi + {\xi ^j}{\nabla _i}{\nabla _j}\Phi - ({\nabla _i}{\xi ^j}){\nabla _j}\tilde \mu + {\nabla _i}\Delta \tilde \mu = 0.$$
(235)

Finally, we need

$$\Delta \tilde \mu = \delta \tilde \mu + {\xi ^i}{\nabla _i}\tilde \mu = \left({{{\partial \tilde \mu} \over {\partial n}}} \right)\delta n + {\xi ^i}{\nabla _i}\tilde \mu = - \left({{{\partial \tilde \mu} \over {\partial n}}} \right){\nabla _i}(n{\xi ^i}) + {\xi ^i}{\nabla _i}\tilde \mu .$$
(236)

Given this, we have arrived at the following form for the perturbed Euler equation:

$$\partial _t^2{\xi _i} + 2{v^j}{\nabla _j}{\partial _t}{\xi _i} + {({v^j}{\nabla _j})^2}{\xi _i} + {\nabla _i}\delta \Phi + {\xi ^j}{\nabla _i}{\nabla _j}({\Phi + \tilde \mu}) - {\nabla _i}\left[ {\left({{{\partial \tilde \mu} \over {\partial n}}} \right){\nabla _j}(n{\xi ^j})} \right] = 0.$$
(237)

This equation should be compared to Equation (15) of [44].

Having derived the perturbed Euler equations, we are interested in constructing conserved quantities that can be used to assess the stability of the system. To do this, we first multiply Equation (237) by the number density n, and then write the result (schematically) as

$$A\partial _t^2\xi + B{\partial _t}\xi + C\xi = 0,$$
(238)

omitting the indices since there is little risk of confusion. Defining the inner product

$$\langle {\eta ^i},{\xi _i}\rangle = \int {{\eta ^{{i_\ast}}}} {\xi _i}\,{\rm{d}}V,$$
(239)

where n and ξ both solve the perturbed Euler equation, and the asterisk denotes complex conjugation, one can now show that

$$\langle \eta ,A\xi \rangle = {\langle \xi ,A\eta \rangle ^{\ast}}\qquad {\rm{and}}\qquad \langle \eta ,B\xi \rangle = - {\langle \xi ,B\eta \rangle ^{\ast}}.$$
(240)

The latter requires the background relation ∇i(nvi) = 0, and holds provided that n → 0 at the surface of the star. A slightly more involved calculation leads to

$$\langle \eta ,C\xi \rangle ={\langle \xi ,C\eta \rangle ^{\ast}}.$$
(241)

Inspired by the fact that the momentum conjugate to ξi is ρ(t + vjj)ξi, we now consider the symplectic structure

$$W(\eta ,\xi) = \left\langle {\eta ,A{\partial _t}\xi + {1 \over 2}B\xi} \right\rangle - \left\langle {A{\partial _t}\eta + {1 \over 2}B\eta ,\xi} \right\rangle .$$
(242)

Given this, it is straightforward to show that W(η, ξ) is conserved, i.e. tW = 0. This leads us to define the canonical energy of the system as

$${E_{\rm{c}}} = {m \over 2}W({\partial _t}\xi ,\xi) = {m \over 2}{\rm{\{}}\langle {\partial _t}\xi ,A{\partial _t}\xi \rangle + \langle \xi ,C\xi \rangle \} .$$
(243)

After some manipulations, we arrive at the following explicit expression:

$${E_{\rm{c}}} = {1 \over 2}\int {\left\{{\rho \vert {\partial _t}\xi \vert ^{2} - \rho \vert {v^j}{\nabla _j}{\xi _i}\vert ^{2} + \rho {\xi ^i}{\xi ^{{j_\ast}}}{\nabla _i}{\nabla _j}(\tilde \mu + \Phi) + \left({{{\partial \mu} \over {\partial n}}} \right)\vert \delta n\vert ^{2} - {1 \over {4\pi G}}\vert {\nabla _i}\delta \Phi \vert ^{2}} \right\}} {\rm{d}}V$$
(244)

which can be compared to Equation (45) of [44]. In the case of an axisymmetric system, e.g. a rotating star, we can also define a canonical angular momentum as

$${J_{\rm{c}}} = - {m \over 2}W({\partial _\varphi}\xi ,\xi) = - {\rm{Re}}\left\langle {{\partial _\varphi}\xi ,A{\partial _t}\xi + {1 \over 2}B\xi} \right\rangle .$$
(245)

The proof that this quantity is conserved relies on the fact that (i) W(η, ξ) is conserved for any two solutions to the perturbed Euler equations, and (ii) commutes with ρvjj in axisymmetry, which means that if ξ solves the Euler equations then so does φξ.

As discussed in [44, 45], the stability analysis is complicated by the presence of so-called “trivial” displacements. These trivials can be thought of as representing a relabeling of the physical fluid elements. A trivial displacement leaves the physical quantities unchanged, i.e. is such that δn = δvi = 0. This means that we must have

$${\nabla _i}(\rho {\zeta ^i}) = 0,$$
(246)
$$({{\partial _t} + {{\mathcal L}_v}}){\zeta ^i} = 0.$$
(247)

The solution to the first of these equations can be written as

$$\rho {\zeta ^i} = {\epsilon ^{ijk}}{\nabla _j}{\chi _k}$$
(248)

where, in order to satisfy the second equations, the vector χk must have time-dependence such that

$$({\partial _t} + {{\mathcal L}_v}){\chi _k} = 0.$$
(249)

This means that the trivial displacement will remain constant along the background fluid trajectories. Or, as Friedman and Schutz [44] put it, the “initial relabeling is carried along with the unperturbed motion”.

The trivials have the potential to cause trouble because they affect the canonical energy. Before one can use the canonical energy to assess the stability of a rotating configuration one must deal with this “gauge problem”. To do this one should ensure that the displacement vector ξ is orthogonal to all trivials. A prescription for this is provided by Friedman and Schutz [44]. In particular, they show that the required canonical perturbations preserve the vorticity of the individual fluid elements. Most importantly, one can also prove that a normal mode solution is orthogonal to the trivials. Thus, normal mode solutions can serve as canonical initial data, and be used to assess stability.

13.2 Instabilities of rotating perfect fluid stars

The importance of the canonical energy stems from the fact that it can be used to test the stability of the system. In particular:

  • Dynamical instabilities are only possible for motions such that Ec = 0. This makes intuitive sense since the amplitude of a mode for which Ec vanishes can grow without bounds and still obey the conservation laws.

  • If the system is coupled to radiation (e.g. gravitational waves) which carries positive energy away from the system (which should be taken to mean that tEc < 0) then any initial data for which Ec < 0 will lead to an unstable evolution.

Consider a real frequency normal-mode solution to the perturbation equations, a solution of form \(\xi = \hat \xi {e^i}^{(\omega t + m\varphi)}\). One can readily show that the associated canonical energy becomes

$${E_{\rm{c}}} = \omega \left[ {\omega \langle \xi ,A\xi \rangle - {i \over 2}\langle \xi ,B\xi \rangle} \right],$$
(250)

where the expression in the bracket is real. For the canonical angular momentum we get

$${J_{\rm{c}}} = - m\;\left[ {\omega \;\langle \xi ,A\xi \rangle - {i \over 2}\;\langle \xi ,B\xi \rangle} \right].$$
(251)

Combining Equation (250) and Equation (251) we see that, for real frequency modes, we will have

$${E_{\rm{c}}} = - {\omega \over m}{J_{\rm{c}}} = {\sigma _{\rm{p}}}{J_{\rm{c}}},$$
(252)

where σp is the pattern speed of the mode.

Now notice that Equation (251) can be rewritten as

$${{{J_{\rm{c}}}} \over {\left\langle {\hat \xi ,\rho \hat \xi} \right\rangle}} = - m\omega + m{{\langle \xi ,i\rho {\upsilon ^j}{\nabla _j}\xi \rangle} \over {\left\langle {\hat \xi ,\rho \hat \xi} \right\rangle}}.$$
(253)

Using cylindrical coordinates, and vj = Ωφj, one can show that

$$- i\rho \xi _i^{\ast}{\upsilon ^j}{\nabla _j}{\xi ^i} = \rho \Omega \;\left[ {m{{\left\vert {\hat \xi} \right\vert}^2} + i{{({{\hat \xi}^{\ast}} \times \hat \xi)}_z}} \right].$$
(254)
$$\left\vert {{{({{\hat \xi}^{\ast}} \times \hat \xi)}_z}} \right\vert \leq {\left\vert {\hat \xi} \right\vert ^2}$$
(255)

and we must have (for uniform rotation)

$${\sigma _{\rm{p}}} - \Omega \left({1 + {1 \over m}} \right) \leq {{{J_{\rm{c}}}/{m^2}} \over {\left\langle {\hat \xi ,\rho \hat \xi} \right\rangle}} \leq {\sigma _{\rm{p}}} - \Omega \;\left({1 - {1 \over m}} \right).$$
(256)

Equation (256) forms a key part of the proof that rotating perfect fluid stars are generically unstable in the presence of radiation [45]. The argument goes as follows: Consider modes with finite frequency in the Ω → 0 limit. Then Equation (256) implies that co-rotating modes (with σp > 0) must have Jc > 0, while counter-rotating modes (for which σp < 0) will have Jc < 0. In both cases Ec > 0, which means that both classes of modes are stable. Now consider a small region near a point where σp = 0 (at a finite rotation rate). Typically, this corresponds to a point where the initially counter-rotating mode becomes co-rotating. In this region Jc < 0. However, Ec will change sign at the point where σp (or, equivalently, the frequency ω) vanishes. Since the mode was stable in the non-rotating limit this change of sign indicates the onset of instability at a critical rate of rotation.

13.3 The r-mode instability

In order to further demonstrate the usefulness of the canonical energy, let us prove the instability of the inertial r-modes. For a general inertial mode we have (cf. [75] for a discussion of the single fluid problem using notation which closely resembles the one we adopt here)

$${\upsilon^i}\sim \delta {\upsilon^i}\sim {\dot \xi ^i}\sim \Omega \qquad {\rm{and}}\qquad \delta \Phi \sim \delta n\sim {\Omega ^2}.$$
(257)

If we also assume axial-led modes, like the r-modes, then we have δvr Ω2 and the continuity equation leads to

$${\nabla _i}\delta {\upsilon ^i}\sim {\Omega ^3}\qquad \rightarrow \qquad {\nabla _i}{\xi ^i}\sim {\Omega ^2}.$$
(258)

Under these assumptions we find that Ec becomes (to order Ω2)

$${E_{\rm{c}}} \approx {1 \over 2}\int \rho \;\left[ {{{\left\vert {{\partial _t}\xi} \right\vert}^2} - {{\left\vert {{\upsilon ^i}{\nabla _i}\xi} \right\vert}^2} + {\xi ^{i{\ast}}}{\xi ^j}{\nabla _i}{\nabla _j}\left({\Phi + \tilde \mu} \right)} \right]{\rm{d}}V.$$
(259)

We can rewrite the last term using the equation governing the axisymmetric equilibrium. Keeping only terms of order we have

$${\xi ^{i{\ast}}}{\xi ^j}{\nabla _i}{\nabla _j}\left({\Phi + \tilde \mu} \right) \approx {1 \over 2}{\Omega ^2}{\xi ^{i{\ast}}}{\xi ^j}{\nabla _i}{\nabla _j}({r^2}{\sin ^2}\theta).$$
(260)

A bit more work then leads to

$${1 \over 2}{\Omega ^2}{\xi ^{i{\ast}}}{\xi ^j}{\nabla _i}{\nabla _j}({r^2}{\sin ^2}\theta) = {\Omega ^2}{r^2}\;\left[ {{{\cos}^2}\theta {{\left\vert {{\xi ^\theta}} \right\vert}^2} + {{\sin}^2}\theta {{\left\vert {{\xi ^\varphi}} \right\vert}^2}} \right]$$
(261)

and

$${\left\vert {{\upsilon ^i}{\nabla _i}{\xi _j}} \right\vert ^2} = {\Omega ^2}\left\{{{m^2}{{\left\vert \xi \right\vert}^2} - 2im{r^2}\sin \theta \cos \theta \left[ {{\xi ^\theta}{\xi ^{\varphi {\ast}}} - {\xi ^\varphi}{\xi ^{\theta {\ast}}}} \right] + {r^2}\left[ {{{\cos}^2}\theta {{\left\vert {{\xi ^\theta}} \right\vert}^2} + {{\sin}^2}\theta {{\left\vert {{\xi ^\varphi}} \right\vert}^2}} \right]} \right\},$$
(262)

which means that the canonical energy can be written in the form

$${E_{\rm{c}}} \approx - {1 \over 2}\int \rho \;\left\{{(m\Omega - \omega)(m\Omega + \omega)\vert \xi {\vert ^2} - \,2im{\Omega ^2}{r^2}\sin \theta \cos \theta \;\left[ {{\xi ^\theta}{\xi ^{\varphi {\ast}}} - {\xi ^\varphi}{\xi ^{\theta {\ast}}}} \right]} \right\}{\rm{d}}V$$
(263)

for an axial-led mode.

Introducing the axial stream function U we have

$${\xi ^\theta} = - {{iU} \over {{r^2}\sin \theta}}{\partial _\varphi}Y_l^m{e^{i\omega t}},$$
(264)
$${\xi ^\varphi} = {{iU} \over {{r^2}\sin \theta}}{\partial _\theta}Y_l^m{e^{i\omega t}},$$
(265)

where \(Y_l^m = Y_l^m(\theta, \varphi)\) are the standard spherical harmonics. This leads to

$$\vert \xi {\vert ^2} = {{\vert U{\vert ^2}} \over {{r^2}}}\;\left[ {{1 \over {{{\sin}^2}\theta}}\vert {\partial _\varphi}Y_l^m{\vert ^2} + \vert {\partial _\theta}Y_l^m{\vert ^2}} \right]$$
(266)

and

$$i{r^2}\sin \theta \cos \theta \;\left[ {{\xi ^\theta}{\xi ^{\varphi {\ast}}} - {\xi ^\varphi}{\xi ^{\theta {\ast}}}} \right] = {1 \over {{r^2}}}{{\cos \theta} \over {\sin \theta}}m\vert U{\vert ^2}\left[ {Y_l^m{\partial _\theta}Y_l^{m{\ast}} + Y_l^{m{\ast}}{\partial _\theta}Y_l^m} \right].$$
(267)

After performing the angular integrals, we find that

$${E_{\rm{c}}} = - {{l(l + 1)} \over 2}\;\left\{{(m\Omega - \omega)(m\Omega + \omega) - {{2{m^2}{\Omega ^2}} \over {l(l + 1)}}} \right\}\int \rho \vert U{\vert ^2}\,{\rm{d}}r.$$
(268)

Combining this with the r-mode frequency [75]

$$\omega = m\Omega \;\left[ {1 - {2 \over {l(l + 1)}}} \right]$$
(269)

we see that Ec < 0 for all l > 1 r-modes, i.e. they are all unstable. The l = m = 1 r-mode is a special case, leading to Ec = 0.

13.4 The relativistic problem

The theoretical framework for studying stellar stability in general relativity was mainly developed during the 1970s, with key contributions from Chandrasekhar and Friedman [31, 32] and Schutz [102, 103]. Their work extends the Newtonian analysis discussed above. There are basically two reasons why a relativistic analysis is more complicated than the Newtonian one. Firstly, the problem is algebraically more complex because one must solve the Einstein field equations in addition to the fluid equations of motion. Secondly, one must account for the fact that a general perturbation will generate gravitational waves. The work culminated in a series of papers [43, 44, 45, 42] in which the role that gravitational radiation plays in these problems was explained, and a foundation for subsequent research in this area was established. The main result was that gravitational radiation acts in the same way in the full theory as in a post-Newtonian analysis of the problem. If we consider a sequence of equilibrium models, then a mode becomes secularly unstable at the point where its frequency vanishes (in the inertial frame). Most importantly, the proof does not require the completeness of the modes of the system.

14 Modelling Dissipation

Although the inviscid model provides a natural starting point for any investigation of the dynamics of a fluid system, the effects of dissipative mechanisms are often key to the construction of a realistic model. Consider, for example, the case of neutron star oscillations and possible instabilities. While it is interesting from the conceptual point of view to establish that an instability (such as the gravitational-wave driven instability of the fundamental f-mode or the inertial r-mode discussed above) may be present in an ideal fluid, it is crucial to establish that the instability actually has opportunity to grow on a reasonably short timescale. To establish this, one must consider the most important damping mechanisms and work out whether they will suppress the instability or not. A recent discussion of these issues in the context of the r-mode instability can be found in [4].

From the point of view of relativistic fluid dynamics, it is clear already from the outset that we are facing a difficult problem. After all, the Fourier theory of heat conduction leads to instantaneous propagation of thermal signals. The fact that this non-causality is built into the description is unattractive already in the context of the Navier-Stokes equations. After all, one would expect heat to propagate at roughly the mean molecular speed in the system. For a relativistic description non-causal behavior would be truly unacceptable. As work by Lindblom and Hiscock [53] has established, there is a deep connection between causality, stability, and hyperbolicity of a dissipative model. One would expect an acceptable model to be hyperbolic, not allowing signals to propagate superluminally.

Our aim in this section is to discuss the three main models that exist in the literature. We first consider the classic work of Eckart [39] and Landau and Lifshitz [66], which is based on a seemingly natural extension of the inviscid equations. However, a detailed analysis of Lindblom and Hiscock [54, 55] has demonstrated that these descriptions have serious flaws and must be considered unsuitable for practical use. However, having discussed these models it is relatively easy to extend them in the way proposed by Israel and Stewart [107, 57, 58]. Their description, the derivation of which was inspired by early work of Grad [50] and Müller [81] and results from relativistic kinetic theory, provides a framework that is generally accepted as meeting the key criteria for a relativistic model [53]. Finally, we describe Carter’s more recent approach to the problem. This model is elegant because it makes maximal use of a variational argument. The construction is also more general than that of Israel and Stewart. In particular, it shows how one would account for several dynamically independent interpenetrating fluid species. This extension is important for, for example, the consideration of relativistic superfluid systems.

14.1 The “standard” relativistic models

It is natural to begin by recalling the equations that describe the evolution of a perfect fluid (see Section 6). For a single particle species, we have the number current nμ which satisfies

$${\nabla _\mu}{n^\mu} = 0,\qquad {\rm{where}}\qquad {n^\mu} = n{u^\mu}.$$
(270)

The stress energy tensor

$${T^\mu}_\nu = p \bot _\nu ^\mu + \rho {u^\mu}{u_\nu},$$
(271)

where p and ρ represent the pressure and the energy density, respectively, satisfies

$${\nabla _\mu}{T^\mu}_\nu = 0.$$
(272)

In order to account for dissipation we need to introduce additional fields. First introduce a vector νμ representing particle diffusion. That is, let

$${n^\mu} \rightarrow n{u^\mu} + {\nu ^\mu},$$
(273)

and assume that the diffusion satisfies the constraint uμνμ = 0. This simply means that it is purely spatial according to an observer moving with the particles in the inviscid limit, exactly what one would expect from a diffusive process. Next we introduce the heat flux qμ and the viscous stress tensor, decomposed into a trace-part τ (not to be confused with the proper time) and a trace-free bit τμν, such that

$${T^{\mu \nu}} \rightarrow (p + \tau){\bot ^{\mu \nu}} + \rho {u^\mu}{u^\nu} + {q^{(\mu}}{u^{\nu)}} + {\tau ^{\mu \nu}},$$
(274)

subject to the constraints

$${u^\mu}{q_\mu} = {\tau ^\mu}_\mu = 0,$$
(275)
$${u^\mu}{\tau _{\mu \nu}} = 0,$$
(276)
$${\tau _{\mu \nu}} - {\tau _{\nu \mu}} = 0.$$
(277)

That is, both the heat flux and the trace-less part of the viscous stress tensor are purely spatial in the matter frame, and τμν is also symmetric. So far, the description is quite general. The constraints have simply been imposed to ensure that the problem has the anticipated number of degrees of freedom.

The next step is to deduce the permissible form for the additional fields from the second law of thermodynamics. The requirement that the total entropy must not decrease leads to the entropy flux sμ having to be such that

$${\nabla _\mu}{s^\mu} \geq 0.$$
(278)

Assuming that the entropy flux is a combination of all the available vectors, we have

$${s^\mu} = s{u^\mu} + \beta {q^\mu} - \lambda {\nu ^\mu},$$
(279)

where β and λ are yet to be specified. It is easy to work out the divergence of this. Then using the component of Equation (272) along uμ, i.e.

$${u_\mu}{\nabla _\nu}{T^{\mu \nu}} = 0,$$
(280)

and the thermodynamic relation

$$n{\nabla _\mu}{x_{\rm{s}}} = {1 \over T}{\nabla _\mu}\rho - {{p + \rho} \over {nT}}{\nabla _\mu}n,$$
(281)

which follows from assuming the equation of state s = s(ρ, n), and we recall that xs = s/n, one can show that

$$\begin{array}{*{20}c} {{\nabla _\mu}{s^\mu} = {q^\mu}\;\left({{\nabla _\mu}\beta - {1 \over T}{u^\nu}{\nabla _\nu}{u_\mu}} \right) + \left({\beta - {1 \over T}} \right)\;{\nabla _\mu}{q^\mu}\quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {- \left({{x_{\rm{s}}} + \lambda - {{p + \rho} \over {nT}}} \right){\nabla _\mu}{\nu ^\mu} - {\nu ^\mu}{\nabla _\mu}\lambda - {\tau \over T}{\nabla _\mu}{u^\mu} - {{{\tau ^{\mu \nu}}} \over T}{\nabla _\mu}{u_\nu}.} \\ \end{array}$$
(282)

We want to ensure that the right-hand side of this equation is positive definite (or indefinite). An easy way to achieve this is to make the following identifications:

$$\beta = 1/T,$$
(283)

and

$$\lambda = - {x_{\rm{s}}} + {{p + \rho} \over {nT}}.$$
(284)

Here we note that λ = g/nT, where g is the Gibbs free energy density. We also identify

$${\nu ^\mu} = - \sigma {T^2}{\bot ^{\mu \nu}}{\nabla _\nu}\lambda ,$$
(285)

where the “diffusion coefficient” σ ≥ 0, and the projection is needed in order for the constraint uμνμ = 0 to be satisfied. Furthermore, we can use

$$\tau = - \zeta {\nabla _\mu}{u^\mu},$$
(286)

where ζ ≥ 0 is the coefficient of bulk viscosity, and

$${q^\mu} = - \kappa T{\bot ^{\mu \nu}}\left({{1 \over T}{\nabla _\nu}T + {u^\alpha}{\nabla _\alpha}{u_\nu}} \right),$$
(287)

with κ ≥ 0 being the heat conductivity coefficient. To complete the description, we need to rewrite the final term in Equation (282). To do this it is useful to note that the gradient of the four-velocity can generally be written as

$${\nabla _\mu}{u_\nu} = {\sigma _{\mu \nu}} + {1 \over 3}{\bot _{\mu \nu}}\theta + {\varpi _{\mu \nu}} - {a_\nu}{u_\mu},$$
(288)

where the acceleration is defined as

$${a_\mu} = {u^\beta}{\nabla _\beta}{u_\mu},$$
(289)

the expansion is θ = ∇μuμ, and the shear is given by

$${\sigma _{\mu \nu}} = {1 \over 2}\;\left({\bot _\nu ^\alpha {\nabla _\alpha}{u_\mu} + \bot _\mu ^\alpha {\nabla _\alpha}{u_\nu}} \right) - {1 \over 3}{\bot _{\mu \nu}}\theta .$$
(290)

Finally, the “twist” follows fromFootnote 5

$${\varpi _{\mu \nu}} = {1 \over 2}\left({\bot _\nu ^\alpha {\nabla _\alpha}{u_\mu} - \bot _\mu ^\alpha {\nabla _\alpha}{u_\nu}} \right).$$
(291)

Since we want τμν to be symmetric, trace-free, and purely spatial according to an observer moving along uμ, it is useful to introduce the notation

$$\left\langle {{A_{\mu \nu}}} \right\rangle = {1 \over 2} \bot _\mu ^\rho \bot _\nu ^\sigma \left({{A_{\rho \sigma}} + {A_{\sigma \rho}} - {2 \over 3}{\bot _{\rho \sigma}}{\bot ^{\gamma \delta}}{A_{\gamma \delta}}} \right)$$
(292)

for any Aμν. In the case of the gradient of the four-velocity, it is easy to show that this leads to

$$\left\langle {{\nabla _\mu}{u_\nu}} \right\rangle = {\sigma _{\mu \nu}}$$
(293)

and therefore it is natural to use

$${\tau ^{\mu \nu}} = - \eta {\sigma ^{\mu \nu}},$$
(294)

where η ≥ 0 is the shear viscosity coefficient. Given these relations, we can write

$$T\,{\nabla _\mu}{s^\mu} = {{{q^\mu}{q_\mu}} \over {\kappa T}} + {\tau \over \zeta} + {{{\nu ^\mu}{\nu _\mu}} \over {\sigma {T^2}}} + {{{\tau ^{\mu \nu}}{\tau _{\mu \nu}}} \over {2\eta}} \geq 0.$$
(295)

By construction, the second law of thermodynamics is satisfied.

The model we have written down is quite general. In particular, it is worth noticing that we did not yet specify the four-velocity uμ. By doing this we can obtain from the above equations both the formulation due to Eckart [39] and that of Landau and Lifshitz [66]. To arrive at the Eckart description, we associate uμ with the flow of particles. Thus we take νμ = 0 (or equivalently σ = 0). This prescription has the advantage of being easy to implement. The Landau and Lifshitz model follows if we choose the four-velocity to be a timelike eigenvector of the stress-energy tensor. From Equation (274) it is easy to see that, by setting qμ = 0, we get

$${u_\mu}{T^{\mu \nu}} = - \rho {u^\nu}.$$
(296)

This is equivalent to setting κ = 0. Unfortunately, these models, which have been used in most applications of relativistic dissipation to date, are not at all satisfactory. While they pass the key test set by the second law of thermodynamics, they fail several other requirements of a relativistic description. A detailed analysis of perturbations away from an equilibrium state [54] demonstrates serious pathologies. The dynamics of small perturbations tends to be dominated by rapidly growing instabilities. This suggests that these formulations are likely to be practically useless. From the mathematical point of view they are also not acceptable since, being non-hyperbolic, they do not admit a well-posed initial-value problem.

14.2 The Israel-Stewart approach

From the above discussion we know that the most obvious strategy for extending relativistic hydrodynamics to include dissipative processes leads to unsatisfactory results. At first sight, this may seem a little bit puzzling because the approach we took is fairly general. Yet, the formulation suffers from pathologies. Most importantly, we have not managed to enforce causality. Let us now explain how this problem can be solved.

Our previous analysis was based on the assumption that the entropy current sμ can be described as a linear combination of the various fluxes in the system, the four-velocity uμ, the heat-flux qμ and the diffusion νμ. In a series of papers, Israel and Stewart [107, 57, 58] contrasted this “first-order” theory with relativistic kinetic theory. Following early work by Müller [81] and connecting with Grad’s 14-moment kinetic theory description [50], they concluded that a satisfactory model ought to be “second order” in the various fields. If we, for simplicity, work in the Eckart frame (cf. Lindblom and Hiscock [53]) this means that we would use the Ansatz

$${s^\mu} = s{u^\mu} + {1 \over T}{q^\mu} - {1 \over {2T}}\;\left({{\beta _0}{\tau ^2} + {\beta _1}{q_\alpha}{q^\alpha} + {\beta _2}{\tau _{\alpha \beta}}{\tau ^{\alpha \beta}}} \right)\;{u^\mu} + {{{\alpha _0}\tau {q^\mu}} \over T} + {{{\alpha _1}{\tau ^\mu}_\nu {q^\nu}} \over T}.$$
(297)

This expression is arrived at by asking what the most general form of a vector constructed from all the various fields in the problem may be. Of course, we now have a number of new (so far unknown) parameters. The three coefficients β0, β1, and β2 have a thermodynamical origin, while the two coefficients α0 and α1 represent the coupling between viscosity and heat flow. From the above expression, we see that in the frame moving with uμ the effective entropy density is given by

$$- {u_\mu}{s^\mu} = s - {1 \over {2T}}\;\left({{\beta _0}{\tau ^2} + {\beta _1}{q_\alpha}{q^\alpha} + {\beta _2}{\tau _{\alpha \beta}}{\tau ^{\alpha \beta}}} \right).$$
(298)

Since we want the entropy to be maximized in equilibrium, when the extra fields vanish, we should have [β0, β1, β2] ≥ 0. We also see that the entropy flux

$$\bot _\nu ^\mu {s^\nu} = {1 \over T}\;\left[ {(1 + {\alpha _0}\tau){q^\mu} + {\alpha _1}{\tau ^{\mu \nu}}{q_\nu}} \right]$$
(299)

is affected only by the parameters α0 and α1.

Having made the assumption (297) the rest of the calculation proceeds as in the previous case. Working out the divergence of the entropy current, and making use of the equations of motion, we arrive at

$$\begin{array}{*{20}c} {{\nabla _\mu}{s^\mu} = - {1 \over T}\tau \;\left[ {{\nabla _\mu}{u^\mu} + {\beta _0}{u^\mu}{\nabla _\mu}\tau - {\alpha _0}{\nabla _\mu}{q^\mu} - {\gamma _0}T{q^\mu}{\nabla _\mu}\left({{{{\alpha _0}} \over T}} \right) + {{\tau T} \over 2}{\nabla _\mu}\;\left({{{{\beta _0}{u^\mu}} \over T}} \right)} \right]\quad \quad \quad \quad} \\ {- {1 \over T}{q^\mu}\;\left[ {{1 \over T}{\nabla _\mu}T + {u^\nu}{\nabla _\nu}{u_\mu} + {\beta _1}{u^\nu}{\nabla _\nu}{q_\mu} - {\alpha _0}{\nabla _\mu}\tau - {\alpha _1}{\nabla _\nu}{\tau ^\nu}_\mu} \right.\quad \quad \quad \quad \quad \quad} \\ {\quad \quad \; + \left. {{T \over 2}{q_\mu}{\nabla _\nu}\;\left({{{{\beta _1}{u^\nu}} \over T}} \right) - (1 - {\gamma _0})\tau T{\nabla _\mu}\;\left({{{{\alpha _0}} \over T}} \right) - (1 - {\gamma _1})T{\tau ^\nu}_\mu {\nabla _\nu}\;\left({{{{\alpha _1}} \over T}} \right)} \right]} \\ {\quad - {1 \over T}{\tau ^{\mu \nu}}\;\left[ {{\nabla _\mu}{u_\nu} + {\beta _2}{u^\alpha}{\nabla _\alpha}{\tau _{\mu \nu}} - {\alpha _1}{\nabla _\mu}{q_\nu} + {T \over 2}{\tau _{\mu \nu}}{\nabla _\alpha}\;\left({{{{\beta _2}{u^\alpha}} \over T}} \right) - {\gamma _1}T{q_\mu}{\nabla _\nu}\;\left({{{{\alpha _1}} \over T}} \right)} \right].} \\ \end{array}$$
(300)

In this expression it should be noted that we have introduced (following Lindblom and Hiscock) two further parameters, γ0 and γ1. They are needed because without additional assumptions it is not clear how the “mixed” quadratic term should be distributed. A natural way to fix these parameters is to appeal to the Onsager symmetry principle [58], which leads to the mixed terms being distributed “equally” and hence γ0 = γ1 = 1/2.

Denoting the comoving derivative by a dot, i.e. using \({u^\mu}{\nabla _\mu}\tau = \dot \tau\) etc. we see that the second law of thermodynamics is satisfied if we choose

$$\tau = - \zeta \left[ {{\nabla _\mu}{u^\mu} + {\beta _0}\dot \tau - {\alpha _0}{\nabla _\mu}{q^\mu} - {\gamma _0}T{q^\mu}{\nabla _\mu}\left({{{{\alpha _0}} \over T}} \right) + {{\tau T} \over 2}{\nabla _\mu}\left({{{{\beta _0}{u^\mu}} \over T}} \right)} \right],$$
(301)
$$\begin{array}{*{20}c} {{q^\mu} = - \kappa T{\bot ^{\mu \nu}}\left[ {{1 \over T}{\nabla _\nu}T + {{\dot u}_\nu} + {\beta _1}{{\dot q}_\nu} - {\alpha _0}{\nabla _\nu}\tau - {\alpha _1}{\nabla _\alpha}{\tau ^\alpha}_\nu + {T \over 2}{q_\nu}{\nabla _\alpha}\left({{{{\beta _1}{u^\alpha}} \over T}} \right)} \right.\quad \quad} \\ {\qquad \qquad \quad \left. {- (1 - {\gamma _0})\tau T{\nabla _\nu}\left({{{{\alpha _0}} \over T}} \right) - (1 - {\gamma _1})T{\tau ^\alpha}_\nu {\nabla _\alpha}\left({{{{\alpha _1}} \over T}} \right) + {\gamma _2}{\nabla _{\left[ \nu \right.}}{u_{\left. \alpha \right]}}{q^\alpha}} \right],} \\ \end{array}$$
(302)
$${\tau _{\mu \nu}} = - 2\eta \left[ {{\beta _2}{{\dot \tau}_{\mu \nu}} + {T \over 2}{\tau _{\mu \nu}}{\nabla _\alpha}\left({{{{\beta _2}{u^\alpha}} \over T}} \right) + \left\langle {{\nabla _\mu}{u_\nu} - {\alpha _1}{\nabla _\mu}{q_\nu} - {\gamma _1}T{q_\mu}{\nabla _\nu}\left({{{{\alpha _1}} \over T}} \right) + {\gamma _3}{\nabla _{\left[ \mu \right.}}{u_{\left. \alpha \right]}}{\tau _\nu}^\alpha} \right\rangle} \right],$$
(303)

where the angular brackets denote symmetrization as before. In these expression we have added yet another two terms, representing the coupling to vorticity. These bring further “free” parameters γ2 and γ3. It is easy to see that we are allowed to add these terms since they do not affect the entropy production. In fact, a large number of similar terms may, in principle, be considered (see note added in proof in [53]). The presence of coupling terms of the particular form that we have introduced is suggested by kinetic theory [58].

What is clear from these complicated expressions is that we now have evolution equations for the dissipative fields. Introducing characteristic “relaxation” times

$${t_0} = \zeta {\beta _0},\qquad {t_1} = \kappa {\beta _1},\qquad {t_2} = 2\eta {\beta _2},$$
(304)

the above equations can be written

$${t_0}\dot \tau + \tau = - \zeta [ \ldots ],$$
(305)
$${t_1}{\bot ^{\mu \nu}}{\dot q^\nu} + {q^\mu} = - \kappa T{\bot ^{\mu \nu}}[ \ldots ],$$
(306)
$${t_2}{\dot \tau _{\mu \nu}} + {\tau _{\mu \nu}} = - 2\eta [ \ldots ].$$
(307)

A detailed stability analysis by Hiscock and Lindblom [53] shows that the Israel-Stewart theory is causal for stable fluids. Then the characteristic velocities are subluminal and the equations form a hyperbolic system. An interesting aspect of the analysis concerns the potentially stabilizing role of the extra parameters (β0, …, αo, …). Relevant discussions of the implications for the nuclear equation of state and the maximum mass of neutron stars have been provided by Olson and Hiscock [86, 84]. A more detailed mathematical stability analysis can be found in the work of Kreiss et al. [64].

Although the Israel-Stewart model resolves the problems of the first-order descriptions for near equilibrium situations, difficult issues remain to be understood for nonlinear problems. This is highlighted in work by Hiscock and Lindblom [56], and Olson and Hiscock [85]. They consider nonlinear heat conduction problems and show that the Israel-Stewart formulation becomes non-causal and unstable for sufficiently large deviations from equilibrium. The problem appears to be more severe in the Eckart frame [56] than in the frame advocated by Landau and Lifshitz [85]. The fact that the formulation breaks down in nonlinear problems is not too surprising. After all, the basic foundation is a “Taylor expansion” in the various fields. However, it raises important questions. There are many physical situations where a reliable nonlinear model would be crucial, e.g. heavy-ion collisions and supernova core collapse. This problem requires further thought.

14.3 Carter’s canonical framework

The most recent attempt to construct a relativistic formalism for dissipative fluids is due to Carter [20]. His approach is less “phenomenological” than the ones we have considered so far in that it is based on making maximal use of variational principle arguments. The construction is also extremely general. On the one hand this makes it more complex. On the other hand this generality could prove useful in more complicated cases, e.g. for investigations of multi-fluid dynamics and/or elastic media. Given the potential that this formalism has for future applications, it is worth discussing it in detail.

The overall aim is to generalize the variational formulation described in Section 8 in such a way that viscous “stresses” are accounted for. Because the variational foundations are the same, the number currents \(n_{\rm{x}}^\mu\) play a central role (x is a constituent index as before). In addition we can introduce a number of viscosity tensors \(\mathop \tau \nolimits_\sum ^{\mu \nu } \), which we assume to be symmetric (even though it is clear that such an assumption is not generally correct [6]). The index Σ is “analogous” to the constituent index, and represents different viscosity contributions. It is introduced in recognition of the fact that it may be advantageous to consider different kinds of viscosity, e.g. bulk and shear viscosity, separately. As in the case of the constituent index, a repeated index Σ does not imply summation.

The key quantity in the variational framework is the master function Λ. As it is a function of all the available fields, we will now have \(\Lambda \left( {n_x^\mu ,\mathop \tau \nolimits_\sum ^{\mu \nu } ,{g_{\mu \nu }}} \right)\). Then a general variation leads to

$$\delta \Lambda = \sum\limits_{\rm{x}} {\mu _\rho ^{\rm{x}}} \,\delta n_{\rm{x}}^\rho + {1 \over 2}\sum\limits_\Sigma {\pi _{\mu \nu}^\Sigma} \,\delta \tau _\Sigma ^{\mu \nu} + {{\partial \Lambda} \over {\partial {g^{\mu \nu}}}}\delta {g^{\mu \nu}}.$$
(308)

Since the metric piece is treated in the same way as in the non-dissipative problem we will concentrate on the case δgμν = 0. This is, obviously, no real restriction since we already know how the variational principle couples in the Einstein equations in the general case. In the above formula we recognize the momenta \(\mu _\rho ^{\rm{x}}\) that are conjugate to the fluxes. We also have an analogous set of “strain” variables which are defined by

$$\pi _{\mu \nu}^\Sigma = \pi _{(\mu \nu)}^\Sigma = 2{{\partial \Lambda} \over {\partial \tau _\Sigma ^{\mu \nu}}}.$$
(309)

As in the non-dissipative case, the variational framework suggests that the equations of motion can be written as a force-balance equation,

$${\nabla _\mu}T_\nu ^\mu = \sum\limits_{\rm{x}} {f_\nu ^{\rm{x}}} + \sum\limits_\Sigma {f_\nu ^\Sigma} = 0,$$
(310)

where the generalized forces work out to be

$$f_\rho ^{\rm{x}} = \mu _\rho ^{\rm{x}}{\nabla _\sigma}n_{\rm{x}}^\sigma + n_{\rm{x}}^\sigma {\nabla _{\left[ \sigma \right.}}\mu _{\left. \rho \right]}^{\rm{x}}$$
(311)

and

$$f_\rho ^\Sigma = \pi _{\rho \nu}^\Sigma {\nabla _\sigma}\tau _\Sigma ^{\sigma \nu} + \tau _\Sigma ^{\sigma \nu}\left({{\nabla _\sigma}\pi _{\rho \nu}^\Sigma - {1 \over 2}{\nabla _\rho}\pi _{\sigma \nu}^\Sigma} \right).$$
(312)

Finally, the stress-energy tensor becomes

$${T^\mu}_\nu = \Psi {g^\mu}_\nu + \sum\limits_{\rm{x}} {{\mu _\nu}} n_{\rm{x}}^\mu + \sum\limits_\Sigma {\tau _\Sigma ^{\mu \sigma}} \pi _{\sigma \nu}^\Sigma ,$$
(313)

with the generalized pressure given by

$$\Psi = \Lambda - \sum\limits_{\rm{x}} {\mu _\rho ^{\rm{x}}} n_{\rm{x}}^\rho - {1 \over 2}\sum\limits_\Sigma {\tau _\Sigma ^{\rho \sigma}} \pi _{\rho \sigma}^\Sigma .$$
(314)

For reasons that will become clear shortly, it is useful to introduce a set of “convection vectors”. In the case of the currents, these are taken as proportional to the fluxes. This means that we introduce \(\beta _{\rm{x}}^\mu\) according to

$${h_{\rm{x}}}\beta _{\rm{x}}^\mu = n_{\rm{x}}^\mu ,\qquad \mu _\nu ^{\rm{x}}\beta _{\rm{x}}^\nu = - 1\qquad \rightarrow \qquad {h_{\rm{x}}} = - \mu _\nu ^{\rm{x}}n_{\rm{x}}^\nu .$$
(315)

With this definition we can introduce a projection operator

$$\bot _{\rm{x}}^{\mu \nu} = {g^{\mu \nu}} + \mu _{\rm{x}}^\mu \beta _{\rm{x}}^\nu \qquad \rightarrow \qquad \bot _{{\rm{x}}\nu}^\mu \beta _{\rm{x}}^\nu = \bot _{\rm{x}}^{\mu \nu}\mu _\nu ^{\rm{x}} = 0.$$
(316)

From the definition of the force density \(f_{\rm{x}}^\mu\) we can now show that

$${\nabla _\mu}n_{\rm{x}}^\mu = - \beta _{\rm{x}}^\mu f_\mu ^{\rm{x}}$$
(317)

and

$${h_{\rm{x}}}{{\mathcal L}_{\rm{x}}}\mu _\nu ^{\rm{x}} = \bot _{\nu \sigma}^{\rm{x}}f_{\rm{x}}^\sigma ,$$
(318)

where the Lie-derivative along \(\beta _{\rm{x}}^\mu\) is \({{\mathcal L}_{\rm{x}}} = {{\mathcal L}_{\beta _{\rm{x}}^\mu}}\). We see that the component of the force which is parallel to the convection vector \(\beta _{\rm{x}}^\mu\) is associated with particle conservation. Meanwhile, the orthogonal component represents the “change in momentum” along \(\beta _{\rm{x}}^\mu\).

In order to facilitate a similar decomposition for the viscous stresses, it is natural to choose the conduction vector as a unit null eigenvector associated with \(\mathop \pi \nolimits_\sum ^{\mu \nu } \). That is, we take

$$\pi _{\mu \nu}^\Sigma \beta _\Sigma ^\nu = 0$$
(319)

together with

$$u_\mu ^\Sigma = {g_{\mu \nu}}\beta _\Sigma ^\nu \qquad {\rm{and}}\qquad u_\mu ^\Sigma \beta _\Sigma ^\mu = - 1.$$
(320)

Introducing the projection associated with this conduction vector,

$$\bot _{\mu \nu}^\Sigma = {g_{\mu \nu}} + u_\mu ^\Sigma u_\nu ^\Sigma ,$$
(321)

yields

$$\bot _{\mu \nu}^\Sigma \beta _\Sigma ^\nu = 0.$$
(322)

Once we have defined \(\mathop \beta \nolimits_\sum ^{\mu \nu } \), we can use it to reduce the degrees of freedom of the viscosity tensors. So far, we have only required them to be symmetric. However, in the standard case one would expect a viscous tensor to have only six degrees of freedom. To ensure that this is the case in the present framework we introduce the following degeneracy condition:

$$u_\mu ^\Sigma \tau _\Sigma ^{\mu \nu} = 0.$$
(323)

That is, we require the viscous tensor to be purely spatial according to an observer moving along \(\mathop u\nolimits_\sum ^\mu \). With these definitions one can show that

$$\beta _\Sigma ^\mu {{\mathcal L}_\Sigma}\pi _{\mu \nu}^\Sigma = 0,$$
(324)

where \(\mathop {\mathcal{L}}\nolimits_\sum = \mathop {\mathcal{L}}\nolimits_{\mathop \beta \nolimits_\sum ^\mu } \) is the Lie-derivative along \(\mathop \beta \nolimits_\sum ^\mu \), and

$$\tau _\Sigma ^{\mu \nu}{{\mathcal L}_\Sigma}\pi _{\mu \nu}^\Sigma = - 2\beta _\Sigma ^\mu f_\mu ^\Sigma .$$
(325)

Finally, let us suppose that we choose to work in a given frame, moving with four-velocity uμ (and that the associated projection operator is Then we can use the following decompositions for the conduction vectors:

$$\beta _{\rm{x}}^\mu = {\beta _{\rm{x}}}({u^\mu} + v_{\rm{x}}^\mu)\qquad {\rm{and}}\qquad \beta _\Sigma ^\mu = {\beta _\Sigma}({u^\mu} + v_\Sigma ^\mu)\,.$$
(326)

We see that μx = 1/βX represents a chemical type potential for species x with respect to the chosen frame. At the same time, it is easy to see that μΣ = 1/βΣ is a Lorentz contraction factor. Using the norm of \(\mathop \beta \nolimits_\sum ^\mu \) we have

$$\beta _\Sigma ^\mu \beta _\mu ^\Sigma = - \beta _\Sigma ^2\left({1 - v_\Sigma ^2} \right) = - 1,$$
(327)

where \(\upsilon _\Sigma ^2 = \upsilon _\Sigma ^\mu \upsilon _\mu ^\Sigma\). Thus

$${\mu ^\Sigma} = 1/{\beta _\Sigma} = \sqrt {1 - v_\Sigma ^2} ,$$
(328)

which is clearly analogous to the standard Lorentz contraction formula.

So far the construction is quite formal, but we are now set to make contact with the physics. First we note that the above results allow us to demonstrate that

$${u^\sigma}{\nabla _\rho}{T^\rho}_\sigma + \sum\limits_{\rm{x}} {\left({{\mu ^{\rm{x}}}{\nabla _\sigma}n_{\rm{x}}^\sigma + v_{\rm{x}}^\sigma f_\sigma ^{\rm{x}}} \right)} + \sum\limits_\Sigma {\left({v_\Sigma ^\sigma f_\sigma ^\Sigma + {1 \over 2}{\mu ^\Sigma}\tau _\Sigma ^{\rho \sigma}{{\mathcal L}_\Sigma}\pi _{\rho \sigma}^\Sigma} \right)} = 0.$$
(329)

Recall that similar results were central to expressing the second law of thermodynamics in the previous two Sections 14.1 and 14.2. To see how things work out in the present formalism, let us single out the entropy fluid (with index s) by defining \({s^\mu} = n_{\rm{s}}^\mu\) and T = μs. To simplify the final expressions it is also useful to assume that the remaining species are governed by conservation laws of the form

$${\nabla _\mu}n_{\rm{x}}^\mu = {\Gamma _{\rm{x}}},$$
(330)

subject to the constraint of total baryon conservation

$$\sum\limits_{{\rm{x}} \neq {\rm{s}}} {{\Gamma _{\rm{x}}}} = 0.$$
(331)

Given this, and the fact that the divergence of the stress-energy tensor should vanish, we have

$$T{\nabla _\mu}{s^\mu} = - \sum\limits_{{\rm{x}} \neq {\rm{s}}} {{\mu ^{\rm{x}}}} {\Gamma _{\rm{x}}} - \sum\limits_{\rm{x}} {v_{\rm{x}}^\mu} f_\mu ^{\rm{x}} - \sum\limits_\Sigma {\left({v_\Sigma ^\mu f_\mu ^\Sigma + {1 \over 2}{\mu ^\Sigma}\tau _\Sigma ^{\mu \nu}{{\mathcal L}_\Sigma}\pi _{\mu \nu}^\Sigma} \right)} .$$
(332)

Finally, we can bring the remaining two force contributions together by introducing the linear combinations

$$\sum\limits_{\rm{x}} {{\zeta ^{\rm{x}}}} v_{\rm{x}}^\mu = 0\qquad {\rm{and}}\qquad \sum\limits_{\rm{x}} {\zeta _\Sigma ^{\rm{x}}} v_{\rm{x}}^\mu = v_\Sigma ^\mu ,$$
(333)

constrained by

$$\sum\limits_{\rm{x}} {{\zeta ^{\rm{x}}}} = \sum\limits_{\rm{x}} {\zeta _\Sigma ^{\rm{x}}} = 1.$$
(334)

then defining

$$\tilde f_\nu ^{\rm{x}} = f_\nu ^{\rm{x}} + \sum\limits_\Sigma {\zeta _\Sigma ^{\rm{x}}} f_\nu ^\Sigma ,$$
(335)

we have

$$T{\nabla _\mu}{s^\mu} = - \sum\limits_{{\rm{x}} \neq {\rm{s}}} {{\mu ^{\rm{x}}}} {\Gamma _{\rm{x}}} - \sum\limits_{\rm{x}} {v_{\rm{x}}^\mu} \tilde f_\mu ^{\rm{x}} - {1 \over 2}\sum\limits_\Sigma {{\mu ^\Sigma}} \tau _\Sigma ^{\mu \nu}{{\mathcal L}_\Sigma}\pi _{\mu \nu}^\Sigma \geq 0.$$
(336)

The three terms in this expression represent, respectively, the entropy increase due to (i) chemical reactions, (ii) conductivity, and (iii) viscosity. The simplest way to ensure that the second law of thermodynamics is satisfied is to make each of the three terms positive definite.

At this point, the general formalism must be completed by some suitably simple model for the various terms. A reasonable starting point would be to assume that each term is linear. For the chemical reactions this would mean that we expand each Γx according to

$${\Gamma _{\rm{x}}} = - \sum\limits_{{\rm{y}} \neq s} {{C_{{\rm{xy}}}}} {\mu ^{\rm{y}}},$$
(337)

where \({{\mathcal C}_{{\rm{xy}}}}\) is a positive definite (or indefinite) matrix composed of the various reaction rates. Similarly, for the conductivity term it is natural to consider “standard” resistivity such that

$$\tilde f_\rho ^{\rm{x}} = - \sum\limits_{\rm{y}} {{\mathcal R}_{\rho \sigma}^{{\rm{xy}}}} v_{\rm{y}}^\sigma .$$
(338)

Finally, for the viscosity we can postulate a law of form

$$\tau _\Sigma ^{\mu \nu} = - \eta _\Sigma ^{\mu \nu \,\rho \sigma}{{\mathcal L}_\Sigma}\pi _{\rho \sigma}^\Sigma ,$$
(339)

where we would have, for an isotropic model,

$$\eta _\Sigma ^{\mu \nu \,\rho \sigma} = {\eta _\Sigma} \bot _\Sigma ^{\mu (\rho} \bot _\Sigma ^{\sigma)\nu} + {1 \over 3}({\eta _\Sigma} - {\zeta _\Sigma}) \bot _\Sigma ^{\mu \nu} \bot _\Sigma ^{\rho \sigma},$$
(340)

where the coefficients ηΣ and ζΣ can be recognized as representing shear and bulk viscosity, respectively.

A detailed comparison between Carter’s formalism and the Israel-Stewart framework has been carried out by Priou [89]. He concludes that the two models, which are both members of a large family of dissipative models, have essentially the same degree of generality and that they are equivalent in the limit of linear perturbations away from a thermal equilibrium state. Providing explicit relations between the main parameters in the two descriptions, he also emphasizes the key point that analogous parameters may not have the same physical interpretation.

In developing his theoretical framework, Carter argued in favor of an “off the peg” model for heat conducting models [18]. This model is similar to that introduced in Section 10, and was intended as a simple, easier to use alternative to the Israel-Stewart construction. In the particular example discussed by Carter, he chooses to set the entrainment between particles and entropy to zero. This was done in order to simplify the discussion. But, as a discussion by Olson and Hiscock [87] shows, it has disastrous consequences. The resulting model violates causality in two simple model systems. As discussed by Priou [89] and Carter and Khalatnikov [25], this breakdown emphasizes the importance of the entrainment effect in these problems.

14.4 Remaining issues

We have discussed some of the models that have been constructed in order to incorporate dissipative effects in a relativistic fluid description. We have seen that the most obvious ways of doing this, the “text-book” approach of Eckart [39] and Landau and Lifshitz [66], fail completely. They do not respect causality and have serious stability problems. We have also described how this problem can be fixed by introducing additional dynamical fields. We discussed the formulations of Israel and Stewart [107, 57, 58] and Carter [20] in detail. From our discussion it should be clear that these models are examples of an extremely large family of possible theories for dissipative relativistic fluids. Given this wealth of possibilities, can we hope to find the “right” model? To some extent, the answer to this question relies on the extra parameters one has introduced in the theory. Can they be constrained by observations? This question has been discussed by Geroch [47] and Lindblom [73]. The answer seems to be no, we should not expect to be able to use observations to single out a preferred theoretical description. The reason for this is that the different models relax to the Navier-Stokes form on very short timescales. Hence, one will likely only be able to constrain the standard shear and bulk viscosity coefficients, etc. Related questions concern the practicality of the different proposed schemes. To a certain extent, this is probably a matter of taste. Of course, it may well be that the additional parameters required in a particular model are easier to extract from microphysics arguments. This could make this description easier to use in practice, which would be strong motivation for preferring it. Of course, there is no guarantee that the same formulation will be ideal for all circumstances. Clearly, there is scope for a lot more research in this problem area.

15 Heavy Ion Collisions

Relativistic fluid dynamics has regularly been used as a tool to model heavy ion collisions. The idea of using hydrodynamics to study the process of multiparticle production in high-energy hadron collisions can be traced back to work by, in particular, Landau in the early 1950s (see [12]). In the early days these phenomena were observed in cosmic rays. The idea to use hydrodynamics was resurrected as collider data became available [15] and early simulations were carried out at Los Alamos [2, 3]. More recently, modeling has primarily been focussed on reproducing data from, for example, CERN. A useful review of this active area of research can be found in [33].

In a hydrodynamics based model, a high-energy nuclear collision is viewed in the following way: In the center-of-mass frame two Lorentz contracted nuclei collide and, after a complex microscopic process, a hot dense plasma is formed. In the simplest description this matter is assumed to be in local thermal equilibrium. The initial thermalization phase is, of course, out of reach for hydrodynamics. In the model, the state of matter is simply specified by the initial conditions, e.g. in terms of distributions of fluid velocities and thermodynamical quantities. Then follows a hydrodynamical expansion, which is described by the standard conservation equations for energy/momentum, baryon number, and other conserved quantities, such as strangeness, isotopic spin, etc. (see [40] for a variational principle derivation of these equations). As the expansion proceeds, the fluid cools and becomes increasingly rarefied. This eventually leads to the decoupling of the constituent particles, which then do not interact until they reach the detector.

Fluid dynamics provides a well defined framework for studying the stages during which matter becomes highly excited and compressed and, later, expands and cools down. In the final stage when the nuclear matter is so dilute that nucleon-nucleon collisions are infrequent, hydrodynamics ceases to be valid. At this point additional assumptions are necessary to predict the number of particles, and their energies, which may be formed (to be compared to data obtained from the detector). These are often referred to as the “freeze-out” conditions. The problem is complicated by the fact that the “freeze-out” typically occurs at a different time for each fluid cell.

Even though the application of hydrodynamics in this area has led to useful results, it is clear that the theoretical foundation for this description is not a trivial matter. Basically, the criteria required for the equations of hydrodynamics to be valid are

  1. 1.

    many degrees of freedom in the system,

  2. 2.

    a short mean free path,

  3. 3.

    a short mean stopping length,

  4. 4.

    a sufficient reaction time for thermal equilibration, and

  5. 5.

    a short de Broglie wavelength (so that quantum mechanics can be ignored).

An interesting aspect of the hydrodynamic models is that they make use of concepts largely outside traditional nuclear physics, e.g. thermodynamics, statistical mechanics, fluid dynamics, and of course elementary particle physics. This is natural since the very hot, highly excited matter has a large number of degrees of freedom. But it is also a reflection of the basic lack of knowledge. As the key dynamics is uncertain, it is comforting to resort to standard principles of physics like the conservation of momentum and energy.

Another key reason why hydrodynamic models are favored is the simplicity of the input. Apart from the initial conditions which specify the masses and velocities, one needs only an equation of state and an Ansatz for the thermal degrees of freedom. If one includes dissipation one must in addition specify the form and magnitude of the viscosity and heat conduction. The fundamental conservation laws are incorporated into the Euler equations. In return for this relatively modest amount of input, one obtains the differential cross sections of all the final particles, the composition of clusters, etc. Of course, before one can confront the experimental data, one must make additional assumptions about the freeze-out, chemistry, etc. A clear disadvantage of the hydrodynamics model is that much of the microscopic dynamics is lost.

Let us discuss some specific aspects of the hydrodynamics that has been used in this area. As we will recognize, the issues that need to be addressed for heavy-ion collisions are very similar to those faced in studies of relativistic dissipation theory and multi-fluid modeling. The one key difference is that the problem only requires special relativity, so there is no need to worry about the spacetime geometry. Of course, it is still convenient to use a fully covariant description since one is then not tied down to the use of a particular set of coordinates.

In many studies of heavy ions a particular frame of reference is chosen. As we know from our discussion of dissipation and causality (see Section 14), this is an issue that must be approached with some care. In the context of heavy-ion collisions it is common to choose uμ as the velocity of either energy transport (the Landau-Lifshitz frame) or particle transport (the Eckart frame). It is recognized that the Eckart formulation is somewhat easier to use and that one can let uμ be either the velocity of nucleon or baryon number transport. On the other hand, there are cases where the Landau-Lifshitz picture has been viewed as more appropriate. For instance, when ultrarelativistic nuclei collide they virtually pass through one another leaving the vacuum between them in a highly excited state causing the creation of numerous particle-antiparticle pairs. Since the net baryon number in this region vanishes, the Eckart definition of the four-velocity cannot easily be employed. This discussion is a clear reminder of the situation for viscosity in relativity, and the resolution is likely the same. A true frame-independent description will need to include several distinct fluid components.

Multi-fluid models have, in fact, often been considered for heavy-ion collisions. One can, for example, treat the target and projectile nuclei as separate fluids to admit interpenetration, thus arriving at a two-fluid model. One could also use a relativistic multi-fluid model to allow for different species, e.g. nucleons, deltas, hyperons, pions, kaons, etc. Such a model could account for the varying dynamics of the different species and their mutual diffusion and chemical reactions. The derivation of such a model would follow closely our discussion in Section 10. In the heavyion community, it has been common to confuse the issue somewhat by insisting on choosing a particular local rest frame at each space-time point. This is, of course, complicated since the different fluids move at different speeds relative to any given frame. For the purpose of studying heavy ion collisions in the baryon-rich regions of space, the standard option seems to be to define the “baryonic Lorentz frame”. This is the local Lorentz frame in which the motion of the center-of-baryon number (analogous to the center-of-mass) vanishes.

The main problem with the one-fluid hydrodynamics model is the requirement of thermal equilibrium. In the hydrodynamic equations of motion it is implicitly assumed that local thermal equilibrium is “imposed” via an equation of state of the matter. This means that the relaxation timescale and the mean-free path should be much smaller than both the hydrodynamical timescale and the spatial size of the system. It seems reasonable to wonder if these conditions can be met for hadronic and nuclear collisions. On the other hand, from the kinematical point of view, apart from the use of the equation of state, the equations of hydrodynamics are nothing but conservation laws of energy and momentum, together with other conserved quantities such as charge. In this sense, for any process where the dynamics of flow is an important factor, a hydrodynamic framework is a natural first step. The effects of a finite relaxation time and mean-free path might be implemented at a later stage by using an effective equation of state, incorporating viscosity and heat conductivity, or some simplified transport equations. This does, of course, lead us back to the challenging problem of designing a causal relativistic theory for dissipation (see Section 14). In the context of heavy-ion collisions no calculations have yet been performed using a fully three-dimensional, relativistic theory which includes dissipation. In fact, considering the obvious importance of entropy, it is surprising that so few calculations have been reported for either relativistic or nonrelativistic hydrodynamics (although see [59]). An interesting comparison of different dissipative formulations is provided in [82, 83].

16 Superfluids and Broken Symmetries

In this section we discuss models that result when additional constraints are made on the properties of the fluids. We focus on the modeling of superfluid systems, using as our example the case of superfluid He4 [61, 94, 110]. The equations describing more complex systems are readily obtained by generalizing our discussion. We contrast three different descriptions: (i) the model that follows from the variational framework that has been our prime focus so far if we impose that one constituent is irrotational, (ii) the potential formulation due to Khalatnikov and collaborators, and (iii) a “hybrid” formulation which has been used in studies of heavy-ion collisions with broken symmetries.

16.1 Superfluids

Neutron star physics provides ample motivation for the need to develop a relativistic description of superfluid systems. As the typical core temperatures (below 108 K) are far below the Fermi temperature of the various constituents (of the order of 1012 K for baryons) neutron stars are extremely cold on the nuclear temperature scale. This means that, just like ordinary matter at near absolute zero temperature, the matter in the star will most likely freeze to a solid or become superfluid. While the outer parts of the star, the so-called crust, form an elastic lattice, the inner parts of the star are expected to be superfluid. In practice, this means that we must be able to model mixtures of superfluid neutrons and superconducting protons. It is also likely that we need to understand superfluid hyperons and color superconducting quarks. There are many hard physics questions that need to be considered if we are to make progress in this area. In particular, we need to make contact with microphysics calculations that determine the various parameters of such multi-fluid systems. However, we will ignore this aspect and focus on the various fluid models that have been used to describe relativistic superfluids.

One of the key features of a pure superfluid is that it is irrotational. Bulk rotation is mimicked by the formation of vortices, slim “tornadoes” representing regions where the superfluid degeneracy is broken. In practice, this means that one would often, e.g. when modeling global neutron star oscillations, consider a macroscopic model where one “averages” over a large number of vortices. The resulting model would closely resemble the standard fluid model. Of course, it is important to remember that the vortices are present on the microscopic scale, and that they may affect the values of various parameters in the problem. There are also unique effects that are due to the vortices, e.g. the mutual friction that is thought to be the key agent that counteracts relative rotation between the neutrons and protons in a superfluid neutron star core [79].

For the present discussion, let us focus on the simplest model problem of superfluid He4. We then have two fluids, the superfluid Helium atoms with particle number density nn and the entropy with particle number density ns. From the derivation in Section 6 we know that the equations of motion can be written

$${\nabla _\mu}n_{\rm{x}}^\mu = 0$$
(341)

and

$$n_{\rm{x}}^\mu {\nabla _{\left[ \mu \right.}}\mu _{\left. \nu \right]}^{\rm{x}} = 0.$$
(342)

To make contact with other discussions of the superfluid problem [25, 26, 27, 29], we will use the notation \({s^\mu} = n_{\rm{s}}^\mu\) and \({\Theta _v} = \mu _v^{\rm{s}}\). Then the equations that govern the motion of the entropy obviously become

$${\nabla _\mu}{s^\mu} = 0\qquad {\rm{and}}\qquad {s^\rho}\nabla {}_{\left[ \rho \right.}{\Theta _{\left. \nu \right]}} = 0.$$
(343)

Now, since the superfluid constituent is irrotational we will have

$$\nabla {}_{\left[ \sigma \right.}\mu _{\left. \nu \right]}^{\rm{n}} = 0.$$
(344)

Hence, the second equation of motion is automatically satisfied once we impose that the fluid is irrotational. The particle conservation law is, of course, unaffected. This example shows how easy it is to specify the equations that we derived earlier to the case when one (or several) components are irrotational. It is worth emphasizing that it is the momentum that is quantized, not the velocity.

It is instructive to contrast this description with the potential formulation due to Khalatnikov and colleagues [62, 69]. We can obtain this alternative formulation in the following way [26]. First of all, we know that the irrotationality condition implies that the particle momentum can be written as a gradient of a scalar potential α (say). That is, we have

$${V_\rho} = - {{\mu _\rho ^{\rm{n}}} \over m} = - {\nabla _\rho}\alpha .$$
(345)

Here m is the mass of the Helium atom and Vρ is what is traditionally (and somewhat confusedly) referred to as the “superfluid velocity”. We see that it is really a rescaled momentum. Next assume that the momentum of the remaining fluid (in this case, the entropy) is written

$$\mu _\rho ^{\rm{s}} = {\Theta _\rho} = {\kappa _\rho} + {\nabla _\rho}\phi .$$
(346)

Here κρ is Lie transported along the flow provided that sρκρ = 0 (assuming that the equation of motion (343) is satisfied). This leads to

$${{\rm{s}}^\rho}{\nabla _\rho}\phi = {s^\rho}{\Theta _\rho}.$$
(347)

There is now no loss of generality in introducing further scalar potentials β and γ such that κρ = βργ, where the potentials are constant along the flow-lines as long as

$${s^\rho}{\nabla _\rho}\beta = {s^\rho}{\nabla _\rho}\gamma = 0.$$
(348)

Given this, we have

$${\Theta _\rho} = {\nabla _\rho}\phi + \beta {\nabla _\rho}\gamma .$$
(349)

Finally, comparing to Khalatnikov’s formulation [62, 69] we define Θρ = −κωρ and let ϕκζ and βκβ. Then we arrive at the final equation of motion

$$- {{{\Theta _\rho}} \over \kappa} = {w_\rho} = - {\nabla _\rho}\zeta - \beta {\nabla _\rho}\gamma.$$
(350)

Equations (345) and (350), together with the standard particle conservation laws, are the key equations in the potential formulation. As we have seen, the content of this description is identical to that of the canonical variational picture that we have focussed on in this review. We have also seen how the various quantities can be related. Of course, one has to exercise some care in using this description. After all, referring to the rescaled momentum as the “superfluid velocity” is clearly misleading when the entrainment effect is in action.

16.2 Broken symmetries

In the context of heavy-ion collisions, models accounting for broken symmetries have sometimes been considered. At a very basic level, a model with a broken U(1) symmetry should correspond to the superfluid model described above. However, at first sight our equations differ from those used, for example, in [106, 92, 125]. Since we are keen to convince the reader that the variational framework we have discussed in this article is able to cover all cases of interest (in fact, we believe that it is more powerful than alternative formulations) a demonstration that we can reformulate our equations to get those written down for a system with a broken U(1) symmetry has some merit. The exercise is also of interest since it connects with models that have been used to describe other superfluid systems.

Take as the starting point the general two-fluid system. From the discussion in Section 10, we know that the momenta are in general related to the fluxes via

$$\mu _\rho ^{\rm{x}} = {{\mathcal B}^{\rm{x}}}n_\rho ^{\rm{x}} + {{\mathcal A}^{{\rm{xy}}}}n_\rho ^{\rm{y}}.$$
(351)

Suppose that, instead of using the fluxes as our key variables, we want to consider a “hybrid” formulation based on a mixture of fluxes and momenta. In the case of the particle-entropy system discussed in the previous Section 16.1, we can then use

$$n_\rho ^{\rm{n}} = {1 \over {{{\mathcal B}^{\rm{n}}}}}\mu _\rho ^{\rm{n}} - {{{{\mathcal A}^{{\rm{ns}}}}} \over {{{\mathcal B}^{\rm{n}}}}}n_\rho ^{\rm{s}}.$$
(352)

Let us impose irrotationality on the fluid by representing the momentum as the gradient of a scalar potential φ. With \(\mu _\rho ^{\rm{n}} = {\nabla _\rho}\varphi\) we get

$$n_\rho ^{\rm{n}} = {1 \over {{{\mathcal B}^{\rm{n}}}}}{\nabla _\rho}\varphi - {{{{\mathcal A}^{{\rm{ns}}}}} \over {{{\mathcal B}^{\rm{n}}}}}n_\rho ^{\rm{s}}.$$
(353)

Now take the preferred frame to be that associated with the entropy flow, i.e. introduce the unit four-velocity uμ such that \(n_{\rm{s}}^\mu = {n_{\rm{s}}}{u^\mu} = s{u^\mu}\). Then we have

$$n_\rho ^{\rm{n}} = n{u_\rho} - {V^2}{\nabla _\rho}\varphi ,$$
(354)

where we have defined

$$n \equiv - {{s{{\mathcal A}^{{\rm{ns}}}}} \over {{{\mathcal B}^{\rm{n}}}}}\qquad {\rm{and}}\qquad {V^2} = - {1 \over {{{\mathcal B}^{\rm{n}}}}}.$$
(355)

With these definitions, the particle conservation law becomes

$${\nabla _\rho}n_{\rm{n}}^\rho = {\nabla _\rho}\left({n{u^\rho} - {V^2}{\nabla ^\rho}\varphi} \right) = 0.$$
(356)

The chemical potential in the entropy frame follows from

$$\mu = - {u^\rho}\mu _\rho ^{\rm{n}} = - {u^\rho}{\nabla _\rho}\varphi .$$
(357)

One can also show that the stress-energy tensor becomes

$${T^\mu}_\nu = \Psi {\delta ^\mu}_\nu + (\Psi + \rho){u^\mu}{u_\nu} - {V^2}{\nabla ^\mu}\varphi {\nabla _\nu}\varphi ,$$
(358)

where the generalized pressure is given by Ψ as usual, and we have introduced

$$\Psi + \rho = {{\mathcal B}^{\rm{s}}}{s^2} + {{\mathcal A}^{{\rm{sn}}}}sn.$$
(359)

The equations of motion can now be obtained from \({\nabla _\mu}{T^\mu}_v = 0\). (Keep in mind that the equation of motion for x = n is automatically satisfied once we impose irrotationality, as in the previous section.) This essentially completes the set of equations written down by, for example, Son [106]. The argument in favor of this formulation is that it is close to the microphysics calculations, which means that the parameters may be relatively straightforward to obtain. Against the description is the fact that it is a (not very elegant) hybrid where the inherent symmetry amongst the different constituents is lost, and there is also a risk of confusion since one is treating a momentum as if it were a velocity.

17 Final Remarks

In writing this review, we have tried to discuss the different building blocks that are needed if one wants to construct a relativistic theory for fluids. Although there are numerous alternatives, we opted to base our discussion of the fluid equations of motion on the variational approach pioneered by Taub [108] and in recent years developed considerably by Carter [17, 19, 21]. This is an appealing strategy because it leads to a natural formulation for multi-fluid problems. Having developed the variational framework, we discussed applications. Here we had to decide what to include and what to leave out. Our decisions were not based on any particular logic, we simply included topics that were either familiar to us, or interested us at the time. That may seem a little peculiar, but one should keep in mind that this is a “living” review. Our intention is to add further applications when the article is updated. On the formal side, we could consider how one accounts for elastic media and magnetic fields, as well as technical issues concerning relativistic vortices (and cosmic strings). On the application side, we may discuss many issues for astrophysical fluid flows (like supernova core collapse, jets, gamma-ray bursts, and cosmology).

In updating this review we will obviously also correct the mistakes that are sure to be found by helpful colleagues. We look forward to receiving any comments on this review. After all, fluids describe physics at many different scales and we clearly have a lot of physics to learn. The only thing that is certain is that we will enjoy the learning process!