1 Introduction

1.1 Relevance of compact object encounters

The vast majority of stars in the Universe will finally become a compact stellar object, either a white dwarf, a neutron star or a black hole. Our Galaxy therefore harbors large numbers of them, probably ∼ 1010 white dwarfs, several 108 neutron stars and tens of millions of stellar-mass black holes. These objects stretch the physics that is known from terrestrial laboratories to extreme limits. For example, the structure of white dwarfs is governed by electron degeneracy pressure, they are therefore Earth-sized manifestations of the quantum mechanical Pauli-principle. Neutron stars, on the other hand, reach in their cores multiples of the nuclear saturation density (2.6 × 1014 g cm−3), which makes them excellent probes for nuclear matter theories. The dimensionless compactness parameter \(\mathcal{C} = (G/{c^2})(M/R) = {R_{\rm{S}}}/(2R)\), where M, R and RS are mass, radius and Schwarzschild radius of the object, can be considered as a measure of the strength of a gravitational field. It is proportional to the Newtonian gravitational potential and directly related to the gravitational redshift. For black holes, the parameter takes values of 0.5 at the Schwarzschild radius and for neutron stars it is only moderately smaller, \(\mathcal{C} \approx 0.2\), both are gigantic in comparison to the solar value of ∼ 10−6. General relativity has so far passed all tests to high accuracy (Will, 2014), but most of them have been performed in the limit of weak gravitational fields. Neutron stars and black holes, in contrast, offer the possibility to test gravity in the strong field regime (Psaltis, 2008).

Compact objects that had time to settle into an equilibrium state possess a high degree of symmetry and are essentially perfectly spherically symmetric. Moreover, they are cold enough to be excellently described in a T = 0 approximation (since for all involved species i the thermal energies are much smaller than the involved Fermi-energies, kT i E F,i ) and they are in chemical equilibrium. For such systems a number of interesting results can be obtained by (semi-) analytical methods. It is precisely because they are in a “minimum energy state” that such systems are hardly detectable in isolation, and certainly not out to large distances.

Compact objects, however, still possess — at least in principle — very large energy reservoirs and in cases where these reservoirs can be tapped, they can produce electromagnetic emission that is so bright that it can serve as cosmic beacons. For example, each nucleon inside of a carbon-oxygen white dwarf (CO WD) can potentially still release ≈ 0.9 MeV via nuclear burning to the most stable elements, or approximately 1.7 × 1051 erg per solar mass. The gravitational binding energy of a neutron star or black hole is even larger, EgravGM2/R = CMc2 = 3.6 × 1053 erg (C/0.2)(M/M). Tapping these gigantic energy reservoirs, however, usually requires special, often catastrophic circumstances, such as the collision or merger with yet another compact object. For example, the merger of a neutron star with another neutron star or with a black hole likely powers the sub-class of short gamma-ray bursts (GRBs) and mergers of two white dwarfs are thought to trigger type Ia supernovae. Such events are highly dynamic and do not possess enough symmetries to be accurately described by (semi-) analytical methods. They require a careful, three-dimensional numerical modelling of gravity, the hydrodynamics and the relevant “microphysical” ingredients such as neutrino processes, nuclear burning or a nuclear equation of state.

1.2 When/why SPH?

Many of such modelling efforts have involved the smoothed particle hydrodynamics method (SPH), originally suggested by Gingold and Monaghan (1977) and by Lucy (1977). There are a number of excellent numerical methods to deal with problems of ideal fluid dynamics, but each numerical method has its particular merits and challenges, and it is usually pivotal in terms of work efficiency to choose the best method for the problem at hand. For this reason, we want to collect here the major strengths of, but also point out challenges for the SPH method.

The probably most outstanding feature of SPH is that it allows in a straight forward way to exactly conserve mass, energy, momentum and angular momentum by construction, i.e., independent of the numerical resolution. This property is ensured by appropriate symmetries in the SPH particle indices in the evolution equations, see Section 3, together with gradient estimates (usually kernel gradients, but see Section 3.2.4) that are anti-symmetric with respect to the exchange of two particle indices.Footnote 1 For example, as will be illustrated in Section 4.1, the mass transfer between two stars and the resulting orbital dynamics are extremely sensitive to the accurate conservation of angular momentum. Eulerian methods are usually seriously challenged in accurately simulating the orbital dynamics in the presence of mass transfer, but this can be achieved when special measures are taken, see, for example, D’Souza et al. (2006) and Motl et al. (2007).

Another benefit that comes essentially for free is the natural adaptivity of SPH. Since the particles move with the local fluid velocity, they naturally trace the flow motion. As a corollary, simulations are not bound, like usually in Eulerian simulations, to a pre-defined “simulation volume”, but instead they can follow whatever the astrophysical system decides to do, be it a collapse or a rapid expansions or both in different parts of the flow. This automatic “refinement on density” is also closely related to the fact that vacuum does not pose a problem: such regions are simply devoid of SPH particles and no computational resources are wasted on regions without matter. In Eulerian methods, in contrast, vacuum is usually treated as a “background fluid” in which the “fluid of interest” moves and special attention needs to be payed to avoid artifacts caused by the interaction of these two fluids. For example, the neutron star surface of a binary neutron star system close to merger moves with an orbital velocity of ∼ 0.1c against the “vacuum” background medium. This can cause strong shocks and it may become challenging to disentangle, say, physical neutrino emission from the one that is entirely due to the artificial interaction with the background medium. There are, however, also hydrodynamic formulations that would in principle allow to avoid such an artificial “vacuum” (Duez et al., 2003).

On the other hand, SPH’s “natural tendency to ignore vacuum” may also become a disadvantage in cases where the major interest of the investigation is a tenuous medium close to a dense one, say, for gas that is accreted onto the surface of a compact star. Such cases are probably more efficiently handled with an adaptive Eulerian method that can refine on physical quantities that are different from density. Several examples of hybrid approaches between SPH and Eulerian methods are discussed in Section 4.1.4.

SPH is Galilean invariant and thus independent of the chosen computing frame. The lack of this property can cause serious artefact’s for Eulerian schemes if the simulation is performed in an inappropriate reference frame, see Springel (2010b) for a number of examples. Particular examples related to binary mergers have been discussed in New and Tohline (1997) and Swesty et al. (2000): simulating the orbital motion of a binary system in a space-fixed frame can lead to an entirely spurious inspiral and merger, while simulations in a corotating frame may yield accurate results. For SPH, in contrast, it does not matter in which frame the calculation is performed.

Another strong asset of SPH is its exact advection of fluid properties: an attribute attached to an SPH particle is simply carried along as the particle moves. This makes it particularly easy to, say, post-process the nucleosynthesis from a particle trajectory, without any need for additional “tracer particles”. In Eulerian methods high velocities with respect to the computational grid can seriously compromise the accuracy of a simulation. For SPH, this is essentially a “free lunch”, see for example Figure 9, where a high-density wedge is advected with a velocity of 0.9999 c through the computational domain.

The particle nature of SPH also allows for a natural transition to n-body methods. For example, if ejected material from a stellar encounter becomes ballistic so that hydrodynamic forces are negligible, one may decide to simply follow the long-term evolution of point particles in a gravitational potential instead of a fluid. Such a treatment can make time scales accessible that cannot be reached within a hydrodynamic approach (Faber et al., 2006a; Rosswog, 2007a; Lee and Ramirez-Ruiz, 2007; Ramirez-Ruiz and Rosswog, 2009; Lee et al., 2010).

SPH can straight forwardly be combined with highly flexible and accurate gravity solvers such as tree methods. Such approaches are extremely powerful for problems in which a fragmentation with complicated geometry due to the interplay of gas dynamics and self-gravity occurs, say in star or planet formation. Many successful examples of couplings of SPH with trees exist in the literature. Maybe the first one was the use the Barnes-Hut oct-tree algorithm (Barnes and Hut, 1986) within the TreeSPH code (Hernquist and Katz, 1989), closely followed by the implementation of a mutual nearest neighbor tree due to Press for the simulation of white-dwarf encounters (Benz et al., 1990). By now, a number of very fast and flexible tree variants are routinely used in SPH (e.g., Dave et al., 1997; Carraro et al., 1998; Springel et al., 2001; Wadsley et al., 2004; Springel, 2005; Wetzstein et al., 2009; Nelson et al., 2009) and for a long time SPH-tree combinations were at the leading edge ahead of Eulerian approaches that only have caught up later, see for example Kravtsov (1999); Teyssier (2002). More recently, also ideas from the fast multipole method have been borrowed (Dehnen, 2000, 2002; Gafton and Rosswog, 2011; Dehnen, 2014) that allow for a scaling with the particle number N that is close to O(N) or even below, rather than the O(N log N) scaling that is obtained for traditional tree algorithms.

But like any other numerical method, SPH also has to face some challenges. One particular example that had received a lot of attention in recent years, was the struggle of standard SPH formulations to properly deal with some fluid instabilities (Thacker et al., 2000; Agertz et al., 2007; Springel, 2010a; Read et al., 2010). As will be discussed in Section 3.1, the problem is caused by surface tension forces that can emerge near contact discontinuities. This insight triggered a flurry of suggestions on how to improve this aspect of SPH (Price, 2008; Cha et al., 2010; Read et al., 2010; Heß and Springel, 2010; Valcke et al., 2010; Junk et al., 2010; Abel, 2011; Gaburov and Nitadori, 2011; Murante et al., 2011; Read and Hayfield, 2012; Hopkins, 2013; Saitoh and Makino, 2013). Arguably the most robust way to deal with this is the reformulation of SPH in terms of different volume elements as discussed in Section 3.1, an example is shown in Section 3.7.3.

Another notoriously difficult field is the inclusion of magnetic fields into SPH. This is closely related to preserving the \(\nabla \cdot \vec B = 0\) constraint during the MHD evolution, but also here there has been substantial progress in recent years (Børve et al., 2001, 2004; Price and Monaghan, 2004a, b, 2005; Price and Rosswog, 2006; Børve et al., 2006; Rosswog, 2007a; Dolag and Stasyszyn, 2009; Bürzle et al., 2011a, b; Tricco and Price, 2012, 2013). Moreover, magnetic fields may be very important in regions of low density and due to SPH’s “tendency to ignore vacuum”, such regions are poorly sampled. For a detailed discussion of the current state of SPH and magnetic fields we refer to the recent review of Price (2012).

Artificial dissipation is also often considered as a major drawback. However, if dissipation is steered properly, see Section 3.2.5, the performance should be very similar to the one of approximate Riemann solver approaches. A Riemann solver approach may from an aesthetic point of view be more appealing, though, and a number of such approaches have been successfully implemented and tested (Inutsuka, 2002; Cha and Whitworth, 2003; Cha et al., 2010; Murante et al., 2011; Puri and Ramachandran, 2014).

Contrary to what was often claimed in the early literature, however, SPH is not necessarily a very efficient method. It is true that if only the bulk matter distribution is of interest, one can often obtain robust results already with an astonishingly small number of SPH particles. To obtain accurate results for the thermodynamics of the flow, however, still usually requires large particle numbers. In large SPH simulations, it becomes a serious challenge to maintain cache-coherence since particles that were initially close in space and computer memory can later on become completely scattered throughout different (mostly slow) parts of the memory. This can be improved by using cache-friendly variables (aggregate data that are frequently used together into common variables, even if they have not much of a physical connection) and/or by various sorting techniques to re-order particles in memory according to their spatial location. This can be done — in the simplest case — by using a uniform hash grid, but in many astrophysical applications hierarchical structures such as trees are highly preferred for sorting the particles, see e.g., Warren and Salmon (1995); Springel (2005); Nelson et al. (2009); Gafton and Rosswog (2011). While such measures can easily improve the performance by factors of a few, they come with some bookkeeping overhead, which, together with, say, individual time steps and particle sinks or sources, can make codes rather unreadable and error-prone.

Finally, we will briefly discuss in Section 3.6 the construction of accurate initial conditions where the particles are in a true (and stable) numerical equilibrium. This is yet another a non-trivial SPH issue.

1.3 Roadmap through this review

This text is organized as follows:

  • In Section 2 we discuss those kernel approximation techniques that are needed for the discussed SPH discretizations. We begin with the basics and then focus on issues that are needed to appreciate some of the recent improvements of the SPH method. Some of these issues are by their very nature rather technical. Readers that are familiar with basic kernel interpolation techniques could skip this section at first reading.

  • In Section 3 we discuss SPH discretizations of ideal fluid dynamics for both the Newtonian and the relativistic case. Since several reviews have appeared in the last years, we only concisely summarize the more traditional formulations and focus on recent improvements of the method.

  • The last section, Section 4, is dedicated to astrophysical applications. We begin with double white-dwarf encounters (Section 4.1) and then turn to encounters between two neutron stars and between a neutron star and a stellar-mass black hole (Section 4.2). In each of these cases, our emphasis is on the more common, gravitational wave-driven binary systems, but we also discuss dynamical collisions as they may occur, for example, in a globular cluster. Sections 2 and 3 provide the basis for an understanding of SPH as a numerical method and they should pave the way to the most recent developments. The less numerically inclined reader may, however, just catch the most basic SPH ideas from Section 3.2.1 and then jump to his/her preferred astrophysical topic in Section 4. The modular structure of Sections 2 and 3 should allow, whenever needed, for a selective consultation on the more technical issues of SPH.

2 Kernel Approximation

SPH uses kernel approximation techniques to calculate densities and pressure gradients from a discrete set of freely moving particles, see Section 3. Since kernel smoothing is one of the key ideas of the SPH technique, we will begin by collecting a number of basic relations that are necessary to understand how the method works and to judge the accuracy of different approaches. The discussed approximation techniques will be applied in Section 3, both for the traditional formulations of SPH and for the recent improvements of the method.

Figure 1:
figure 1

Sketch of the interaction of a particle a, with its neighboring particles. To avoid a computationally expensive interaction of each particle with all other particles, kernels with a finite support (indicated by the shaded region) are usually used. The support size of a particle a is set as a multiple, Qh a , of its smoothing length, h a . Often, Q = 2 is used.

2.1 Function interpolation

A kernel-smoothed representation of a function A can be obtained via

$$\langle A \rangle (\overrightarrow r) = \int A ({\overrightarrow r ^\prime })W(\overrightarrow r - {\overrightarrow r ^\prime },h)dr^{\prime},$$
(1)

where W is a smoothing kernel. The width of the kernel function is determined by the quantity h, which, in an SPH context, is usually referred to as the “smoothing length”. Obviously, the kernel has the dimension of an inverse volume, it should be normalized to unity, \(\int {W({{\vec r}^\prime},h)d{r^\prime} = 1} \), and it should possess the δ-property,

$$\mathop {\lim }\limits_{h \to 0} \left\langle A \right\rangle (\vec r) = \left\langle A \right\rangle (\vec r),$$
(2)

so that the original function is reproduced exactly in the limit of a vanishing kernel support. The choice of kerne functions is discussed in Section 2.3 in more detail, for now we only assume that the kernels are radial, i.e., \(W(\vec r) = W(\vert\vec r\vert)\), which leads to the very useful property

$${\nabla _a}W(\vert {\vec r_a} - {\vec r_b}\vert, h) = - {\nabla _b}W(\vert {\vec r_a} - {\vec r_b}\vert, h)$$
(3)

that allows for a straight forward conservation of Nature’s conservation laws. This topic is laid out in much detail in Section 2.4 of Rosswog (2009) and we refer the interested reader to this text for a further discussion and technical details. There are plausible arguments that suggest, for example, spheroidal kernels (Fulbright et al., 1995; Shapiro et al., 1996), but such approaches do usually violate angular momentum conservation. In the following we will only discuss radial kernels. Figure 1 sketches how a particle interacts with its neighbor particles via a radial kernel of support radius Qh a . Let us assume that a function A is known at the positions \({\vec r_b},{A_b} = A({\vec r_b})\). One can then approximate Eq. (1) as

$$\left\langle A \right\rangle (\vec r) \simeq \sum\limits_b {{V_b}} \;{A_b}W(\vec r - {\vec r_b},h),$$
(4)

where V b is the volume associated with the fluid parcel (“particle”) at \({\vec r_b}\). In nearly all SPH formulations that are used in astrophysics the particle volume is estimated as V b = m b /ρ b with m b being the (fixed) particle mass and ρ b its mass density. While being a straight forward choice, this is by no means the only option. Recently, SPH formulations have been proposed (Saitoh and Makino, 2013; Hopkins, 2013; Rosswog, 2015) that are based on different volume elements and that can yield substantial improvements. In the following, we will therefore keep a general volume element V b in most equations and only occasionally we give the equations for specific volume element choices.

The assessment of the kernel approximation quality in a practical SPH simulation is actually a rather non-trivial problem. Such an analysis is straight forward for the continuum kernel approximation (Benz, 1990; Monaghan, 1992), or, in discrete form, for particles that are distributed on a regular lattice (Monaghan, 2005), but both of these estimates have limited validity for practical SPH simulations. In practice a discrete kernel approximation is used and particles are neither located on a regular grid nor are they randomly distributed, they are “disordered, but in an orderly way” (Monaghan, 2005). The exact particle distribution depends on the dynamics of the flow and on the kernel that is used. An example of a particle distribution that arises in a practical simulation (a Kelvin-Helmholtz instability) is shown in Figure 2. Note in particular that the particles sample space in a regular way and no “clumping” occurs as would be the case for random particle positions. Why the particles arrange in such a regular way is discussed in Section 3.2.3. Apart from the particle distribution, the accuracy depends, of course, on the chosen kernel function W and this is discussed further in Section 2.3 (see also Figure 6). Since the kernel function determines the distribution into which the particles settle, the kernel function and the particle distribution cannot be considered as independent entities. This makes the analytical accuracy assessment of SPH very difficult and also suggests to interpret results based on a prescribed, fix particle distribution with some caution. In any case, results should be further scrutinized in practical benchmark tests with known solutions. Initial conditions are another very important issue for the accuracy of SPH simulations, they are further discussed in Section 3.6.

Figure 2:
figure 2

SPH particle distribution in a Kelvin-Helmholtz instability test. Note that the particles are not distributed randomly, but instead show a high degree of regularity. This is crucial for the accuracy of the SPH kernel interpolation, see the quality indicators Q1Q4 in Eqs. (6) and (9). This example uses a high-order Wendland kernel, see Section 2.3, which enforces a particularly regular particle distribution.

To obtain indicators for the quality of a function interpolation, it is instructive to assume that the function A is smooth in the neighborhood of the particle of interest, a, and that it can be expanded in a Taylor-series around \({\vec r_a}\):

$${A_b} = {A_a} + {(\nabla A)_a} \cdot ({\vec r_b} - {\vec r_a}) + {1 \over 2}{({\partial _i}{\partial _j}A)_a}{({\vec r_b} - {\vec r_a})^i}{({\vec r_b} - {\vec r_a})^j} + O(r_{ab}^3),$$
(5)

where \({r_{ab}} = \vert{\vec r_a} - {\vec r_b}\vert\) and the Einstein summation convention has been used. Inserting the expansion into the discrete approximation Eq. (4) and requiring that 〈A a should be a close approximation of A a , one finds the “interpolation quality indicators”

$${\mathcal{Q}_1}:\sum\limits_b {{V_b}} {W_{ab}}({h_a}) \approx 1\quad {\rm{and}}\quad {\mathcal{Q}_2}:\sum\limits_b {{V_b}} ({\vec r_b} - {\vec r_a}){W_{ab}}({h_a}) \approx 0,$$
(6)

where \({W_{ab}}({h_a}) = W(\vert{\vec r_a} - {\vec r_b}\vert/{h_a})\). The first relation simply states that the particles should provide a good approximation to a partition of unity.

2.2 Function derivatives

We restrict the discussion here to the first-order derivatives that we will need in Section 3, for higher-order derivatives that are required for some applications we refer the interested reader to the literature (Español and Revenga, 2003; Monaghan, 2005; Price, 2012). There are various ways to calculate gradients for function values that are known at given particle positions. Obviously, the accuracy of the gradient estimate is of concern, but also the symmetry in the particle indices since it can allow to enforce exact numerical conservation of physically conserved quantities. An accurate gradient estimate without built-in conservation can be less useful in practice than a less accurate estimate that exactly obeys Nature’s conservation laws, see Section 5 in Price (2012) for a striking example of how a seemingly good gradient without built-in conservation can lead to pathological particle distributions. The challenge is to combine exact conservation with an accurate gradient estimate.

2.2.1 Direct gradient of the approximant

The straight forward (though not most accurate) gradient estimate is the direct gradient of the interpolant Eq. (4)Footnote 2

$${(\nabla A)^{(0)}}(\vec r) = \sum\limits_b {{V_b}} \;{A_b}\nabla W(\vec r - {\vec r_b},\;h).$$
(7)

Proceeding as above, we can insert again Eq. (5) into Eq. (7),

$$(\nabla A)_a^{(0)} = {A_a}\sum\limits_b {{V_b}} {\nabla _a}{W_{ab}}({h_a}) + \sum\limits_b {{V_b}} (\nabla A \cdot ({\vec r_b} - {\vec r_a})){\nabla _a}{W_{ab}}({h_a}) + \ldots,$$
(8)

which delivers the “gradient quality indicators”

$${\mathcal{Q}_3}:\sum\limits_b {{V_b}} {\nabla _a}{W_{ab}}({h_a}) \approx 0\quad {\rm{and}}\quad {\mathcal{Q}_4}:\sum\limits_b {{V_b}} {({\vec r_b} - {\vec r_a})^i}{({\nabla _a}{W_{ab}}({h_a}))^j} \approx {\delta ^{ij}}$$
(9)

from the requirement that the estimate closely approximates the gradient of A. Q3 is simply the gradient of the quality indicator Q1 and therefore again an expression of the partition of unity requirement. Note, however, that even when all function values are the same, A b = A0, the gradient estimate does not necessarily vanish exactly. This is a direct consequence of Eq. (6) not being an exact partition of unity. Note that, therefore, the reproduction of even constant functions is not enforced and this is often referred to as lack of zeroth order consistency. There are, however, benefits from a particle distribution noticing its imperfections, since, together with exact conservation, this provides an automatic “re-meshing” mechanism that drives the particles into a regular distribution (Price, 2012). Numerical experiments show that the long-term evolution of a particle distribution with such a re-meshing mechanism built in leads to much more accurate results than seemingly more accurate gradient estimates without such a mechanism. The reason is that in the first case the particles arrange themselves in a regular configuration (such as in Figure 2) where the quality indicators are fulfilled to a high degree, while in the latter case pathological particle distributions can develop that sample the fluid only very poorly.

2.2.2 Constant-exact gradients

An immediate way to improve the gradient estimate is to simply subtract the gradient of the residual in the approximate partition of unity, see Eq. (6), or, equivalently, the first error term in Eq. (8). The resulting gradient estimate

$$(\nabla A)_a^{(1)} = \sum\limits_b {{V_b}} ({A_b} - {A_a}){\nabla _a}{W_{ab}}({h_a}),$$
(10)

now manifestly vanishes for a constant function A, i.e., if all A k are the same, independent of the particle distribution.

2.2.3 Linear-exact gradients

Exact gradients of linear functions can be constructed in the following way (Price, 2004). Start with the RHS of Eq. (7) at \({\vec r_a}\) and again insert the Taylor expansion of A b around \({\vec r_a}\)

$$\sum\limits_b {{V_b}{A_b}{\nabla _a}{W_{ab}} = \sum\limits_b {{V_b}\{ {A_a} + {{(\nabla A)}_a} \cdot ({{\vec r}_b} - {{\vec r}_a}), + \ldots \} {\nabla _a}{W_{ab}}} } $$
(11)

which can be re-arranged into

$$\sum\limits_b {{V_b}({A_b} - {A_a}){{({\nabla _a}{W_{ab}})}^i} = (\nabla A)_a^k{M^{ki}}}.$$
(12)

Here the sum over the common index k is implied and the matrix is given by

$${M^{ki}} = \sum\limits_b {{V_b}} {({\vec r_b} - {\vec r_a})^k}{({\nabla _a}{W_{ab}})^i}.$$
(13)

Solving for the gradient component then yields

$$(\nabla A)_a^k = {({M^{ki}})^{ - 1}}\sum\limits_b {{V_b}} ({A_b} - {A_a}){({\nabla _a}{W_{ab}})^i}.$$
(14)

Note that the sum is the same as in the constant-exact gradient estimate Eq. (10), corrected by the matrix M1, which depends on the properties of the particle distribution. It is straight forward to double-check that this exactly reproduces the gradient of a linear function. Assume that \(A(\vec r) = {A_0} + {(\nabla A)_0} \cdot (\vec r - {\vec r_0})\) so that

$${A_b} - {A_a} = {(\nabla A)_0} \cdot ({\vec {r_b}} - {\vec {r_a}}).$$
(15)

If this is inserted into Eq. (14) one finds, as expected,

$$(\nabla A)_a^k = {({M^{ki}})^{ - 1}}\sum\limits_b {(\nabla A)_0^l} {({\vec {r_b}} - {\vec {r_a}})^l}{({\nabla _a}{W_{ab}})^i} = (\nabla A)_0^l{({M^{ki}})^{ - 1}}{M^{li}} = (\nabla A)_0^k.$$
(16)

Obviously, the linear-exact gradient comes at the price of inverting a D×D matrix in D dimensions, however, since the inversion of this small matrix can be done analytically, this does not represent a major computational burden.

2.2.4 Integral-based gradients

Integral-based higher-order derivatives (Brookshaw, 1985; Monaghan, 2005) have for a long time been appreciated for their insensitivity to particle noise. Surprisingly, integral-based estimates for first-order derivatives have only recently been explored (García-Senz et al., 2012; Cabezón et al., 2012). Start from the vector

$${\tilde \vec I_A}(\vec r) = \int {[A(\vec r) - A(\vec r^{\prime})]} (\vec r - \vec r^{\prime})W(\vert \vec r - \vec r^{\prime}\vert,\;h)dV^{\prime}$$
(17)

and, similar to above, insert a first-order Taylor expansion of \(A({\vec r^\prime})\) around \(\vec r\) (sum over k)

$$\tilde I_A^i(\vec r) = \int {\left[ {(\nabla A)_{\vert \vec r}^k{{(\vec r - \vec r^{\prime})}^k} + O({{(\vec r - \vec r^{\prime})}^2})} \right]} \;\;{(\vec r - \vec r^{\prime})^i}W(\vert \vec r - \vec r^{\prime}\vert,\;h)dV^{\prime},$$
(18)

so that the gradient component representation (exact for linear functions) is given by

$$(\nabla A)_{\vert \vec r}^k = {\tilde C^{ki}}(\vec r)\tilde I_A^i(\vec r),$$
(19)

where the matrix \({\tilde C^{ki}}\) is the inverse of

$${\tilde T^{ki}}(\vec r) = {\tilde T^{ik}}(\vec r) = \int {(\vec r - \vec r^{\prime}){{(\vec r - \vec r^{\prime})}^i}} W(\vert \vec r - \vec r^{\prime}\vert,\;h)dV^{\prime}.$$
(20)

The matrix \({\tilde T^{ki}}\) only depends on the positions while \({\tilde \vec I_A}\) contains the function to be differentiated. If we replace the integral by summations (and drop the tilde), the integral-based gradient estimate reads (sum over d)

$$(\nabla A)_{\vert \vec r}^k = {C^{kd}}(\vec r)I_A^d(\vec r),$$
(21)

where (Ckl) = (Tkl)−1 and

$${T^{kl}}(\vec r) = \sum\limits_b {{V_b}W(\vert \vec r - {{\vec r}_b}\vert,h){{(\vec r - {{\vec r}_b})}^k}} {(\vec r - {\vec r_b})^l}$$
(22)
$$I_A^l(r) = \sum\limits_b {{V_b}} W(\vert \vec r - {\vec r_b}\vert,\;h)\;[A(\vec r) - A({\vec r_b})]\;{(\vec r - {\vec r_b})^l}.$$
(23)

It is worth mentioning that for a radial kernel its gradient can be written as

$${\nabla _a}{W_{ab}}({h_a}) = ({\vec r_b} - {\vec r_a}){Y_{ab}}({h_a}),$$
(24)

where Y is also a valid, positively definite and compactly supported kernel function. Therefore, if Eq. (24) is inserted in Eqs. (13) and (14), one recovers Eq. (21), i.e., the linear-exact and the integral-based gradient are actually equivalent for radial kernels.

For a good interpolation, where the quality indicator Q2 in Eq. (6) vanishes to a good approximation, one can drop the term containing \(A(\vec r)\)

$$I_A^l(\vec r) \simeq - {\sum\limits_b {{V_b}W(\vert \vec r - {{\vec r}_b}\vert,\;h)\;{A_b}(\vec r - {{\vec r}_b})} ^l},$$
(25)

so that the gradient estimate in integral approximation (IA) explicitly reads (sum over d)

$$(\nabla A)_{{\rm{I}}A}^k = \sum\limits_b {{V_b}} {A_b}{C^{kd}}{({\vec r_b} - \vec r)^d}W(\vert \vec r - {\vec r_b}\vert,\;h) \equiv \sum\limits_b {{V_b}} {A_b}G_b^k(\vec r).$$
(26)

Comparison of Eq. (26) with Eq. (7) suggests that \({\vec G_b}({\vec r^\prime})\) takes over the role that is usually played by \(\nabla W(\vec r - {\vec r_b},h)\):

$$\nabla W(\vec r - {\vec r_b},\;h) \rightarrow {\vec G_b}(\vec r).$$
(27)

While being slightly less accurate than Eqs. (21)(23), see Section 2.2.5, the approximation Eq. (26) has the major advantage that \(\vec G\) is anti-symmetric with respect to the exchange of \(\vec r\) and \({\vec r_b}\) just as the direct gradient of the radial kernel, see Eq. (3). Therefore, it allows in a straightforward way to enforce exact momentum conservation,Footnote 3 though with a substantially more accurate gradient estimate. This type of gradient has in a large number of benchmark tests turned out to be superior to the traditional SPH approach.

2.2.5 Accuracy assessment of different gradient prescriptions

We perform a numerical experiment to measure the accuracy of different gradient prescriptions for a regular particle distribution. Since, as outlined above, kernel function and particle distribution are not independent entities, such tests at fixed particle distribution are a useful accuracy indicator, but they should be backed up by tests in which the particles can evolve dynamically. We place SPH particles in a 2D hexagonal lattice configuration in [−1, 1] × [−1, 1]. Each particle is assigned a pressure value according P(x, y) = 2 + x and we use the different prescriptions discussed in Sections 2.2.12.2.4 to numerically measure the pressure gradient. The relative average errors, \(\epsilon = {N^{ - 1}}\sum\nolimits_{b = 1}^N {{\epsilon _b}}\) with \({\epsilon _b} = \vert{({\partial _x}P)_b} - 1\vert\), as a function of the kernel support are shown in Figure 3. The quantity η determines the smoothing length via h b = η (m b /ρ b )1/D, while D denotes the number of spatial dimensions. For this numerical experiment the standard cubic spline kernel M4, see Section 2.3, has been used. We apply the “standard” SPH-gradient, Eq. (7), the approximate integral-based gradient (“IA gradient”), Eq. (26), the full integral-based gradient (“fIA gradient”), Eq. (21), and the linearly exact-gradient (“LE gradient”), Eq. (14). Note that for this regular particle distribution, the constant-exact gradient, Eq. (10), is practically indistinguishable from the standard prescription, since it differs only by a term that is proportional to the first “gradient quality indicator”, Eq. (9), which vanishes here to a very good approximation. Clearly, the more sophisticated gradient prescriptions yield vast improvements of the gradient accuracy. Both the LE and fIA gradients reproduce the exact value to within machine precision, the IA gradient which has the desired anti-symmetry property, improves the gradient accuracy of the standard SPH estimate by approximately 10 orders of magnitude.

As expected from the term that was neglected in Eq. (23), the accuracy of the “IA gradient” deteriorates substantially (to an accuracy level comparable to the standard prescription), for a less regular particle distribution, see Rosswog (2015). Therefore, the usefulness of the IA gradient depends on how regularly the SPH particles are distributed in a practical simulation. The original work by García-Senz et al. (2012) and our own extensive numerical experiments (Rosswog, 2015), however, show that the IA gradient is also in practical numerical tests highly superior to the standard kernel gradient prescription.

Figure 3:
figure 3

Accuracy of different gradient prescriptions. Shown are the average relative errors in the numerical determination of a pressure gradient for different gradient prescriptions, see Eq. (7) for “std. SPH gradient”, Eq. (26) for “IA gradient”, Eq. (21) for “flA gradient” and Eq. (14) for “LE gradient”. The more sophisticated gradient prescriptions offer vast improvements over the standard SPH gradient. However, of those latter ones only the IA gradient has the same symmetry properties of the standard SPH prescription, which allows for a straight forward enforcing of Nature’s conservation laws. Image adapted from Rosswog (2015).

2.3 Which kernel function to choose?

In the following, we briefly explore the properties of a selection of kernel functions. We write normalized SPH kernels in the formFootnote 4

$$W(\vert \vec r - \vec r^{\prime}\vert,\;h) = {\sigma \over {{h^D}}}w(q),$$
(28)

where h is the smoothing length that determines the support of \(W,q = \vert\vec r - {\vec r^\prime}\vert/h\) and D is the number of spatial dimensions. The normalizations are obtained from

$$\sigma ^{ - 1} = \left\{ {\begin{array}{*{20}c} {2\int_0^Q {w(q)dq} } & {in 1D} \\ {2\pi \int_0^Q {w(q)q dq} } & {in 2D} \\ {4\pi \int_0^Q {w(q)q^2 dq} } & {in 3D,} \\ \end{array} } \right.$$
(29)

where Q is the kernel support (= 2 for most of the following kernels). In the following we will give the kernels in the form that is usually found in the literature. For a fair comparison in terms of computational effort/neighbor number we stretch the kernels in all plots and experiments to a support of 2h. So if a kernel has a normalization σ lh for a support of lh, it has normalization σ kh = (l/k)Dσ lh if it is stretched to a support of kh.

2.3.1 Kernels with vanishing central derivatives

“Bell-shaped” kernels with their vanishing derivatives at the origin are rather insensitive to the exact positions of nearby particles and, therefore, they are good density estimators (Monaghan, 1992). For this reason they have been widely used in SPH simulations. The kernels that we discuss here and their derivatives are plotted in Figure 4. More recently, kernels with non-vanishing central derivatives have been (re-)suggested, some of them are discussed in Section 2.3.2.

2.3.1.1 B-spline functions: M4 and M6 kernels

The most commonly used SPH kernels are the so-called B-spline functions (Schoenberg, 1946), M n , which are generated as Fourier transforms:

$${M_n}(x,h) = {1 \over \pi }\int_{ - \infty }^\infty {{{\left[ {{{\sin (kh/2)} \over {kh/2}}} \right]}^n}\cos (kx)\;dk.} $$
(30)

The smoothness of the M n functions increases with n and they are continuous up to the (n − 2)-th derivative. Since SPH requires at the very least the continuity in the first and second derivative, the cubic spline kernel M4

$$w_4 (q) = \left\{ {\begin{array}{*{20}c} {\tfrac{1} {4}(2 - q)^3 } & {0 \leqslant q < 1} \\ {\tfrac{1} {4}(2 - q)^3 } & {1 \leqslant q < 2} \\ 0 & {else} \\ \end{array} } \right.$$
(31)

is the lowest-order SPH kernel that is a viable option, it is often considered the “standard choice” in SPH. As we will show below, much more accurate choices are available at a moderate extra cost. The normalization factor σ of M4 has values of [2/3, 10/(7π), 1/π] in 1, 2 and 3 dimensions.

Figure 4:
figure 4

Comparison of a selection of kernels. The kernel value W (h = 1) is displayed in the upper and its derivatives in the lower panel. Note that the M6 and the QCM6 kernels have, for ease of comparison, been scaled to a support of 2h.

The quintic spline kernel M6 has also occasionally been used in SPH simulations. It reads (truncated at support Q = 3)

$$w_6 (q) = \left\{ {\begin{array}{*{20}c} {(3 - q)^5 - 6(2 - q)^5 + 15(1 - q)^5 } & {0 \leqslant q < 1} \\ {(3 - q)^5 - 6(2 - q)^5 } & {1 \leqslant q < 2} \\ {(3 - q)^5 } & {2 \leqslant q < 3} \\ 0 & {else} \\ \end{array} } \right.$$
(32)

with normalizations of [1/120, 7/(478π), 1/(120π)] in 1, 2 and 3 dimensions. Although this is the commonly used form, we re-scale the M6 kernel in all plots to a support of Q = 2 to enable a fair and easy comparison with other kernels in terms of computational effort (i.e., neighbor number).

2.3.1.2 A parametrized family of kernels

More recently, a one parameter family of kernels has been suggested (Cabezón et al., 2008)

$$W_{H,n} = \frac{{\sigma _{H,n} }} {{h^D }}\left\{ {\begin{array}{*{20}c} 1 & {q = 0} \\ {\left( {\tfrac{{\sin \left[ {\tfrac{\pi } {2}q} \right]}} {{\tfrac{\pi } {2}q}}} \right)^n } & {0 \leqslant q < 2} \\ 0 & {else,} \\ \end{array} } \right.$$
(33)

where n determines the smoothness and the shape of the kernel, see Figure 4. The normalization of this kernel family, σH,n, for integer n from 3 to 9 is given in Table 1. In Cabezón et al. (2008), their Table 2, a fifth-order polynomial is given that provides the normalization for continuous n between 3 and 7. The WH,3 kernel is very similar to the M4 (not shown), while WH,5 is a close approximation of M6, see Figure 4, provided they have the same support.

Table 1: Parameters of WH,n and QCM6 kernels. Table adapted from Rosswog (2015).
Table 2 Parameters of the QCM6 kernel
2.3.1.3 Wendland kernels

An interesting class of kernels with compact support and positive Fourier transforms are the so-called Wendland functions (Wendland, 1995). In several areas of applied mathematics Wendland functions have been well appreciated for their good interpolation properties, but they have not received much attention as SPH kernels and have only recently been explored in more detail (Dehnen and Aly, 2012; Hu et al., 2014; Rosswog, 2015). Dehnen and Aly (2012) have pointed out in particular that these kernels are not prone to the pairing instability, see Section 2.3.4, despite having a vanishing central derivative. These kernels have been explored in a large number of benchmark tests (Rosswog, 2015) and they are high appreciated for their “cold fluid properties”: the particle distribution remains highly ordered even in dynamical simulations (e.g., Figure 6, upper right panel) and only allows for very little noise.

Here, we only experiment with one particular example, the C6 smooth

$${{\rm{W}}_{3,3}} = {{{\sigma _W}} \over {{h^3}}}(1 - q)_ + ^8(32{q^3} + 25{q^2} + 8q + 1),$$
(34)

see e.g., Schaback and Wendland (2006), where the symbol (.)+ denotes the cutoff function max(., 0). The normalization σ w is 78/(7π) and 1365/(64π) in 2 and 3 dimensions.

2.3.2 Kernels with non-vanishing central derivatives

A number of kernels have been suggested in the literature whose derivatives remain finite in the center so that the repulsive forces between particles never vanish. The major motivation behind such kernels is to achieve a very regular particle distribution and in particular to avoid the pairing instability, see Section 2.3.4. However, as recently pointed out by Dehnen and Aly (2012), the pairing instability is not necessarily related to a vanishing central derivative, instead, non-negative Fourier transforms have been found as a necessary condition for stability against pairing. Also kernels with vanishing central derivatives can possess this properties. We explore here only two peaked kernel functions, one, the linear quartic core (LIQ) kernel, that has been suggested as a cure to improve SPH’s performance in Kelvin-Helmholtz instabilities and another one, the quartic core M6 (QCM6) kernel, mainly for pedagogical reasons to illustrate how an even very subtle change in the central part of the kernel can seriously deteriorate the approximation quality. For a more extensive account on kernels with non-vanishing central derivatives we refer to the literature (Thomas and Couchman, 1992; Fulk and Quinn, 1996; Valcke et al., 2010; Read et al., 2010).

2.3.2.1 Linear quartic kernel

The centrally peaked “linear quartic” (LIQ) kernel (Valcke et al., 2010) reads

$$W_{LIQ} (q) = \frac{{\sigma _{LIQ} }} {{h^D }}\left\{ {\begin{array}{*{20}c} {F - q} & {for q \leqslant x_s } \\ {Aq^4 + Bq^3 + Cq^2 + Dq + E} & {for x_s < q \leqslant 1} \\ 0 & {else,} \\ \end{array} } \right.$$
(35)

with x s = 0.3, A = −500/343, B = 1300/343, C = −900/343, D = −100/343, E = 200/343 and F = 13/20. The normalization constant σLIQ is 1000/447, 3750/403π and 30 000/2419π in 1, 2 and 3 dimensions.

2.3.2.2 Quartic core M6 kernel

We also explore a very subtle modification of the well-known and appreciated M6 kernel so that it exhibits a small, but non-zero derivative in the center. This “quartic core M6 kernel” (QCM6) is constructed by replacing the second derivative of the M6 kernel for q < q c by a parabola whose parameters have been chosen so that the kernel fits smoothly and differentiably the M6 kernel at the transition radius defined by d2w6/dq2(q c ) = 0 (numerical value q c = 0.75929848). The QCM6-kernel then reads:

$$W_{QCM_6 } (q) = \frac{{\sigma _{QCM_6 } }} {{h^D }}\left\{ {\begin{array}{*{20}c} {Aq^4 + Bq^2 + Cq + D} & {0 \leqslant q < q_c } \\ {(3 - q)^5 - 6(2 - q)^5 + 15(1 - q)^5 } & {q_c \leqslant q < 1} \\ {(3 - q)^5 - 6(2 - q)^5 } & {1 \leqslant q < 2} \\ {(3 - q)^5 } & {2 \leqslant q < 3} \\ 0 & {else.} \\ \end{array} } \right.$$
(36)

The coefficients A, B, C and D are determined by the conditions \({w_{{\rm{QC}}{{\rm{M}}_6}}}({q_c}) = {w_6}({q_c}),w_{{\rm{QC}}{{\rm{M}}_6}}^\prime({q_c}) = w_6^\prime({q_c}),w_{{\rm{QC}}{{\rm{M}}_6}}^{\prime\prime}({q_c}) = w_6^{\prime\prime}({q_c})\) and \(w_{{\rm{QC}}{{\rm{M}}_6}}^{\prime\prime\prime}({q_c}) = w_6^{\prime\prime\prime}({q_c})\), where the primes indicate the derivatives with respect to q. The resulting numerical coefficients are given in Table 1. Note that QCM6 is continuous everywhere up to the third derivative. As can be seen in Figure 4, the QCM6 kernel (violet dots) deviates only subtly from M6 (solid red line), but even this very subtle modification does already seriously compromise the accuracy of the kernel. In a recent study (Rosswog, 2015), it was found, however, that this kernel has the property of producing only very little noise, particles placed initially on a quadratic or hexagonal lattice (in pressure equilibrium) remain on this lattice configuration.

The different kernels and their derivatives are summarized in Figure 4. As mentioned above, the M6, QCM6 and W3,3 kernels have been rescaled to a support of Q = 2 to allow for an easy comparison. Note how the kernels become more centrally peaked with increasing order, for example, the WH,9 kernel only deviates noticeably from zero inside of q = 1, so that it is very insensitive to particles entering or leaving its support near q = 2.

2.3.3 Accuracy assessment of different kernels

2.3.3.1 Density estimates

We assess the density estimation accuracy of the different kernels in a numerical experiment. The particles are placed in a 2D hexagonal lattice configuration in [−1, 1] × [−1, 1]. This configuration corresponds to the closest packing of spheres with radius rs where each particle possesses an effective area of \({A_{{\rm{eff}}}} = 2\sqrt 3 r_{\rm{s}}^2\). Each particle is now assigned the same mass m b = ρ0Aeff to obtain the uniform density ρ0. Subsequently, the densities at the particle locations, ρ b , are calculated via the standard SPH expression for the density, Eq. (56), and the average error \(\epsilon = {N^{ - 1}}\sum\nolimits_{b = 1}^N {{\epsilon _b}} \) with \({\epsilon _b} = \vert{\rho _b} - {\rho _0}\vert/{\rho _0}\) is determined. Figure 5, upper panel, shows the error as a function of η, which parametrizes the kernel support size via h b = η (m b /σ b )1/D, D being again the number of spatial dimensions.

Interestingly, the “standard” cubic spline kernel (“CS”, solid black line in the figure) actually does not perform well. At typical values of η near 1.3 the relative density error is a few times 10−3. Just replacing it by the quintic spline kernel (“M6”, solid red line in the figure) improves the density estimate for similar values of η already by two orders of magnitude.Footnote 5 The Wendland kernel (“W3,3”, dashed blue line in the figure) continuously decreases the error with increasing η and therefore does not show the pairing instability at any value of η (Dehnen and Aly, 2012). It maintains a very regular particle distribution with very little noise (Rosswog, 2015), see also the Gresho-Chan vortex test that is shown in Section 3.7.4. In these fixed particle distribution tests the WH,n-kernels perform particularly well. At large smoothing length they deliver exquisite density estimates. For example, the WH,9 kernel is at η > 1.9 more than two orders of magnitude more accurate than M6.

The centrally peaked kernels turn out to be rather poor density estimators. The shown LIQ kernel (filled black circles) needs a large η beyond 2 to achieve a density estimate better than 1%. Even the subtle modification of the central core in the QCM6 kernel, substantially deteriorates the

Figure 5:
figure 5

Accuracy of different kernels. The upper/lower panel shows the accuracy of density/gradient accuracy for different kernels. The quantity η determines the smoothing length, hη. Particles are distributed on a 2D hexagonal lattice and they are assigned a constant mass for the density test and a linearly increasing pressure distribution in the gradient test. Note that the “standard” SPH kernel (“M4”) does not perform particular well in either task. Images adapted from Rosswog (2015). density estimate (violet dots) in comparison to the original M6 kernel. One needs a value of η > 1.8 for a density accuracy that is better than 1%, at this large η the WH,7 kernel is approximately four orders of magnitude more accurate. After extensive numerical experiments, a recent study (Rosswog, 2015) still gave overall preference to the Wendland kernel, Eq. (34). It gave less accurate results on “frozen” particle distributions than, say, the WH,n-kernel, but in dynamical experiments where particles were allowed to move, the Wendland kernel maintained a substantially lower noise level.

2.3.3.2 Gradient estimates

We perform a similar experiment to measure the gradient accuracy. The particles are set up as before and each particle is assigned a pressure P(x,y) = 2 + x. We use the straight forward SPH estimate

$${(\nabla P)_a} - \sum\limits_b {{{{m_b}} \over {{\rho _b}}}{P_b}} {\nabla _a}{W_{ab}}({h_a})$$
(37)

to calculate the pressure gradient. The average error, \(\epsilon = {N^{ - 1}}\sum\nolimits_{b = 1}^N {{\epsilon _b}} \) with \({\epsilon _b} \equiv \vert\nabla P - {(\nabla P)_b}\vert/\vert\nabla P\vert\) is shown in Figure 5, right panel, for different kernels as a function of η.

Again, the “standard” cubic spline kernel (solid black) does not perform particularly well, only for η > 1.8 reaches the gradient estimate an accuracy better than 1%. In a dynamical situation, however, the particle distribution would at such a large kernel support already fall prey to the pairing instability. For moderately small supports, say η < 1.6, the M6 kernel is substantially more accurate. Again, the accuracy of the Wendland kernel increases monotonically with increasing η, and the three WH,n-kernels perform best in this static test. As in the case of the density estimates, the peaked kernels perform rather poorly and only achieve a 1% accuracy for extremely large values of η.

2.3.4 Kernel choice and pairing instability

Figure 5 suggests to increase the kernel support to achieve greater accuracy in density and gradient estimates (though at the price of reduced spatial resolution). For many bell-shaped kernels, however, the support cannot be increased arbitrarily since for large smoothing lengths the so-called “pairing instability” sets in where particles start to form pairs. In the most extreme case, two particles can merge into effectively one. Nevertheless, this is a rather benign instability. An example is shown in Figure 6, where for the interaction of a blast wave with a low-density bubbleFootnote 6, we have once chosen (left panels) a kernel-smoothing length combination (M4 kernel with η = 2.2) that leads to strong particle pairing, and once (right panels) a combination (Wendland kernel with η= 2. 2) that produces a very regular particle distribution. Note that despite the strong pairing in the left column, the continuum properties (the lower row shows the density as an example) are still reasonably well reproduced. The pairing simply leads to a loss of resolution, though at the original computational cost.

This instability has traditionally been explained by means of the inflection point in the kernel derivatives of bell-shaped kernels, see Figure 4, that leads to decreasing and finally vanishing repelling forces once the inflection point has been crossed. Dehnen and Aly (2012) in contrast argue that the stability of the Wendland kernels with respect to pairing is due to the density error being monotonically declining, see Figure 5, and non-negative kernel Fourier transforms. They investigated in particular a number of bell-shaped Wendland kernels with strictly positive Fourier transforms and they did not find any sign of the instability despite the vanishing central derivatives. This is, of course, a desirable feature for convergence studies since the neighbor number can be increased without limit.

Figure 6:
figure 6

Numerical experiment to illustrate the impact of the kernel and the smoothing length on the particle distribution. Shown is the interaction of a blast wave with a low-density bubble of equal pressure. Both experiments were conducted with a large smoothing length (η = 2.2 in Eq. (47)), in the left column the “standard” cubic spline kernel is used, in the right column a high-order Wendland kernel is used, see Eq. (34). At these very large smoothing lengths, the cubic spline kernel becomes pairing unstable, i.e., the particles form a clumpy distribution. The Wendland kernel in contrast produces a very regular particle distribution. Note that despite the somewhat pathological particle distribution in the cubic spline kernel case, the density (lower row) is still reasonable well reproduced.

2.4 Summary: Kernel approximation

The improvement of the involved kernel approximation techniques is one obvious way how to further enhance the accuracy of SPH. One should strive, however, to preserve SPH’s most important asset, its exact numerical conservation. The simplest possible, yet effective, improvement is to just replace the kernel function, see Section 2.3. We have briefly discussed a number of kernels and assessed their accuracy in estimating a uniform density and the gradient of a linear function for the case where the SPH particles are placed on a hexagonal lattice. The most widely used kernel function, the cubic spline M4, does actually not perform particularly well, neither for the density nor the gradient estimate. At moderate extra cost, however, one can use substantially more accurate kernels, for example the quintic spline kernel, M, or the higher-order members of the WH,n family. Another, very promising kernel family are the Wendland functions. They are not prone to the pairing instability and therefore show much better convergence properties than kernels that start forming pairs beyond a critical support size. Moreover, the Wendland kernel that we explored in detail (Rosswog, 2015) is very reluctant to allow for particle motion on a sub-resolution scale and it maintains a very regular particle distribution, even in highly dynamical tests. The explored peaked kernels, in contrast, performed rather poorly in both estimating densities and gradients.

3 SPH Formulations of Ideal Fluid Dynamics

Smoothed particle hydrodynamics (SPH) is a completely mesh-free, Lagrangian method that was originally suggested in an astrophysical context (Gingold and Monaghan, 1977; Lucy, 1977), but by now it has also found many applications in the engineering world, see Monaghan (2012) for a starting point. Since a number of detailed reviews exists, from the “classics” (Benz, 1990; Monaghan, 1992) to more recent ones (Monaghan, 2005; Rosswog, 2009; Springel, 2010a; Price, 2012), we want to avoid becoming too repetitive about SPH basics and therefore put the emphasis here on recent developments. Many of them have very good potential, but have not yet fully made their way into practical simulations. Our emphasis here is also meant as a motivation for computational astrophysicists to keep their simulation tools up-to-date in terms of methodology. A very explicit account on the derivation of various SPH aspects has been provided in Rosswog (2009), therefore we will sometimes refer the interested reader to this text for more technical details.

The basic idea of SPH is to represent a fluid by freely moving interpolation points — the particles — whose evolution is governed Nature’s conservation laws. These particles move with the local fluid velocity and their densities and gradients are determined by the kernel approximation techniques discussed in Section 2, see Figure 7. The corresponding evolution equations can be formulated in such a way that mass, energy, momentum and angular momentum are conserved by construction, i.e., they are fulfilled independent of the numerical resolution.

In the following, we use the convention that the considered particle is labeled with “a”, its neighbors with “b” and a general particle with “k”, see also the sketch in Figure 1. Moreover, the difference between two vectors is denoted as \({\vec A_{ab}} = {\vec A_a} - {\vec A_b}\) and the symbol B ab refers to the arithmetic average B ab = (B a + B b )/2 of two scalar functions.

3.1 Choice of the SPH volume element

Up to now, the choice of the volume element in the kernel approximation, see Section 2, has been kept open. The traditional choice is simply V b = m b /ρ b , where ρ is calculated as a kernel-weighted sum over nearby masses, see Eq. (56) and m is the SPH particle mass. As will be discussed in Sections 3.3 and 3.4, this translates in a straight forward manner to relativistic SPH. In the latter case the particle mass m is replaced by the SPH particle’s baryon ν number and the mass density ρ is replaced by the baryon number density as calculated in the “computing frame”, see below. It has recently been realized (Saitoh and Makino, 2013) that this “standard” choice of volume element is responsible for the spurious surface tension forces that can occur in many SPH implementations near contact discontinuities. At such discontinuities the pressure is continuous, but density ρ and internal energy u exhibit a jump. For a polytropic equation of state, P= (Γ−1), the product of density and internal energy must be the same on both sides to ensure a single value of P at the discontinuity, i.e., ρ1u1= ρ2u2, where the subscripts label the two sides of the discontinuity. This means in particular, that the jumps in u and ρ need to be consistent with each other, otherwise the mismatch can cause spurious forces that have an effect like a surface tension and can suppress weakly triggered fluid instabilities, see for example Thacker et al. (2000); Agertz et al. (2007); Springel (2010a); Read et al. (2010). In the “standard” SPH formulation such a mismatch can occur because the density estimate is smooth, but the internal energy enters the SPH equations as an un-smoothed quantity. One may question, however, whether an unresolvably sharp transition in u is a viable numerical initial condition in the first place.

Figure 7:
figure 7

Sampling of a fluid flow by SPH particles: the continuum is represented by a finite number of sampling points (“particles”) that move with the local fluid velocity. The particles share the total mass and move in a way that they conserve energy, momentum and angular momentum. For some particles (green, filled circles) also their support (“sphere of influence”) is indicated by red circles.

The problem can be alleviated if also the internal energy is smoothed by applying some artificial thermal conductivity and this has been shown to work well for Kelvin-Helmholtz instabilities (Price, 2008). But, it is actually a non-trivial problem to design appropriate triggers that supply conductivity exclusively where needed and not elsewhere. Artificial conductivity applied where it is undesired can have catastrophic consequences, for example by removing physically required energy/pressure gradients for a star in hydrostatic equilibrium.

An alternative cure comes from using different volume elements in the SPH discretization process. The first step in this direction was probably taken by Ritchie and Thomas (2001) who realized that by using the internal energy as a weight in the SPH density estimate a much sharper density transition could be achieved than by the standard SPH density sum where each particle is, apart from the kernel, only weighted by its mass. In Saitoh and Makino (2013) it was pointed out that SPH formulations that do not include density explicitly in the equations of motion do avoid the pressure becoming multi-valued at contact discontinuities. Since the density usually enters the equation of motion via the choice of the volume element, a different choice can possibly avoid the problem altogether. This observation is consistent with the findings of Heß and Springel (2010) who used a particle hydrodynamics method, but calculated the volumes via a Voronoi tessellation rather than via smooth density sum. In their approach no spurious surface tension effects have been observed. Closer to the original SPH spirit is the class of kernel-based particle volume estimates that have recently been suggested by Hopkins (2013) as a generalization of the approach from Saitoh and Makino (2013). Recently, such volume elements have been generalized for the use in special-relativistic studies (Rosswog, 2015).

Assume that we have an (exact or approximate) partition of unity so that

$$\sum\limits_b {{\Phi _b}(\vec r) = 1} $$
(38)

and a function U can be approximated as

$$\tilde U(\vec r) \approx \sum\limits_b {{U_b}{\Phi _b}(\vec r),} $$
(39)

where U b are the known function values at positions \({\vec r_b}\).This can be used to split up space (total volume V) into volume elements V b

$$V = \int {dV = \int {\left({\sum\limits_b {{\Phi _b}(\vec r)} } \right)dV = \sum\limits_b {{V_b},} } } $$
(40)

where Eq. (38) has been inserted and the particle volume

$${V_b} = \int {{\Phi _b}(\vec r)} dV$$
(41)

has been introduced. One may use, for example, a Shepard-type partition of unity (Shepard, 1968)

$${\Phi _b}(\vec r) = {{{W_b}(\vec r)} \over {\sum\nolimits_k {{W_k}(\vec r)} }},$$
(42)

with \({W_b}(\vec r) = W(\vert\vec r - {\vec r_b}\vert)\) being a smoothing kernel so that upon using Eq. (41) the particle volume becomes

$${V_b} = \int {{{{W_b}(\vec r)} \over {\sum\nolimits_k {{W_k}(\vec r)} }}dV.} $$
(43)

Making use of the approximate δ-property of the kernel, this yields

$${V_b} = {1 \over {\sum\nolimits_k {{W_{kb}}} }},$$
(44)

where \({W_{kb}} = W(\vert{\vec r_k} - {\vec r_b}\vert)\). This, of course, has the simple physical interpretation of locally sampling the particle number density (keep in mind that the kernel has the dimension of an inverse volume) and taking its inverse as the particle volume.

While this is a straight forward and plausible approach, one can in principle generalize this volume estimate by weighting the kernel with any scalar property X(Hopkins, 2013) of the particles, so that the volume becomes

$$V_b^{(X)} = {{{X_b}} \over {\sum\nolimits_k {{X_k}{W_{kb}}} }} \equiv {{{X_b}} \over {{\kappa _{X,b}}}}$$
(45)

and one can choose to calculate the density via

$$\rho _b^{(X)} = {{{m_b}} \over {V_b^{(X)}}} = {{{m_b}} \over {{X_b}}}{\kappa _{X,b}},$$
(46)

respectively. If the smoothing length is adjusted to a multiple of the typical particle separation,

$${h_b} = \eta {(V_b^{(X)})^{1/D}},$$
(47)

D being the number of spatial dimensions, the derivatives of the quantity κ X,b becomeFootnote 7

$${\nabla _a}{\kappa _{X,b}} = {1 \over {{\Omega _b}}}\sum\limits_k {{X_k}{\nabla _a}{W_{bk}}({h_b})\quad {\rm{and}}\quad {{d{\kappa _{X,b}}} \over {dt}}} = {1 \over {{\Omega _b}}}\sum\limits_k {{X_k}{{\vec v}_{bk}} \cdot } {\nabla _b}{W_{bk}}({h_b}),$$
(48)

with the generalized “grad-h terms” being

$${\Omega _b} = 1 - {{{m_b}} \over {{X_b}}}{{\partial {h_b}} \over {\partial {\rho _b}}}\sum\limits_k {{X_k}{{\partial {W_{kb}}({h_b})} \over {\partial {h_b}}}}.$$
(49)

For the SPH discretization process one needs the derivatives of the volume elements

$${\nabla _a}{V_b} = - {{V_b^2} \over {{X_b}{\Omega _b}}}\sum\limits_k {{X_k}{\nabla _a}{W_{bk}}({h_b})} $$
(50)

and

$${{d{V_b}} \over {dt}} = - {{V_b^2} \over {{X_b}{\Omega _b}}}\sum\limits_k {{X_k}{{\vec v}_{bk}} \cdot {\nabla _b}{W_{bk}}({h_b}).} $$
(51)

3.2 Newtonian SPH

In its most basic form, the task is simply to solve the Lagrangian conservation equations for mass, energy and momentum of an ideal fluid (Landau and Lifshitz, 1959):

$${{d\rho } \over {dt}} = - \rho \nabla \cdot \vec v,$$
(52)
$${{du} \over {dt}} = \left({{P \over {{\rho ^2}}}} \right){{d\rho } \over {dt}},$$
(53)
$${{d\vec v} \over {dt}} = - {{\nabla P} \over \rho },$$
(54)

where ρ is the mass density, \(d/dt = {\partial _t} + \vec v \cdot \nabla \) the Lagrangian time derivative, \(\vec v\) the fluid velocity, u the specific thermal energy and P the pressure. Like in other numerical methods, many different discrete approximations to the continuum equations can be used and they may differ substantially in their accuracy. In the following, we will summarize commonly used SPH discretizations that have been used for simulations of compact objects. They differ in their derivation strategy, the resulting symmetries in the particle indices, the volume elements, the manner how gradients are calculated and in the way they deal with shocks.

3.2.1 “Vanilla ice” SPH

We will begin with the simplest, but fully conservative, SPH formulation (“vanilla ice”) that is still used in many astrophysical simulations. A detailed step-by-step derivation with particular attention to conservation issues can be found in Section 2.3 of Rosswog (2009). By using the volume element V b = m b /ρ b and the derivative prescription Eq. (10) one finds the discrete form of the Lagrangian continuity equation

$${{d\rho } \over {dt}} = - \rho \nabla \cdot \vec v\;\; \Rightarrow \;\;{{d{\rho _a}} \over {dt}} = \sum\limits_b {{m_b}{{\vec v}_{ab}} \cdot {\nabla _a}{W_{ab}}} $$
(55)

where \({\vec v_{ab}} = {\vec v_a} - {\vec v_b}\) and \({W_{ab}} = W(\vert{\vec r_a} - {\vec r_b}\vert,h)\). As an alternative to this “density-by-integration” approach, one can also estimate the density at a particle position \({\vec r_a}\) as a weighted sum over contributing particles (“density-by-summation”)

$${\rho _a} = \sum\limits_b {{m_b}{W_{ab}}({h_a}).} $$
(56)

In practice, there is little difference between the two. Note that this density estimate corresponds to the choice X= m in Eq. (45). Usually, the particle masses are kept constant so that exact mass conservation is enforced automatically. Most often, the specific energy u is evolved in time, its evolution equation follows directly from Eq. (55) and the adiabatic first law of thermodynamics,

$$\partial {u_b}/\partial {\rho _b} = {P_b}/\rho _b^2$$
(57)

as

$${{du} \over {dt}} = \left({{P \over {{\rho ^2}}}} \right){{d\rho } \over {dt}}\;\; \Rightarrow \;\;{{d{u_a}} \over {dt}} = {{{P_a}} \over {\rho _a^2}}\sum\limits_b {{m_b}{{\vec v}_{ab}} \cdot {\nabla _a}{W_{ab}}.} $$
(58)

A discrete form of the momentum equation that enforces exact conservation can be found by using the identity

$${{\nabla P} \over \rho } = {P \over {{\rho ^2}}}\nabla \rho + \nabla \left({{P \over \rho }} \right)$$
(59)

in the Lagrangian momentum equation

$${{d\vec v} \over {dt}} = - {{\nabla P} \over \rho }\;\; \Rightarrow \;\;{{d{{\vec v}_a}} \over {dt}} = - \sum\limits_b {{m_b}\left({{{{P_a}} \over {\rho _a^2}} + {{{P_b}} \over {\rho _b^2}}} \right)} {\nabla _a}{W_{ab}}.$$
(60)

If the kernel W ab in Eqs. (58) and (60) is evaluated with a symmetric combination of smoothing lengths, say with h ab = (ha + h b )/2, then Eqs. (56), (58) and (60) form, together with an equation of state, a closed set of equations that enforces the exact conservation of mass, energy, linear and angular momentum by construction. For practical simulations that may possibly involve shocks, Eqs. (58) and (60) need to be augmented by extra measures (e.g., by artificial viscosity) to ensure that entropy is produced in a shock and that kinetic energy is properly transformed into thermal energy, see Section 3.2.5.

3.2.2 SPH from a variational principle

More elegantly, a set of SPH equations can be obtained by starting from the (discretized) Lagrangian of an ideal fluid (Gingold and Monaghan, 1982; Speith, 1998; Monaghan and Price, 2001; Springel and Hernquist, 2002; Rosswog, 2009; Price, 2012). This ensures that conservation of mass, energy, momentum and angular momentum is, by construction, built into the discretized form of the resulting fluid equations. Below, we use the generalized volume element Eq. (45) without specifying the choice of the weight X so that a whole class of SPH equations is produced (Hopkins, 2013). In this derivation the volume of an SPH particle takes over the fundamental role that is usually played by the density sum. Therefore we will generally express the SPH equations in terms of volumes rather than densities, we only make use of the latter for comparison with known equation sets.

The Lagrangian of an ideal fluid can be written as (Eckart, 1960)

$$L = \int {\rho (x)\left({{{v{{(x)}^2}} \over 2} - u(x)} \right)\;dx,} $$
(61)

and on using Eq. (39) it can be discretized into

$$L \simeq \int {\left\{ {\sum\limits_b {{\rho _b}\left({{{u_b^2} \over 2} - {u_b}} \right){\Phi _b}(x)} } \right\}dx = \sum\limits_b {{\rho _b}{V_b}\left({{{v_b^2} \over 2} - {u_b}} \right),} } $$
(62)

where we have used the definition of the particle volume Eq. (41). For the choice mb= ρbVb the standard SPH-Lagrangian

$$L = \sum\limits_b {{m_b}\left({{{v_b^2} \over 2} - {u_b}} \right)} $$
(63)

is recovered. The Euler—Lagrange equations

$${d \over {dt}}{{\partial L} \over {\partial {{\vec v}_a}}} - {{\partial L} \over {\partial {{\vec r}_a}}} = 0$$
(64)

then yield, for a fixed particle mass m a ,

$${{d{{\vec v}_a}} \over {dt}} = {1 \over {{m_a}}}{{\partial L} \over {\partial {{\vec r}_a}}} = {1 \over {{m_a}}}\sum\limits_b {{m_b}{{{P_b}} \over {\rho _b^2}}{{\partial ({m_b}/{V_b})} \over {\partial {{\vec r}_a}}} = {1 \over {{m_a}}}\sum\limits_b {{P_b}{{\partial {V_b}} \over {\partial \,{{\vec r}_a}}},} } $$
(65)

where Eq. (57) was used. The energy equation follows directly from the first law of thermodynamics as

$${{\partial {u_a}} \over {dt}} = - {{{P_a}} \over {{m_a}}}{{d{V_a}} \over {dt}}.$$
(66)

With Eqs. (50), (51), \({\nabla _b}{W_{ab}} = - {\nabla _a}{W_{ab}}\) and \({\nabla _a}{W_{bk}} = {\nabla _b}{W_{bk}}({\delta _{ba}} - {\delta _{ka}})\) the SPH equations for a general volume element of the form given in Eq. (45) become

$${V_b} = {{{X_b}} \over {{\kappa _{X,b}}}}$$
(67)
$${m_a}{{d{{\vec v}_a}} \over {dt}} = - \sum\limits_b {{X_a}{X_b}\left\{ {{{{P_a}} \over {\kappa _a^2{\Omega _a}}}{\nabla _a}{W_{ab}}({h_a}) + {{{P_b}} \over {\kappa _b^2{\Omega _b}}}{\nabla _a}{W_{ab}}({h_b})} \right\}} $$
(68)
$$m_a \frac{{du_a }} {{dt}} = - \sum\limits_b {X_a X_b \left\{ {\frac{{P_a }} {{\kappa _a^2 \Omega _a }}\nabla _a W_{ab} (h_a ) + \frac{{P_b }} {{\kappa _b^2 \Omega _b }}\nabla _a W_{ab} (h_b )} \right\}} $$
(69)

For the choice X= m, this reduces to the commonly used equation set (Monaghan and Price, 2001; Price, 2004; Rosswog and Price, 2007). Explicit forms of the equations, also for other choices of, are given in Table 2.

Table 2: Explicit forms of the equations for special choices of the weight X for Newtonian and special-relativistic SPH.
3.2.2.1 The Gadget equations

Thece, Gadget (Springel et al., 2001; Springel, 2005), uses an entropy formulation of SPH. It uses the entropic function A(s) occurring in a polytropic equation of state, \(P = A(s){\rho ^\Gamma }\). If no entropy is created, the quantity A is simply advected by the fluid element and only the momentum equation needs to be solved explicitly. In Gadget the smoothing lengthsFootnote 8 are adapted so that

$${{4\pi } \over 3}h_a^3{\rho _a} = {N_{{\rm{SP}}{{\rm{H}}^{\bar m}}}}$$
(70)

is fulfilled, where N SPH is the typical neighbor number and \(\bar m\) is an average particle mass. The Euler-Lagrange equations then yield

$${{d{{\vec v}_a}} \over {dt}} = - \sum\limits_b {{m_b}\left\{ {{{{f_a}{P_a}} \over {\rho _a^2}}{\nabla _a}{W_{ab}}({h_a}) + {{{f_b}{P_b}} \over {\rho _a^2}}{\nabla _a}{W_{ab}}({h_b})} \right\}},$$
(71)

where the f k are the grad-h terms that read

$${f_k} = {\left({1 + {{{h_k}} \over {3{\rho _k}}}{{\partial {\rho _k}} \over {\partial {h_k}}}} \right)^{ - 1}}.$$
(72)

Obviously, in this formulation artificial dissipation terms need to be applied also to A to ensure that entropy is produced at the right amounts in shocks.

3.2.3 Self-regularization in SPH

SPH possesses a built-in “self-regularization” mechanism, i.e. SPH particles feel, in addition to pressure gradients, a force that aims at driving them towards an optimal particle distribution. This corresponds to (usually ad hoc introduced) “re-meshing” steps that are used in Lagrangian mesh methods. The ability to automatically re-mesh is closely related the lack of zeroth order consistency of SPH that was briefly described in Section 2.2.1: the particles “realize” that their distribution is imperfect and they have to adjust accordingly. Particle methods without such a re-meshing mechanism can quickly evolve into rather pathological particle configurations that yield, in long-term, very poor results, see Price (2012) for an explicit numerical example.

To understand this mechanism better, it is instructive to expand P b in Eq. (65) around P a (sum over k)

$${m_a}{\left({{{d{{\vec v}_a}} \over {dt}}} \right)^i} = \sum\limits_b {\left\{ {{P_a} + (\nabla + P)_a^k{{({{\vec r}_b} - {{\vec r}_a})}^k} + \ldots } \right\}{{\left({{{\partial {V_b}} \over {\partial {{\vec r}_a}}}} \right)}^i}} $$
(73)
$$ \approx {P_a}{\left({{\partial \over {\partial {{\vec r}_a}}}\sum\limits_b {{V_b}} } \right)^i} + (\nabla P)_a^k\sum\limits_b {{{({{\vec r}_b} - {{\vec r}_a})}^k}{{\left({{{\partial {V_b}} \over {\partial {{\vec r}_a}}}} \right)}^i}} $$
(74)
$$ \equiv {P_a}e_0^i - (\nabla P)_a^k{V_a}{D^{ki}}$$
(75)
$$ \equiv f_{{\rm{reg}}}^i + f_{{\rm{hyd}}}^i.$$
(76)

The first term is the “regularization force” that is responsible for driving particles into a good distribution. It only vanishes, \({\vec e_0} = 0\), if the sum of the volumes is constant. The second term, \(f_{{\rm{hyd}}}^i\), is the approximation to the hydrodynamic force. For a good particle distribution, \(e_0^i \rightarrow 0\) and \({D^{ki}} \rightarrow {\delta ^{ki}}\) and thus (using \({\rho _a} = {m_a}/{V_a}\)) the Euler equation (54) is reproduced.

3.2.4 SPH with integral-based derivatives

One can find SPH equations based on the more accurate integral approximation derivatives by using the formal replacement Eq. (27). By replacing the kernel gradients, one finds

$${{d{{\vec v}_a}} \over {dt}} = - {1 \over {{m_a}}}\sum\limits_b {{X_a}{X_b}\left\{ {{{{P_a}} \over {k_a^2}}{{\vec G}_a} + {{{P_b}} \over {k_b^2}}{{\vec G}_b}} \right\}} $$
(77)
$${{d{u_a}} \over {dt}} = {{{P_a}{X_a}} \over {{m_a}k_a^2}}\sum\limits_b {{X_b}{{\vec v}_{ab}} \cdot {{\vec G}_a},} $$
(78)

with

$${\left({{{\vec G}_a}} \right)^k} = {C^{kd}}({\vec r_a},{h_a}){({\vec r_b} - {\vec r_a})^d}{W_{ab}}({h_a})$$
(79)

and

$${\left({{{\vec G}_b}} \right)^k} = {C^{kd}}({\vec r_b},{h_b}){({\vec r_b} - {\vec r_a})^d}{W_{ab}}({h_b}),$$
(80)

where a summation over d is implied and the density can be calculated via Eq. (67). For the conventional choice X= m this reproduces (up to the grad-h terms) the original equation set derived in García-Senz et al.(2012) from a Lagrangian. The experiments in García-Senz et al.(2012); Cabezón et al.(2012); Rosswog (2015) clearly show that the use of this gradient prescription substantially improves the accuracy of SPH.

3.2.5 Treatment of shocks

The equations of gas dynamics allow for discontinuities to emerge even from perfectly smooth initial conditions (Landau and Lifshitz, 1959). At discontinuities, the differential form of the fluid equations is no longer valid, and their integral form needs to be used, which, at shocks, translates into the Rankine-Hugoniot conditions, which relate the upstream and downstream properties of the flow. They show in particular, that the entropy increases in shocks, i.e., that dissipation occurs inside the shock front. For this reason the inviscid SPH equations need to be augmented by further measures that become active near shocks.

Another line of reasoning that suggests using artificial viscosity is the following. One can think of the SPH particles as (macroscopic) fluid elements that follow streamlines, so in this sense SPH is a method of characteristics. Problems can occur when particle trajectories cross since in such a case fluid properties at the crossing point can become multi-valued. The term linear in the velocities in Eq. (82) was originally also introduced as a measure to avoid particle interpenetration where it should not occur (Hernquist and Katz, 1989) and to damp particle noise. Read and Hayfield(2012) designed special dissipation switches to avoid particle properties becoming multi-valued at trajectory crossings.

3.2.5.1 Common form of the dissipative equations

There are a number of successful Riemann solver implementations (see, e.g., Inutsuka, 2002; Cha and Whitworth, 2003; Cha et al., 2010; Murante et al., 2011; Iwasaki and Inutsuka, 2011; Puri and Ramachandran, 2014), but the most widespread approach is the use “artificial” dissipation terms in SPH. Such terms are not necessarily meant to mimic physical dissipation, their only purpose is to ensure that the Rankine—Hugoniot relations are fulfilled, though on a numerically resolvable rather than a microscopic scale. The most commonly chosen approach is to add the following terms

$${\left({{{d{{\vec v}_a}} \over {dt}}} \right)_{{\rm{diss}}}} = - \sum\limits_b {{m_b}{\Pi _{ab}}{\nabla _a}{W_{ab}}\quad {\rm{and}}\quad {{\left({{{d{u_a}} \over {dt}}} \right)}_{{\rm{diss}}}} = {1 \over 2}\sum\limits_b {{m_b}{\Pi _{ab}}{{\vec v}_{ab}} \cdot {\nabla _a}{W_{ab}}} } $$
(81)

to the momentum and energy equation, where Π ab is the artificial viscosity tensor. As long as Π ab is symmetric in a and b, the conservation of energy, linear and angular momentum is ensured by the form of the equation and the anti-symmetry of V a W a b with respect to the exchange of a and b.There is some freedom in choosing Π ab , but the most commonly used form is (Monaghan and Gingold, 1983)

$$\Pi _{ab} = \left\{ {\begin{array}{*{20}c} {\frac{{ - \alpha \bar c_{ab} \mu _{ab} + \beta \mu _{ab}^2 }} {{_0^{\bar \rho _{ab} } }}} & {for \vec r_{ab} \cdot \vec v_{ab} < 0} \\ 0 & {otherwise} \\ \end{array} } \right., where \mu _{ab} = \frac{{\bar h_{ab} \vec r_{ab} \cdot \vec v_{ab} }} {{r_{ab}^2 + \varepsilon \bar h_{ab}^2 }},$$
(82)

where the barred quantities are arithmetic averages of particle a and b and \({\vec{v}_{ab}} = {\vec{v}_a} - {\vec{v}_b}\).This is a SPH-translation of a bulk and a von Neumann—Richtmyer viscosity,Footnote 9 with viscous pressures of \({P_B} \propto - \rho l\nabla \cdot \vec{v}\) and \({P_{{\rm{NR}}}} \propto \rho {l^2}{(\nabla \cdot \vec{v})^2}\), respectively, where l is the local resolution length. This form has been chosen because it is Galilean invariant, vanishes for rigid rotation and ensures exact conservation. The parameters αand βset the strength of the dissipative terms and are usually chosen so that good results are obtained in standard benchmark tests.

Comparison with Riemann solver approaches (Monaghan, 1997) suggests an alternative form that involves signal velocities and jumps in variables across characteristics. The main idea of these “discontinuity capturing terms” is that for any conserved scalar variable A with \(\sum\nolimits_a {{m_a}} d{A_a}/dt = 0\) a dissipative term of the form

$${\left({{{d{A_a}} \over {dt}}} \right)_{{\rm{diss}}}} = \sum\limits_b {{m_b}{{{\alpha _{A,b}}{v_{{\rm{sig}}}}} \over {{{\bar \rho }_{ab}}}}({A_a} - {A_b}){{\hat e}_{ab}} \cdot {\nabla _a}{W_{ab}}} $$
(83)

should be added, where the parameter α A,b determines the exact amount of dissipation and νsig is a signal velocity between particle a and b.Applied to the velocity and the thermokinetic energy \(e = u + {v^2}/2\), this yields

$${\left({{{d{{\vec v}_a}} \over {dt}}} \right)_{{\rm{diss}}}} = \sum\limits_b {{m_b}{{\alpha {v_{{\rm{sig}}}}({{\vec v}_a} - {{\vec v}_b}) \cdot {{\hat e}_{ab}}} \over {{{\bar \rho }_{ab}}}}{\nabla _a}{W_{ab}}} $$
(84)
$${\left({{{d{{\hat e}_a}} \over {dt}}} \right)_{{\rm{diss}}}} = \sum\limits_b {{m_b}{{e_a^{\ast} - e_b^{\ast}} \over {{{\bar \rho }_{ab}}}}{{\hat e}_{ab}} \cdot {\nabla _a}{W_{ab}},} $$
(85)

where, following Price (2008), the energy e* includes velocity components along the line joining particles a and b, \(e_a^{\ast} = {\textstyle{1 \over 2}}\alpha {v_{{\rm{sig}}}}{({\vec{v}_a} \cdot {\hat{e}_{ab}})^2} + {\alpha _u}v_{{\rm{sig}}}^u{u_a}\). Note that in this equation different signal velocities and dissipation parameters can be used for the velocities and the thermal energy terms. Using \(d{u_a}/dt = d{\hat{e}_a}/dt - {\vec{v}_a} \cdot d{\vec{v}_a}/dt\), this translates into

$${\left({{{d{u_a}} \over {dt}}} \right)_{{\rm{diss}}}} = \sum\limits_b {{{{m_b}} \over {{{\bar \rho }_{ab}}}}\left[ {\alpha {v_{{\rm{sig}}}}{1 \over 2}{{({{\vec v}_{ab}} \cdot {{\hat e}_{sb}})}^2} + {\alpha _u}v_{{\rm{sig}}}^u({u_a} - {u_b})} \right]{{\hat e}_{ab}} \cdot {\nabla _a}{W_{ab}}} $$
(86)

for the thermal energy equation. The first term in this equation bears similarities with the “standard” artificial viscosity prescription, see Eq. (82). The second one expresses the exchange of thermal energy between particles and, therefore, represents an artificial thermal conductivity, which smoothes discontinuities in the specific energy. Such artificial conductivity had been suggested earlier to cure the so-called “wall heating problem” (Noh, 1987). Tests have shown that artificial conductivity substantially improves SPH’s performance in simulating Sedov blast waves (Rosswog and Price, 2007) and in the treatment of Kelvin—Helmholtz instabilities (Price, 2008). Note that the general strategy suggests to use dissipative terms also for the continuity equation so that particles can exchange mass in a conservative way. This has, so far, been rarely applied, but Read and Hayfield (2012) find good results with this strategy in multi-particle-mass simulations.

3.2.5.2 Time dependent dissipation parameters

The choice of the dissipation parameters is crucial and simply using fixed parameters applies dissipation also where it is not needed and actually unwanted.Footnote 10 As a result, one simulates some type of viscous fluid rather than the intended inviscid medium. Therefore, Morris and Monaghan (1997) suggested an approach with time-dependent parameters for each particle and with triggers that indicate where dissipation is needed. Using β a = 2αa in Eq. (82) they evolved αa according to

$${{d{\alpha _a}} \over {dt}} = \mathcal{A}_a^ + - \mathcal{A}_a^ - \quad {\rm{with}}\quad \mathcal{A}_a^ + = \max (- {(\nabla \cdot \vec v)_a},0)\quad {\rm{and}}\quad \mathcal{A}_a^ - = {{{\alpha _a}(t) - {\alpha _{\min }}} \over {{\tau _a}}},$$
(87)

where αmin represents a minimum, “floor” value for the viscosity parameter and \({\tau _a}\sim {h_a}/{c_{{\rm{s}},a}}\) is the individual decay time scale. This approach (or slight modifications of it) has been shown to substantially reduce unwanted effects in practical simulations (Rosswog et al., 2000; Dolag et al., 2005; Wetzstein et al., 2009) but, as already realized in the original publication, triggering on the velocity divergence also raises the viscosity in a slow compression with \((\nabla \cdot \vec{v}) = {\rm{const}}\), where it is actually not needed.

3.2.5.3 New shock triggers

More recently, some improvements of the basic Morris and Monaghan idea were suggested by Cullen and Dehnen (2010). First, the authors argued that the floor value can be safely set to zero, provided that αcan grow fast enough. To ensure the latter, the current value of α a is compared to values indicated by triggers and, if necessary, α a is increased instantaneously rather than by solving Eq. (87). Second, instead of using \({(\nabla \cdot \vec{v})_a}\) one triggers on its time derivative to find the locally desired dissipation parameter:

$${A_a} = {\xi _a}\max \left[ { - {{d(\nabla \cdot \vec v)a} \over {dt}},0} \right]\quad {\rm{and}}\quad {\alpha _{a,{\rm{des}}}} = {\alpha _{\max }}{{{A_a}} \over {{A_a} + c_a^2/h_a^2}},$$
(88)

where c a is a signal velocity, ξ a a limiter, see below, and αmax a maximally admissible dissipation value. If \({\alpha _{a,\;{\rm{des}}}} > {\alpha _{\rm{a}}}\) then ξ a is raised immediately to ξ a ,des, otherwise it decays according to the \(\mathcal{A}_a^ - \)-term in Eq. (87). Apart from overall substantially reducing dissipation, this approach possesses the additional virtue that α peaks ∼ 2 smoothing lengths ahead of a shock front and decays immediately after. Tests show that this produces much less unwanted dissipation than previous approaches.

Read and Hayfield (2012) suggest a similar strategy, but trigger on the spatial change of the compression

$$A_a = \xi _a h_a^2 \left| {\nabla (\nabla \cdot \vec v)} \right|_a and \alpha _{a,des} = \alpha _{\max } \frac{{A_a }} {{A_a + h_a \left| {\nabla \cdot \vec v} \right|_a + 0.05 c_a }},$$
(89)

where \(\nabla \cdot \vec{v} < 0\) and α des = 0 otherwise. The major idea is to detect convergence before it actually occurs. Particular care is taken to ensure that all fluid properties remain single valued as particles approach each other and higher-order gradient estimators are used. With this approach they find good results with only very little numerical noise.

3.2.5.4 Noise triggers

The time scale τa on which dissipation is allowed to decay is usually chosen in a trial-and-error approach. If chosen too large, the dissipation may be higher than desired, if decaying too quickly, velocity noise may build up again after a shock has passed. Therefore, a safer strategy is allow for a fast decay, but devise additional triggers that switch on if noise is detected. The main idea for identifying noise is to monitor sign fluctuations of \(\nabla \cdot \vec{v}\) in the vicinity of a particle. A particle is considered “noisy” if some of its neighbors are being compressed while others expand. For example, the ratio

$${{{S_{1,a}}} \over {{S_{2,a}}}} \equiv {{\sum\nolimits_b {{{(\nabla \cdot \vec v)}_b}} } \over {\sum\nolimits_b {\vert \nabla \cdot \vec v{\vert_b}}}}$$
(90)

can be used to construct a noise trigger. It deviates from ±1 in a noisy region since contributions of different sign are added up in S 1,a and therefore such deviations can be used as a noise indicator:

$$\mathcal{N}_a^{(1)} = \left\vert {{{{{\tilde S}_{1,a}}} \over {{S_{2,a}}}} - 1} \right\vert,$$
(91)

where the quantity

$$\tilde S_{1,a} = \left\{ {\begin{array}{*{20}c} { - S_{1,a} if (\nabla \cdot \vec v)_a < 0} \\ {S_{1,a} else.} \\ \end{array} } \right.$$
(92)

If all particles in the neighborhood are either compressed or expanding, \(\mathcal{N}_a^{(1)}\) vanishes. This trigger only reacts on sign changes, but does not account for the strengths of the (de-)compressions. A second noise trigger that accounts for this uses

$$\mathcal{S}_a^ + = {1 \over {{N^ + }}}\sum\limits_{b,\nabla \cdot {{\vec v}_b} > 0}^{{N^ + }} {\nabla \cdot {{\vec v}_b}} \quad {\rm{and}}\quad \mathcal{S}_a^ - = - {1 \over {{N^ - }}}\sum\limits_{b,\nabla \cdot {{\vec v}_b} < 0}^{{N^ - }} {\nabla \cdot {{\vec v}_b}},$$
(93)

where N+/N− is number of neighbor particles with positive/negative \(\nabla \cdot \vec{v}\). The trigger then reads

$$\mathcal{N}_a^{(2)} = \sqrt {\mathcal{S}_a^ + \mathcal{S}_a^ - }.$$
(94)

If there are sign fluctuations in \(\nabla \cdot \vec{v}\), but they are small compared to c s /h, the product is very small, if we have a uniform expansion or compression one of the factors will be zero. So only for sign changes and significantly large compressions/expansions will the product have a substantial value. Like in Eq. (88) the noise triggers \(\mathcal{N}_a^{(1)}\) and \(\mathcal{N}_a^{(2)}\) can be compared against reference values to steer the dissipation parameters. For more details on noise triggers see Rosswog (2015).

Note that in all these triggers (Cullen and Dehnen, 2010; Read and Hayfield, 2012; Rosswog, 2015) the gradients can straight forwardly be calculated from accurate expressions such as Eq. (21) or (14) rather than from kernel gradients.

3.2.5.5 Limiters

Complementary to these triggers one can also try to actively suppress dissipation where it is unwanted (with ideally working triggers this should, of course, not be necessary). For example, Balsara (1995) suggested to distinguish between shock and shear motion based on the ratio

$$\xi _a^{\rm{B}} = {{{{\left\vert {\nabla \cdot \vec v} \right\vert}_a}} \over {{{\left\vert {\nabla \cdot \vec v} \right\vert}_a} + {{\left\vert {\nabla \cdot \vec v} \right\vert}_a} + 0.0001{c_{s,a}}/{h_a}}}.$$
(95)

Dissipation is suppressed, \(\xi _a^{\rm{B}} \rightarrow 0\), where \(\vert \nabla \cdot \vec{v}{\vert _a}\; \gg \;\vert \nabla \cdot \vec{v}{\vert _a}\), whereas \(\xi _a^{\rm{B}}\) tends to unity in the opposite limit. If symmetrized limiters are applied, \(\Pi _{ab}^{\prime} = {\Pi _{ab}}\;{{\bar{\xi} }_{ab}}\), exact conservation is ensured. The Balsara limiter has been found very useful in many applications (Steinmetz, 1996; Navarro and Steinmetz, 1997; Rosswog et al., 2000), but it can be challenged if shocks occur in a shearing environment like an accretion disk (Owen, 2004). Part of this challenge comes from the finite accuracy of the standard SPH-derivatives. It is, however, straight forward (Cullen and Dehnen, 2010; Read and Hayfield, 2012; Rosswog, 2015) to use more accurate derivatives such as, for example, Eq. (14), in the limiters. Suppression of dissipation in shear flows can then be obtained by simply replacing the SPH-gradient operators in Eq. (95) by more accurate expressions, or by simply adding terms proportional to (accurate estimates for) \(\vert \nabla \cdot \vec{v}\vert \) in the denominators of Eqs. (88) and (89) (instead of multiplying αdes with a limiter). Another alternative is the limiter proposed in Cullen and Dehnen (2010) that also makes use of more accurate gradient estimates:

$$\xi _a^{\rm{C}} = {{{{\left\vert {2{{(1 - {R_a})}^4}{{(\nabla \cdot \vec v)}_a}} \right\vert}^2}} \over {{{\left\vert {2{{(1 - {R_a})}^4}{{(\nabla \cdot \vec v)}_a}} \right\vert}^2} + tr({{\bf{S}}_a} \cdot {\bf{S}}_a^t)}},$$
(96)

where, in our notation and for generalized volume elements,

$${R_a} = {{{V_a}} \over {{X_a}}}\sum\limits_b {{\rm{sign}}{{(\nabla \cdot \vec v)}_b}} {X_b}{W_{ab}}.$$
(97)

Note that R a is simply the ratio of a density summation where each term is weighted by the sign of \(\nabla \cdot \vec{v}\) and the normal density, Eq. (46). Therefore, near a shock \({R_a} \rightarrow - 1\). The matrix S is the traceless symmetric part of the velocity gradient matrix (δ i υ j ) and a measure for the local shear. Similar to the Balsara factor, ξC approaches unity if compression clearly dominates over shear and it vanishes in the opposite limit.

3.3 Special-relativistic SPH

The special-relativistic SPH equations can — like the Newtonian ones — be elegantly derived from a variational principle (Monaghan and Price, 2001; Rosswog, 2009, 2010a, b). We discuss here a formulation (Rosswog, 2015) that uses generalized volume elements, see Eq. (45). It is assumed that space-time is flat, that the metric, η μν , has the signature (-,+,+,+) and units in which the speed of light is equal to unity, c=1, are adopted. We use the Einstein summation convention and reserve Greek letters for space-time indices from 0…3 with 0 being the temporal component, while i and j refer to spatial components and SPH particles are labeled by a, b and k.

The Lagrangian of a special-relativistic perfect fluid can then be written as (Fock, 1964)

$$L = - \int {{T^{\mu \nu }}{U_\mu }{U_\nu }dV,} $$
(98)

where the energy-momentum tensor of an ideal fluid reads

$${T^{\mu \nu }} = (e + P){U^\mu }{U^\nu } + P{\eta ^{\mu \nu }}$$
(99)

with e being the energy density, P the pressure and Uλ the four-velocity of a fluid element. One can write the energy density as a sum of a contribution from rest mass density and one from the internal energy

$$e = {\rho _{{\rm{rest}}}}{c^2} + u{\rho _{{\rm{rest}}}} = n{m_0}{c^2}(1 + u/{c^2}),$$
(100)

where the speed of light was, for clarity, shown explicitly. The baryon number density n is measured in the local fluid rest frame and the average baryon mass is denoted by m0. With the conventions that all energies are written in units of m0c2 and c= 1 we can use the normalization of the four-velocity, \({U_\mu }{U^\mu } = - 1\), to simplify the Lagrangian to

$$L = - \int {n(1 + u)dV.} $$
(101)

The calculation will be performed in an a priori chosen “computing frame” (CF) and — due to length contraction — the baryon number density measured in this frame, N, is increased by a Lorentz factor γ with respect to the local fluid rest frame

$$N = \gamma n.$$
(102)

Therefore, the Lagrangian can be written as

$$L = - \int {dVN\left({{{1 + u} \over \gamma }} \right)\quad {\rm{or}}\quad } L \simeq - \sum\limits_b {{V_b}{N_b}} {{1 + {u_b}} \over {{\gamma _b}}} = - \sum\limits_b {{\nu _b}} {{1 + {u_b}} \over {{\gamma _b}}},$$
(103)

where the baryon number carried by particle b, ν b , has been introduced. If volume elements of the form Eq. (45) are introduced, one can calculate CF baryon number densities as in the Newtonian case [see Eq. (46)]

$${N_b} = {{{\nu _b}} \over {{V_b}}} = {{{\nu _b}} \over {{X_b}}}\sum\limits_k {{X_k}{W_{bk}}({h_b})},$$
(104)

which reduces for the choice X = ν to the equivalent of the standard SPH sum, Eq. (56), but with ρ/m being replaced by N/ν. To obtain the equations of motion, ∇ a N b and dN a /dt are needed, which follow from Eq. (45) as

$${\nabla _a}{N_b} = {{{\nu _b}} \over {{X_b}{{\tilde \Omega }_b}}}\sum\limits_k {{X_k}{\nabla _a}{W_{bk}}({h_b})} \quad {\rm{and}}\quad {{d{N_a}} \over {dt}} = {{{\nu _a}} \over {{X_a}{{\tilde \Omega }_a}}}\sum\limits_k {{X_k}{{\vec v}_{bk}} \cdot {\nabla _b}{W_{bk}}({h_b})} $$
(105)

and which contain the generalized “grad-h” terms

$${\tilde \Omega _b} = 1 - {{{\nu _b}} \over {{X_b}}}{{\partial {h_b}} \over {\partial {N_b}}}\sum\limits_k {{X_k}{{\partial {W_{kb}}({h_b})} \over {\partial {h_b}}}.} $$
(106)

It turns out to be beneficial to use the canonical momentum per baryon

$${\vec S_a} \equiv {1 \over {{\nu _a}}}{{\partial L} \over {\partial {{\vec v}_a}}} = {\gamma _a}{\vec v_a}\left({1 + {u_a} + {{{P_a}} \over {{n_a}}}} \right)$$
(107)

as a numerical variable. Its evolution equation follows directly from the Euler-Lagrange equations as (Rosswog, 2015)

$${{d{{\vec S}_a}} \over {dt}} = - {1 \over {{\nu _a}}}\sum\limits_b {\left\{ {{{{P_a}V_a^2} \over {{{\tilde \Omega }_a}}}{{{X_b}} \over {{X_a}}}{\nabla _a}{W_{ab}}({h_a}) + {{{P_b}V_b^2} \over {{{\tilde \Omega }_b}}}{{{X_a}} \over {{X_b}}}{\nabla _a}{W_{ab}}({h_b})} \right\}}.$$
(108)

For the choice V k = ν k /N k , this reduces to the momentum equation given in Rosswog (2010b).

The canonical energy

$$E \equiv \sum\limits_a {{{\partial L} \over {\partial {{\vec v}_a}}} \cdot {{\vec v}_a} - L = } \sum\limits_a {{\nu _a}{\epsilon _a}}, $$
(109)

suggests to use

$${\epsilon _a} = {\vec v_a} \cdot {\vec S_a} + {{1 + {u_a}} \over {{\gamma _a}}} = {\gamma _a}\left({1 + {u_a} + {{{P_a}} \over {{n_a}}}} \right) - {{{P_a}} \over {{N_a}}} = {\gamma _a}{\mathcal{E}_a} - {{{P_a}} \over {{N_a}}},$$
(110)

as numerical energy variable. Here the specific, relativistic enthalpy was abbreviated as

$${\mathcal{E}_a} = 1 + {u_a} + {{{P_a}} \over {{n_a}}}.$$
(111)

The subsequent derivation is identical to the one in Rosswog (2009) up to their Eq. (165),

$${{d{\epsilon _a}} \over {dt}} = {\vec v_a} \cdot {{d{{\vec S}_a}} \over {dt}} + {{{P_a}} \over {N_a^2}}{{d{N_a}} \over {dt}},$$
(112)

which, upon using Eqs. (105) and (108), yields the special-relativistic energy equation

$${{d{\epsilon _a}} \over {dt}} = - {1 \over {{\nu _a}}}\sum\limits_b {\left\{ {{{{P_a}V_a^2} \over {{{\tilde \Omega }_a}}}{{{X_b}} \over {{X_a}}}{{\vec v}_b} \cdot {\nabla _a}{W_{ab}}({h_a}) + {{{P_b}V_b^2} \over {{{\tilde \Omega }_b}}}{{{X_a}} \over {{X_b}}}{{\vec v}_a} \cdot {\nabla _a}{W_{ab}}({h_b})} \right\}}.$$
(113)

Again, for V k = ν k /Nk, this reduces to the energy equation given in Rosswog (2010b). Of course, the set of equations needs to be closed by an equation of state which in the simplest case can be a polytrope.

Note that by choosing the canonical energy and momentum as numerical variables, one avoids complications such as time derivatives of Lorentz factors, that have plagued earlier SPH formulations (Laguna et al., 1993a). The price one has to pay is that the physical variables (such as υ and \(\vec v\)) need to be recovered at every time step from N, e and \(\vec S\) by solving a non-linear equation. For this task approaches very similar to what is used in Eulerian relativistic hydrodynamics can be applied (Chow and Monaghan, 1997; Rosswog, 2010b).

As in the non-relativistic case, these equations need to be augmented by extra measures to deal with shocks. The dissipative terms (Chow and Monaghan, 1997)

$${\left({{{d{{\vec S}_a}} \over {dt}}} \right)_{{\rm{diss}}}} = - \sum\limits_b {{\nu _b}{\Pi _{ab}}\overline {{\nabla _a}{W_{ab}}} } \;{\rm{with}}\;{\Pi _{ab}} = - {{K{v_{{\rm{sig}}}}} \over {{{\bar N}_{ab}}}}(\vec S_a^{\ast} - \vec S_b^{\ast}) \cdot {\hat e_{ab}}$$
(114)
$${\left({{{d{\epsilon _a}} \over {dt}}} \right)_{{\rm{diss}}}} = - \sum\limits_b {{\nu _b}{{\vec \Omega }_{ab}} \cdot \overline {{\nabla _a}{W_{ab}}} } \;{\rm{with}}\;{\vec \Omega _{ab}} = - {{K{v_{{\rm{sig}}}}} \over {{{\bar N}_{ab}}}}(\epsilon _a^{\ast} - \epsilon _b^{\ast}) \cdot {\hat e_{ab}}$$
(115)

with symmetrized kernel gradients

$$\overline {{\nabla _a}{W_{ab}}} = {1 \over 2}[{\nabla _a}{W_{ab}}({h_a}) + {\nabla _a}{W_{ab}}({h_b})],$$
(116)
$$\gamma _k^{\ast} = {1 \over {\sqrt {1 - {{({{\vec v}_k} \cdot {{\hat e}_{ab}})}^2}} }},\vec S_k^{\ast} = \gamma _k^{\ast}\left({1 + {u_k} + {{{P_k}} \over {{n_k}}}} \right){\vec v_k}\quad {\rm{and}}\quad \vec \epsilon _k^{\ast} = \gamma _k^{\ast}\left({1 + {u_k} + {{{P_k}} \over {{n_k}}}} \right) - {{{P_k}} \over {{N_k}}},$$
(117)

where the asterisk denotes the projection to the line connecting two particles, give good results, even in very challenging shock tests (Rosswog, 2010b). A good choice for υsig is (Rosswog, 2010b)

$${v_{{\rm{sig,ab}}}} = \max ({\alpha _a},{\alpha _b}),$$
(118)

where

$$\alpha _k^ \pm = \max (0, \pm \lambda _k^ \pm)$$
(119)

with \(\lambda _k^ \pm \) being the extreme local eigenvalues of the Euler equations, see e.g., Martí and Müller (2003),

$$\lambda _k^ \pm = {{{v_\parallel }(1 - c_{\rm{s}}^2) \pm {c_{\rm{s}}}\sqrt {(1 - {v^2})(1 - v_\parallel ^2 - v_ \bot ^2c_{\rm{s}}^2)} } \over {1 - {v^2}c_{\rm{s}}^2}}$$
(120)

and c s,k is the relativistic sound velocity of particle k, \(k,\;{c_{s,k}} = \sqrt {{{(\Gamma - 1)(\mathcal{E} - 1)} \over \mathcal{E}}}.\) In 1 D, this simply reduces to the usual velocity addition law, \(\lambda _k^ \pm = ({v_k} \pm {c_{{\rm{s}},k}})/(1 \pm {v_k}{c_{{\rm{s}},k}})\). As in the non-relativistic case, the challenge lies in designing triggers that switch on where needed, but not otherwise. The strategies that can be applied here are straight forward translations of those described in Section 3.2.5. We refer to Rosswog (2015) for more details on this topic.

3.3.1 Integral approximation-based special-relativistic SPH

Like in the Newtonian case, alternative SPH equations can be obtained (Rosswog, 2015) by replacing the kernel gradients by the functions \(\vec G\), see Eq. (26):

$${{d{{\vec S}_a}} \over {dt}} = - {1 \over {{\nu _a}}}\sum\limits_b {\left\{ {{P_a}V_a^2{{{X_b}} \over {{X_a}}}{{\vec G}_a} + {P_b}V_b^2{{{X_a}} \over {{X_b}}}{{\vec G}_b}} \right\}} $$
(121)

and

$${{d{\epsilon _a}} \over {dt}} = - {1 \over {{\nu _a}}}\sum\limits_b {\left\{ {{P_a}V_a^2{{{X_b}} \over {{X_a}}}{{\vec v}_b} \cdot {{\vec G}_a} + {P_b}V_b^2{{{X_a}} \over {{X_b}}}{{\vec v}_a} \cdot {{\vec G}_b}} \right\}},$$
(122)

where (sum over d)

$${\left({{{\vec G}_a}} \right)^k} = {C^{kd}}({\vec r_a},\;{h_a}){({\vec r_b} - {\vec r_a})^b}{W_{ab}}({h_a})$$
(123)

and

$${\left({{{\vec G}_b}} \right)^k} = {C^{kd}}({\vec r_b},\;{h_b}){({\vec r_b} - {\vec r_a})^d}{W_{ab}}({h_b}).$$
(124)

The density calculation remains unchanged from Eq. (104). The same dissipative terms as for the kernel-gradient-based approach can be used, but it is important to replace \(\overline {{\nabla _a}{W_{ab}}} \) by

$${\overline {\vec G} _{ab}} = {1 \over 2}\left[ {{{\vec G}_a} + {{\vec G}_b}} \right],$$
(125)

since otherwise numerical instabilities can occur. This form has been extensively tested (Rosswog, 2015) and shown to deliver significantly more accurate results than the kernel-gradient-based approach.

3.4 General-relativistic SPH

For binaries that contain a neutron or a black hole general-relativistic effects are important. The first relativistic SPH formulations were developed by Kheyfets et al. (1990) and Mann (1991, 1993). Shortly after, Laguna et al. (1993a) developed a 3D, general-relativistic SPH code that was subsequently applied to the tidal disruption of stars by massive black holes (Laguna et al., 1993b). Their SPH formulation is complicated by several issues: the continuity equation contains a gravitational source term that requires SPH kernels for curved space-times. Moreover, owing to their choice of variables, the equations contain time derivatives of Lorentz factors that are treated by finite difference approximations and restrict the ability to handle shocks to only moderate Lorentz factors. The Laguna et al. formulation has been extended by Rantsiou et al. (2008) and applied to neutron star black hole binaries, see Section 4.2.6. We focus here on SPH in a fixed background metric, approximate GR treatments are briefly discussed in Section 4.2.

In smooth continuation of the Newtonian and special-relativistic approaches, we focus here on the derivation from a Lagrangian. To this end, we assume that a prescribed metric \({g_{\mu \nu }}\) is known as a function of the coordinates and that the perturbations that the fluid induces to the space-time geometry can be safely neglected. Again, c = 1 and signature (−, +, +, +) are assumed and the same index conventions as in the special relativistic section are adopted. Contravariant spatial indices of a vector quantity ω at particle a are denoted as \(w_a^i\), while covariant ones will be written as (ω i ) a . The line element and proper time are given by \(d{s^2} = {g_{\mu \nu }}d{x^\mu }d{x^\nu }\;{\rm{and}}\;d{\tau ^2} = - d{s^2}\) and the proper time is related to a coordinate time t by

$$\Theta d\tau = dt,$$
(126)

where a generalization of the Lorentz-factor

$$\Theta \equiv {1 \over {\sqrt { - {g_{\mu \nu }}{v^\mu }{v^\nu }} }}\quad {\rm{with}}\quad {v^\alpha } = {{d{x^\alpha }} \over {dt}}$$
(127)

was introduced. This relates to the four-velocity Uν, normalized to UμU μ = −1, by

$${v^\mu } = {{d{x^\mu }} \over {dt}} = {{d{x^\mu }} \over {d\tau }}{{d\tau } \over {dt}} = {{{U^\mu }} \over \Theta } = {{{U^\mu }} \over {{U^0}}}.$$
(128)

The Lagrangian of an ideal relativistic fluid can be written as (Fock, 1964)

$$L = - \int {{T^{\mu \nu }}{U_\mu }{U_\nu }} \sqrt { - g} dV,$$
(129)

where \(g = \det ({g_{\mu \nu }})\) and Tμν denotes the energy-momentum tensor of an ideal fluid without viscosity and conductivity

$${T^{\mu \nu }} = (e + P){U^\mu }{U^\nu } + P{g^{\mu \nu }}$$
(130)

with the energy density e given in Eq. (100). With these conventions the Lagrangian can be written, similar to the special-relativistic case, as

$$L = - \int {n(1 + u)} \sqrt { - g} dV.$$
(131)

As before, for practical simulations we give up general covariance and choose a particular “computing frame”. To find a SPH discretization in terms of a suitable density variable one can express local baryon number conservation, \(({U^\mu }n){;_\mu } = 0\), as (Siegler and Riffert, 2000)

$${\partial _\mu }(\sqrt { - g} {U^\mu }n) = 0,$$
(132)

or, more explicitly, as

$${\partial _t}(N) + {\partial _i}(N{v^i}) = 0,$$
(133)

where Eq. (128) was used and the computing frame baryon number density

$$N = \sqrt { - g} \Theta n$$
(134)

was introduced. The total conserved baryon number can then be expressed as a sum over fluid parcels with volume ΔV b located at \({\vec r_b}\), where each parcel carries a baryon number ν b

$$\mathcal{N} = \int {N\;dV \simeq \sum\limits_b {{N_b}\Delta {V_b} = } \sum\limits_b {{\nu _b}}.} $$
(135)

Eq. (133) looks like the Newtonian continuity equation which suggests to use it in the SPH discretization process of a quantity f:

$$\tilde f(\vec r) \simeq \sum\limits_b {{f_b}} {{{\nu _b}} \over {{N_b}}}W(\vec r - {\vec r_b},h),$$
(136)

where the subscript b indicates that a quantity is evaluated at a position \({\vec r_b}\) and W is the smoothing kernel. If all ν b are kept constant in time, exact baryon number conservation is guaranteed and no continuity equation needs to be solved (this can be done, though, if desired). If Eq. (136) is applied to the baryon number density N at the position of particle a, one finds

$${N_a} = N({\vec r_a}) = \sum\limits_b {{\nu _b}} W({\vec r_a} - {\vec r_b},{h_a}).$$
(137)

Note that the locally smoothed quantities are evaluated with flat-space kernels which assumes that the local space-time curvature radius is large in comparison to the local fluid resolution length. Such an approach is very convenient, but (more involved) alternatives to this approach exist (Kheyfets et al., 1990; Laguna et al., 1993a). It is usually desirable to adjust the smoothing length locally to fully exploit the natural adaptivity of a particle method. As before, one can adopt the smoothing length according to h a = η(ν a /N a )1/3. Due to the mutual dependence an iteration is required at each time step to obtain consistent values for both density and smoothing length, similar to the Newtonian and special-relativistic case. Motivated by Eqs. (134) and (135), one can re-write the fluid Lagrangian, Eq. (131), in terms of our computing frame number density N,

$$L = - \int {{{1 + u} \over \Theta }} N\;dV,$$
(138)

or, in discretized form,

$${L_{{\rm{SPH}}}} = - \sum\limits_b {{\nu _b}} {\left({{{1 + u} \over \Theta }} \right)_b}.$$
(139)

This Lagrangian has the same form as in the special-relativistic case, see Eq. (103), with the Lorentz factor γ being replaced with its generalized form Θ. Using the Euler-Lagrange equations and taking care of the velocity dependence of both the generalized Lorentz-factor Θ and the internal energy, \({{\partial {u_b}} \over {\partial v_a^i}} = {{\partial {u_b}} \over {\partial {n_b}}}{{\partial {n_b}} \over {\partial v_a^i}}\), one finds the canonical momentum per baryon

$${({S_i})_a} \equiv {1 \over {{\nu _a}}}{{\partial L} \over {\partial v_a^i}} = {\Theta _a}\left({1 + {u_a} + {{{P_a}} \over {{n_a}}}} \right){({g_{i\mu }}{v^\mu })_a} = \left({1 + {u_a} + {{{P_a}} \over {{n_a}}}} \right){({U_i})_a}.$$
(140)

which generalizes Eq. (107). The Euler-Lagrange equations (Rosswog, 2010a) deliver

$${{d{{({S_i})}_a}} \over {dt}} = - \sum\limits_b {{\nu _b}} \left\{ {{{{P_a}\sqrt { - {g_a}} } \over {{\Omega _a}N_a^2}}{{\partial {W_{ab}}({h_a})} \over {\partial x_a^i}} + {{{P_b}\sqrt { - {g_b}} } \over {{\Omega _b}N_b^2}}{{\partial {W_{ab}}({h_b})} \over {\partial x_a^i}}} \right\} + {\left({{{\sqrt { - g} } \over {2N}}{T^{\mu \nu }}{{\partial {g_{\mu \nu }}} \over {\partial {x^i}}}} \right)_a},$$
(141)

again very similar to the special-relativistic case. Like before, the canonical energy

$$E \equiv \sum\limits_a {{{\partial L} \over {\partial v_a^i}}} v_a^i - L = \sum\limits_a {{\nu _a}} \left({v_a^i{{({S_i})}_a} + {{1 + {u_a}} \over {{\Theta _a}}}} \right)$$
(142)

suggests to use

$${e_a} \equiv v_a^i{({S_i})_a} + {{1 + {u_a}} \over {{\Theta _a}}}$$
(143)

as numerical variable whose evolution equation follows from straight forward differentiation (Rosswog, 2010a) as

$${{d{e_a}} \over {dt}} = - \sum\limits_b {{\nu _b}} \left\{ {{{{P_a}\sqrt { - {g_a}} v_b^i} \over {{\Omega _a}N_a^2}}{{\partial {W_{ab}}({h_a})} \over {\partial x_a^i}} + {{{P_b}\sqrt { - {g_b}} v_a^i} \over {{\Omega _b}N_b^2}}{{\partial {W_{ab}}({h_b})} \over {\partial x_a^i}}} \right\} - {\left({{{\sqrt { - g} } \over {2N}}{T^{\mu \nu }}{\partial _t}{g_{\mu \nu }}} \right)_a}.$$
(144)

Together with an equation of state, Eqs. (137), (141) and (144) represent a complete and self-consistently derived set of SPH equations. The gravitational terms are identical to those of Siegler and Riffert (2000) and Monaghan and Price (2001), but the hydrodynamic terms differ in both the particle symmetrization and the presence of the grad-h terms

$${\Omega _b} = 1 - {{\partial {h_b}} \over {\partial {N_b}}}\sum\limits_k {{\nu _k}} {{\partial {W_{bk}}({h_b})} \over {\partial {h_b}}}.$$
(145)

Note that the only choices in our above derivation were the h-dependence in Eq. (137) and how to adapt the smoothing length. The subsequent calculation contained no arbitrariness concerning the symmetry in particle indices, everything followed stringently from the first law of thermodynamics and the Euler-Lagrange equations. Another important point to note is that the derived energy equation, Eq. (144), does not contain destabilizing time derivatives of Lorentz factors (Norman and Winkler, 1986) on the RHS — in contrast to earlier SPH formulations (Laguna et al., 1993a). The above SPH equations can also be recast in the language of the 3+1 formalism, this can be found in Tejeda (2012).

3.4.1 Limiting cases

It is a straight forward exercise to show that, in the limit of vanishing hydrodynamic terms (i.e., υ and P = 0), the evolution equations (141) and (144) reduce to the equation of geodesic motion (Tejeda, 2012). If, in the opposite limit, we are neglecting the gravitational terms in Eqs. (141) and (144) and assume flat space-time with Cartesian coordinates one has \(\sqrt { - g} \rightarrow 1\) and Θ → γ, and Eq. (134) becomes N = γn. The momentum and energy equations reduce in this limit to the previous equations (108) and (113) (for X = ν).

3.5 Frequently used SPH codes

A number of SPH codes are regularly used throughout various areas of astrophysics and it is beyond the scope of this review to give a detailed account of each of them. In Table 3 we briefly summarize some of the basic features of Newtonian SPH codes that have been used in the context of compact object simulations and that will be referred to in the subsequent sections of this review. For a more detailed code description we refer to the original papers. SPH approaches that use GR or various approximations to it are further discussed in Section 4.2.

Table 3: Frequently used Newtonian SPH codes

3.6 Importance of initial conditions

It is not always sufficiently appreciated how important the initial conditions are for the accuracy of SPH simulations. Unfortunately, it can become rather non-trivial to construct them. One obvious requirement is the regularity of a particle distribution so that the quality indicators \({\mathcal{Q}_1} - {\mathcal{Q}_4}\), Eqs. (6) and (9), are fulfilled with high accuracy. This suggests placing the particles initially on some type of lattice. However, these lattices should ideally possess at least two more properties: a) they should not contain preferred directions b) they should represent a stable minimum energy configuration for the applied SPH formulation. Condition a) is desirable since otherwise shocks propagating along a symmetry axis of a lattice will continuously “collect” particles in this direction and this can lead to “lattice ringing effects”. Condition b) is important since a regular lattice does by no means guarantee that the configuration actually represents an equilibrium for the SPH particles. As briefly outlined in Section 3.2.2, the momentum equation derived from a Lagrangian also contains a “regularization force” contribution that strives to achieve a good particle distribution. If the initial lattice does not represent such a configuration, the particles will start to move “off the lattice” and this will introduce noise. In Figure 8 we show the result of a numerical experiment. 10 K particles are placed either on a quadratic or hexagonal lattice in the domain [−1,1] × [−1,1] with particles at the edges being fixed. If the configurations are allowed to evolve without dissipation (here we use the common cubic spline kernel), the particles in the quadratic lattice case soon begin to move off the grid and gain average velocities of ⟨υ⟩ ≈ 0.015 cs, while the remain in the initial configuration when initially placed on a hexagonal lattice (⟨υ⟩ ≈ 10−4 cs). This is discussed in more detail in Rosswog (2015). From a theoretical point of view, the relation between kernel, particle configuration and stability/noise is to date only incompletely understood and must be addressed in careful future studies.

Figure 8:
figure 8

Numerical experiment to illustrate the importance of the initial particle lattice. In this experiment particles are placed on either a quadratic or a hexagonal lattice and they are subsequently allowed to evolve (no dissipation, cubic spline kernel). In the quadratic lattice case, the particles begin to move around and rearrange themselves (middle panel after 10 sound-crossing times), while they stay on the original lattice for the hexagonal case (left panel; same time). The logarithm of the average particle velocities for both cases (velocities in units of the sound speed, times in sound crossing times) are shown in the right panel.

In experiments (Rosswog, 2015) different kernels show a very different noise behavior and this seems to be uncorrelated with the accuracy properties of the kernels: kernels that are rather poor density and gradient estimators may be excellent in producing very little noise (see, for example, the QCM6 kernel described in Rosswog, 2015). On the other hand, very accurate kernels (like WH,9) may be still allow for a fair amount of noise in a dynamical simulation. The family of Wendland kernels (Wendland, 1995) has a number of interesting properties, among them the stability against particle pairing despite having a vanishing central derivative (Dehnen and Aly, 2012) and their reluctance to tolerate sub-resolution particle motion (Rosswog, 2015), in other words noise. In those experiments where particle noise is relevant for the overall accuracy, for example in the Gresho-Chan vortex test, see Section 3.7.4, the Wendland kernel, Eq. (34), performs substantially better than any other of the explored kernels.

A heuristic approach to construct good, low-noise initial conditions for the actually used kernel function is to apply a velocity-dependent damping term, \({\vec f_{{\rm{damp}}}} \propto - {\vec v_{{\rm{damp}},{\rm{a}}}}/{\tau _a},\), with a suitably chosen damping time scale τ a to the momentum equation. This “relaxation process” can be applied until some convergence criterion is met (say, the kinetic energy or some density oscillation amplitude has dropped below some threshold). This procedure becomes, however, difficult to apply for more complicated initial conditions and it can become as time consuming as the subsequent production simulation. For an interesting recent suggestion on how to construct more general initial conditions see Diehl et al. (2012).

3.7 The performance of SPH

Here we cannot give an exhaustive overview over the performance of SPH in general. We do want to address, however, a number of issues that are of particular relevance in astrophysics. Thereby we put particular emphasis on the new improvements of SPH that were discussed in the previous sections. Most astrophysical simulations, however, are still carried out with older methodologies. This is natural since, on the one hand, SPH is still a relatively young numerical method and improvements are constantly being suggested and, on the other hand, writing a well-tested and robust production code is usually a rather laborious endeavor. Nevertheless, efforts should be taken to ensure that latest developments find their way into production codes. In this sense the below shown examples are also meant as a motivation for computational astrophysicists to keep their simulation tools up-to-date.

The accuracy of SPH with respect to commonly used techniques can be enhanced by using

  • higher-order kernels, for example, the WH,n or the Wendland kernel W3,3, see Section 2.3

  • different volume elements, see Section 3.1

  • more accurate integral-based derivatives, see Section 2.2.4

  • modern dissipation triggers, see Section 3.2.5. In the examples shown here, we use a special-relativistic SPH code (“SPHINCS_SR”, Rosswog, 2015) to demonstrate SPH’s performance in a few astrophysically relevant examples. Below, we refer several times to the “\({\mathcal{F}_1}\) formulation”: it consists of baryon number density calculated via Eq. (104) with volume weight X = P0 05, the integral approximation-based form of the SPH equations, see Eqs. (121) and (122), and shock trigger Eq. (88) and noise triggers, see Rosswog (2015) for more details.

3.7.1 Advection

SPH is essentially perfect in terms of advection: if a particle carries a certain property, say some electron fraction, it simply transports this property with it (unless the property is changed by physical processes, of course). The advection accuracy is, contrary to Eulerian schemes, independent of the numerical resolution and just governed by the integration accuracy of the involved ordinary differential equations. We briefly want to illustrate the advection accuracy in an example that is a very serious challenge for Eulerian schemes, but essentially a “free lunch” for SPH: the highly supersonic advection of a density pattern through the computational domain with periodic boundary conditions. To this end, we set up 7K SPH particles inside the domain [−1, 1] × [−1, 1]. Each fluid element is assigned a velocity in the x-direction of υ x = 0.9999 corresponding to a Lorentz factor of γ ≈ 70.7 and periodic boundary conditions are applied. The result of this experiment is shown in Figure 9: after crossing the computational domain ten times (more than 23 000 time steps; right panel) the shape of the triangle has not changed in any noticeable way with respect to the initial condition (left panel).

Figure 9:
figure 9

Demonstration of SPH’s superb advection properties. A high-density wedge (N = 2) in pressure equilibrium with the background (N = 1) is advected with velocity υ x = 0.9999, corresponding to a Lorentz factor of Γ = 70.7, through a box with periodic boundaries. For this test only 7000 particles are used. After crossing the box 10 times (or more than 23 000 time steps; right panel) no deterioration of the shape with respect to the initial condition (left panel) is noticeable.

3.7.2 Shocks

Since shocks play a crucial role in many areas of astrophysics we want to briefly discuss SPH’s performance in shocks. As an illustration, we show the result of a 2D relativistic shock tube test where the left state is given by [P, υ x ,υ y ,N] L = [40/3, 0, 0, 10] and the right state by [P,υ x ,υ y ,N] R = [10−6, 0, 0, 1]. The polytropic exponent is γ = 5/3 and the results are shown at t = 0.25. In the experiment 80 K particles were initially placed on a hexagonal lattice in region [−0.4, 0.4] × [−0.02, 0.02]. Overall, the exact solution (red solid line) is well captured, see Figure 10. The discontinuities, however, are less sharp than those obtained with modern high-resolution shock-capturing methods with the same resolution. Characteristic for SPH is that the particles that were initially located on a lattice get compressed in the shock and, in the post-shock region, they need to “re-grid” themselves into a new particle distribution. This also means that the particles have to pass each other and, therefore, also acquire a small velocity component in y-direction. As a result, there is some unavoidable “re-meshing noise” behind the shock front. This can be tamed by using smoother kernels and, in general, the details of the results depend on the exactly chosen dissipation parameters and initial conditions. Note that the noise trigger that is used in the shown simulation applies dissipation in the re-gridding region behind the shock, therefore, the contact discontinuity has been somewhat broadened. Reducing the amount of dissipation in this region (i.e., raising the tolerable reference value for noise) would sharpen the contact discontinuity, but at the price of increased re-meshing noise in the velocity.

Figure 10:
figure 10

Two-dimensional relativistic shock tube test. The exact solution is shown as red, solid line, the SPH results as open squares. Each SPH particle is shown.

3.7.3 Fluid instabilities

The standard formulation of SPH has recently been criticized (Thacker et al., 2000; Agertz et al., 2007; Springel, 2010a; Read et al., 2010) for its performance in handling of fluid instabilities. As discussed in Section 3.1, this is related to the different smoothness of density and internal energy which can lead to spurious pressure forces acting like a surface tension. We want to briefly illustrate that a modern formulation can capture Kelvin-Helmholtz instabilities accurately.Footnote 11 Three stripes of hexagonal lattices are placed in the domain [−1, 1] × [−1, 1] so that a density of N = 2 is realized in the middle stripe (∣y∣ < 0.3) and N = 1 in the surrounding stripes. The high-density stripe moves with υ x = 0.2, the other two with υ x = −0.2, the pressure is P0 = 10 everywhere and the polytropic exponent is Γ = 5/3. No perturbation mode is triggered explicitly, we wait until small perturbations that occur as particles at the interface pass each other grow into a healthy Kelvin-Helmholtz instability. Figure 11 shows snapshots at t = 2, 3.5 and 5.0. Note that “standard methods” (fixed, high dissipation parameters, volume element ν/N or m/ρ and direct gradients of the M4 kernel) do, for this setup, not lead to a noticeable instability on the shown time scale (Rosswog, 2015).

Figure 11:
figure 11

Example of an untriggered Kelvin-Helmholtz instability. The central density stripe has a density of N = 2 and moves with velocity υ x = 0.2, the surrounding stripes have density N = 1 and υ x = −0.2. No particular mode was excited, the instability is triggered by small numerical irregularities at the interface that grow into Kelvin-Helmholtz billows. The shown snapshots are taken at t = 2, 3.5 and 5.0, 1000 K SPH particles were used. The simulation has been performed with the \({\mathcal{F}_1}\) SPH formulation of Rosswog (2015), note that with “standard techniques” (constant, high dissipation, standard volume elements and direct gradients of the M4 kernel) no Kelvin-Helmholtz billows develop on the shown time scale.

3.7.4 The Gresho—Chan vortex

The Gresho-Chan vortex (Gresho and Chan, 1990) is considered as a particularly difficult test, in general and in particular for SPH. As shown in Springel (2010a), standard SPH shows very poor convergence in this test. The test deals with a stationary vortex that should be in stable equilibrium. Since centrifugal forces and pressure gradients balance exactly, any deviation from the initial configuration that develops over time is spurious and of purely numerical origin. The azimuthal component of the velocity in this test rises linearly up to a maximum value of υ0 which is reached at r = R1 and subsequently decreases linearly back to zero at 2 R1

$$v_\phi (r) = v_0 \left\{ {\begin{array}{*{20}c} u & {for} & {u \leqslant 1} \\ {2 - u} & {for} & {1 < u \leqslant 2,} \\ 0 & {for} & {u > 2} \\ \end{array} } \right.$$
(146)

where u = r/R1. We require that centrifugal and pressure accelerations balance, therefore, the pressure becomes

$$P(r) = P_0 + \left\{ {\begin{array}{*{20}c} {\tfrac{1} {2}v_0^2 u^2 } & {for} & {u \leqslant 1} \\ {4v_0^2 (\tfrac{{u^2 }} {8} - u + \ln u + 1)} & {for} & {1 < u \leqslant 2} \\ {4v_0^2 (\ln 2 - \tfrac{1} {2})} & {for} & {u > 2.} \\ \end{array} } \right.$$
(147)

In the literature on non-relativistic hydrodynamics (Liska and Wendroff, 2003; Springel, 2010a; Read and Hayfield, 2012; Dehnen and Aly, 2012) usually υ0 = 1 is chosen together with R1 = 0.2, a uniform density ρ =1 and a polytropic exponent of 5/3. Since we perform this test with the special-relativistic code SPHINCS_SR in the Newtonian limit, we use most of these values, but choose R1 = 2 × 10−4 and υ0 = 10−3 to be safely in the non-relativistic regime. We use this test here to illustrate which difference modern concepts can make in this challenging test, see Figure 12. The left panel shows the result from a modern SPH formulation (formulation \({\mathcal{F}_1}\) from Rosswog, 2015) and one that applies “traditional” approaches (fixed, high dissipation parameters, volume element ν/N or m/ρ and direct gradients of the M4 kernel). The traditional approach fails completely and actually does not converge to the correct solution (Springel, 2010a; Rosswog, 2015) while the new, more sophisticated approach yields very good results.

Figure 12:
figure 12

Comparison of the performance of a modern (good dissipation triggers, high-order kernel, integral-based gradient calculation; this is the \({\mathcal{F}_1}\) formulation from Rosswog, 2015) and a more traditional SPH formulation (constant high dissipation, cubic spline kernel, gradients from kernel derivatives; this is the \({\mathcal{F}_4}\) formulation from the same paper) in the challenging Gresho-Chan vortex problem at t = 1. At this time, the combination of noise and high dissipation have, for the second case, already very seriously deteriorated the ideally stationary solution. The modern formulation, in contrast, has stayed very close to the initial condition. Also indicated is the L1 error of the velocity.

3.8 Summary: SPH

The most outstanding property of SPH is its exact numerical conservation. This can straight forwardly be achieved via symmetries in the particle indices of the SPH equations together with anti-symmetric gradient estimates. The most elegant and least arbitrary strategy to obtain a conservative SPH formulation is to start from a fluid Lagrangian and to derive the evolution equations via a variational principle. This approach can be applied in the Newtonian, special- and general relativistic case, see Sections 3.2, 3.3 and 3.4. Advection of fluid properties is essentially perfect in SPH and it is in particular not dependent on the coordinate frame in which the simulation is performed.

SPH robustly captures shocks, but they are, at a given resolution, not as sharp as those from state-of-the-art high-resolution shock-capturing schemes. Moreover, standard SPH has been criticized for its (in)ability to resolve fluid instabilities under certain circumstances. Another issue that requires attention when performing SPH simulations are initial conditions. When not prepared carefully, they can easily lead to noisy results, since the regularization force discussed in Section 3.2.3 leads for poor particle distribution to a fair amount of particle velocity fluctuations. This issue is particular severe when poor kernel functions and/or low neighbor numbers are used, see Section 2.

Recently, a number of improvements to SPH techniques have been suggested. These include a) more accurate gradient estimates, see Section 2.2, b) new volume elements which eliminate spurious surface tension effects, see Section 3.1, c) higher-order kernels, see Section 2.3 and d) more sophisticated dissipation switches, see Section 3.2.5. As illustrated, for example by the Gresho-Chan vortex test in Section 3.7.4, enhancing SPH with these improvements can substantially increase its accuracy with respect to older SPH formulations.

4 Astrophysical Applications

The remaining part of this review is dedicated to actual applications of SPH to astrophysical studies of compact objects. We focus on encounters of

  • two white dwarfs (Section 4.1),

  • two neutron stars (Section 4.2.5) and

  • a neutron star with a black hole (Section 4.2.6). In each case the focus is on gravitational-wave-driven binary mergers. In locations with large stellar number densities, e.g., globular clusters, dynamical collisions between stars occur frequently and encounters between two neutron stars and a neutron star with a stellar-mass black hole may yield very interesting signatures. Therefore, such encounters are also briefly discussed.

In each of these fields, a wealth of important results have been achieved with a number of different methods. Naturally, since the scope of this review are SPH methods, we will focus our attention here to those studies that are at least partially based on SPH simulations. For further studies that are based on different methods we have to refer to the literature.

4.1 Double white-dwarf encounters

4.1.1 Relevance

White dwarfs (WDs) are the evolutionary end stages of most stars in the Universe, for every solar mass of stars that forms ∼ 0.22 WDs will be produced on average. As a result, the Milky Way contains ∼ 1010 WDs (Napiwotzki, 2009) in total and ∼ 108 double WD systems (Nelemans et al., 2001b). About half of these systems have separations that are small enough (orbital periods < 10 hrs) so that gravitational wave emission will bring them into contact within a Hubble time, making them a major target for the eLISA mission (Amaro-Seoane et al., 2013). Once in contact, in almost all cases the binary system will merge, in the remaining small fraction of cases mass transfer may stabilize the orbital decay and lead to long-lived interacting binaries such as AM CVn systems (Paczyński, 1967; Warner, 1995; Nelemans et al., 2001a; Nelemans, 2005; Solheim, 2010).

Those systems that merge may have a manifold of interesting possible outcomes. The merger of two He WDs may produce a low-mass He star (Webbink, 1984; Iben Jr and Tutukov, 1986; Saio and Jeffery, 2000; Han et al., 2002), He-CO mergers may form hydrogen-deficient giant or R CrB stars (Webbink, 1984; Iben Jr et al., 1996; Clayton et al., 2007) and if two CO WDs merge, the outcome may be a more massive, possibly hot and high B-field WD (Bergeron et al., 1991; Barstow et al., 1995; Segretain et al., 1997). A good fraction of the CO-CO merger remnants probably transforms into ONeMg WDs which finally, due to electron captures on Ne and Mg, undergo an accretion-induced collapse (AIC) to a neutron star (Saio and Nomoto, 1985; Nomoto and Kondo, 1991; Saio and Nomoto, 1998). Given that the nuclear binding energy that can still be released by burning to iron group elements (1.6 MeV from He, 1.1 MeV from C and 0.8 MeV from O) is large, it is not too surprising that there are also various pathways to thermonuclear explosions. The ignition of helium on the surface of a WD may lead to weak thermonuclear explosions (Bildsten et al., 2007; Foley et al., 2009; Perets et al., 2010), sometimes called “.Ia” supernovae. The modern view is that WDWD mergers might also trigger type Ia supernovae (SN Ia) (Webbink, 1984; Iben Jr and Tutukov, 1984) and, in some cases, even particularly bright “super-Chandrasekhar” explosions, e.g., Howell et al. (2006); Hicken et al. (2007).

SN Ia are important as cosmological distance indicators, as factories for intermediate mass and iron-group nuclei, as cosmic ray accelerators, kinetic energy sources for galaxy evolution or simply in their own right as end points of binary stellar evolution. After having been the second-best option behind the “single degenerate” model for decades, it now seems entirely possible that double degenerate mergers are behind a sizable fraction of SN Ia. It seems that with the re-discovery of double degenerates as promising type Ia progenitors an interesting time for supernovae research has begun. See Howell (2011) and Maoz et al. (2013) for two excellent recent reviews on this topic.

Below, we will briefly summarize the challenges in a numerical simulation of a WDWD merger (Section 4.1.2) and then discuss recent results concerning mass transferring systems (Section 4.1.3) and, closely related, to the final merger of a WDWD binary and possibilities to trigger SN Ia (Section 4.1.4). We will also briefly discuss dynamical collisions of WDs (Section 4.1.5). For SPH studies that explore the gravitational wave signatures of WDWD mergers we refer to the literature (Lorén-Aguilar et al., 2005; Dan et al., 2011; van den Broek et al., 2012).

Note that in this section we explicitly include the constants G and c in the equations to allow for a simple link to the astrophysical literature.

4.1.2 Challenges

WDWD merger simulations are challenging for a number of reasons not the least of which are the onset of mass transfer and the self-consistent triggering of thermonuclear explosions.

While two white dwarfs revolve around their common center of mass, gravitational-wave emission reduces the separation a of a circular binary orbit at a rate of (Peters and Mathews, 1963; Peters, 1964)

$${\dot a_{{\rm{GW}}}} = - {{64{G^3}} \over {5{c^5}}}{{{m_1}{m_2}M} \over {{a^3}}},$$
(148)

where m1 and m2 are the component masses. Although it is gravitational wave emission that drives the binary towards mass transfer/merger in the first place, its dynamical impact at the merger stage is completely negligible since

$${{{\tau _{{\rm{GW}}}}} \over {{P_{{\rm{orb}}}}}} = {{a/\vert {{\dot a}_{{\rm{GW}}}}\vert } \over {2\pi/{\omega _{\rm{K}}}}} = 6.6 \cdot {10^8}{\left({{a \over {2 \times {{10}^9}{\rm{cm}}}}} \right)^{5/2}}\left({{{0.6{M_ \odot }} \over {{m_1}}}} \right)\left({{{0.6{M_ \odot }} \over {{m_2}}}} \right){\left({{{1.2{M_ \odot }} \over M}} \right)^{1/2}}.$$
(149)

Mass transfer will set in once the size of the Roche lobe of one of the stars has become comparable to the enclosed star. Due to the inverted mass-radius relationship of WDs, it is always the less massive WD (“secondary”) that fills its Roche lobe first. From the Roche lobe size and Kepler’s third law, the average density \({\bar{\rho }}\) of the donor star can be related to the orbital period (Paczyński, 1971; Frank et al., 2002)

$$\overline \rho \approx {{115\;{\rm{g}}\;{\rm{c}}{{\rm{m}}^{ - 3}}} \over {P_{{\rm{hr}}}^2}},$$
(150)

where Phr is the orbital period measured in hours. In other words: the shorter the orbital period, the higher the density of the mass donating star. For periods below 1 hr the donor densities exceed those of main sequence stars which signals that a compact star is involved. If one uses the typical dynamical time scale of a WD, \({\tau _{{\rm{dyn}}}} \approx {(G\bar{\rho})^{1/2}}\), one finds

$${{{P_{{\rm{orb}}}}} \over {{\tau _{{\rm{dyn}}}}}} \approx 10,$$
(151)

so that a single orbit would already require ∼ 10000(ndyn/1000) numerical time steps, if ndyn denotes the number of numerical time steps per stellar dynamical time. This demonstrates that long-lived mass transfer over tens of orbital periods can become quite computationally expensive and may place limits on the numerical resolution that can be afforded in such a simulation. On the other hand, when numerically resolvable mass transfer sets in, it already has a rate of

$${\dot M_{\lim }}\sim{{1\,{\rm{particle}}\,{\rm{mass}}} \over {{\rm{orbital}}\,{\rm{period}}}}\sim 2 \times {10^{ - 8}}{{{M_ \odot }} \over {\rm{s}}}\left({{{{{10}^6}} \over {{\rm{npart}}}}} \right){\left({{M \over {1{M_ \odot }}}} \right)^{3/2}}{\left({{{2 \cdot {{10}^9}{\rm{cm}}} \over {{a_0}}}} \right)^{3/2}},$$
(152)

where npart is the total number of SPH particles, Mtot is the total mass of the binary and a0 is the separation between the stars at the onset of mass transfer. This limit due to finite numerical resolution is several orders of magnitude above the Eddington limit of WDs, therefore, sub-Eddington accretion rates are hardly ever resolvable within a global, 3D SPH simulation. The transferred matter comes initially from the tenuous WD surface which, in SPH, is the poorest resolved region of the star. Following this matter is also a challenge for Eulerian methods since it needs to be disentangled from the “vacuum” background and, due to the resolution-dependent angular momentum conservation, it is difficult to obtain the correct feedback on the orbital evolution. In other words: the consistent simulation of mass transfer and its feedback on the binary dynamics is a serious challenge for every numerical method.

The onset of mass transfer represents a juncture in the life of WDWD binary, since now the stability of mass transfer decides whether the binary can survive or will inevitably merge. The latter depends sensitively on the internal structure of the donor star, the binary mass ratio and the angular momentum transport mechanisms (e.g., Marsh et al., 2004; Gokhale et al., 2007). Due to the inverse mass-radius relationship of WDs, the secondary will expand on mass loss and, therefore, tendentially speed up the mass loss further. On the other hand, since the mass is transferred to the higher mass object, momentum/center of mass conservation will tend to widen the orbit and, therefore, tendentially reduce mass transfer. If the circularization radius of the transferred matter is smaller than the primary radius it will directly impact on the stellar surface and tend to spin up the accreting star. In this way, orbital angular momentum is lost to the spin of the primary which, in turn, decreases the orbital separation and accelerates the mass transfer. If, on the other hand, the circularization radius is larger than the primary radius and a disk can form, angular momentum can, via the large lever arm of the disk, be fed back into the orbital motion and stabilize the binary system (Iben Jr et al., 1998; Piro, 2011). To make things even more complicated, if tidal interaction substantially heats up the mass donating star it may impact on its internal structure and, therefore, change its response to mass loss.

To capture these complex angular momentum transfer mechanisms reliably in a simulation requires a very accurate numerical angular momentum conservation. We want to briefly illustrate this point with a small numerical experiment. A 0.3 + 0.6 M WD binary system is prepared in a Keplerian orbit, so that mass transfer is about to set in. To mimic the effect of numerical angular momentum loss in a controllable way, we add small artificial forces similar to those emerging from gravitational wave emission (Peters and Mathews, 1963; Peters, 1964; Davies et al., 1994) and adjust the overall value so that 4% or 0.5% of angular momentum per orbit are lost. These results are compared to a simulation without artificial loss terms where the angular momentum is conserved to better than 0.01% per orbit, see Figure 13. The effect on the mutual separation a (in 109 cm) is shown in the upper panels and the gravitational wave amplitude, h+ (r is the distance to the observer) as calculated in the quadrupole formalism, are shown the lower panels.

Even the moderate loss of 0.5% angular momentum per orbit leads to a quick artificial merger and a mass transfer duration that is reduced by more than a factor of three. These conservation requirements make SPH a natural choice for WDWD merger simulations and it has indeed been the first method used for these type of problems.

As outlined above, one of the most exciting possibilities is the triggering of thermonuclear explosions during the interaction of two WDs. Such explosions can either be triggered by a shock wave where the thermonuclear energy generation behind the shock wears down possible dissipative effects or, spontaneously, if the local conditions for burning are favorable enough so that it occurs faster than the star can react by expanding. Seitenzahl et al. (2009) have studied detonation conditions in detail via local simulations and found that critical detonation conditions can require that length scales down to centimeters are resolved which is, of course, a serious challenge for global, 3D simulations of objects with radii of ∼ 109 cm.

Figure 13:
figure 13

Numerical experiment to illustrate the sensitivity of a binary system to the non-conservation of angular momentum. A WDWD binary system (0.3 and 0.6 M) is adiabatically relaxed to the point where mass transfer is about to set in and then its orbital evolution is followed in a 3D hydrodynamic simulation. To mimic numerical loss of angular momentum, an artificial force is applied that removes angular momentum at a rate of 0.5% (middle) and 4% per orbit (right panel). The simulation shown in the leftmost panel conserves angular momentum to better than 0.01% per orbit. Shown is the binary separation a (upper row) and the gravitational wave amplitude h+ times to distance to the source r. Image courtesy of Marius Dan.

There is also a huge disparity in terms of time scales. Whenever nuclear burning is important for the dynamics of the gas flow, the nuclear time scales are many orders of magnitude shorter than the admissible hydrodynamic time steps. Therefore, nuclear networks are usually implemented via operator splitting methods, see e.g., Benz et al. (1989); Rosswog et al. (2009b); Raskin et al. (2010); García-Senz et al. (2013). Because of the exact advection in SPH the post-processing of hydrodynamic trajectories with larger nuclear networks to obtain detailed abundance patterns is straight forward. For burning processes in tenuous surface layers, however, SPH is seriously challenged since here the resolution is poorest. For such problems, hybrid approaches that combine SPH with, say, AMR methods (Guillochon et al., 2010; Dan et al., 2015) seem to be the best strategies.

4.1.3 Dynamics and mass transfer in white-dwarf binaries systems

Three-dimensional simulations of WDWD mergers were pioneered by Benz et al. (1990). Their major motivation was to understand the merger dynamics and the possible role of double degenerate systems as SN Ia progenitors. They used an SPH formulation as described in Section 3.2.1 (“vanilla ice”) together with 7000 SPH particles, an equation of state for a non-degenerate ideal gas with a completely degenerate, fully relativistic electron component and they restricted themselves to the study of a 0.9–1.2 M system. No attempts were undertaken to include nuclear burning in this study (but see Benz et al., 1989). Each star was relaxed in isolation, see Section 3.6, and subsequently placed in a circular Keplerian orbit so that the secondary was overfilling its critical lobe by ∼ 8%. Under these conditions the secondary star was disrupted within slightly more than two orbital periods, forming a three-component system of a rather unperturbed primary, a hot pressure supported spherical envelope and a rotationally supported outer disk. About 0.6 % of a solar mass were able to escape, the remaining ∼ 1.7 M, supported mainly by pressure gradients, showed no sign of collapse.

Rasio and Shapiro (1995) were more interested in the equilibrium and the (secular, dynamical and mass transfer) stability properties of close binary systems. They studied systems both with stiff (Γ > 5/3; as models for neutron stars) and soft (Γ = 5/3) polytropic equations of state, as approximations for the EOS of (not too massive) WDs and low-mass main sequence star binaries. They put particular emphasis on constructing accurate, synchronized initial conditions (Rasio and Shapiro, 1994). These were obtained by relaxing the binary system in a corotating frame where, in equilibrium, all velocities should vanish. The resulting configurations satisfied the virial theorem to an accuracy of about one part in 103. With these initial conditions they found a more gradual increase in the mass transfer rate in comparison to Benz et al. (1990), but nevertheless the binary was disrupted after only a few orbital periods.

Segretain et al. (1997) focussed on the question whether particularly massive and hot WDs could be the result of binary mergers (Bergeron et al., 1991). They applied a simulation technology similar to Benz et al. (1990) and concentrated on a binary system with non-spinning WDs of 0.6 and 0.9 M. They showed, for example, that such a merger remnant would need to lose about 90% of its angular momentum in order to reproduce properties of the observed candidate WDs.

Although Rasio and Shapiro (1995) had already explored the construction of accurate initial conditions, essentially all subsequent simulations (Guerrero et al., 2004; Lorén-Aguilar et al., 2005; Yoon et al., 2007; Lorén-Aguilar et al., 2009; Pakmor et al., 2010, 2011; Zhu et al., 2013) were carried out with rather approximate initial conditions consisting of spherical stars placed in orbits where, according to simple Roche-lobe geometry estimates, mass transfer should set in. Marsh et al. (2004) had identified in a detailed orbital stability analysis definitely stable regions (roughly for primary masses substantially larger than the companion mass), definitely unstable (mass ratios between 2/3 and 1) and an intermediate region where the stability of mass transfer is less clear. Motivated by large discrepancies in the mass transfer duration that had been observed between careful grid-based (Motl et al., 2002; D’Souza et al., 2006; Motl et al., 2007) and earlier SPH simulations (Benz et al., 1990; Rasio and Shapiro, 1995; Segretain et al., 1997; Guerrero et al., 2004; Yoon et al., 2007; Pakmor et al., 2010) Dan et al. (2009, 2011) focussed on the mass transfer in this unclear regime. They very carefully relaxed the binary system in a corotating frame and thereby adiabatically reduced the mutual separation until the first particle climbed up to saddle point L1 in the effective potential, see Figure 14.

Figure 14:
figure 14

Initial conditions for a synchronized WDWD binary system at the onset of resolvable mass transfer. The red line is the point mass Roche potential, the SPH particle values are shown as filled black circles. Once the first SPH particle has crossed the L1 point, the system is transformed from the corotating binary frame to a space-fixed frame, where it is hydro dynamically evolved. Image reproduced with permission from Dan et al. (2011), copyright by AAS.

In their study, such carefully constructed initial conditions were compared to the previously commonly used approximate initial conditions. Apart from the inaccuracies inherent to the analytical Roche lobe estimates, approximate initial conditions also neglect the tidal deformations of the stars and, therefore, seriously underestimate the initial separation at the onset of mass transfer. For this reason, such initial conditions have up to 15% too little angular momentum and, as a result, binary systems with inaccurate initial conditions merge too violently on a much too short time scale. As a result, temperatures and densities in the final remnant are over- and the size of tidal tails are underestimated. The carefully prepared binary systems all showed dozens of orbits of numerically resolvable mass transfer. Given that, due to the finite resolution, the mass transfer is already highly super-Eddington when it starts being resolvable all the results on mass transfer duration have to be considered as strict lower limits. One particular example, a 0.2 M He-WD and a 0.8 CO-WD, merged with approximate initial conditions within two orbital periods (comparable to earlier SPH results), but only after painfully long 84 orbital periods when the initial conditions were prepared carefully. This particular example also illustrated the suitability of SPH for such investigations: during the orbital evolution, which corresponds to ≈ 17000 dynamical time scales, energy and angular momentum were conserved to better than 1%! All of the investigated (according to the Marsh et al. analysis unstable) binary systems merged in the end although only after several dozens of orbital periods. Some systems showed a systematic widening of the orbits after the onset of mass transfer. Although they were still disrupted in the end, this indicated that systems in the parameter space vicinity of this 0.2–0.8 M system may evolve into short-period AM CVn systems.

In two recent studies, Dan et al. (2012, 2014) systematically explored the parameter space by simulating 225 different binary systems with masses ranging from 0.2 to 1.2 M. All of the initial conditions were prepared carefully as Dan et al. (2011). Despite the only moderate resolution (40 K particles) that could be afforded in such a broad study, they found excellent agreement with the orbital evolution predicted by mass transfer stability analysis (Marsh et al., 2004; Gokhale et al., 2007).

4.1.4 Double white-dwarf mergers and possible pathways to thermonuclear super-novae

The merger of two white dwarfs, the so-called “double degenerate scenario”, had already been suggested relatively early on (Webbink, 1984; Iben Jr and Tutukov, 1984) as a promising type Ia progenitor channel. It was initially modelled as CO-rich matter being accreted from a thick disk onto a central, cold WD (Nomoto and Iben Jr, 1985; Saio and Nomoto, 1985, 1998; Piersanti et al., 2003a, b; Saio and Nomoto, 2004). Since for such thick disks accretion rates close to the Eddington limit (M ∼ 10−5 M yr−1) are expected, most studies concluded that carbon ignition would start in the envelope of the central WD and, as the burning flame propagates inwards within ∼ 5000 years, it would transform the WD from CO into ONeMg (Saio and Nomoto, 1985, 1998). When approaching the Chandrasekhar mass, Ne and Mg would undergo electron captures and the final result would be an accretion-induced collapse to a neutron star rather than a SN Ia. Partially based on these studies, the double degenerate model was long regarded as only the second best model that had some good motivation (consistent rates, lack of hydrogen in SN Ia spectra, Chandrasekhar mass as motivation for the uniformity of type Ia properties), but lacked a convincing pathway to an explosion.

Figure 15:
figure 15

Illustration of the morphology of a WDWD merger (mass ratio q = 0.78). Image reproduced with permission from Diehl et al. (2008), copyright by the author(s).

The Barcelona group were the first to explore the effect of nuclear burning during a WD merger event (Guerrero et al., 2004). They implemented the reduced 14-isotope α-network of Benz et al. (1989) with updated reaction rates into a “vanilla-ice” SPH code with artificial viscosity enhanced by the Balsara factor, see Sections 3.2.1 and 3.2.5. They typically used 40 K particles, approximate initial conditions as described above and explored six different combinations of masses/chemical compositions. They found an orbital dynamics similar to Benz et al. (1989); Segretain et al. (1997) and, although in the outer, partially degenerate layers of the central core temperatures around 109 K were encountered, no dynamically important nuclear burning was observed. Whenever it set in, the remnant had time to quench it by expansion, both for the He and CO accreting systems. Therefore, they concluded that direct SN Ia explosions were unlikely, but some remnants could evolve into subdwarf B objects as suggested in Iben Jr (1990).

Yoon et al. (2007) challenged the “classical picture” of the cold WD accreting from a thick disk as an oversimplification. Instead, they suggested that the subsequent secular evolution of the remnant would be better studied by treating the central object as a differentially rotating CO star with a central, slowly rotating, cold core engulfed by a rapidly rotating hot envelope, which, in turn, is embedded and fed by a centrifugally supported Keplerian accretion disk. The further evolution of such a system is then governed by the thermal cooling of the hot envelope and the redistribution of angular momentum inside of the central remnant and the accretion of the matter from the disk into the envelope. They based their study of the secular remnant evolution on a dynamical merger calculation of two CO WDs with 0.6 and 0.9 M. To this end, they used an SPH code originally developed for neutron star merger calculations (Rosswog et al., 1999, 2000; Rosswog and Davies, 2002; Rosswog and Liebendörfer, 2003; Rosswog et al., 2008) extended by the Helmholtz EOS (Timmes, 1999) and a quasi-equilibrium reduced a-network (Hix et al., 1998). Particular care was taken to avoid artefacts from the artificial viscosity treatment and time-dependent viscosity parameters (Morris and Monaghan, 1997) and a Balsara-switch (Balsara, 1995), see Section 3.2.5, were used in the simulation. As suggested by the work of Segretain et al. (1997), they assumed non-synchronized stars and started the simulations from approximate initial conditions, see Section 4.1.3. Once a stationary remnant had been formed, the results were mapped into a 1D hydrodynamic stellar evolution code (Yoon and Langer, 2004) and its secular evolution was followed including the effects of rotation and angular momentum transport. They found that the growth of the stellar core is controlled by the neutrino cooling at the interface between the core and the envelope and that carbon ignition could be avoided provided that a) once the merger reaches a quasi-static equilibrium temperatures are below the carbon ignition threshold, b) the angular momentum loss occurs on a time scale longer than the neutrino cooling time scale and c) the mass accretion from the centrifugally supported disk is low enough (M ≤ 5 × 10−6–10−5 M yr−1). From such remnants an explosion may be triggered ∼ 105 years after the merger. Such systems, however, may need unrealistically low viscosities.

A more recent study (Shen et al., 2012) started from two remnants of CO WD mergers (Dan et al., 2011) and followed their viscous longterm evolution. Their conclusion was more in line with earlier studies: they expected that the long-term result would be ONe or an accretion-induced collapse to a neutron star rather than a SN Ia.

In recent years, WD mergers have been extensively explored as possible pathways to SN Ia. Not too surprisingly, a number of pathways have been discovered that very likely lead to a thermonuclear explosion directly prior to or during the merger. Whether these explosions are responsible for (some fraction of) normal SNe Ia or for peculiar subtypes needs to be further explored in future work. Many of the recent studies used a number of different numerical tools to explore various aspects of WDWD mergers. We focus here on those studies where SPH simulations were involved.

4.1.4.1 Explosions prior to merger

Dan et al. (2011) had carefully studied the impact of mass transfer on the orbital dynamics. In these SPH simulations the feedback on the orbit is accurately modelled, but due to SPH’s automatic “refinement on density” the properties of the transferred matter are not well resolved. Therefore, a “best-of-both-worlds” approach was followed in Guillochon et al. (2010)/Dan et al. (2011) where the impact of the mass transfer on the orbital dynamics was simulated with SPH, while recording the orbital evolution and the mass transfer rate. This information was used in a second set of simulations that was performed with the Flash code (Fryxell et al., 2000). This second study focussed on the detailed hydrodynamic interaction of the transferred mass with the accretor star. For He-CO binary systems where helium directly impacts on a primary of a mass > 0.9 M, they found that helium surface explosions can be self-consistently triggered via Kelvin-Helmholtz (KH) instabilities. These instabilities occur at the interface between the incoming helium stream and an already formed helium torus around the primary. “Knots” produced by the KH instabilities can lead to local ignition points once the triple-alpha time scale becomes shorter than the local dynamical time scale. The resulting detonations travel around the primary surface and collide on the side opposite to the ignition point. Such helium surface detonations may resemble weak type Ia SNe (Bildsten et al., 2007; Foley et al., 2009; Perets et al., 2010) and they may drive shock waves into the CO core which concentrate in one or more focal points, similar to what was found in the 2D study of Fink et al. (2007). This could possibly lead to an explosion via a “double-detonation” mechanism. In a subsequent large-scale parameter study, Dan et al. (2012, 2014) found, based on a comparison between nuclear burning and hydrodynamical time scales, that a large fraction of the helium-accreting systems do produce explosions early on: all dynamically unstable systems with primary masses < 1.1 M together with secondary masses > 0.4 M triggered helium-detonations at surface contact. A good fraction of these systems could also produce in addition KH-instability-induced detonations as described in detail in Guillochon et al. (2010). There was no definitive evidence for explosions prior to contact for any of the studied CO-transferring systems.

4.1.4.2 Explosions during merger

Pakmor et al. (2010) studied double degenerate mergers, but — contrary to earlier studies — they focussed on very massive WDs with masses close to 0.9 M. They used the Gadget code (Springel, 2005) with some modifications (Pakmor et al., 2012a), for example, the Timmes EOS (Timmes, 1999) and a 13 isotope network were implemented for their study. In order to facilitate the network implementation, the energy equation (instead of, as usually in Gadget, the entropy equation) was evolved. No efforts were undertaken to reduce the constant, untriggered artificial viscosity. They placed the stars on orbit with the approximate initial conditions described above and found the secondary to be disrupted within two orbital periods. In a second step, several hot (> 2.9 × 109 K) particles were identified and the remnant was artificially ignited in these hot spots. The explosion was followed with a grid-based hydrodynamics code (Fink et al., 2007; Röpke and Niemeyer, 2007) that had been used in earlier SN Ia studies. In a third step, the nucleosynthesis was post-processed and synthetic light curves were calculated (Kromer and Sim, 2009). The explosion resulted in a moderate amount of 56Ni (0.1 M), large amounts (1.1 M) of intermediate mass elements and oxygen (0.5 M) and less than 0.1 M of unburnt carbon. The kinetic energy of the explosion (1.3 × 1051 erg) was typical for a SN Ia, but the resulting velocities were relatively small, so that the explosion resembled a sub-luminous 1991bg-like supernova. An important condition for reaching ignition is a mass ratio close to unity. Some variation in total mass is expected, but cases with less than 0.9 M of a primary mass would struggle to reach the ignition temperatures and — if successful — the lower densities would lead to even lower resulting 56Ni masses and therefore lower luminosities. For substantially higher masses, in contrast, the burning would proceed at larger densities and, therefore, result in much larger amounts of 56Ni. Based on population synthesis models (Ruiter et al., 2009) they estimated that mergers of this type of system could account for 2–11 % of the observed SN Ia rate. In a follow up study (Pakmor et al., 2011) the sensitivity of the proposed model to the mass ratio was studied. The authors concluded that binaries with a primary mass near 0.9 M ignite a detonation immediately at contact, provided that the mass ratio q exceeds 0.8. Both the abundance tomography and the lower-than-standard velocities provided support for the idea of this type of merger producing sub-luminous, 1991bg-type supernovae.

As a variation of the theme, Pakmor et al. (2012b) also explored the merger of a higher mass system with 0.9 and 1.1 M. Using the same assumption about the triggering of detonations as before, they found a substantially larger mass of 56Ni (0.6 M), 0.5 M of intermediate mass elements, 0.5 M of Oxygen and about 0.15 M of unburnt carbon. Due to its higher density, only the primary is able to burn 56Ni and, therefore, the brightness of the SN Ia would be closely related to the primary mass. The secondary is only incompletely burnt and thus provides the bulk of the intermediate mass elements. Overall, the authors concluded that such a merger reproduces the observational properties of normal SN Ia reasonably well. In Kromer et al. (2013) the results of a 0.9 and 0.76 M CO-CO merger were analyzed and unburned oxygen close to the center of the ejecta was found which produces narrow emission lines of [O1] in the late-time spectrum, similar to what is observed in the sub-luminous SN 2010lp (Leibundgut et al., 1993).

Fryer et al. (2010) applied a sequence of computational tools to study the spectra that can be expected from a supernova triggered by a double-degenerate merger. Motivated by population synthesis calculations, they simulated a CO-CO binary of 0.9 and 1.2 M with the SNSPH code (Fryer et al., 2006). They assumed that the remnant would explode at the Chandrasekhar mass limit, into a gas cloud consisting of the remaining merger debris. They found a density profile with ρ ∝ r−4, inserted an explosion (Meakin et al., 2009) into such a matter distribution and calculated signatures with the radiation-hydrodynamics code Rage (Fryer et al., 2007, 2009). They found that in such “enshrouded” SNe Ia the debris extends and delays the X-ray flux from a shock breakout and produces a signal that is closer to a SN Ib/c. Also the V-band peak was extended and much broader than in a normal SN Ia with the early spectra being dominated by CO lines only. They concluded that, within their model, a CO-CO merger with a total mass > 1.5 M would not produce spectra and light curves that resemble normal type Ia supernovae.

One of the insights gained from the study of Pakmor et al. (2010) was that (massive) mergers with similar masses are more likely progenitors of SNe Ia than the mergers with larger mass differences that were studied earlier. This motivated Zhu et al. (2013) to perform a large parameter study where they systematically scanned the parameter space from 0.4–0.4 M up to 1.0–1.0 M. In their study, they used the Gasoline code (Wadsley et al., 2004) together with the Helmholtz EOS (Timmes, 1999; Timmes and Swesty, 2000), no nuclear reactions were included. All their simulations were performed with non-spinning stars and approximate initial conditions as outlined above. Mergers with “similar” masses produced a well-mixed, hot central core while “dissimilar” masses produced a rather unaffected cold core surrounded by a hot envelope and disk, consistent with earlier studies. They found that the central density ratio of the accreting and donating star of ρ a /ρ d > 0.6 is a good criterion for those systems that produce hot cores (i.e., to define “similar” masses).

Dan et al. (2014) also performed a very broad scan of the parameter space. They studied the temperature distribution inside the remnant for different stellar spins: tidally locked initial conditions produce hot spots (which are the most likely locations for detonations to be initiated) in the outer layers of the core, while irrotational systems produce them deep inside of the core, consistent with the results of Zhu et al. (2013). Thus, the spin state of the WDs may possibly be decisive for the question where the ignition is triggered which may have its bearings on the resulting supernova. Dan et al. found essentially no chemical mixing between the stars for mass ratios below q ≈ 0.45, but maximum mixing for a mass ratio of unity. Contrary to Zhu et al. (2013), nowhere complete mixing was found, but this difference can be convincingly attributed to the different stellar spins that were investigated (tidal locking in Dan et al., 2014, no spins in Zhu et al., 2013). In addition to the helium-accreting systems that likely undergo a detonation for a total mass beyond 1.1 M, they also found CO binaries with total masses beyond 2.1 M to be prone to a CO explosion. Such systems may be candidates for the so-called super-Chandrasekhar SN Ia explosions. They also discussed the possibility of “hybrid supernovae” where a ONeMg core with a significant helium layer collapses and forms a neutron star. While technically being a (probably weak) core-collapse supernova (Podsiadlowski et al., 2004; Kitaura et al., 2006), most of the explosion energy may come from helium burning. Such hybrid supernovae may be candidates for the class of “Ca-rich” SNe Ib (Perets et al., 2010) as the burning conditions seem to favor the production of intermediate mass elements.

Raskin et al. (2012) explored 10 merging WDWD binary systems with total masses between 1.28 and 2.12 M with the SNSPH code (Fryer et al., 2006) coupled to the Helmholtz EOS and a 13-isotope nuclear network. They used a heuristic procedure to construct tidally deformed, synchronized binaries by letting the stars fall towards each other in free fall, and repeatedly set the fluid velocities to zero. They had coated their CO WDs with atmospheres of helium (smaller than 2.5% of the total mass) and found that for all cases where the primary had a mass of 1.06 M, the helium detonated, no carbon detonation was encountered, though.

Given the difficulty to identify the central engine of a SNe Ia on purely theoretical grounds, Raskin and Kasen (2013) explored possible observational signatures stemming from the tidal tails of a WD merger. They followed the ejecta from a SNSPH simulation (Fryer et al., 2006) with n-body methods and — assuming spherical symmetry — they explored by means of a 1D Lagrangian code how a supernova would interact with such a medium. Provided the time lag between merger and supernova is short enough (< 100 s), detectable shock emission at radio, optical, and/or X-ray wavelengths is expected. For delay times between 108 s and 100 years one expects broad NaID absorption features, and, since this has not been observed to date, they concluded that if (some) type Ia supernovae are indeed caused by WDWD mergers, the delay times need to be either short (< 100 s) or rather long (> 100 years). If the tails can expand and cool for ∼ 104 years, they produce the observable narrow NaID and Ca II K & K lines which are seen in some fraction of type Ia supernovae.

An interesting study from a Santa Cruz-Berkeley collaboration (Moll et al., 2014; Raskin et al., 2014) combined again the strengths of different numerical methods. The merger process was calculated with the SNSPH code (Fryer et al., 2006), the subsequent explosion with the grid-based code Castro (Almgren et al., 2010; Zhang et al., 2011) and synthetic light curves and spectra were calculated with Sedona (Kasen et al., 2006). For some cases without immediate explosion the viscous remnant evolution was followed further with ZEUS-MP2 (Hayes et al., 2006). The first part of the study (Moll et al., 2014) focussed on prompt detonations, while the second part explored the properties of detonations that emerge in later phases, after the secondary has been completely disrupted. For the first part, three mergers (1.20–1.06, 1.06–1.06 and 0.96–0.81 M) were simulated with the SPH code and subsequently mapped into Castro, where soon after simulation start (< 0.1 s) detonations emerged. They found the best agreement with common SNe Ia (0.58 M of 56Ni) for the binary with 0.96–0.81 M. More massive systems lead to more 56Ni and therefore unusually bright SNe Ia. The remnant asymmetry at the moment of detonation leads to large asymmetries in the elemental distributions and therefore to strong viewing angle effects for the resulting supernova. Depending on viewing angle, the peak bolometric luminosity varied by a factor of two and the flux in the ultraviolet even varied by an order of magnitude. All of the three models approximately fulfilled the width-luminosity relation (“brighter means broader”; Phillips et al., 1999).

The companion study (Raskin et al., 2014) explored cases where the secondary has become completely disrupted before a detonation sets in, so that the primary explodes into a disk-like CO structure. To initiate detonations, the SPH simulations were stopped once a stationary structure had formed, all the material with ρ > 106 g cm−3 was burnt in a separate simulation and, subsequently, the generated energy was deposited back as thermal energy into the merger remnant and the SPH simulation was resumed. As a double-check of the robustness of this approach, two simulations were also mapped into Castro and detonated there as well. Overall there was good agreement both in morphology and the nucleosynthetic yields. The explosion inside the disk produced an hourglass-shaped remnant geometry with strong viewing angle effects. The disk scale height, initially set by the mass ratio and the different merger dynamics and burning processes, turned out to be an important factor for the viewing angle dependence of the later supernova. The other crucial factor was the primary mass that determines the resulting amount of 56Ni. Interestingly, the location of the detonation spot, whether at the surface or in the core of the primary, had a relatively small effect compared to the presence of an accretion disk. While qualitatively in agreement with the width-luminosity relation, the lightcurves lasted much longer than standard SNe Ia. The surrounding CO disk from the secondary remained essentially unburnt, but, impeding the expansion, it lead to relatively small intermediate mass element absorption velocities. The large asymmetries in the abundance distribution could lead to a large overestimate of the involved 56Ni masses if spherical symmetry is assumed in interpreting observations. The lightcurves and spectra were peculiar with weak features from intermediate mass elements but relatively strong carbon absorption. The study also explored how longer-term viscous evolution before a detonation sets in would affect the supernova. Longer delay times were found to produce likely larger 56Ni masses and more symmetrical remnants. Such systems might be candidates for super-Chandra SNe Ia.

4.1.5 Simulations of white dwarf—white dwarf collisions

A physically interesting alternative to gravitational-wave-driven binary mergers are dynamical collisions between two WDs as they are expected in globular clusters and galactic cores. They have first been explored, once more, by Benz et al. (1989), probably in one of the first hydrodynamic simulations of WDs that included a nuclear reaction network. At that time they found that nuclear burning could, in central collisions, help to undbind a substantial amount of matter, but this amount was found to be irrelevant for the chemical evolution of galaxies. More recently, the topic has been taken up again by Rosswog et al. (2009a) and, independently, by Raskin et al. (2009). Both of these studies employed SPH simulations (2 × 106 SPH particles in Rosswog et al., 2009a, 8 × 105 particles in Raskin et al., 2009) coupled to small nuclear reactions networks. Both groups concluded that an amount of radioactive 56Ni could be produced that is comparable to what is observed in normal type Ia supernovae (∼ 0.5 M⊙). In Rosswog et al. (2009a), one of the simulations was repeated within the Flash code (Fryxell et al., 2000) to judge the robustness of the result and given the enormous sensitivity of the nuclear reactions to temperature overall good agreement was found, see Figure 16. Rosswog et al. also post-processed the nucleosynthesis with larger nuclear networks and calculated synthetic light curves and spectra with the Sedona code (Kasen et al., 2006). Interestingly, the resulting light-curves and spectra (Rosswog et al., 2009a) look like normal type Ia supernova, even the width-luminosity relationship (Phillips et al., 1999) is fulfilled to good accuracy. These results have been confirmed recently in further studies (Raskin et al., 2010; Hawley et al., 2012; Kushnir et al., 2013; García-Senz et al., 2013).

Figure 16:
figure 16

Comparison of the head-on collision of two WDs once calculated with an SPH code (Rosswog et al., 2008) (upper row) and once with the Adaptive Mesh Refinement code Flash (Fryxell et al., 2000) (lower row). Each of the panels is split into density (left) and temperature (right). Given the enormous temperature sensitivity of explosive nuclear burning the agreement is remarkably good, the produced energy from the SPH calculation is log(Eerg) = 51.21 and log(Eerg) = 51.11 for the Flash simulation. Image reproduced with permission from Rosswog et al. (2009a), copyright by AAS.

This result is interesting for a number of reasons. First, the detonation mechanism is parameterfree and extremely robust: the free-fall velocity between WDs naturally produces relative velocities in excess of the WD sound speeds, therefore strong shocks are inevitable. Moreover, the most likely involved WD masses are near the peak of the mass distribution, ∼ 0.6 M, or, due to mass segregation effects, possibly slightly larger, but well below the Chandrasekhar mass. This has the benefit that the nuclear burning occurs at moderate densities (ρ ∼ 107 g cm−3) and thus produces naturally the observed mix of ∼ 0.5 M 56Ni and intermediate-mass elements, without any fine-tuning such as the deflagration-detonation transition that is required in the single-degenerate scenario (Hillebrandt and Niemeyer, 2000). However, based on simple order of magnitude estimates, the original studies (Raskin et al., 2009; Rosswog et al., 2009a) concluded that, while being very interesting explosions, the rates would likely be too low to make a substantial contribution to the observed supernova sample. More recently, however, there have been claims (Thompson, 2011; Kushnir et al., 2013) that the Kozai-Lidov mechanism in triple stellar systems may substantially boost the rates of WDWD collisions so that they could constitute a sizeable fraction of the SN Ia rate. Contrary to these claims, a recent study by Hamers et al. (2013) finds that the contribution from the triple-induced channels to SN Ia is small. Here further studies are needed to quantify how relevant collisions really are for explaining normal SNe Ia.

4.1.6 Summary: Double white-dwarf encounters

SPH has very often been used to model mergers of, and later on also collisions between, two WDs. This is mainly since SPH is not restricted by any predefined geometry and has excellent conservation properties. As illustrated in the numerical experiment shown in Figure 13, even small (artificial) losses of angular momentum can lead to very large errors in the prediction of the mass transfer duration and the gravitational-wave signal. SPH’s tendency to “follow the density” makes it ideal to predict, for example, the gravitational-wave signatures of WD mergers. But it is exactly this tendency which makes it very difficult for SPH to study thermonuclear ignition processes in low-density regions that are very important, for example, for the double-detonation scenario. This suggests to apply in such cases a combination of numerical tools: SPH for bulk motion and orbital dynamics and Eulerian (Adaptive Mesh) hydrodynamics for low-density regions that need high resolution. As outlined above, there have recently been a number of successful studies that have followed such strategies.

SPH simulations have played a pivotal role in “re-discovering” the importance of white-dwarf mergers (and possibly, to some extent, collisions) as progenitor systems of SNe Ia. In the last few years, a number of new possible pathways to thermonuclear explosions prior to or during a WD merger have been discovered. There is, however, not yet a clear consensus whether they produce just “peculiar” SN Ia-like events or whether they may be even responsible for the bulk of “standard” SN Ia. Here, a lot of progress can be expected within the next few years, both from the modelling and the observational side.

4.2 Encounters between neutron stars and black holes

4.2.1 Relevance

The relevance of compact binary systems consisting of two neutron stars (NS) or a neutron star and a stellar-mass black hole (BH) is hard to overrate. The first observed double neutron-star system, PSR 1913+16, proved, although indirectly, the existence of gravitational waves (GWs): the orbit decays in excellent agreement with the predictions from general relativity (GR) (Taylor and Weisberg, 1989; Weisberg et al., 2010). By now, there are 10 binary systems that are thought to consist of two neutron stars (Lorimer, 2008), but to date no NSBH system has been identified, although both types of systems should form in similar evolutionary processes. This is usually attributed to possibly smaller numbers in comparison to NS-NS systems (Belczynski and Ziolkowski, 2009) and to a lower detection probability in current surveys. Once formed, gravitational-wave emission drives compact binary systems towards coalescence on a time scale approximately given by Lorimer (2008)

$${\tau _{{\rm{GW}}}} = 9.83 \times {10^6}{\rm{years}}{\left({{{{P_{{\rm{orb}}}}} \over {{\rm{hr}}}}} \right)^{8/3}}{\left({{M \over {{M_ \odot }}}} \right)^{ - 2/3}}{\left({{\mu \over {{M_ \odot }}}} \right)^{ - 1}}{(1 - {e^2})^{7/2}},$$
(153)

where Porb is the current orbital period, M the total and μ the reduced mass and e the eccentricity. This implies that the initial orbital period must be ≤ 1 day for a coalescence to occur within a Hubble time. As the inspiral time depends sensitively on the orbital period and eccentricity, which are set by individual evolutionary histories, one expects a large spread in inspiralling times. Near its innermost stable circular orbit (ISCO) a compact binary system emits gravitational waves at a frequency and amplitude of

$$f \approx 594\,{\rm{Hz}}{\left({{{6GM} \over {{c^2}a}}} \right)^{3/2}}\left({{{7.4{M_ \odot }} \over M}} \right)$$
(154)
$$h \approx 3.6 \times {10^{ - 22}}\left({{{{m_1}} \over {6{M_ \odot }}}} \right)\left({{{{m_2}} \over {1.4{M_ \odot }}}} \right)\left({{{6GM} \over {{c^2}a}}} \right)\left({{{100{\rm{Mpc}}} \over r}} \right),$$
(155)

where a is the binary separation, r the distance to the GW source and the scalings are oriented at a NSBH system. Thus the emission from the late inspiral stages will sweep through the frequency bands of the advanced detector facilities from ∼ 10 to ∼ 3000 Hz (lig; vir) making compact binary mergers (CBMs) the prime targets for the ground-based gravitational-wave detections. The estimated coalescence rates are rather uncertain, though, they range from 1–1000 Myr−1 MWEG−1 for NSNS binaries and from 0.05–100 Myr−1 MWEG−1 for NSBH binaries (Abadie, 2010), where MWEG stands for “Milky Way Equivalent Galaxy”.

In addition, CBMs have long been suspected to be the engines of short gamma-ray bursts (sGRBs) (Paczyński, 1986; Goodman, 1986; Eichler et al., 1989; Narayan et al., 1992). Although already their projected distribution on the sky and their fluence distribution pointed to a cosmological source (Goodman, 1986; Paczyński, 1986; Schmidt et al., 1988; Meegan et al., 1992; Piran, 1992; Schmidt, 2001; Guetta and Piran, 2005), their cosmological origin was only firmly established by the first afterglow observations for short bursts in 2005 (Hjorth et al., 2005; Bloom et al., 2006). This established the scale for both distance and energy and proved that sGRBs occur in both early- and late-type galaxies. CBMs are natural candidates for sGRBs since accreting compact objects are very efficient converters of gravitational energy into electromagnetic radiation, they occur at rates that are consistent with those of sGRBs (Guetta and Piran, 2005; Nakar et al., 2006; Guetta and Piran, 2006) and CBMs are expected to occur in both early and late-type galaxies. Kicks imparted at birth provide a natural explanation for the observed projected offsets of ∼ 5 kpc from their host galaxy (Fong et al., 2013), and, with dynamical time scales of ∼ 1 ms (either orbital at the ISCO or the neutron star oscillation time scales), they naturally provide the observed short-time fluctuations. Moreover, for cases where an accretion torus forms, the expected viscous lifetime is roughly comparable with a typical sGRB duration (∼ 0.2 s). While this picture is certainly not without open questions, see Piran (2004); Lee and Ramirez-Ruiz (2007); Nakar (2007); Gehrels et al. (2009); Berger (2011, 2013) for recent reviews, it has survived the confrontation with three decades of observational results and — while competitors have emerged — it is still the most commonly accepted model for the engine of short GRBs.

Lattimer and Schramm (1974, 1976) and Lattimer et al. (1977) suggested that the decompression of initially cold neutron star matter could lead to rapid neutron capture, or “r-process”, nucleosynthesis so that CBMs may actually also be an important source of heavy elements. Although discussed convincingly in a number of subsequent publications (Symbalisty and Schramm, 1982; Eichler et al., 1989), this idea kept the status of an exotic alternative to the prevailing idea that the heaviest elements are formed in core-collapse supernovae, see Arcones and Thielemann (2013) for a recent review. Early SPH simulations of NSNS mergers (Rosswog et al., 1998, 1999) showed that of order ∼ 1% of neutron-rich material is ejected per merger event, enough to be a substantial or even the major source of r-process. A nucleosynthesis post-processing of these SPH results (Freiburghaus et al., 1999) confirmed that this material is indeed a natural candidate for the robust, heavy r-process (Sneden et al., 2008). Initially it was doubted (Qian, 2000; Argast et al., 2004) that CMBs as main r-process source are consistent with galactic chemical evolution, but a recent study based on a detailed chemical evolution model (Matteucci et al., 2014) finds room for a substantial contribution of neutron star mergers and recent hydrodynamic galaxy formation studies (Shen et al., 2014; van de Voort et al., 2015) even come to the conclusion that neutron star mergers are consistent with being the dominant r-process source in the Universe. Moreover, state-of-the-art supernova models seem unable to provide the conditions that are needed to produce heavy elements with nucleon numbers in excess of A = 90 (Arcones et al., 2007; Fischer et al., 2010; Hüdepohl et al., 2010; Roberts et al., 2010).Footnote 12 On the other hand, essentially all recent studies agree that CBMs eject enough mass to be at least a major r-process source (Oechslin et al., 2007; Bauswein et al., 2013b; Rosswog et al., 2014; Hotokezaka et al., 2013b; Kyutoku et al., 2013) and that the resulting abundance pattern beyond A ≈ 130 resembles the solar-system distribution (Metzger et al., 2010b; Roberts et al., 2011; Korobkin et al., 2012; Goriely et al., 2011a,b; Eichler et al., 2015). Recent studies (Wanajo et al., 2014; Just et al., 2014) suggest that compact binary mergers with their different ejection channels for neutron-rich matter could actually even be responsible for the whole range of r-process.

In June 2013, the SWIFT satellite detected a relatively nearby (z = 0.356) sGRB, GRB 130603B, (Melandri et al., 2013) for which the HST (Tanvir et al., 2013; Berger et al., 2013) detected 9 days after the burst a nIR point source with properties close to what had been predicted (Kasen et al., 2013; Barnes and Kasen, 2013; Grossman et al., 2014; Tanaka and Hotokezaka, 2013; Tanaka et al., 2014) by models for “macro-” or “kilonovae” (Li and Paczyński, 1998; Kulkarni, 2005; Rosswog, 2005; Metzger et al., 2010a,b; Roberts et al., 2011), radioactively powered transients from the decay of freshly produced r-process elements. If this interpretation is correct, GRB 130603B would provide the first observational confirmation of the long-suspected link between CBMs, nucleosynthesis and gamma-ray bursts.

4.2.2 Differences between double neutron and neutron star black hole mergers

Both NSNS and NSBH systems share the same three stages of the merger: a) the secular inspiral where the mutual separation is much larger than the object radii and the orbital evolution can be very accurately described by post-Newtonian methods (Blanchet, 2014), b) the merger phase where relativistic hydrodynamics is important and c) the subsequent ringdown phase. Although similar in their formation paths and in their relevance, there are a number of differences between NSNS and NSBH systems. For example, when the surfaces of two neutron stars come into contact the neutron star matter can heat up and radiate neutrinos. Closely related, if matter from the interaction region is dynamically ejected, it may — due to the large temperatures — have the chance to change its electron fraction due to weak interactions. Such material may have a different nucleosynthetic signature than the matter that is ejected during a NSBH merger. Moreover, the interface between two neutron stars is prone to hydrodynamic instabilities that can amplify existing neutron star magnetic fields (Price and Rosswog, 2006; Anderson et al., 2008; Rezzolla et al., 2011; Zrake and MacFadyen, 2013). The arguably largest differences between the two types of mergers, however, are the total binary mass and its mass ratio q. For neutron stars, the deviation from unity ∣ 1 − q ∣ is small — for masses that are known to better than 0.01 M, J1807−2500B has the largest deviation with a mass ratio of q = 0.88 (Lattimer, 2012) — while for NSBH systems a broad range of total masses is expected. Since the merger dynamics is very sensitive to the mass ratio, a much larger diversity is expected in the dynamical behavior of NSBH systems. The larger bh mass has also as a consequence that the plunging phase of the orbit sets in at larger separations and therefore lower GW frequencies, see Eq. (154). Moreover, for black holes the dimensionless spin parameter can be close to unity, aBH = cJ/GM2 ≈ 1, while for neutron stars it is restricted by the mass shedding limit to somewhat lower values aNS < 0.7 (Lo and Lin, 2011), therefore spin-orbit coupling effects could potentially be larger for NSBH systems and lead to observable effects, e.g., Buonanno et al. (2003); Grandclément et al. (2004).

4.2.3 Challenges

Compact binary mergers are challenging to model since in principle each of the fundamental interactions enters: the strong (equation of state), weak (composition, neutrino emission, nucleosynthesis), electromagnetic (transients, neutron star crust) and of course (strong) gravity. Huge progress has been achieved during the last decade, but so far none of the approaches “has it all” and, depending on the focus of the study, different compromises have to be accepted.

The compactness parameters, CGM/Rc2, of ∼ 0.2 for neutron stars and ∼ 0.5 for black holes suggest a general-relativistic treatment. Contrary to the WDWD case discussed in Section 4.1, now the gravitational and orbital time-scales can become comparable

$${{{\tau _{{\rm{GW}}}}} \over {{P_{{\rm{orb}}}}}} = {{a/\vert{{\dot a}_{{\rm{GW}}}}\vert} \over {2\pi/{\omega _{\rm{K}}}}} = 5.0{\left({{a \over {6G(M/2.8{M_ \odot })/{c^2}}}} \right)^{5/2}}\left({{{1.4{M_ \odot }} \over {{m_1}}}} \right)\left({{{1.4{M_ \odot }} \over {{m_2}}}} \right){\left({{{2.8{M_ \odot }} \over M}} \right)^{1/2}},$$
(156)

so that backreaction from the gravitational-wave emission on the dynamical evolution can no more be safely neglected. And, unless one considers the case where the black hole completely dominates the spacetime geometry, it is difficult to make admissible approximations to dynamical GR gravity.

Of comparable importance is the neutron star equation of state (EOS). Together with gravity it determines the compactness of the neutron stars which in turn impacts on peak GW frequencies. It also influences the torus masses, the amount of ejected matter, and, since it sets the β-equilibrium value of the electron fraction Y e in neutron star matter, also the resulting nucleosynthesis and the possibly emerging electromagnetic transients. Unfortunately, the EOS is not well known beyond ∼ 3 times nuclear density. On the other hand, since the EOS seriously impacts on a number of observable properties from a CBM, one can be optimistic and hope to conversely constrain the high-density EOS via astronomical observations.

Since the bulk of a CBM remnant consists of high density matter (ρ > 1010 g cm−3), photons are very efficiently trapped and the only cooling agents are neutrinos. Moreover, with temperatures of order MeV, weak interactions become so fast that they change the electron fractions Y e substantially on a dynamical time scale (∼ 1 ms). While such effects can be safely ignored when gravitational-wave emission is the main focus, these processes are crucial for the neutrino signature, for the “engine physics” of GRBs (e.g., via \(\nu \bar \nu\) annihilation), and for nucleosynthesis, since the nuclear reactions are very sensitive to the neutron-to-proton ratio which is set by the weak interactions. Fortunately, the treatment of neutrino interactions is not as delicate as in the core-collapse supernova case where changes on the percent level can decide between a successful and a failed explosion (Janka et al., 2007). Thus, depending on the exact focus, leakage schemes or simple transport approximations may be admissible in the case of compact binary mergers. But also approximate treatments face the challenge that the remnant is intrinsically three-dimensional, that the optical depths change from τ ∼ 104 in the hypermassive neutron star (e.g., Rosswog and Liebendörfer, 2003), to essentially zero in the outer regions of the disk and that the neutrino-nucleon interactions are highly energy-dependent.

Neutron stars are naturally endowed with strong magnetic fields and a compact binary merger offers a wealth of possibilities to amplify them. They may be decisive for the fundamental mechanism to produce a GRB in the first place, but they may also determine — via transport of angular momentum — when the central object produced in a NSNS merger collapses into a black hole or how accretion disks evolve further under the transport mediated via the magneto-rotational instability (MRI) (Balbus and Hawley, 1998).

In addition to these challenges from different physical processes, there are also purely numerical challenges which are, however, closely connected to the physical ones. For example, the length scales that need to be resolved to follow the growth of the MRI can be minute in comparison to the natural length scales of the problem. Or, another example, the viscous dissipation time scale of an accretion disk,

$${\tau _{{\rm{visc}}}}\sim 0.3{\rm{s}}\left({{{0.1} \over \alpha }} \right){\left({{r \over {300{\rm{km}}}}} \right)^{3/2}}{\left({{{3{M_ \odot }} \over M}} \right)^{1/2}}{\left({{{r/H} \over 2}} \right)^2}$$
(157)

with α being the Shakura-Sunyaev viscosity parameterization (Shakura and Sunyaev, 1973), M the central mass, r a typical disk radius and H the disk scale height, may be challengingly long in comparison to the hydrodynamic time step that is allowed by the CFL stability criterion (Press et al., 1992),

$$\Delta t < {10^{ - 5}}{\rm{s}}\left({{{\Delta x} \over {1\,{\rm{km}}}}} \right)\left({{{0.3c} \over {{c_{\rm{s}}}}}} \right).$$
(158)

4.2.4 The current status of SPH- vs grid-based simulations

A lot of progress has been achieved in recent years in simulations of CBMs. This includes many different microphysical aspects as well as the dynamic solution of the Einstein equations. The main focus of this review are SPH methods and therefore we will restrict the detailed discussion to work that makes use of SPH. Nevertheless, it is worth briefly comparing the current status of SPH-based simulations with those that have been obtained with grid-based methods. As will be explained in more detail below, SPH had from the beginning a very good track record with respect to the implementation of various microphysics ingredients. On the relativistic gravity side, however, it is lagging behind in terms of implementations that dynamically solve the Einstein equations, a task that has been achieved with Eulerian methods already more than a decade ago (Shibata, 1999; Shibata and Uryū, 2000). In SPH, apart from Newtonian gravity, post-Newtonian and conformal flatness approaches (CFA) exist, but up to now no coupling between SPH-hydrodynamics and a dynamic spacetime solver has been achieved.

Naturally, this has implications for the types of problems that have been addressed and there are interesting questions related to NSNS and NSBH mergers that have so far not yet (or only approximately) been tackled with SPH approaches. One example with far-reaching astrophysical consequences is the collapse of a hypermassive neutron star (HMNS) that temporarily forms after a binary neutron star merger. Observations now indicate a lower limit on the maximum neutronstar mass of around 2.0 M (1.97 ± 0.04 M for PSR J1614+2230, see Demorest et al., 2010, and 2.01 ± 0.04 M for PSR J0348+0432, see Antoniadis et al., 2013) and, therefore, it is very likely that the hot and rapidly differentially rotating central remnant of a NSNS merger is at least temporarily stabilized against a gravitational collapse to a black hole, see e.g., Hotokezaka et al. (2013a). It appears actually entirely plausible that the low-mass end of the NSNS distribution may actually leave behind a massive but stable neutron star rather than a black hole. In SPH, the question when a collapse sets in, can so far only be addressed within the conformal flatness approximation (CFA), see below. Being exact for spherical symmetry, the CFA should be fairly accurate in describing the collapse itself. It is, however, less clear how accurate the CFA is during the last inspiral stages where the deviations from spherical symmetry are substantial. Therefore quantitative statements about the HMNS lifetimes need to be interpreted with care. On the other hand, differential rotation has a major impact in the stabilization and therefore hydrodynamic resolution is also crucial for the question of the collapse time. Here, SPH with its natural tendency to refine on density should perform very well once coupled to a dynamic spacetime solver.

Another question of high astrophysical significance is the amount of matter that becomes ejected into space during a NSNS or NSBH merger. It is likely one of the major sources of r-process in the cosmos and thought to cause electromagnetic transients similar to the recently observed “macronova” event in the aftermath of GRB 130603B (Tanvir et al., 2013; Berger et al., 2013). In terms of mass ejection, one could expect large differences between the results of fully relativistic, grid-based hydrodynamics results and Newtonian or approximate GR SPH approaches. The expected differences are twofold. On the one hand, Newtonian/approximate GR treatments may yield stars of a different compactness which in turn would influence the dynamics and torques and therefore the ejecta amount. But apart from gravity, SPH has a clear edge in dealing with ejecta: mass conservation is exact, advection is exact (i.e., a composition only changes due to nuclear reactions but not due to numerical effects), angular momentum conservation is exact and vacuum really corresponds to the absence of matter. Eulerian schemes usually face here the challenges that conservation of mass, angular momentum and the accuracy of advection are resolution-dependent and that vacuum most often is modelled as background fluid of lower density. Given this rather long list of challenges, it is actually encouraging that the results of different groups with very different methods agree reasonably well these days. For NSNS mergers, Newtonian SPH simulations (Rosswog, 2013) find a range from 8 × 10−3–4 × 10−2 M, approximate GR SPH calculations (Bauswein et al., 2013b) find a range from 10−3–2 × 10−2 M, and full GR calculations (Hotokezaka et al., 2013b) find 10−4–10−2 M.Footnote 13 Even the results from Newtonian NSBH calculations agree quite well with the GR results (compare Table 1 in Rosswog, 2013 and the results of Kyutoku et al., 2013).

Another advantage of SPH is that one can also decide to just focus on the ejected matter. For example, a recent SPH-based study (Rosswog et al., 2014) has followed the evolution of the dynamic merger ejecta for as long as 100 years, while Eulerian methods are usually restricted to very few tens of milliseconds. During this expansion the density was followed from supra-nuclear densities (> 2 × 1014 g/ccm) down to values that are below the interstellar matter density (< 10−25 g/ccm).

For many of the topics that will be addressed below, there have been parallel efforts on the Eulerian side and within the scope of this review we will not be able to do justice to all these parallel developments. As a starting point, we want to point to a number of excellent textbooks (Alcubierre, 2008; Baumgarte and Shapiro, 2010; Rezzolla and Zanotti, 2013) that deal with relativistic (mostly Eulerian) fluid dynamics and to various recent review articles (Duez, 2010; Shibata and Taniguchi, 2011; Faber and Rasio, 2012; Pfeiffer, 2012; Lehner and Pretorius, 2014).

4.2.5 Neutron star—neutron star mergers

4.2.5.1 Early Newtonian calculations with polytropic equation of state

The earliest NSNS merger calculations (Rasio and Shapiro, 1992; Davies et al., 1994; Zhuge et al., 1994; Rasio and Shapiro, 1994, 1995; Zhuge et al., 1996) were performed with Newtonian gravity and a polytropic equation of state, sometimes a simple gravitational-wave backreaction force was added. While these initial studies were, of course, rather simple, they set the stage for future efforts and settled questions about the qualitative merger dynamics and some of the emerging phenomena. For example, they established the emergence of a Kelvin-Helmholtz unstable vortex sheet at the interface between the two stars, which, due to the larger shear, is more pronounced for initially non-rotating stars. Moreover, they confirmed the expectation that a relatively baryon-free funnel would form along the binary rotation axis (Davies et al., 1994) (although this conclusion may need to be revisited in the light of emerging, neutrino-driven winds, see e.g., Dessart et al., 2009; Perego et al., 2014; Martin et al., 2015). These early simulations also established the basic morphological differences between tidally locked and irrotational binaries and between binaries of different mass ratios. In addition, these studies also drove technical developments that became very useful in later studies such as the relaxation techniques to construct synchronized binary systems (Rasio and Shapiro, 1994) or the procedures to analyze the GW energy spectrum in the frequency band (Zhuge et al., 1994, 1996). See also Rasio and Shapiro (1999) for a review on earlier research.

4.2.5.2 Studies with focus on microphysics

Studies with a focus on microphysics (in a Newtonian framework) were pioneered by Ruffert et al. (1996) who implemented a nuclear equation of state and a neutrino-leakage scheme into their Eulerian (PPM) hydrodynamics code. In SPH, the effects of a nuclear equation of state (EOS) were first explored in Rosswog et al. (1999). The authors implemented the Lattimer-Swesty EOS (Lattimer and Swesty, 1991) and neutrino cooling in the simple free-streaming limit to bracket the effects that neutrino emission may possibly have. To avoid artefacts from excessive artificial dissipation, they also included the time-dependent viscosity parameters suggested by Morris and Monaghan (1997), see Section 3.2.5. They found torus masses between 0.1 and 0.3 M, and, maybe most importantly, that between 4 × 10−3 and 4 × 10−2 M of neutron star matter becomes ejected. A companion paper (Freiburghaus et al., 1999) post-processed trajectories from this study and found that all the matter undergoes r-process and yields an abundance pattern close to the one observed in the solar system, provided that the initial electron fraction is Y e ≈ 0.1. A subsequent study (Rosswog et al., 2000) explored the effects of different initial stellar spins and mass ratios q ≠ 1 on the ejecta masses.

The simulation ingredients were further refined in Rosswog and Davies (2002); Rosswog and Liebendörfer (2003); Rosswog et al. (2003). Here, the Shen et al. (1998a, b) EOS, extended down to very low densities, was used and a detailed multi-flavor neutrino leakage scheme was developed (Rosswog and Liebendörfer, 2003) that takes particular care to avoid using average neutrino energies. These studies were also performed at a substantially higher resolution (up to 106 particles) than previous SPH studies of the topic. The typical neutrino luminosities turned out to be ∼ 2 × 1053 erg/s with typical average neutrino energies of 8/15/20 MeV for ν e , \({\bar \nu _e}\) and the heavy lepton neutrinos. Since GRBs were a major focus of the studies, neutrino annihilation was calculated in a post-processing step and, barring possible complications from baryonic pollution, it was concluded that \(\nu \bar \nu \) annihilation should lead to relativistic ouflows and could produce moderately energetic sGRBs. Simple estimates indicated, however, that strong neutrino-driven winds are likely to occur, that could, depending on the wind geometry, pose a possible threat for the emergence of ultra-relativistic outflow/a sGRB. A more recent, 2D study of the neutrino emission from a merger remnant (Dessart et al., 2009) found indeed strong baryonic winds with mass loss rates Ṁ ∼ 10−3 M/s emerging along the binary rotation axis. Recently, studies of neutrino-driven winds have been extended to 3D (Perego et al., 2014) and the properties of the blown-off material have been studied in detail. This complex of topics, ν-driven winds, baryonic pollution, collapse of the central merger remnant will for sure receive more attention in the future. Based on simple arguments, it was also argued that any initial seed magnetic fields should be amplified to values near equipartition, making magnetically launched/aided outflow likely. Subsequent studies (Price and Rosswog, 2006; Anderson et al., 2008; Rezzolla et al., 2011; Zrake and MacFadyen, 2013) found indeed a strong amplification of initial seed magnetic fields.

In a recent set of studies, the neutron star mass parameter space was scanned by exploring systematically mass combinations from 1.0 to 2.0 M (Korobkin et al., 2012; Rosswog et al., 2013; Rosswog, 2013). The main focus here was the dynamically ejected mass and its possible observational signatures. One interesting result (Korobkin et al., 2012) was that the nucleosynthetic abundance pattern is essentially identical for the dynamic ejecta of all mass combinations and even NSBH systems yield practically an identical pattern. While extremely robust to a variation of the astrophysical parameters, the pattern showed some sensitivity to the involved nuclear physics, for example to a change of the mass formula or the distribution of the fission fragments. The authors concluded that the dynamic ejecta of neutron star mergers are excellent candidates for the source of the heavy, so-called “strong r-process” that is observed in a variety of metal-poor stars and shows each time the same relative abundance pattern for nuclei beyond barium (Sneden et al., 2008). Based on these results, predictions were made for the resulting “macronovae” (or sometimes called “kilonovae”) (Li and Paczyński, 1998; Kulkarni, 2005; Rosswog, 2005; Metzger et al., 2010a,b; Roberts et al., 2011). The first set of models assumed spherical symmetry (Piran et al., 2013; Rosswog et al., 2013), but subsequent studies Grossman et al. (2014) were based on the 3D remnant structure obtained by hydrodynamic simulations of the expanding ejecta. This study included the nuclear energy release during the hydrodynamic evolution (Rosswog et al., 2014). The study substantially profited from SPH’s geometric flexibility and its treatment of vacuum as just empty (i.e., SPH particle-free) space. The ejecta expansion was followed for as many as 40 orders of magnitude in density, from nuclear matter down to the densities of interstellar matter. Since from the latter calculations the 3D remnant structure was known, also viewing angle effects for macronovae could be explored (Grossman et al., 2014). Accounting for the very large opacities of the r-process ejecta (Barnes and Kasen, 2013; Kasen et al., 2013), Grossman et al. (2014) predicted that the resulting macronova should peak, depending on the binary system, between 3 and 5 days after the merger in the nIR, roughly consistent with what has been observed in the aftermath of GRB 130603B (Tanvir et al., 2013; Berger et al., 2013).

4.2.5.3 Studies with approximate GR gravity

A natural next step beyond Newtonian gravity is the application of post-Newtonian expansions. Blanchet et al. (1990) developed an approximate formalism for moderately relativistic, self-gravitating fluids which allows to write all the equations in a quasi-Newtonian form and casts all relativistic non-localities in terms of Poisson equations with compactly supported sources. The 1PN equations require the solution of eight Poisson equations and to account for the lowest order radiation reaction terms requires the solution of yet another Poisson equation. While — with nine Poisson equations — computationally already quite heavy, the efforts to implement the scheme into SPH by two groups (Faber and Rasio, 2000; Ayal et al., 2001; Faber et al., 2001; Faber and Rasio, 2002) turned out to be not particularly useful, mainly since for realistic neutron stars with compactness C ≈ 0.17 the corrective 1PN terms are comparable to the Newtonian ones, which can lead to instabilities. As a result, one of the groups (Ayal et al., 2001) decided to study “neutron stars” of small compactness (M < 1 M, R ≈ 30 km), while the other (Faber and Rasio, 2000; Faber et al., 2001; Faber and Rasio, 2002) artificially downsized the 1PN effects by choosing a different speed of light for the corresponding terms. While both approaches represented admissible first steps, the corresponding results are astrophysically difficult to interpret.

A second, more successful approach, was the resort to the so-called conformal flatness approximation (CFA) (Isenberg, 2008; Wilson and Mathews, 1995; Wilson et al., 1996; Mathews and Wilson, 1997; Mathews et al., 1998). Here the basic assumption is that the spatial part of the metric is conformally flat, i.e., it can be written as a multiple (the prefactor depends on space and time and absorbs the overall scale of the metric) of the Kronecker symbol γ ij = Ψ4δ ij , and that it remains so during the subsequent evolution. The latter, however, is an assumption and by no means guaranteed. Physically this corresponds to gravitational-wave-free space time. Consequently, the inspiral of a binary system has to be achieved by adding an ad hoc radiation reaction force. The CFA also cannot handle frame dragging effects. On the positive side, for spherically symmetric space times the CFA coincides exactly with GR and for small deviation from spherical symmetry, say for rapidly rotating neutron stars, it has been shown (Cook et al., 1996) to deliver very accurate results. For more general cases such as a binary merger, the accuracy is difficult to assess. Nevertheless, given how complicated the overall problem is, the CFA is certainly a very useful approximation to full GR, in particular since it is computationally much more efficient than solving Einstein’s field equations.

The CFA was implemented into SPH by Oechslin et al. (2002) and slightly later by Faber et al. (2004). The major difference between the two approaches was that Oechslin et al. solved the set of six coupled, non-linear elliptic field equations by means of a multi-grid solver (Press et al., 1992), while Faber et al. used spectral methods from the Lorene library on two spherically symmetric grids around the stars. Both studies used polytropic equations of state (Oechslin et al., 2002 used Γ = 2.0, 2.6 and 3.0; Faber et al., 2004 used Γ = 2.0) and approximative radiation reaction terms based on the Burke-Thorne potential (Burke, 1971; Thorne, 1969). Oechslin et al. used a combination of a bulk and a von Neumann-Richtmyer artificial viscosity steered similarly as in the Morris and Monaghan (1997) approach, while Faber et al. argued that shocks would not be important and artificial dissipation would not be needed.

In a subsequent study, Oechslin et al. (2004) explored how the presence of quark matter in neutron stars would impact on a NSNS merger and its gravitational-wave signal. They combined a relativistic mean field model (above ρ = 2 × 1014 g cm−3) with a stiff polytrope as a model for the hadronic EOS and added in an MIT bag model so that quark matter would appear at 5 × 1014 g cm−3 and would completely dominate the EOS for high densities (> 1015 g cm−3). While the impact on the GW frequencies at the ISCO remained moderate (< 10%), the post-merger GW signal was substantially influenced in those cases where the central object did not collapse immediately into a BH. In a subsequent study, (Oechslin and Janka, 2006) implemented the Shen et al. (1998a, b) EOS and a range of NS mass ratios was explored, mainly with respect to the question how large resulting torus masses would be and whether such merger remnants could likely power bursts similar to GRBs 050509b, 050709, 050724, 050813. The found range of disk masses from 1–9% of the baryonic mass of the NSNS binary was considered promising and broadly consistent with CBM being the central engines of sGRB.

In Oechslin et al. (2007), the same group also explored the Lattimer-Swesty EOS (Lattimer and Swesty, 1991), the cold EOS of Akmal et al. (1998) and ideal gas equations of state with parameters fitted to nuclear EOSs. The merger outcome was rather sensitive to the nuclear matter EOS: the remnant collapsed either immediately or very soon after merger for the soft Lattimer-Swesty EOS and for all other cases it did not show signs of collapse for tens of dynamical time scales. Both ejecta and disk masses were found to increase with an increasing deviation of the mass ratio from unity. The ejecta masses were in a range between 10−3 and 10−2 M, comparable, but slightly lower than the earlier, Newtonian estimates (Rosswog et al., 1999). In terms of their GW signature, it turned out that the peak in the GW wave energy spectrum that is related to the formation of the hypermassive merger remnant has a frequency that is sensitive to the nuclear EOS (Oechslin and Janka, 2007). In comparison, the mass ratio and neutron star spin only had a weak impact.

This line of work was subsequently continued by Bauswein et al. in a series of papers (Bauswein et al., 2009, 2010b, a; Bauswein and Janka, 2012; Bauswein et al., 2012, 2013b, a). In their first study, they explored the merger of two strange stars (Bauswein et al., 2009, 2010b). If strange quark matter is really the ground state of matter as hypothesized (Bodmer, 1971; Witten, 1984), compact stars made of strange quark matter might exist. Such stars would differ from neutron stars in the sense that they are self-bound, they do not have an overall inverse mass-radius relationship and they can be more compact. Therefore, the gravitational torques close to merger are different and it is more difficult to tidally tear apart a strange star. In their study (Bauswein et al., 2010b), the quark matter EOS was modelled via the MIT bag model (Chodos et al., 1974; Farhi and Jaffe, 1984) for two different bag constants (60 and 80 MeV/fm3) and a large number of binary systems (each time 130 K SPH particles) was explored. The coalescence of two strange stars is indeed morphologically different from a neutron star merger: the result is a differentially rotating hypermassive object with sharp boundary layers surrounded by a thin and clumpy strange matter disk, see Figure 17 for an example of a strange star merger with 1.2 and 1.35 M (Bauswein et al., 2010b). Moreover, due to the greater compactness, the peak GW frequencies were larger during both inspiral and the subsequent ringdown phase. If the merger of two strange stars would eject matter in the form of “strangelets” these should contribute to the cosmic ray flux. The ejected mass and therefore the contribution to the cosmic ray flux strongly depends on the chosen bag constant and for large values no mass loss could be resolved. For such values neutron stars and strange stars could coexist, since neutron stars would not be converted into strange stars by capturing cosmic strangelets. In this and another study (Bauswein et al., 2010a) thermal effects (and their consistent treatment) were shown to have a substantial impact on the remnant structure.

Figure 17:
figure 17

Merger of two strange stars with masses of 1.2 and 1.35 M. Shown is the density in the orbital plane. Image reproduced with permission from Bauswein et al. (2010b), copyright by APS.

In subsequent work, a very large number of microphysical EOSs was explored (Bauswein and Janka, 2012; Bauswein et al., 2012, 2013b, a). Here, the authors systematically explored which imprint the nuclear EOS would have on the GW signal. They found that the peak frequency of the post-merger signal correlates well with the radii of the non-rotating neutron stars (Bauswein and Janka, 2012; Bauswein et al., 2012) and concluded that a GW detection would allow to constrain the ns radius within a few hundred meters. In a follow-up study, Bauswein et al. (2013a) explored the threshold mass beyond which a prompt collapse to a black hole occurs. The study also showed that the ratio between this threshold mass and the maximum mass is tightly correlated with the compactness of the non-rotating maximum mass configuration.

In a separate study (Bauswein et al., 2013b), they used their large range of equations of state and several mass ratios to systematically explore dynamic mass ejection. According to their study, softer equations of state with correspondingly smaller radii eject a larger amount of mass. In the explored cases they found a range from 10−3 to 2 × 10−2 M to be dynamically ejected. For the arguably most likely case with 1.35 and 1.35 M they found a range of ejecta masses of about one order of magnitude, determined by the equation of state. Moreover, consistent with other studies, they found a robust r-process that produces a close-to-solar abundance pattern beyond nucleon number of A = 130 and they discussed the implications for “macronovae” and possibly emerging radio remnants due to the ejecta.

4.2.6 Neutron star—black hole mergers

A number of issues that have complicated the merger dynamics in the WDWD case, such as stability of mass transfer or the formation of a disk, see Section 4.1.3, are also very important for NSBH mergers.Footnote 14 Here, however, they are further complicated by a poorly known high-density equation of state which determines the mass-radius relationship and therefore the reaction of the neutron star on mass loss, general-relativistic effects such as the appearance of an innermost stable circular orbit or effects from the bh spin and the fact that now the GW radiation-reaction time scale can become comparable to the dynamical time scale, see Eq. (156).

Of particular relevance is the question for which binary systems sizeable accretion tori form since they are thought to be the crucial “transformation engines” that channel available energy into (relativistic) outflow. The final answer to this question requires 3D numerical simulations with the relevant physics, but a qualitative idea can be gained form simple estimates (but, based on the experience from WDWD binaries, keep in mind that even plausible approximations can yield rather poor results, see Section 4.1.3). Mass transfer is expected to set in when the Roche volume becomes comparable to the volume of the neutron star. By applying Paczynski’s estimate for the Roche lobe radius (Paczyński, 1971) and equating it with the ns radius, one finds that the onset of mass transfer (which we use here as a proxy for the tidal disruption radius) can be expected near a separation of

$${a_{{\rm{MT}}}} = 2.17{R_{{\rm{NS}}}}{\left({{{1 + q} \over q}} \right)^{1/3}} \approx 26\,{\rm{km}}\left({{{{R_{{\rm{NS}}}}} \over {12\,{\rm{km}}}}} \right){\left({{{1 + q} \over q}} \right)^{1/3}}.$$
(159)

Since aMT grows, in the limit q ≪ 1, only proportional to \(M_{{\rm{BH}}}^{1/3}\), but the ISCO and the event horizon grow ∝ MBH, the onset of mass transfer/disruption can take place inside the ISCO for large bh masses. At the very high end of bh masses, the neutron star is swallowed as whole without being disrupted at all. A qualitative illustration (for fiducial neutron star properties, MNS = 1.4 M and RNS = 12 km) is shown in Figure 18. Roughly, already for black holes near MBH ≈ 8 M the mass transfer/disruption occurs near the ISCO which makes it potentially difficult to form a massive torus from ns debris. So, low mass black holes are clearly preferred as GRB engines. Numerical simulations (Faber et al., 2006a) have shown, however, that even if the disruption occurs deep inside the ISCO this does not necessarily mean that all the matter is doomed to fall straight into the hole and a torus can still possibly form.

When discussing disk formation in a GRB context, it is worth keeping in mind that even seemingly small disk masses allow, at least in principle, for the extraction of energies,

$${E_{{\rm{extr}}}}\sim 1.8 \times {10^{51}}{\rm{erg}}\left({{\epsilon \over {0.1}}} \right)\left({{{{M_{{\rm{disk}}}}} \over {0.01{M_ \odot }}}} \right),$$
(160)

that are large enough to accommodate the isotropic gamma-ray energies, Eγ,iso ∼ 1050 erg, that have been inferred for short bursts (Berger, 2013). If short bursts are collimated into a half-opening angle θ, their true energies are substantially lower than this number, Eγ,true = (Eγ,iso/65) (θ/10°)2.

Figure 18:
figure 18

Illustration of the mass transfer separation aMT with respect to horizon and innermost stable circular orbits based on Newtonian estimates (for fiducial neutron star properties, MNS = 1.4 M and RNS = 12 km).

4.2.6.1 Early Newtonian calculations with polytropic equations of state

The first NSBH simulations were performed by Lee and Kluźniak (1999a, b) and Lee (2000, 2001) using Newtonian physics and polytropic equations of state. Although simple and missing some qualitative features of black holes (such as a last stable orbit), these simulations provided insight and a qualitative understanding of the system dynamics, the impact of neutron star spin, the mass ratio and the equation of state. The first set of simulations (Lee and Kluźniak, 1999a) were carried out with an SPH formulation similar to the one of Hernquist and Katz (1989), but an alternative kernel gradient symmetrization. In this study, the artificial viscosity tensor Eq. (82) was implemented with fixed dissipation parameters α and β and ∼ 104 SPH particles were used. The black hole was modelled as a Newtonian point mass with an absorbing boundary condition at the Schwarzschild radius RS, no backreaction from gravitational-wave emission was accounted for. In the simulations of initially tidally locked binaries with a stiff EOS (Γ = 3; Lee and Kluźniak, 1999a), they found that the neutron star survived the onset of mass transfer and kept orbiting, at a reduced mass, for several orbital periods. A similar behavior had been realized in subsequent work with a stiff nuclear equation of state (Rosswog et al., 2004; Rosswog, 2007b). In a follow-up paper (Lee and Kluźniak, 1999b), a simple point-mass backreaction force was applied, and, in one case, the Paczyński-Wiita (Paczyński and Wiita, 1980) potential was used (now with absorbing boundary at 1.5 RS). But the main focus of this study was to explore the effect of a softer equation of state (Γ = 5/3). In all explored cases the system dynamics was very different from the previous study, the neutron star was completely disrupted and formed a massive disk of ∼ 0.2 M, with ∼ 0.01 M being dynamically ejected. The sensitivity to the EOS stiffness is not entirely surprising, since the solutions to the Lane-Emden equations give the mass-radius relationship

$${{d\log R} \over {d\log M}} = (\Gamma - 2)/(3\Gamma - 4)$$
(161)

for a polytropic star, so that the neutron star reacts differently on mass loss: it shrinks for Γ > 2, so mass loss is quenched, and expands for Γ < 2 and therefore further enhances mass transfer.

In a second set of calculations (≈ 80 K SPH particles), they explored non-rotating neutron stars that were modelled as compressible triaxial ellipsoids according to the semi-analytic work of Lai et al. (1993a, b, 1994), both with stiff (Γ = 2.5 and 3) (Lee, 2000) and soft (Γ = 5/3 and 2) (Lee, 2001) polytropic equations of state. They used the same simulation technology, but also applied a Balsara-limiter, see Eq. (95), in their artificial viscosity treatment and only purely Newtonian interaction between NS and BH was considered. For the Γ = 3 case, the neutron star survived again until the end of the simulation, with Γ = 2.5 it survived the first mass transfer episode but was subsequently completely disrupted and formed a disk of nearly 0.2 M, about 0.03 M were dynamically ejected.

Lee et al. (2001) also simulated mergers between a black hole and a strange star which was modelled with a simple quark-matter EOS. The dynamical evolution for such systems was quite different from the polytropic case: the strange star was stretched into thin matter stream that wound around the black hole and was finally swallowed. Although “starlets” of ≈ 0.03 M formed during the disruption process, all of them were in the end swallowed by the hole within milliseconds, no mass loss could be resolved.

4.2.6.2 Studies with focus on microphysics

The first NSBH studies based on Newtonian gravity, but including detailed microphysics were performed by Ruffert and Janka (1999) and Janka et al. (1999) using a Eulerian PPM code on a Cartesian grid.Footnote 15 The first Newtonian-gravity-plus-microphysics SPH simulations of NSBH mergers were discussed in Rosswog et al. (2004); Rosswog (2005). Here the black hole was simulated as a Newtonian point mass with an absorbing boundary and a simple GW backreaction force was applied. For the neutron star the Shen et al. (1998a, b) temperature-dependent nuclear EOS was used and the star was modelled with 3 × 105 − 106 SPH particles. In addition, neutrino cooling and electron/positron captures were followed with a detailed multi-flavor leakage scheme (Rosswog and Liebendörfer, 2003). The initial study focussed on systems with low mass black holes (q = 0.5−0.1) since this way there are greater chances to disrupt the neutron star outside of the ISCO, see above. Moreover, both (carefully constructed) corotating and irrotational neutron stars were studied. In all cases the core of the neutron star (0.15−0.68 M) survived the initial mass transfer episodes until the end of the simulations (22 −64 ms). If disks formed at all during the simulated time, they had only moderate masses (∼ 0.005 M). One of the NSBH binary (MNS = 1.4 M, MBH = 3 M) systems was followed throughout the whole mass transfer episode (Rosswog, 2007b) which lasted for 220 ms or 47 orbital revolutions and only ended when the neutron star finally became disrupted and resulted the formation of a disk of 0.05 M. A set of test calculations with a stiff (Γ = 3) and a softer polytropic EOS (Γ = 2) indicated that such episodic mass transfer is related to the stiffness of the ns EOS and only occurs for stiff cases, consistent with the results of Lee (2000). Subsequent studies with better approximations to relativistic gravity, e.g., Faber et al. (2006b), have seen qualitatively similar effects for stiffer EOSs, but after a few orbital periods the neutron was always disrupted. Shibata and Taniguchi (2011) discussed such episodic, long-lived mass transfer in a GR context and concluded that while possible, it has so far never been seen in fully relativistic studies.

A study (Rosswog, 2005) with simulation tools similar to Rosswog et al. (2004) focussed on higher mass, non-spinning black holes (MBH = 14… 20 M) that were approximated by pseudo-relativistic potentials (Paczyński and Wiita, 1980). While being very simple to implement, this approach mimics some GR effects quite well and in particular it has an innermost stable circular particle orbit at the correct location (6GMBH/c2), see Tejeda and Rosswog (2013) for quantitative assessment of various properties. In none of these high black hole mass cases was episodic mass transfer observed, the neutron star was always completely disrupted shortly after the onset of mass transfer. Although disks formed for systems below 18 M, a large part of them was inside the ISCO and was falling practically radially into the hole on a dynamical time scale. As a result, they were thin and cold and not considered promising GRB engines. It was suggested, however, that even black holes at the high end of the mass distribution could possibly be GRB engines, provided they spin rapidly enough, since then both ISCO and horizon move closer to the bh. The investigated systems ejected between 0.01 and 0.2 M at large velocities (∼ 0.5 c) and analytical estimates suggested that such systems should produce bright optical/near-infrared transients (“macronovae”) powered by the radioactive decay of the freshly produced r-process elements within the ejecta, as originally suggested by Li and Paczyński (1998).

4.2.6.3 Studies with approximate GR gravity around a non-spinning black hole

Faber et al. (2006b, a) studied the merger of a non-rotating black hole with a polytropic neutron star in approximate GR gravity. While the hydrodynamics code was fully relativistic, the self-gravity of the neutron star was treated within the conformal flatness (CF) approximation. Since the black hole was kept at a fixed position its mass needed to be substantially larger than the one of the neutron star, therefore a mass ratio of q = 0.1 was chosen. The neutron star matter was modelled with two (by nuclear matter standards) relatively soft polytropes (Γ = 1.5 and 2). In the first study (Faber et al., 2006b), they focussed on tidally locked neutron stars and solved the five linked non-linear field equations of the CF approach by means of the Lorene libraries, the second study (Faber et al., 2006a) used irrotational neutron stars and solved the CF equations by means of a Fast Fourier transform solver. In a first case they considered a neutron star of compactness \(\mathcal{C}= 0.15\), and, to simulate a case where the disruption of the neutron star occurs near the ISCO, they also considered a second case where \(\mathcal{C}\) was only 0.09. The first case turned out to be astrophysically unspectacular: the entire neutron star was swallowed as a whole without leaving matter behind. The second, “undercompact” case, however, see Figure 19, showed some very interesting behavior: the neutron star spiralled towards the black hole, became tidally stretched and although at some point 98% of the ns mass were inside of the ISCO, see panel two, via a rapid redistribution of angular momentum, a substantial fraction of the matter was ejected as a one-armed spiral. Approximately 12% of the initial neutron star formed a disk and an additional 13% of the initial neutron star were tidally ejected into unbound orbits. Such systems, they concluded, would be interesting sGRB engines.

Figure 19:
figure 19

Disruption of an “undercompact” neutron star (compactness \(\mathcal{C} = 0.15\)), simulated within the conformal flatness approximation by a non-rotating black hole (q = 0.1). Although at some stage 98% of the neutron star are within the innermost stable circular orbit (dashed circle), rapid redistribution of angular momentum leads to the ejection of a one-armed spiral. Finally a disk made of 0.12 MNS forms and 0.13 MNS are dynamically ejected. Image reproduced with permission from Faber et al. (2006a), copyright by AAS.

4.2.6.4 Studies in the fixed metric of a spinning black hole

Rantsiou et al. (2008) explored how the outcome of a neutron star black hole merger depends on the spin of the black hole and on the inclination angle of the binary orbit with respect to the equatorial plane of the black hole. They used the relativistic SPH code originally developed by Laguna et al. (1993a, b) to study the tidal disruption of a main sequence star by a massive black hole. The new code version employed Kerr-Schild coordinates to avoid coordinate singularities at the horizon as they appear in the frequently used Boyer-Lindquist coordinates. Since the spacetime was kept fixed, they focused on a small mass ratio q = 0.1, where the impact of the neutron star on the spacetime is sub-dominant. Both the black hole mass and spin were frozen at their initial values during the simulation and the GW backreaction was implemented via the quadrupole approximation in the point mass limit, similar to the one used by Lee and Kluźniak (1999b). The neutron star itself was modelled as a Γ = 2.0 polytrope with Newtonian self-gravity, the artificial dissipation parameters were fixed to 0.2 (instead of values near unity which are needed to properly deal with shocks). Note that Γ = 2 is a special choice, since a Newtonian star does not change its radius if mass is added or lost, see Eq. (161). The bulk of the simulations was calculated with 104 SPH particles, in one case 105 particles were used to confirm the robustness of the results.

For the case of a Schwarzschild black hole (aBH = 0) they found that neither a disk formed nor any material was ejected. For equatorial mergers with spinning black holes, it required a spin parameter of aBH > 0. 7 for any mass to form a disk or to become ejected. For a rapidly spinning bh (aBH = 0. 75) an amount of matter of order 0.01 0 became unbound, for a close-to-maximally spinning bh (abh = 0.99) a huge amount of matter (> 0.4 M) was ejected. Mergers with inclination angles > 60° lead to the complete swallowing of the neutron star. An example of close-to-maximally spinning black hole (aBH = 0.99) and a neutron star whose orbital plane is inclined by 45° with respect to the black hole spin is shown in Figure 20. Here, as much as 25% of the neutron star, in the shape of a helix, become unbound.

Figure 20:
figure 20

Disruption of a polytropic star (Γ = 2.0) by a close-to-maximally spinning black hole (abh = 0.99) where the binary orbit has an initial inclination of 45° with respect to the black hole spin. The “helix” of unbound material contains 25% of the original neutron star mass. Image reproduced with permission from Rantsiou et al. (2008), copyright by AAS.

4.2.7 Collisions between two neutron stars and between a neutron star and a black hole

Traditionally, the focus of compact object encounters have been GW-driven binary systems such as the Hulse-Taylor pulsar (Taylor and Weisberg, 1989; Weisberg et al., 2010). More recently, however, also dynamical collisions/high-eccentricity encounters between two compact objects have attracted a fair amount of interest (Kocsis et al., 2006; O’Leary et al., 2009; Lee et al., 2010; East et al., 2012; Kocsis and Levin, 2012; Gold and Brügmann, 2013). Unfortunately, their rates are at least as difficult to estimate as those of GW-driven mergers.

Collisions differ from gravitational-wave-driven mergers in a number of ways. For example, since gravitational-wave emission of eccentric binaries efficiently removes angular momentum in comparison to energy, primordial binaries will have radiated away their eccentricity and will finally merge from a nearly circular orbit. On the contrary, binaries that have formed dynamically, say in a globular cluster, start from a small orbital separation, but with large eccentricities and may not have had the time to circularize until merger. This leads to pronouncedly different gravitational-wave signatures, “chirping” signals of increasing frequency and amplitude for mergers and initially well-separated, repeated GW bursts that continue from minutes to days in the case of collisions. Moreover, compact binaries are strongly gravitationally bound at the onset of the dynamical merger phase while collisions, in contrast, have total orbital energies close to zero and need to get rid of energy and angular momentum via GW emission and/or through mass shedding episodes in order to form a single remnant. Due to the strong dependence on the impact parameter and the lack of strong constraints on it, one expects a much larger variety of dynamical behavior for collisions than for mergers.

Lee et al. (2010) provided detailed rate estimates of compact object collisions and concluded that such encounters could possibly produce an interesting contribution to the observed GRB rate. They also performed the first SPH simulations of such encounters. Using the SPH code from their earlier studies (Lee and Kluźniak, 1999a, b; Lee, 2000, 2001), they explored the dynamics and remnant structure of encounters with different strengths between all types of compact stellar objects (WD/NS/BH; typically with 100 K SPH particles). Here polytropic equations of state were used and black holes were treated as Newtonian point masses with absorbing boundaries at their Schwarzschild radii. Their calculations indicated in particular that such encounters would produce interesting GRB engines with massive disks and additional external reservoirs (one tidal tail for each close encounter) where a large amounts of matter (> 0.1 M) could be stored to possibly prolong the central engine activity, as observed in some bursts. In addition, a substantial amount of mass was dynamically ejected (0.03 M for NSNS and up to 0.2 M for NSBH systems).

In Rosswog et al. (2013), various signatures of gravitational-wave-driven mergers and dynamical collisions were compared, both for NSNS and NSBH encounters. The study applied Newtonian SPH (with up to 8 × 106 particles) together with a nuclear equation of state (Shen et al., 1998a, b) and a detailed neutrino leakage scheme (Rosswog and Liebendörfer, 2003). As above, black holes were modelled as Newtonian point masses with absorbing boundaries at the Rs. A simulation result of a strong encounter between a 1.3 and a 1.4 M neutron star (pericenter distance equal to the average of the two neutron star radii) is shown in Figure 21. Due to the strong shear at their interface, a string of Kelvin-Helmholtz vortices forms in each of the close encounters before a final central remnant is formed. Such conditions offer plenty of opportunity for magnetic field amplification (Price and Rosswog, 2006; Anderson et al., 2008; Obergaulinger et al., 2010; Rezzolla et al., 2011; Zrake and MacFadyen, 2013). In all explored cases, the neutrino luminosity was at least comparable to the merger case, L v ≈ 1053 erg/s, but for the more extreme cases exceeded this value by an order of magnitude. Thus, if neutrino annihilation should be the main agent driving a GRB outflow, chances for collisions should be at least as good as in the merger case. But both scenarios also share the same caveat: neutrino-driven, baryonic pollution could prevent in at least a fraction of cases the emergence of relativistic outflows. In NSBH collisions the neutron star took usually several encounters before being completely disrupted. In some cases its core survived several encounters and was finally ejected with a mass of ∼ 0.1 M. Of course, this offers a number of interesting possibilities (production of low-mass neutron stars, explosion of the NS core at the minimal mass etc.). But first of all, such events may be very rare and it needs to be seen whether such behavior can occur at all in the general-relativistic case.

Generally, both NSNS and NSBH collisions ejected large quantities of unbound debris. Collisions between neutron stars ejected a few percent (dependent on the impact strength) of a solar mass, while all investigated NSBH collisions ejected ∼ 0.15 M, consistent with the findings of Lee et al. (2010). Since NSBH encounters should dominate the rates (Lee et al., 2010), it was concluded in Rosswog et al. (2013) that collisions must be (possibly much) less frequent than 10% of the NSNS merger rate to avoid a conflict with constraints from the chemical evolution of galaxies.

Figure 21:
figure 21

mpg-Movie (46.0 MB) — Collision between a 1.3 M and a 1.4 M neutron stars (modelled by 8 × 106 SPH particles; pericenter distance equal to the average of the two neutron star radii). Only matter below the orbital plane is shown, color-coded is temperature in MeV. The corresponding simulations are discussed in detail in Rosswog et al. (2013). (For video see appendix)

Since here the ejecta velocities and masses are substantially larger (υej ∼ 0.2 c and m ej ∼ 0.1 M) than in the neutron star merger case (υej ∼ 0.1 c and mej ∼ 0.01 M) simple scaling relations (Grossman et al., 2014) suggest that a resulting radioactively powered macronova should peak after

$${t_{\rm{P}}} = 11\,{\rm{days}}{\left({{\kappa \over {10{\rm{c}}{{\rm{m}}^2}/{\rm{g}}}}{{{m_{{\rm{ej}}}}} \over {0.1{M_ \odot }}}{{0.2c} \over {{v_{{\rm{ej}}}}}}} \right)^{1/2}}$$
(162)

with a luminosity of

$${L_{\rm{P}}} = 8.8 \times {10^{40}}{\rm{erg/s}}{\left({{{{v_{{\rm{ej}}}}} \over {0.2c}}{{10{\rm{c}}{{\rm{m}}^2}/{\rm{g}}} \over \kappa }} \right)^{0.65}}{\left({{{{m_{{\rm{ej}}}}} \over {0.1{M_ \odot }}}} \right)^{0.35}}.$$
(163)

Here, κ is the r-process material opacity (Kasen et al., 2013) and a radioactive heating rate \(\dot{\epsilon} \propto {t^{ - 1.3}}\) (Metzger et al., 2010b; Korobkin et al., 2012) has been assumed.

4.2.8 Post-merger disk evolution

SPH simulations were also applied to study the long-term evolution of accretion disks that have formed during a CBM. Lee and collaborators (Lee and Ramirez-Ruiz, 2002; Lee et al., 2004, 2005, 2009) started from their NSBH merger simulations, see Section 4.2.6, and followed the fate of the resulting disks. Since the viscous disk time scale, see Eq. (157), by far exceeds the numerical time step allowed by the CFL condition, Eq. (158), the previous results were mapped into a 2D version of their code and they followed the evolution, driven by a Shakura-Sunyaev “α-viscosity” prescription (Shakura and Sunyaev, 1973), for hundreds of milliseconds. Consistent with their NSBH merger simulations, the black hole was treated as a Newtonian point mass with an absorbing boundary at RS = 2 GMBH/c2, the disk self-gravity was neglected. In the first study (Lee and Ramirez-Ruiz, 2002) the disk matter was modelled with a polytropic EOS (Γ = 4/3) and locally dissipated energy was assumed to be emitted via neutrinos. Subsequent studies (Lee et al., 2004, 2005) applied increasingly more sophisticated microphysics, the latter study accounted for pressure contributions from relativistic electrons with an arbitrary degree of degeneracy and an ideal gas of nucleons and alpha particles. These latter studies accounted for opacity-dependent neutrino cooling and also considered trapped neutrinos as a source of pressure, but no distinction between different neutrino flavors was made. It turned out that at the transition between the inner, neutrino-opaque and the outer, transparent regions an inversion of the lepton number gradient builds up, with minimum values Y e ≈ 0.05 close to the transition radius (∼ 107 cm), values close to 0.1 near the BH and proton-rich conditions (Ye > 0.5) at large radii. Such lepton number gradients drive strong convective motions that shape the inner disk regions. Overall, neutrino luminosities around ≈ 1053 erg/s were found and around 1052 erg were emitted in neutrinos over the lifetime of the disk (∼ 0.4 s).

4.2.9 Summary: Encounters between neutron stars and black holes

Compact binary mergers are related to a number of vibrant astrophysical topics. They are likely the first sources whose gravitational waves are detected directly, they probably power short gamma-ray bursts and they may be the astrophysical sites where a large fraction of the heaviest elements in the Universe are forged. With these high promises for different fields comes also the necessity to reliably include a broad range of physical processes. Here, much progress has been achieved in the last one and a half decades. The task for the future will be to bring the different facets of compact binary encounters into a coherent, bigger astrophysical picture. Very likely, our understanding will be challenged once, say, the first gamma-ray burst is also observed as a gravitational-wave event. If the interpretation of the recent nIR transient related to GRB 130603B as a “macronova” is correct, this is the first link between two suspected, but previously unproven aspects of the compact binary merger phenomenon: gamma-ray bursts and heavy element nucleosynthesis. This may just have been one of the first heralds of the beginning era of multi-messenger astronomy.

Figure 22:
figure 22

mpg-Movie (14.7 MB) Still from a movie — Grazing collision between a 5 M black hole and a 1.3 M neutron star. Shown are the densities in the orbital plane after the first, second and third close encounter. In this particular case, the neutron star’s core survives all three encounters, each time producing a tidal tail, and is finally ejected at ∼ 0.1 M. The corresponding simulations are discussed in detail in Rosswog et al. (2013). (For video see appendix)

SPH has played a major role in achieving our current understanding of the astrophysics of compact binary mergers. The main reasons for its use were its geometrical flexibility, its excellent numerical conservation properties, and the ease with which new physics ingredients can be implemented. A broad range of physical ingredients has been implemented into the simulations that exist to date. These include a large number of different equations of state, weak interactions/neutrino emission and magnetic fields. In terms of gravitational physics, mergers have been simulated in Newtonian, post-Newtonian and in conformal flatness approximations to GR. An important milestone, however, that at the time of writing (June 2014) still has to be achieved, is an SPH simulation where the spacetime is self-consistently evolved in dynamic GR.