1 Introduction

1.1 The scope and style of the review

The historical development of laser interferometers for application as gravitational-wave detectors [47] has involved the combination of relatively simple optical subsystems into more and more complex assemblies. The individual elements that compose the interferometers, including mirrors, beam splitters, lasers, modulators, various polarising optics, photo detectors and so forth, are individually well described by relatively simple, mostly-classical physics. Complexity arises from the combination of multiple mirrors, beam splitters etc. into optical cavity systems, have narrow resonant features, and the consequent requirement to stabilise relative separations of the various components to sub-wavelength accuracy, and indeed in many cases to very small fractions of a wavelength.

Thus, classical physics describes the interferometer techniques and the operation of current gravitational-wave detectors. However, we note that at signal frequencies above a couple of hundreds of Hertz, the sensitivity of current detectors is limited by the photon counting noise at the interferometer readout, also called shot-noise. The next generation systems such as Advanced LIGO [23, 5], Advanced Virgo [4] and LCGT [36] are expected to operate in a regime where the quantum physics of both light and mirror motion couple to each other. Then, a rigorous quantum-mechanical description is certainly required. Sensitivity improvements beyond these ‘Advanced’ detectors necessitate the development of non-classical techniques. The present review, in its first version, does not consider quantum effects but reserves them for future updates.

The components employed tend to behave in a linear fashion with respect to the optical field, i.e., nonlinear optical effects need hardly be considered. Indeed, almost all aspects of the design of laser interferometers are dealt with in the linear regime. Therefore the underlying mathematics is relatively simple and many standard techniques are available, including those that naturally allow numerical solution by computer models. Such computer models are in fact necessary as the exact solutions can become quite complicated even for systems of a few components. In practice, workers in the field rarely calculate the behaviour of the optical systems from first principles, but instead rely on various well-established numerical modelling techniques. An example of software that enables modelling of either time-dependent or frequency-domain behaviour of interferometers and their component systems is Finesse [22, 19]. This was developed by one of us (AF), has been validated in a wide range of situations, and was used to prepare the examples included in the present review.

The target readership we have in mind is the student or researcher who desires to get to grips with practical issues in the design of interferometers or component parts thereof. For that reason, this review consists of sections covering the basic physics and approaches to simulation, intermixed with some practical examples. To make this as useful as possible, the examples are intended to be realistic with sensible parameters reflecting typical application in gravitational wave detectors. The examples, prepared using Finesse, are designed to illustrate the methods typically applied in designing gravitational wave detectors. We encourage the reader to obtain Finesse and to follow the examples (see Appendix A).

1.2 Overview of the goals of interferometer design

As set out in very many works, gravitational-wave detectors strive to pick out signals carried by passing gravitational waves from a background of self-generated noise. The principles of operation are set out at various points in the review, but in essence, the goal has been to prepare many photons, stored for as long as practical in the ‘arms’ of a laser interferometer (traditionally the two arms are at right angles), so that tiny phase shifts induced by the gravitational waves form as large as possible a signal, when the light leaving the appropriate ‘port’ of the interferometer is detected and the resulting signal analysed.

The evolution of gravitational-wave detectors can be seen by following their development from prototypes and early observing systems towards the Advanced detectors, which are currently in the final stages of planning or early stages of construction. Starting from the simplest Michelson interferometer [18], then by the application of techniques to increase the number of photons stored in the arms: delay lines [31], Fabry-Pérot arm cavities [16, 17] and power recycling [15]. The final step in the development of classical interferometry was the inclusion of signal recycling [41, 30], which, among other effects, allows the signal from a gravitational-wave signal of approximately-known spectrum to be enhanced above the noise.

Reading out a signal from even the most basic interferometer requires minimising the coupling of local environmental effects to the detected output. Thus, the relative positions of all the components must be stabilised. This is commonly achieved by suspending the mirrors etc. as pendulums, often multi-stage pendulums in series, and then applying closed-loop control to maintain the desired operating condition. The careful engineering required to provide low-noise suspensions with the correct vibration isolation, and also low-noise actuation, is described in many works. As the interferometer optics become more complicated, the resonance conditions, i.e., the allowed combinations of inter-component path lengths required to allow the photon number in the interferometer arms to reach maximum, become more narrowly defined. It is likewise necessary to maintain angular alignment of all components, such that beams required to interfere are correctly co-aligned. Typically the beams need to be aligned within a small fraction (and sometimes a very small fraction) of the far-field diffraction angle, and the requirement can be in the low nanoradian range for km-scale detectors [44, 21]. Therefore, for each optical component there is typically one longitudinal (i.e., along the direction of light propagation), plus two angular degrees of freedom (pitch and yaw about the longitudinal axis). A complex interferometer can consist of up to around seven highly sensitive components and so there can be of order 20 degrees of freedom to be measured and controlled [3, 57].

Although the light fields are linear, the coupling between the position of a mirror and the complex amplitude of the detected light field typically shows strongly nonlinear dependence on mirror positions due to the sharp resonance features exhibited by cavity systems. However, the fields do vary linearly or at least smoothly close to the desired operating point. So, while well-understood linear control theory suffices to design the control system needed to maintain the optical configuration at its operating point, bringing the system to that operating condition is often a separate and more challenging nonlinear problem. In the current version of this work we consider only the linear aspects of sensing and control.

Control systems require actuators, and those employed are typically electrical-force transducers that act on the suspended optical components, either directly or — to provide enhanced noise rejection — at upper stages of multi-stage suspensions. The transducers are normally coil-magnet actuators, with the magnets on the moving part, or, less frequently, electrostatic actuators of varying design. The actuators are frequently regarded as part of the mirror suspension subsystem and are not discussed in the current work.

1.3 Overview of the physics of the primary interferometer components

To give order to our review we consider the main physics describing the operation of the basic optical components (mirrors, beam splitters, modulators, etc.) required to construct interferometers. Although all of the relevant physics is generally well known and not new, we take it as a starting point that permits the introduction of notation and conventions. It is also true that the interferometry employed for gravitational-wave detection has a different emphasis than other interferometer applications. As a consequence, descriptions or examples of a number of crucial optical properties for gravitational wave detectors cannot be found in the literature. The purpose of this first version of the review is especially to provide a coherent theoretical framework for describing such effects. With the basics established, it can be seen that the interferometer configurations that have been employed in gravitational-wave detection may be built up and simulated in a relatively straightforward manner.

As mentioned above, we do not address the newer physics associated with operation at or beyond the standard quantum limit. The interested reader can begin to explore this topic from the following references.

  • The standard quantum limit [10, 32]

  • Squeezing [38, 53]

  • Quantum nondemolition interferometry [9, 24]

These matters are to be included in a future revision of this review.

1.4 Plane-wave analysis

The main optical systems of interferometric gravitational-wave detectors are designed such that all system parameters are well known and stable over time. The stability is achieved through a mixture of passive isolation systems and active feedback control. In particular, the light sources are some of the most stable, low-noise continuous-wave laser systems so that electromagnetic fields can be assumed to be essentially monochromatic. Additional frequency components can be modelled as small modulations (in amplitude or phase). The laser beams are well collimated, propagate along a well-defined optical axis and remain always very much smaller than the optical elements they interact with. Therefore, these beams can be described as paraxial and the well-known paraxial approximations can be applied.

It is useful to first derive a mathematical model based on monochromatic, scalar, plane waves. As it turns out, a more detailed model including the polarisation and the shape of the laser beam as well as multiple frequency components, can be derived as an extension to the plane-wave model. A plane electromagnetic wave is typically described by its electric field component: with E0 as the (constant) field amplitude in V/m, \({\vec e_p}\) the unit vector in the direction of polarisation, such as, for example, \({\vec e_y}\) for \({\mathscr I}\)-polarised light, ω the angular oscillation frequency of the wave, and \(\vec k = {\vec e_k}\omega/c\) the wave vector pointing the in the direction of propagation. The absolute phase φ only becomes meaningful when the field is superposed with other light fields.

figure 1

Figure 1

In this document we will consider waves propagating along the optical axis given by the z-axis, so that \(\vec k\vec r = kz\). For the moment we will ignore the polarisation and use scalar waves, which can be written as

$$E(z,t) = {E_0}\cos (\omega t - kz + \varphi){.}$$
(1)

Further, in this document we use complex notation, i.e.,

$$E = \Re\{E^{\prime}\} \quad {\rm{with}}\quad E^{\prime} = E_0^{\prime}\exp ({\rm{i}}(\omega t - kz)).$$
(2)

This has the advantage that the scalar amplitude and the phase φ can be given by one, now complex, amplitude E′0 = E0 exp(iφ) We will use this notation with complex numbers throughout. For clarity we will simply use the unprimed letters for the auxiliary field. In particular, we will use the letter E and also a and b to denote complex electric-field amplitudes. But remember that, for example, in E = E0 exp(−i kz) neither E nor E0 are physical quantities. Only the real part of E exists and deserves the name field amplitude.

1.5 Frequency domain analysis

In most cases we are either interested in the fields at one particular location, for example, on the surface of an optical element, or we want to know the fields at all places in the interferometer but at one particular point in time. The latter is usually true for the steady state approach: assuming that the interferometer is in a steady state, all solutions must be independent of time so that we can perform all computations at t = 0 without loss of generality. In that case, the scalar plane wave can be written as

$$E = {E_0}\exp (-{\rm{i}}\;kz).$$
(3)

The frequency domain is of special interest as numerical models of gravitational-wave detectors tend to be much faster to compute in the frequency domain than in the time domain.

2 Optical Components: Coupling of Field Amplitudes

When an electromagnetic wave interacts with an optical system, all of its parameters can be changed as a result. Typically optical components are designed such that, ideally, they only affect one of the parameters, i.e., either the amplitude or the polarisation or the shape. Therefore, it is convenient to derive separate descriptions concerning each parameter. This section introduces the coupling of the complex field amplitude at optical components. Typically, the optical components are described in the simplest possible way, as illustrated by the use of abstract schematics such as those shown in Figure 2.

Figure 2
figure 2

This set of figures introduces an abstract form of illustration, which will be used in this document. The top figure shows a typical example taken from the analysis of an optical system: an incident field Ein is refiected and transmitted by a semi-transparent mirror; there might be the possibility of second incident field Ein2. The lower left figure shows the abstract form we choose to represent the same system. The lower right figure depicts how this can be extended to include a beam splitter object, which connects two optical axes.

2.1 Mirrors and spaces: reflection, transmission and propagation

The core optical systems of current interferometric gravitational interferometers are composed of two building blocks: a) resonant optical cavities, such as Fabry-Pérot resonators, and b) beam splitters, as in a Michelson interferometer. In other words, the laser beam is either propagated through a vacuum system or interacts with a partially-reflecting optical surface.

The term optical surface generally refers to a boundary between two media with possibly different indices of refraction n, for example, the boundary between air and glass or between two types of glass. A real fused silica mirror in an interferometer features two surfaces, which interact with a reffected or transmitted laser beam. However, in some cases, one of these surfaces has been treated with an anti-reffection (AR) coating to minimise the effect on the transmitted beam.

The terms mirror and beam splitter are sometimes used to describe a (theoretical) optical surface in a model. We define real amplitude coefficients for reflection and transmission r and t, with 0 ≤ r, t ≤ 1, so that the field amplitudes can be written as The π/2 phase shift upon transmission (here given by the factor i) refers to a phase convention explained in Section 2.4.

The free propagation of a distance D through a medium with index of refraction n can be described with the following set of equations: In the following we use n = 1 for simplicity.

Note that we use above relations to demonstrate various mathematical methods for the analysis of optical systems. However, refined versions of the coupling equations for optical components, including those for spaces and mirrors, are also required, see, for example. Section 2.6.

2.2 The two-mirror resonator

The linear optical resonator, also called a cavity is formed by two partially-transparent mirrors, arranged in parallel as shown in Figure 5. This simple setup makes a very good example with which to illustrate how a mathematical model of an interferometer can be derived, using the equations introduced in Section 2.1.

The cavity is defined by a propagation length D (in vacuum), the amplitude reflectivities r1, r2 and the amplitude transmittances t1, t2. The amplitude at each point in the cavity can be computed simply as the superposition of fields. The entire set of equations can be written as

$$\begin{array}{*{20}c} {{a_1} = {\rm{i}}\;{t_1}{a_0} + {r_1}a_3^\prime \quad}\\ {a_1^\prime = \exp (- {\rm{i}}\;kD){a_1}}\\ {{a_2} = {\rm{i}}\;{t_2}a_1^\prime \quad \quad \quad}\\ {{a_3} = {r_2}a_1^\prime \quad \quad \quad \quad}\\ {a_3^\prime = \exp (- {\rm{i}}\;kD){a_3}}\\ {{a_4} = {r_1}{a_0} + {\rm{i}}\;{t_1}a_3^\prime \quad}\\\end{array}$$
(4)

The circulating field impinging on the first mirror (surface) a3 can now be computed as

$$\begin{array}{*{20}c} {a_3^\prime = \exp (- {\rm{i}}\,kD){a_3} = \exp (- {\rm{i}}\,kD){r_2}a_1^\prime = \exp (- {\rm{i}}\,2kD){r_2}{a_1}} \\ {= \exp (- {\rm{i}}\,{\rm{2}}kD)\,{r_2}({\rm{i}}\,{t_1}{a_0} + {r_1}a_3^\prime).\quad \quad \quad \quad \quad \quad \quad \,\,\,} \\ \end{array}$$
(5)

This then yields

$$a_3^{\prime} = {a_0}{{{\rm{i}}\;{r_2}{t_1}\exp (- {\rm{i\;2}}kD)} \over {1 - {r_1}{r_2}\exp (- {\rm{i\;2}}kD)}}.$$
(6)

We can directly compute the reflected field to be

$${a_4} = {a_0}\left({{r_1} - {{{r_2}t_1^2\exp (- {\rm{i\;2}}kD)} \over {1 - {r_1}{r_2}\exp (- {\rm{i\;2}}kD)}}} \right) = {a_0}\left({{{{r_1} - {r_2}(r_1^2 + t_1^2)\exp (- {\rm{i\;2}}kD)} \over {1 - {r_1}{r_2}\exp (- {\rm{i\;2}}kD)}}} \right),$$
(7)

while the transmitted field becomes

$${a_2} = {a_0}{{- {t_1}{t_2}\exp (- {\rm{i}}\;kD)} \over {1 - {r_1}{r_2}\exp (- {\rm{i\;2}}kD)}}.$$
(8)

The properties of two mirror cavities will be discussed in more detail in Section 5.1.

2.3 Coupling matrices

Computations that involve sets of linear equations as shown in Section 2.2 can often be done or written efficiently with matrices. Two methods of applying matrices to coupling field amplitudes are demonstrated below, using again the example of a two mirror cavity. First of all, we can rewrite the coupling equations in matrix form. The mirror coupling as given in Figure 3 becomes and the amplitude coupling at a ‘space’, as given in Figure 4, can be written as In these examples the matrix simply transforms the ‘known’ impinging amplitudes into the ‘unknown’ outgoing amplitudes.

figure 3

Figure 3

figure 4

Figure 4

2.3.1 Coupling matrices for numerical computations

An obvious application of the matrices introduced above would be to construct a large matrix for an extended optical system appropriate for computerisation. A very flexible method is to setup one equation for each field amplitude. The set of linear equations for a mirror would expand to

$$\left({\begin{array}{*{20}c} 1 & 0 & 0 & 0 \\ {- {\rm{i}}\;t} & 1 & 0 & {- r} \\ 0 & 0 & 1 & 0 \\ {- r} & 0 & {- {\rm{i}}\;t} & 1 \\ \end{array}} \right)\left({\begin{array}{*{20}c} {{a_1}} \\ {{a_2}} \\ {{a_3}} \\ {{a_4}} \\ \end{array}} \right) = \left({\begin{array}{*{20}c} {{a_1}} \\ 0 \\ {{a_3}} \\ 0 \\ \end{array}} \right) = {M_{{\rm{system}}}}{\vec a_{{\rm{sol}}}} = {\vec a_{{\rm{input}}}},$$
(9)

where the input vectorFootnote 1 \({{\vec a}_{{\rm{input}}}}\) has non-zero values for the impinging fields and \({{\vec a}_{{\rm{sol}}}}\) is the ‘solution’ vector, i.e., after solving the system of equations the amplitudes of the impinging as well as those of the outgoing fields are stored in that vector.

As an example we apply this method to the two mirror cavity. The system matrix for the optical setup shown in Figure 5 becomes

$$\left({\begin{array}{*{20}c} 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ {- {\rm{i}}\;{t_1}} & 1 & 0 & {- {r_1}} & 0 & 0 & 0 \\ {- {r_1}} & 0 & 1 & {- {\rm{i}}\;{t_1}} & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & {- {e^{- {\rm{i}}\;kD}}} \\ 0 & {- {e^{- {\rm{i}}\;kD}}} & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & {- {\rm{i}}\;{t_1}} & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & {- {r_2}} & 1 \\ \end{array}} \right)\left({\begin{array}{*{20}c} {{a_0}} \\ {{a_1}} \\ {{a_4}} \\ {a_3^\prime} \\ {a_1^\prime} \\ {{a_2}} \\ {{a_3}} \\ \end{array}} \right) = \left({\begin{array}{*{20}c} {{a_0}} \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ \end{array}} \right)$$
(10)

This is a sparse matrix. Sparse matrices are an important subclass of linear algebra problems and many efficient numerical algorithms for solving sparse matrices are freely available (see, for example, [13]). The advantage of this method of constructing a single matrix for an entire optical system is the direct access to all field amplitudes. It also stores each coupling coefficient in one or more dedicated matrix elements, so that numerical values for each parameter can be read out or changed after the matrix has been constructed and, for example, stored in computer memory. The obvious disadvantage is that the size of the matrix quickly grows with the number of optical elements (and with the degrees of freedom of the system, see, for example, Section 7).

Figure 5
figure 5

Simplified schematic of a two mirror cavity. The two mirrors are defined by the amplitude coefficients for reflection and transmission. Further, the resulting cavity is characterised by its length D. Light field amplitudes are shown and identified by a variable name, where necessary to permit their mutual coupling to be computed.

2.3.2 Coupling matrices for a compact system descriptions

The following method is probably most useful for analytic computations, or for optimisation aspects of a numerical computation. The idea behind the scheme, which is used for computing the characteristics of dielectric coatings [28, 40] and has been demonstrated for analysing gravitational wave detectors [43], is to rearrange equations as in Figure 6 and Figure 7 such that the overall matrix describing a series of components can be obtained by multiplication of the component matrices. In order to achieve this, the coupling equations have to be re-ordered so that the input vector consists of two field amplitudes at one side of the component. For the mirror, this gives a coupling matrix of

$$\left({\begin{array}{*{20}c} {{a_1}}\\ {{a_4}}\\ \end{array}} \right) = {{\rm{i}} \over t}\left({\begin{array}{*{20}c} {- 1} & \quad r\\ {- r} & {{r^2} + {t^2}}\\ \end{array}} \right)\left({\begin{array}{*{20}c} {{a_2}}\\ {{a_3}}\\ \end{array}} \right).$$
(11)

In the special case of the lossless mirror this matrix simplifies as we have r2 + t2 = R + T =1. The space component would be described by the following matrix:

$$\left({\begin{array}{*{20}c} {{a_1}} \\ {{a_4}} \\ \end{array}} \right) = \left({\begin{array}{*{20}c} {\exp ({\rm{i}}\;kD)} & {\quad 0} \\ {\quad 0} & {\exp (- {\rm{i}}\;kD)} \\ \end{array}} \right)\left({\begin{array}{*{20}c} {{a_2}} \\ {{a_3}} \\\end{array}} \right).$$
(12)

With these matrices we can very easily compute a matrix for the cavity with two lossless mirrors as

$${M_{{\rm{cav}}}} = {M_{{\rm{mirror}}1}} \times {M_{{\rm{space}}}} \times {M_{{\rm{mirror}}2}}$$
(13)
$$= {{- 1} \over {{t_1}{t_2}}}\left({\begin{array}{*{20}c} {{e^ +} - {r_1}{r_2}{e^ -}} & {\quad - {r_2}{e^ +} + {r_1}{e^ -}}\\ {- {r_2}{e^ -} + {r_1}{e^ +}} & {{e^ -} - {r_1}{r_2}{e^ +}}\\ \end{array}} \right),$$
(14)

with e+ = exp(i kD) and e = exp(−ikD). The system of equation describing a cavity shown in Equation (4) can now be written more compactly as

$$\left({\begin{array}{*{20}c} {{a_0}}\\ {{a_4}}\\ \end{array}} \right) = {{- 1} \over {{t_1}{t_2}}}\left({\begin{array}{*{20}c} {{e^ +} - {r_1}{r_2}{e^ -}} & {\quad - {r_2}{e^ +} + {r_1}{e^ -}}\\ {- {r_2}{e^ -} + {r_1}{e^ +}} & {{e^ -} - {r_1}{r_2}{e^ +}}\\ \end{array}} \right)\left({\begin{array}{*{20}c} {{a_2}}\\ 0\\ \end{array}} \right).$$
(15)

This allows direct computation of the amplitude of the transmitted field resulting in

$${a_2} = {a_0}{{- {t_1}{t_2}\exp (- {\rm{i}}\;kD)} \over {1 - {r_1}{r_2}\exp (- {\rm{i\;2}}kD)}},$$
(16)

which is the same as Equation (8).

figure 6

Figure 6

figure 7

Figure 7

The advantage of this matrix method is that it allows compact storage of any series of mirrors and propagations, and potentially other optical elements, in a single 2 × 2 matrix. The disadvantage inherent in this scheme is the lack of information about the field amplitudes inside the group of optical elements.

2.4 Phase relation at a mirror or beam splitter

The magnitude and phase of reflection at a single optical surface can be derived from Maxwell’s equations and the electromagnetic boundary conditions at the surface, and in particular the condition that the field amplitudes tangential to the optical surface must be continuous. The results are called Fresnel’s equations [33]. Thus, for a field impinging on an optical surface under normal incidence we can give the reflection coefficient as

$$r = {{{n_1} - {n_2}} \over {{n_1} + {n_2}}},$$
(17)

with n1 and n2 the indices of refraction of the first and second medium, respectively. The transmission coefficient for a lossless surface can be computed as t2 = 1 − r2. We note that the phase change upon reflection is either 0 or 180°, depending on whether the second medium is optically thinner or thicker than the first. It is not shown here but Fresnel’s equations can also be used to show that the phase change for the transmitted light at a lossless surface is zero. This contrasts with the definitions given in Section 2.1 (see Figure (3)ff.), where the phase shift upon any reflection is defined as zero and the transmitted light experiences a phase shift of π/2. The following section explains the motivation for the latter definition having been adopted as the common notation for the analysis of modern optical systems.

2.4.1 Composite optical surfaces

Modern mirrors and beam splitters that make use of dielectric coatings are complex optical systems, see Figure 8 whose reflectivity and transmission depend on the multiple interference inside the coating layers and thus on microscopic parameters. The phase change upon transmission or reflection depends on the details of the applied coating and is typically not known. In any case, the knowledge of an absolute value of a phase change is typically not of interest in laser interferometers because the absolute positions of the optical components are not known to sub-wavelength precision. Instead the relative phase between the incoming and outgoing beams is of importance. In the following we demonstrate how constraints on these relative phases, i.e., the phase relation between the beams, can be derived from the fundamental principle of power conservation. To do this we consider a Michelson interferometer, as shown in Figure 9, with perfectly-reflecting mirrors. The beam splitter of the Michelson interferometer is the object under test. We assume that the magnitude of the reflection r and transmission t are known. The phase changes upon transmission and reflection are unknown. Due to symmetry we can say that the phase change upon transmission should be the same in both directions. However, the phase change on reflection might be different for either direction, thus, we write for the reflection at the front and for the reflection at the back of the beam splitter.

Figure 8
figure 8

This sketch shows a mirror or beam splitter component with dielectric coatings and the photograph shows some typical commercially available examples [45]. Most mirrors and beam splitters used in optical experiments are of this type: a substrate made from glass, quartz or fused silica is coated on both sides. The reflective coating defines the overall reflectivity of the component (anything between R ≈ 1 and R ≈ 0, while the anti-reflective coating is used to reduce the reflection at the second optical surface as much as possible so that this surface does not influence the light. Please note that the drawing is not to scale, the coatings are typically only a few microns thick on a several millimetre to centimetre thick substrate.

Figure 9
figure 9

The relation between the phase of the light field amplitudes at a beam splitter can be computed assuming a Michelson interferometer, with arbitrary arm length but perfectly-reflecting mirrors. The incoming field E0 is split into two fields E1 and E2 which are reflected atthe end mirrors and return to the beam splitter, as E3 and E4, to be recombined into two outgoing fields. These outgoing fields E5 and E6 are depicted by two arrows to highlight that these are the sum of the transmitted and reflected components of the returning fields. We can derive constraints for the phase of E1 and E2 with respect to the input field E0 from the conservation of energy: |E0|2 = |E5|2 + |E6|2.

Then the electric fields can be computed as

$${E_1} = r\;{E_0}\;{e^{{\rm{i}}{\varphi _{r1}}}};\quad {E_2} = t\;{E_0}\;{e^{{\rm{i}}{\varphi _t}}}.$$
(18)

We do not know the length of the interferometer arms. Thus, we introduce two further unknown phases: Φ1 for the total phase accumulated by the field in the vertical arm and Φ2 for the total phase accumulated in the horizontal arm. The fields impinging on the beam splitter compute as

$${E_3} = r\;{E_0}\;{e^{{\rm{i}}({\varphi _{r1}} + {\Phi _1})}};\quad {E_4} = t\;{E_0}\;{e^{{\rm{i}}({\varphi _t} + {\Phi _2})}}.$$
(19)

The outgoing fields are computed as the sums of the reflected and transmitted components:

$$\begin{array}{*{20}c} {{E_5} = \ {E_0}(R\ {e^{{\rm{i}}(2{\varphi _{r1}} + {\Phi _1})}} + T\ {e^{{\rm{i}}(2{\varphi _t} + {\Phi _2})}}).}\quad\\ {{E_6} = \ {E_0}\;rt(\ {e^{{\rm{i}}({\varphi _t} + {\varphi _{r1}} + {\Phi _1})}} + {e^{{\rm{i}}({\varphi _t} + {\varphi _{r2}} + {\Phi _2})}}),}\\ \end{array}$$
(20)

with R = r2 and T = t2.

It will be convenient to separate the phase factors into common and differential ones. We can write

$${E_5} = {E_0}\;{e^{{\rm{i}}{\alpha _ +}}}(R{e^{{\rm{i}}{\alpha _ -}}} + T{e^{{\rm{- i}}{\alpha _ -}}}),$$
(21)

with

$${\alpha _ +} = {\varphi _{r1}} + {\varphi _t} + {1 \over 2}({\Phi _1} + {\Phi _2});\quad {\alpha _ -} = {\varphi _{r1}} - {\varphi _t} + {1 \over 2}({\Phi _1} - {\Phi _2}),$$
(22)

and similarly

$${E_6} = {E_0}\;rt\;{e^{{\rm{i}}{\beta _{+}}}}2\cos ({\beta _ -}),$$
(23)

with

$${\beta _ +} = {\varphi _t} + {1 \over 2}({\varphi _{r1}} + {\varphi _{r2}} + {\Phi _1} + {\Phi _2});\quad {\beta _ -} = {1 \over 2}({\varphi _{r1}} - {\varphi _{r2}} + {\Phi _1} - {\Phi _2}){.}$$
(24)

for simplicity we now limit the discussion to a 50:50 beam splitter with \(r = t = 1/\sqrt 2\), for which we can simplify the field expressions even further:

$${E_5} = {E_0}\; {e^{{\rm{i}}{\alpha _ +}}}\cos ({\alpha _ -});\quad {E_6} = {E_0} \;{e^{{\rm{i}}{\beta _ +}}}\cos ({\beta _ -}){.}$$
(25)

Conservation of energy requires that |E0|2 = |E5|2 + |E6|2, which in turn requires

$${\cos ^2}({\alpha _ -}) + {\cos ^2}({\beta _ -}) = 1,$$
(26)

which is only true if

$${\alpha _ -} - {\beta _ -} = (2N + 1){\pi \over 2},$$
(27)

with N as in integer (positive, negative or zero). This gives the following constraint on the phase factors

$${1 \over 2}({\varphi _{r1}} + {\varphi _{r2}}) - {\varphi _t} = (2N + 1){\pi \over 2}.$$
(28)

One can show that exactly the same condition results in the case of arbitrary (lossless) reflectivity of the beam splitter [48].

We can test whether two known examples fulfill this condition. If the beam-splitting surface is the front of a glass plate we know that φt = 0, φr1 = π φr2 = 0, which conforms with Equation (28). A second example is the two-mirror resonator, see Section 2.2. If we consider the cavity as an optical ‘black box’, it also splits any incoming beam into a reflected and transmitted component, like a mirror or beam splitter. Further we know that a symmetric resonator must give the same results for fields injected from the left or from the right. Thus, the phase factors upon reflection must be equal φr = φr1 = φr2. The reflection and transmission coefficients are given by Equations (7) and (8) as

$${r_{{\rm{cav}}}} = \left({{r_1} - {{{r_2}t_1^2\exp (- {\rm{i}}\;{\rm{2}}kD)} \over {1 - {r_1}{r_2}\exp (- {\rm{i}}\;{\rm{2}}kD)}}} \right),$$
(29)

and

$${t_{{\rm{cav}}}} = {{- {t_1}{t_2}\exp (- {\rm{i}}\;kD)} \over {1 - {r_1}{r_2}\exp (- {\rm{i\;2}}kD)}}.$$
(30)

We demonstrate a simple case by putting the cavity on resonance (kD = ). This yields

$${r_{{\rm{cav}}}} = \left({{r_1} - {{{r_2}t_1^2} \over {1 - {r_1}{r_2}}}} \right);\quad {t_{{\rm{cav}}}} = {{{\rm{i}}\;{t_1}{t_2}} \over {1 - {r_1}{r_2}}},$$
(31)

with rcav being purely real and tcav imaginary and thus φt = π/2 and φr = 0 which also agrees with Equation (28).

In most cases we neither know nor care about the exact phase factors. Instead we can pick any set which fulfills Equation (28). For this document we have chosen to use phase factors equal to those of the cavity, i.e., φt = π/2 and φr = 0, which is why we write the reflection and transmission at a mirror or beam splitter as

$${E_{{\rm{refl}}}} = r\;{E_{0\quad}}{\rm{and}}\quad {E_{{\rm{trans}}}} = {\rm{i}}\;t\;{E_0}.$$
(32)

In this definition r and t are positive real numbers satisfying r2 +t2 = 1 for the lossless case.

Please note that we only have the freedom to chose convenient phase factors when we do not know or do not care about the details of the optical system, which performs the beam splitting. If instead the details are important, for example when computing the properties of a thin coating layer, such as anti-reflex coatings, the proper phase factors for the respective interfaces must be computed and used.

2.5 Lengths and tunings: numerical accuracy of distances

The resonance condition inside an optical cavity and the operating point of an interferometer depends on the optical path lengths modulo the laser wavelength, i.e., for light from an Nd:YAG laser length differences of less than 1 µm are of interest, not the full magnitude of the distances between optics. On the other hand, several parameters describing the general properties of an optical system, like the finesse or free spectral range of a cavity (see Section 5.1) depend on the macroscopic distance and do not change significantly when the distance is changed on the order of a wavelength. This illustrates that the distance between optical components might not be the best parameter to use for the analysis of optical systems. Furthermore, it turns out that in numerical algorithms the distance may suffer from rounding errors. Let us use the Virgo [56] arm cavities as an example to illustrate this. The cavity length is approximately 3 km, the wavelength is on the order of 1 µm, the mirror positions are actively controlled with a precision of 1 µm and the detector sensitivity can be as good as 10−18 m, measured on ∼ 10 ms timescales (i.e., many samples of the data acquisition rate). The floating point accuracy of common, fast numerical algorithms is typically not better than 10−15. If we were to store the distance between the cavity mirrors as such a floating point number, the accuracy would be limited to 3 pm, which does not even cover the accuracy of the control systems, let alone the sensitivity.

Figure 10
figure 10

Illustration of an arm cavity of the Virgo gravitational-wave detector [56]: the macroscopic length L of the cavity is approximately 3 km, while the wavelength of the Nd:YAG laser is λ ∼ 1 µm. The resonance condition is only affected by the microscopic position of the wave nodes with respect to the mirror surfaces and not by the macroscopic length, i.e., displacement of one mirror by Δx = λ/2 re-creates exactly the same condition. However, other parameters of the cavity, such as the finesse, only depend on the macroscopic length L and not on the microscopic tuning.

A simple and elegant solution to this problem is to split a distance D between two optical components into two parameters [29]: one is the macroscopic ‘length’ L, defined as the multiple of a constant wavelength λ0 yielding the smallest difference to D. The second parameter is the microscopic tuning T that is defined as the remaining difference between L and D, i.e., D = L + T. Typically, λ0 can be understood as the wavelength of the laser in vacuum, however, if the laser frequency changes during the experiment or multiple light fields with different frequencies are used simultaneously, a default constant wavelength must be chosen arbitrarily. Please note that usually the term λ in any equation refers to the actual wavelength at the respective location as λ = λ0/n with n the index of refraction at the local medium.

We have seen in Section 2.1 that distances appear in the expressions for electromagnetic waves in connection with the wave number, for example,

$${E_{\rm{2}}} = {E_{1\quad}}\exp (- {\rm{i}}\;kz)\;.$$
(33)

Thus, the difference in phase between the field at z = z1 and z = z1 + D is given as

$$\varphi = - kD.$$
(34)

We recall that k = 2π/λ = ω/c. We can define ω0 = 2π c/λ0 and k0 = ω0/c. For any given wavelength λ we can write the corresponding frequency as a sum of the default frequency and a difference frequency ω = ω0 + Δω. Using these definitions, we can rewrite Equation (34) with length and tuning as

$$- \varphi = kD = {{{\omega _0}L} \over c} + {{\Delta \omega L} \over c} + {{{\omega _0}T} \over c} + {{\Delta \omega T} \over c}.$$
(35)

The first term of the sum is always a multiple of 2π, which is equivalent to zero. The last term of the sum is the smallest, approximately of the order Δω · 10−14. For typical values of L ≈ 1 m, T < 1 µm and Δω < 2π · 100 MHz we find that

$${{{\omega _0}L} \over c} = 0,\quad {{\Delta \omega L} \over c}\underset{\approx}{<} 2,\quad {{{\omega _0}T} \over c} \underset{\approx}{<} 6,\quad {{\Delta \omega T} \over c} \underset{\approx}{<} 2\;\;\;{10^{- 6}},$$
(36)

which shows that the last term can often be ignored.

We can also write the tuning directly as a phase. We define as the dimensionless tuning

$$\phi = {\omega _0}T/c.$$
(37)

This yields

$$\exp \left({{\rm{i}}{\omega \over c}T} \right) = \exp \left({{\rm{i}}{{{\omega _0}} \over c}T{\omega \over {{\omega _0}}}} \right) = \exp \left({{\rm{i}}{\omega \over {{\omega _0}}}\phi} \right).$$
(38)

The tuning ϕ is given in radian with 2π referring to a microscopic distance of one wavelengthFootnote 2 λ0.

Finally, we can write the following expression for the phase difference between the light field taken at the end points of a distance D:

$$\varphi = - kD = - \left({{{\Delta \omega L} \over c} + \phi {\omega \over {{\omega _0}}}} \right),$$
(39)

or if we neglect the last term from Equation (36) we can approximate (ω/ω0 ≈ 1) to obtain

$$\varphi \approx - \left({{{\Delta \omega L} \over c} + \phi} \right).$$
(40)

This convention provides two parameters L and ϕ that can describe distances with a markedly improved numerical accuracy. In addition, this definition often allows simplification of the algebraic notation of interferometer signals. By convention we associate a length L with the propagation through free space, whereas the tuning will be treated as a parameter of the optical components. Effectively the tuning then represents a microscopic displacement of the respective component. If, for example, a cavity is to be resonant to the laser light, the tunings of the mirrors have to be the same whereas the length of the space in between can be arbitrary.

2.6 Revised coupling matrices for space and mirrors

Using the definitions for length and tunings we can rewrite the coupling equations for mirrors and spaces introduced in Section 2.1 as follows. The mirror coupling becomes (compare this to Figure 6), and the amplitude coupling for a ‘space’, formally written as in Figure 7, is now written as

figure 11

Figure 11

figure 12

Figure 12

2.7 Finesse examples

2.7.1 Mirror reflectivity and transmittance

We use Finesse to plot the amplitudes of the light fields transmitted and reflected by a mirror (given by a single surface). Initially, the mirror has a power reflectance and transmittance of R = T = 0.5 and is, thus, lossless. For the plot in Figure 13 we tune the transmittance from 0.5 to 0. Since we do not explicitly change the reflectivity, R remains at 0.5 and the mirror loss increases instead, which is shown by the trace labelled ‘total’ corresponding to the sum of the reflected and transmitted light power. The plot also shows the phase convention of a 90° phase shift for the transmitted light.

Figure 13
figure 13

Finesse example: Mirror reflectivity and transmittance.

Finesse input file for ‘Mirror reflectivity and transmittance’

figure Graphic1

2.7.2 Length and tunings

This Finesse file demonstrates the conventions for lengths and microscopic positions introduced in Section 2.5. The top trace in Figure 14 depicts the phase change of a beam reflected by a beam splitter as the function of the beam splitter tuning. By changing the tuning from 0 to 180° the beam splitter is moved forward and shortens the path length by one wavelength, which by convention increases the light phase by 360°. On the other hand, if a length of a space is changed, the phase of the transmitted light is unchanged (for the default wavelength Δk = 0), as shown the in the lower trace.

Figure 14
figure 14

Finesse example: Length and tunings.

Finesse input file for ‘Length and tunings’

figure Graphic2

3 Light with Multiple Frequency Components

So far we have considered the electromagnetic field to be monochromatic. This has allowed us to compute light-field amplitudes in a quasi-static optical setup. In this section, we introduce the frequency of the light as a new degree of freedom. In fact, we consider a field consisting of a finite and discrete number of frequency components. We write this as

$$E(t,z) = \sum\limits_j {{a_j}\exp ({\rm{i}}({\omega _j}t - {k_j}z)),}$$
(41)

with complex amplitude factors aj, ωj as the angular frequency of the light field and kj = ωj/c. In many cases the analysis compares different fields at one specific location only, in which case we can set z = 0 and write

$$E(t) = \sum\limits_j {{a_j}\exp ({\rm{i}}{\omega _j}t){.}}$$
(42)

In the following sections the concept of light modulation is introduced. As this inherently involves light fields with multiple frequency components, it makes use of this type of field description. Again we start with the two-mirror cavity to illustrate how the concept of modulation can be used to model the effect of mirror motion.

3.1 Modulation of light fields

Laser interferometers typically use three different types of light fields: the laser with a frequency of, for example, f ≈ 2.8 · 1014 Hz, radio frequency (RF) sidebands used for interferometer control with frequencies (offset to the laser frequency) of f ≈ 1 • 10e to 150 • 10e Hz, and the signal sidebands at frequencies of 1 to 10,000 HzFootnote 3. As these modulations usually have as their origin a change in optical path length, they are often phase modulations of the laser frequency, the RF sidebands are utilised for optical readout purposes, while the signal sidebands carry the signal to be measured (the gravitational-wave signal plus noise created in the interferometer).

Figure 15 shows a time domain representation of an electromagnetic wave of frequency ω0, whose amplitude or phase is modulated at a frequency One can easily see some characteristics of these two types of modulation, for example, that amplitude modulation leaves the zero crossing of the wave unchanged whereas with phase modulation the maximum and minimum amplitude of the wave remains the same. In the frequency domain in which a modulated field is expanded into several unmodulated field components, the interpretation of modulation becomes even easier: any sinusoidal modulation of amplitude or phase generates new field components, which are shifted in frequency with respect to the initial field. Basically, light power is shifted from one frequency component, the carrier, to several others, the sidebands. The relative amplitudes and phases of these sidebands differ for different types of modulation and different modulation strengths. This section demonstrates how to compute the sideband components for amplitude, phase and frequency modulation.

Figure 15
figure 15

Example traces for phase and amplitude modulation: the upper plot a) shows a phase-modulated sine wave and the lower plot b) depicts an amplitude-modulated sine wave. Phase modulation is characterised by the fact that it mostly affects the zero crossings of the sine wave. Amplitude modulation affects mostly the maximum amplitude of the wave. The equations show the modulation terms in red with m the modulation index and Ω the modulation frequency.

3.2 Phase modulation

Phase modulation can create a large number of sidebands. The number of sidebands with noticeable power depends on the modulation strength (or depth) given by the modulation index m. Assuming an input field

$${E_{{\rm{in}}}} = {E_0}\exp ({\rm{i}}{\omega _0}t),$$
(43)

a sinusoidal phase modulation of the field can be described as

$$E = {E_0}\exp \left({{\rm{i}}({\omega _0}t + m\cos (\Omega t))} \right).$$
(44)

This equation can be expanded using the identity [27]

$$\exp ({\rm{i}}z\cos \varphi) = \sum\limits_{k = - \infty}^\infty {{{\rm{i}}^k}{J_k}} (z)\exp ({\rm{i}}\;k\varphi),$$
(45)

with Bessel functions of the first kind Jk(m). We can write

$$E = {E_0}\exp ({\rm{i}}{\omega _0}t) \sum\limits_{k = - \infty}^\infty {{{\rm{i}}^k}{J_k}} (m)\exp ({\rm{i}}\;k\Omega t){.}$$
(46)

The field for k = 0, oscillating with the frequency of the input field ω0, represents the carrier. The sidebands can be divided into upper (k > 0) and lower (k < 0) sidebands. These sidebands are light fields that have been shifted in frequency by k Ω. The upper and lower sidebands with the same absolute value of k are called a pair of sidebands of order k. Equation (46) shows that the carrier is surrounded by an infinite number of sidebands. However, for small modulation indices (m < 1) the Bessel functions rapidly decrease with increasing k (the lowest orders of the Bessel functions are shown in Figure 16). For small modulation indices we can use the approximation [2]

$${J_k}(m) = {\left({{m \over 2}} \right)^k}\sum\limits_{n =0}^\infty {{{{{\left({- {{{m^2}} \over 4}} \right)}^n}} \over {n!(k + n)!}} = {1 \over {k!}}} {\left({{m \over 2}} \right)^k} + O({m^{k + 2}}).$$
(47)

In which case, only a few sidebands have to be taken into account. For m ≪ 1 we can write

$$\begin{array}{*{20}c} {E = {E_0}\exp ({\rm{i}}{\omega _0}t)}\qquad \qquad \qquad \qquad \quad\quad\quad\quad\quad\quad\quad \\ {\times \left({{J_0}(m) - {\rm{i}}\;{J_{- 1}}(m)\;\exp (- {\rm{i}}\Omega t) + {\rm{i}}\;{J_1}(m)\exp ({\rm{i}}\Omega t)} \right),}\\ \end{array}$$
(48)

and with

$${J_{- k}}(m) = {(- 1)^k}{J_k}(m),$$
(49)

we obtain

$$E = {E_0}\exp ({\rm{i}}{\omega _0}t)\left({1 + {\rm{i}}{m \over 2}\left({\exp (- {\rm{i}}\Omega t) + \exp ({\rm{i}}\Omega t)} \right)} \right),$$
(50)

as the first-order approximation in m. In the above equation the carrier field remains unchanged by the modulation, therefore this approximation is not the most intuitive. It is clearer if the approximation up to the second order in is given:

$$E = {E_0}\exp ({\rm{i}}{\omega _0}t)\left({1 - {{{m^2}} \over 4} + {\rm{i}}{m \over 2}\left({\exp (- {\rm{i}}\Omega t) + \exp ({\rm{i}}\Omega t)} \right)} \right),$$
(51)

which shows that power is transferred from the carrier to the sideband fields.

Figure 16
figure 16

Some of the lowest-order Bessel functions Jk(x) of the first kind. For small x the expansion shows a simple dependency and higher-order functions can often be neglected.

Higher-order expansions in m can be performed simply by specifying the highest order of Bessel function, which is to be used in the sum in Equation (46), i.e.,

$$E = {E_0}\exp ({\rm{i}}{\omega _0}t)\sum\limits_{k = - order}^{order} {{i^k}\;{J_k}(m)\exp ({\rm{i}}\;k\Omega t){.}}$$
(52)

3.3 Frequency modulation

For small modulation, indices, phase modulation and frequency modulation can be understood as different descriptions of the same effect [29]. Following the same spirit as above we would assume a modulated frequency to be given by

$$\omega = {\omega _0} + {m^{\prime}}\cos (\Omega t),$$
(53)

and then we might be tempted to write

$$E = {E_0}\;\exp \left({{\rm{i(}}{\omega _0} + {m^{\prime}}\cos (\Omega t))t} \right),$$
(54)

which would be wrong. The frequency of a wave is actually defined as ω/(2π) = f = /dt. Thus, to obtain the frequency given in Equation (53), we need to have a phase of

$${\omega _0}t + {{m^{\prime}} \over \Omega}\sin (\Omega t){.}$$
(55)

for consistency with the notation for phase modulation, we define the modulation index to be

$$m = {{m^{\prime}} \over \Omega} = {{\Delta \omega} \over \Omega},$$
(56)

with Δω as the frequency swing — how far the frequency is shifted by the modulation — and Ω the modulation frequency — how fast the frequency is shifted. Thus, a sinusoidal frequency modulation can be written as

$$E = {E_0}\;\exp ({\rm{i}}\varphi) = {E_0}\;\exp \left({{\rm{i}}\left({{\omega _0}t + {{\Delta \omega} \over \Omega}\cos (\Omega t)} \right)} \right),$$
(57)

which is exactly the same expression as Equation (44) for phase modulation. The practical difference is the typical size of the modulation index, with phase modulation having a modulation index of m < 10, while for frequency modulation, typical numbers might be m > 104. Thus, in the case of frequency modulation, the approximations for small m are not valid. The series expansion using Bessel functions, as in Equation (46), can still be performed, however, very many terms of the resulting sum need to be taken into account.

3.4 Amplitude modulation

In contrast to phase modulation, (sinusoidal) amplitude modulation always generates exactly two sidebands. Furthermore, a natural maximum modulation index exists: the modulation index is defined to be one (m = 1) when the amplitude is modulated between zero and the amplitude of the unmodulated field.

If the amplitude modulation is performed by an active element, for example by modulating the current of a laser diode, the following equation can be used to describe the output field:

$$\begin{array}{*{20}c} {E = {E_0}\exp ({\rm{i}}{\omega _0}t)\left({1 + m\cos (\Omega t)} \right)}\qquad\qquad\qquad\qquad\\ {= {E_0}\exp ({\rm{i}}{\omega _0}t)\left({1 + {m \over 2}\exp ({\rm{i}}\Omega t) + {m \over 2}\exp (- {\rm{i}}\Omega t)} \right)}\\ \end{array}.$$
(58)

However, passive amplitude modulators (like acousto-optic modulators or electro-optic modulators with polarisers) can only reduce the amplitude. In these cases, the following equation is more useful:

$$\begin{array}{*{20}c} {E = {E_0}\exp ({\rm{i}}{\omega _0}t)\left({1 - {m \over 2}\left({1 - \cos (\Omega t)} \right)} \right)}\qquad\qquad\qquad\\ {\quad = {E_0}\exp ({\rm{i}}{\omega _0}t)\left({1 - {m \over 2} + {m \over 4}\exp ({\rm{i}}\Omega t) + {m \over 4}\exp (- {\rm{i}}\Omega t)} \right).}\\ \end{array}$$
(59)

3.5 Sidebands as phasors in a rotating frame

A common method of visualising the behaviour of sideband fields in interferometers is to use phase diagrams in which each field amplitude is represented by an arrow in the complex plane.

We can think of the electric field amplitude E0 exp(i ω0t) as a vector in the complex plane, rotating around the origin with angular velocity ω0. To illustrate or to help visualise the addition of several light fields it can be useful to look at this problem using a rotating reference frame, defined as follows. A complex number shall be defined as z = x + iy so that the real part is plotted along the x-xis, while the y-axis is used for the imaginary part. We want to construct a new coordinate system (x′, y′) in which the field vector is at a constant position. This can be achieved by defining

$$\begin{array}{*{20}c} {x = x^{\prime}\;\cos {\omega _0}t - y^{\prime}\;\sin {\omega _0}t}\\ {y = x^{\prime}\;\sin {\omega _0}t + y^{\prime}\;\cos {\omega _0}t,}\\ \end{array}$$
(60)

or

$$\begin{array}{*{20}c} {x^{\prime} = x \cos (- {\omega _0}t) - y \sin (- {\omega _0}t)}\\ {y^{\prime} = x \sin (- {\omega _0}t) + y \cos (- {\omega _0}t){.}}\\ \end{array}$$
(61)

Figure 17 illustrates how the transition into the rotating frame makes the field vector to appear stationary. The angle of the field vector in a rotating frame depicts the phase offset of the field. Therefore these vectors are also called phasors and the illustrations using phasors are called phasor diagrams. Two more complex examples of how phasor diagrams can be employed is shown in Figure 18 [11].

Figure 17
figure 17

Electric field vector E0 exp(iω0t) depicted in the complex plane and in a rotating frame (x′, y′) rotating at ω0 so that the field vector appears stationary.

Figure 18
figure 18

Amplitude and phase modulation in the ‘phasor’ picture. The upper plots a) illustrate how a phasor diagram can be used to describe phase modulation, while the lower plots b) do the same for amplitude modulation. In both cases the left hand plot shows the carrier in blue and the modulation sidebands in green as snapshots at certain time intervals. One can see clearly that the upper sideband (ω0 + Ω) rotates faster than the carrier, while the lower sideband rotates slower. The right plot in both cases shows how the total field vector at any given time can be constructed by adding the three field vectors of the carrier and sidebands. [Drawing courtesy of Simon Chelkowski]

Phasor diagrams can be especially useful to see how frequency coupling of light field amplitudes can change the type of modulation, for example, to turn phase modulation into amplitude modulation. An extensive introduction to this type of phasor diagram can be found in [39].

3.6 Phase modulation through a moving mirror

Several optical components can modulate transmitted or reflected light fields. In this section we discuss in detail the example of phase modulation by a moving mirror. Mirror motion does not change the transmitted light; however, the phase of the reflected light will be changed as shown in Equation (11).

We assume sinusoidal change of the mirror’s tuning as shown in Figure 19. The position modulation is given as xm = cos(ωst + φs), and thus the reflected field at the mirror becomes (assuming a4 = 0)

$${a_3} = r{a_1}\exp (- {\rm{i\;2}}{\phi _0})\exp ({\rm{i}}2k{x_{\rm{m}}}) \approx r{a_1}\exp (- {\rm{i\;2}}{\phi _0})\exp \left({- {\rm{i\;2}}{k_0}{a_{\rm{s}}}\cos ({\omega _{\rm{s}}}t + {\varphi _{\rm{s}}})} \right),$$
(62)

setting m = 2k0as. This can be expressed as

$$\begin{array}{*{20}c} {{a_3} = r{a_1}\exp (- {\rm{i\;2}}{\phi _0})\left({1 + {\rm{i}}{m \over 2}\exp \left({- {\rm{i}}({\omega _{\rm{s}}}t + {\varphi _{\rm{s}}})} \right) + {\rm{i}}{m \over 2}\exp \left({{\rm{i}}({\omega _{\rm{s}}}t + {\varphi _{\rm{s}}})} \right)} \right)}\\ {= r{a_1}\exp (- {\rm{i\;2}}{\phi _0})\left({1 + {m \over 2}\exp \left({- {\rm{i}}({\omega _{\rm{s}}}t + {\varphi _{\rm{s}}} - \pi/2)} \right)} \right.}\qquad\qquad\qquad\\ {\left. {+ {m \over 2}\exp \left({{\rm{i}}({\omega _{\rm{s}}}t + {\varphi _{\rm{s}}} + \pi/2)} \right)} \right).}\qquad\qquad\qquad\qquad\qquad\qquad\\ \end{array}$$
(63)
Figure 19
figure 19

A sinusoidal signal with amplitude as frequency ωs and phase offset φs is applied to a mirror position, or to be precise, to the mirror tuning. The equation given for the tuning ϕ assumes that ωs/ω0 ≪ 1, see Section 2.5.

3.7 Coupling matrices for beams with multiple frequency components

The coupling between electromagnetic fields at optical components introduced in Section 2 referred only to the amplitude and phase of a simplified monochromatic field, ignoring all the other parameters of the electric field of the beam given in Equation (1). However, this mathematical concept can be extended to include other parameters provided that we can find a way to describe the total electric field as a sum of components, each of which is characterised by a discrete value of the related parameters. In the case of the frequency of the light field, this means we have to describe the field as a sum of monochromatic components. In the previous sections we have shown how this could be done in the special case of an initial monochromatic field that is subject to modulation: if the modulation index is small enough we can limit the amount of frequency components that we need to consider. In many cases it is actually sufficient to describe a modulation only by the interaction of the carrier at φ0 (the unmodulated field) and two sidebands with a frequency offset of °φm to the carrier. A beam given by the sum of three such components can be described by a complex vector:

$$\vec a = \left({\begin{array}{*{20}c} {\quad a({\omega _0})}\\ {a({\omega _0} - {\omega _m})}\\ {a({\omega _0} + {\omega _m})}\\ \end{array}} \right) = \left({\begin{array}{*{20}c} {{a_{\omega 0}}}\\ {{a_{\omega 1}}}\\ {{a_{\omega 2}}}\\ \end{array}} \right)$$
(64)

with φ0 = φ0, φ0φm = φ1 and φ0 + φm = φ2. In the case of a phase modulator that applies a modulation of small modulation index m to an incoming light field \({{\vec a}_1}\), we can describe the coupling of the frequency component as follows:

$$\begin{array}{*{20}c} {{a_{2,\omega 0}} = {J_0}(m){a_{1,\omega 0}} + {J_1}(m){a_{1,\omega 1}} + {J_{- 1}}(m){a_{1,\omega 2}}}\;\;\\ {{a_{2,\omega 1}} = {J_0}(m){a_{1,\omega 1}} + {J_{- 1}}(m){a_{1,\omega 0}}}\qquad\qquad\qquad\\ {{a_{2,\omega 2}} = {J_0}(m){a_{1,\omega 2}} + {J_1}(m){a_{1,\omega 0}},}\qquad\qquad\qquad\\ \end{array}$$
(65)

which can be written in matrix form:

$${\vec a_2} = \left({\begin{array}{*{20}c} {{J_0}(m)} & {{J_1}(m)} & {{J_{- 1}}(m)}\\ {{J_{- 1}}(m)} & {{J_0}(m)} & 0\\ {{J_1}(m)} & 0 & {{J_0}(m)}\\ \end{array}} \right){\vec a_1}.$$
(66)

And similarly, we can write the complete coupling matrix for the modulator component, for example, as

$$\left({\begin{array}{*{20}c} {{a_{2,w0}}} \\ {{a_{2,w1}}} \\ {{a_{2,w2}}} \\ {{a_{4,w0}}} \\ {{a_{4,w1}}} \\ {{a_{4,w2}}} \\ \end{array}} \right)\left({\begin{array}{*{20}c} {{J_0}(m)} & {{J_1}(m)} & {{J_{- 1}}(m)} & 0 & 0 & 0 \\ {{J_{- 1}}(m)} & {{J_0}(m)} & 0 & 0 & 0 & 0 \\ {{J_1}(m)} & 0 & {{J_0}(m)} & 0 & 0 & 0 \\ 0 & 0 & 0 & {{J_0}(m)} & {{J_1}(m)} & {{J_{- 1}}(m)} \\ 0 & 0 & 0 & {{J_{- 1}}(m)} & {{J_0}(m)} & 0 \\ 0 & 0 & 0 & {{J_1}(m)} & 0 & {{J_0}(m)} \\ \end{array}} \right)\left({\begin{array}{*{20}c} {{a_{1,w0}}} \\ {{a_{1,w1}}} \\ {{a_{1,w2}}} \\ {{a_{3,w0}}} \\ {{a_{3,w1}}} \\ {{a_{3,w2}}} \\ \end{array}} \right)$$
(67)

3.8 Finesse examples

3.8.1 Modulation index

This file demonstrates the use of a modulator. Phase modulation (with up to five higher harmonics is applied to a laser beam and amplitude detectors are used to measure the field at the first three harmonics. Compare this to Figure 16 as well.

Figure 20
figure 20

Finesse example: Modulation index.4

Finesse input file for ‘Modulation index’

figure Graphic3

3.8.2 Mirror modulation

Finesse offers two different types of modulators: the ‘modulator’ component shown in the example above, and the ‘fsig’ command, which can be used to apply a signal modulation to existing optical components. The main difference is that ‘fsig’ is meant to be used for transfer function computations. Consequently Finesse discards all nonlinear terms, which means that the sideband amplitude is proportional to the signal amplitude and harmonics are not created.

Figure 21
figure 21

Finesse example: Mirror modulation.

Finesse input file for ‘Mirror modulation’

figure Graphic4

4 Optical Readout

In previous sections we have dealt with the amplitude of light fields directly and also used the amplitude detector in the Finesse examples. This is the advantage of a mathematical analysis versus experimental tests, in which only light intensity or light power can be measured directly. This section gives the mathematical details for modelling photo detectors.

The intensity of a field impinging on a photo detector is given as the magnitude of the Poynting vector, with the Poynting vector given as [58]

$$\vec S = \vec E \times \vec H = {1 \over {{\mu _0}}}\vec E \times \vec B.$$
(68)

Inserting the electric and magnetic components of a plane wave, we obtain

$$\vert \vec S\vert = {1 \over {{\mu _0}c}}{E^2} = c{\epsilon _0}E_0^2{\cos ^2}(\omega t) = {{c{\epsilon _0}} \over 2}E_0^2(1 + \cos (2\omega t)),$$
(69)

with ϵ0 the electric permeability of vacuum and the speed of light.

The response of a photo detector is given by the total flux of effective radiationFootnote 4 during the response time of the detector. For example, in a photo diode a photon will release a charge in the n-p junction. The response time is given by the time it takes for the charge to travel through the detector (and further time may be taken up in the electronic processing of the signal). The size of the photodiode and the applied bias voltage determine the travel time of the charges with typical values of approximately 10 ns. Thus, frequency components faster than perhaps 100 MHz are not resolved by a standard photodiode. For example, a laser beam with a wavelength of = 1064 nm has a frequency of f = c/λ ≈ 282 1012 Hz = 282 THz. Thus, the 2ω component is much too fast for the photo detector; instead, it returns the average power

$$\vert \overline {\vec S} \vert = {{c{\epsilon _0}} \over 2}E_0^2.$$
(70)

In complex notation we can write

$$\vert \overline {\vec S} \vert = {{c{\epsilon _0}} \over 2}EE^\ast.$$
(71)

However, for more intuitive results the light fields can be given in converted units, so that the light power can be computed as the square of the light field amplitudes. Unless otherwise noted, throughout this work the unit of light field amplitudes \(\sqrt {{\rm{watt}}}\). Thus, the notation used in this document to describe the computation of the light power of a laser beam is

$$P = EE^\ast.$$
(72)

4.1 Detection of optical beats

What is usually called an optical beat or simply a beat is the sinusoidal behaviour of the intensity of two overlapping and coherent fields. For example, if we superpose two fields of slightly different frequency, we obtain

$$\begin{array}{*{20}c} {E = {E_0}\cos ({\omega _1}t) + {E_0}\cos ({\omega _2}t)}\qquad\qquad\qquad\qquad\qquad\qquad\\ {P = {E^2} = E_0^2\left({{{\cos}^2}({\omega _1}t) + {{\cos}^2}({\omega _2}t) + 2\cos ({\omega _1}t)\cos ({\omega _1}t)} \right)}\;\;\;\;\\ {= E_0^2\left({{{\cos}^2}({\omega _1}t) + {{\cos}^2}({\omega _2}t) + \cos ({\omega _ +}t) + \cos ({\omega _ -}t)} \right),}\\ \end{array}$$
(73)

with ω+ = ω1 + ω2 and ω = ω1ω2. In this equation the frequency ω can be very small and can then be detected with the photodiode as illustrated in Figure 22.

$${P_{{\rm{diode}}}} = E_0^2(1 + \cos ({\omega _ -}t))$$
(74)

Using the same example photodiode as before: in order to be able to detect an optical beat ω would need to be smaller than 100 MHz. If we take two, sightly detuned Nd:YAG lasers with f = 282 THz, this means that the relative detuning of these lasers must be smaller than 10−7.

Figure 22
figure 22

A beam with two frequency components hits the photo diode. Shown in this plot are the field amplitude, the corresponding intensity and the electrical output of the photodiode.

In general, for a field with several frequency components, the photodiode signal can be written as

$$\vert E{\vert ^2} = E \cdot {E^{\ast}} = \sum\limits_{i = 0}^N {\sum\limits_{j = 0}^N {{a_i}a_j^{\ast}{e^{{\rm{i}}({\omega _i}{\rm{-}}{\omega _j})t}}.}}$$
(75)

for example, if the photodiode signal is filtered with a low-pass filter, such that only the DC part remains, we can compute the resulting signal by looking for all components without frequency dependence. The frequency dependence vanishes when the frequency becomes zero, i.e., in all parts of Equation (75) with ωi = ωj. The output is a real number, calculated like this:

$$x = \sum\limits_i {\sum\limits_j {{a_i}a_j^{\ast}\quad {\rm{with}}\quad {\rm{\{}}i,j\vert i,j} \in \{0, \ldots, N\} \wedge {\omega _i} = {\omega _j}\} {.}}$$
(76)

4.2 Signal demodulation

A typical application of light modulation, is its use in a modulation-demodulation scheme, which applies an electronic demodulation to a photodiode signal. A ‘demodulation’ of a photodiode signal at a user-defined frequency ωx, performed by an electronic mixer and a low-pass filter, produces a signal, which is proportional to the amplitude of the photo current at DC and at the frequency Interestingly, by using two mixers with different phase offsets one can also reconstruct the phase of the signal, or to be precise the phase difference of the light at ω0 ± ωx with respect to the carrier light. This feature can be very powerful for generating interferometer control signals.

Mathematically, the demodulation process can be described by a multiplication of the output with a cosine: cos(ωx+φj) (is the demodulation phase), which is also called the ‘local oscillator’. After the multiplication was performed only the DC part of the result is taken into account. The signal is

$${S_0} = \vert E{\vert ^2} = E \cdot E^\ast = \sum\limits_{i = 0}^N {\sum\limits_{j = 0}^N {{a_i}a_j^{\ast}{e^{{\rm{i}}({\omega _i} - {\omega _j})t}}.}}$$
(77)

Multiplied with the local oscillator it becomes

$$\begin{array}{*{20}c} {{S_1} = {S_0} \cdot \cos ({\omega _x}t + {\varphi _x}) = {S_0}{1 \over 2}({e^{{\rm{i}}({\omega _x}t + {\varphi _x})}} + {e^{{\rm{- i}}({\omega _x}t + {\varphi _x})}})} \\ {= {1 \over 2}\sum\limits_{i = 0}^N {\sum\limits_{j = 0}^N {{a_i}a_j^{\ast}{e^{{\rm{i}}({\omega _i} - {\omega _j})t}} \cdot ({e^{{\rm{i}}({\omega _x}t + {\varphi _x})}} + {e^{{\rm{- i}}({\omega _x}t + {\varphi _x})}})}}.} \\ \end{array}$$
(78)

With \({A_{ij}} = {a_i}a_j^{\ast}\) and \({e^{{\rm{i}}{\omega _{ij}}\,t}} = {e^{{\rm{i}}\,{{{\rm{(}}{\omega _i}{\rm{-}}{\omega _j})}^t}}}\) we can write

$${S_1} = {1 \over 2}\left({\sum\limits_{i = 0}^N {{A_{ii}} + \sum\limits_{i = 0}^N {\sum\limits_{j = i + 1}^N {({A_{ij}}{e^{{\rm{i}}{\omega _{ij}}t}} + A_{ij}^{\ast}{e^{{\rm{- i}}{\omega _{ij}}t}})}}}} \right)\cdot({e^{{\rm{i}}({\omega _x}t + {\varphi _x})}} + {e^{{\rm{- i}}({\omega _x}t + {\varphi _x})}}){.}$$
(79)

When looking for the DC components of S1 we get the following [20]:

$$\begin{array}{*{20}c} {{S_{1,{\rm{DC}}}} = \sum\limits_{ij} {{1 \over 2}({A_{ij}}\,{e^{- {\rm{i}}\,{\varphi _x}}} + A_{ij}^{\ast} \,{e^{{\rm{i}}\,{\varphi _x}}})\quad {\rm{with}}\quad \{i,j\vert i,j \in \{0, \ldots ,N\} \wedge {\omega _{ij}} = {\omega _x}\}}} \\ {= \sum\limits_{ij} {\Re \{{A_{ij}}\,{e^{- {\rm{i}}\,{\varphi _x}}}\} {.}} \qquad \quad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \,\,} \\ \end{array}$$
(80)

This would be the output of a mixer and a subsequent low-pass filter. The results for φx = 0 and φx = π/2 are called in-phase and in-quadrature, respectively (or also first and second quadrature). They are given by

$$\begin{array}{*{20}c} {{S_{1,{\rm{DC,phase}}}} = \sum\limits_{ij} {\Re \{{A_{ij}}\},}}\\ {{S_{1,{\rm{DC,quad}}}} = \sum\limits_{ij} {\Im \{{A_{ij}}\}.}}\\\end{array}$$
(81)

if only one mixer is used, the output is always real and is determined by the demodulation phase. However, with two mixers generating the in-phase and in-quadrature signals, it is possible to construct a complex number representing the signal amplitude and phase:

$$z = \sum\limits_{ij} {{a_i}a_j^{\ast}\quad {\rm{with}}\quad \{i,j\vert i,j \in \{0, \ldots, N\} \wedge {\omega _{ij}} = {\omega _x}\}{.}}$$
(82)

Often several sequential demodulations are applied in order to measure very specific phase information. For example, a double demodulation can be described as two sequential multiplications of the signal with two local oscillators and taking the DC component of the result. First looking at the whole signal, we can write:

$${S_2} = {S_0} \cdot \cos ({\omega _x}t + {\varphi _x})\cos ({\omega _y}t + {\varphi _y}){.}$$
(83)

This can be written as

$$\begin{array}{*{20}c} {{S_2} = {S_0}{1 \over 2}(\cos ({\omega _y}t + {\omega _x}t + {\varphi _y} + {\varphi _x}) + \cos ({\omega _y}t - {\omega _x}t + {\varphi _y} - {\varphi _x}))}\\ {= {S_0}{1 \over 2}(\cos ({\omega _ +}t + {\varphi _ +}) + \cos ({\omega _ -}t + {\varphi _ -})),}\qquad \qquad \qquad\quad \\ \end{array}$$
(84)

and thus reduced to two single demodulations. Since we now only care for the DC component we can use the expression from above (Equation (82)). These two demodulations give two complex numbers:

$$\begin{array}{*{20}c} {z1 = \sum\limits_{ij} {{A_{ij}}\quad {\rm{with}}\quad \{i,j\vert i,j \in \{0, \ldots, N\} \wedge {\omega _i} - {\omega _j} = {\omega _ +}\},}}\\ {z2 = \sum\limits_{ij} {{A_{kl}}\quad {\rm{with}}\quad \{k,l\vert k,l \in \{0, \ldots, N\} \wedge {\omega _k} - {\omega _l} = {\omega _ -}\}{.}}}\\\end{array}$$
(85)

The demodulation phases are applied as follows to get a real output (two sequential mixers)

$$x = \Re \{({z_1}\;{e^{- {\rm{i}}{\varphi _x}}} + {z_2}\;{e^{{\rm{i}}{\varphi _x}}}){e^{{\rm{- i}}{\varphi _y}}}\}{.}$$
(86)

In a typical setup, a user-defined demodulation phase for the first frequency (here is given. If two mixers are used for the second demodulation, we can reconstruct the complex number

$$z = {z_1}\;{e^{- {\rm{i}}{\varphi _x}}} + {z_2}\;{e^{{\rm{i}}{\varphi _x}}}.$$
(87)

More demodulations can also be reduced to single demodulations as above.

4.3 Finesse examples

4.3.1 Optical beat

In this example two laser beams are superimposed at a 50:50 beam splitter. The beams have a slightly different frequency: the second beam has a 10 kHz offset with respect to the first (and to the default laser frequency). The plot illustrates the output of four different detectors in one of the beam splitter output ports, while the phase of the second beam is tuned from 0° to 180°. The photodiode ‘pd1’ shows the total power remaining constant at 1. The amplitude detectors ‘ad1’ and ‘ad10k’ detect the laser light at 0 Hz (default frequency) and 10 kHz respectively. Both show a constant absolute \(\sqrt {1/2}\) and the detector ‘ad10k’ tracks the tuning of the phase of the second laser beam. Finally, the detector ‘pd10k’ resembles a photodiode with demodulation at 10 kHz. In fact, this represents a photodiode and two mixers used to reconstruct a complex number as shown in Equation (82). One can see that the phase of the resulting electronic signal also directly follows the phase difference between the two laser beams.

Figure 23
figure 23

Finesse example: Optical beat.

Finesse input file for ‘Optical beat’

figure Graphic5

5 Basic Interferometers

The large interferometric gravitational-wave detectors currently in operation are based on two fundamental interferometer topologies: the Fabry-Pérot and the Michelson interferometer. The main instrument is very similar to the original interferometer concept used in the famous experiment by Michelson and Morley, published in 1887 [42]. The main difference is that modern instruments use laser light to illuminate the interferometer to achieve much higher accuracy. Already the first prototype by Forward and Weiss has thus achieved a sensitivity a million times better than Michelson’s original instrument [18]. In addition, in current gravitational-wave detectors, the Michelson interferometer has been enhanced by resonant cavities, which in turn have been derived from the original idea for a spectroscopy standard published by Fabry and Pérot in 1899 [16]. The following section will describe the fundamental properties of the Fabry-Pérot interferometer and the Michelson interferometer. A thorough understanding of these basic instruments is essential for the study of the high-precision interferometers used for gravitational-wave detection.

5.1 The two-mirror cavity: a Fabry-Pérot interferometer

We have computed the field amplitudes in a linear two-mirror cavity, also called Fabry-Pérot interferometer, in Section 2.2. In order to understand the features of this optical instrument it is of interest to have a closer look at the power circulation in the cavity. A typical optical layout is shown in Figure 24: two parallel mirrors form the Fabry-Pérot cavity. A laser beam is injected through the first mirror (at normal incidence).

Figure 24
figure 24

Typical optical layout of a two-mirror cavity, also called a Fabry-Pérot interferometer. Two mirrors form the Fabry-Pérot interferometer, a laser beam is injected through one of the mirrors and the reflected and transmitted light can be detected by photo detectors.

The behaviour of the (ideal) cavity is determined by the length of the cavity L, the wavelength of the laser λ and the reflectivity and transmittance of the mirrors. Assuming an input power of |a0|2 = 1, we obtain

$${P_1} = \vert {a_1}{\vert ^2} = {{{T_1}} \over {1 + {R_1}{R_2} - 2{r_1}{r_2}\cos (2kL)}},$$
(88)

with k = 2π/λ, P, T = t2 and R = r2, as defined in Section 1.4. Similarly we could compute the transmission of the optical system as the input-output ratio of the field amplitudes. For example,

$${{{a_2}} \over {{a_0}}} = {{- {t_1}{t_2}\exp (- {\rm{i}}\;kL)} \over {1 - {r_1}{r_2}\exp (- {\rm{i}}\;2kL)}}$$
(89)

is the frequency-dependent transfer function of the cavity in transmission (the frequency dependency is hidden inside the k = 2πf/c).

Figure 25 shows a plot of the circulating light power i over the laser frequency. The maximum power is reached when the cosine function in the denominator becomes equal to one, i.e., at kL = with N an integer. This is called the cavity resonance. The lowest power values are reached at anti-resonance when kL = (N + 1/2)π. We can also rewrite

$$2kL = \omega {{2L} \over c} = 2\pi f{{2L} \over c} = {{2\pi f} \over {{\rm{FSR}}}},$$
(90)

with FSR being the free-spectral range of the cavity as shown in Figure 25. Thus, it becomes clear that resonance is reached for laser frequencies

$${f_r} = N \cdot {\rm{FSR}},$$
(91)

where N is an integer.

Figure 25
figure 25

Power enhancement in a two-mirror cavity as a function of the laser-light frequency. The peaks marks the resonances of the cavity, i.e., modes of operation in which the injected light is resonantly enhanced. The frequency distance between two peaks is called free-spectral range (FSR).

Another characteristic parameter of a cavity is its linewidth, usually given as full width at half maximum (FWHM) or its pole frequency, fp. In order to compute the linewidth we have to ask at which frequency the circulating power becomes half the maximum:

$$\vert {a_1}({f_p}){\vert ^2}\overset{!}{=}{1 \over 2}\vert {a_{1,\max}}\vert ^{2}.$$
(92)

This results in the following expression for the full linewidth:

$${\rm{FWHM = 2}}{f_p} = {{2{\rm{FSR}}} \over \pi}\arcsin \left({{{1 - {r_1}{r_2}} \over {2\sqrt {{r_1}{r_2}}}}} \right).$$
(93)

The ratio of the linewidth and the free spectral range is called the finesse of a cavity:

$$F = {{{\rm{FSR}}} \over {{\rm{FWHM}}}}={\pi \over {{\rm{2arcsin}}\left({{{1 - {r_1}{r_2}} \over {2\sqrt {{r_1}{r_2}}}}} \right)}}.$$
(94)

In the case of high finesse, i.e., r1 and r2 are close to 1 we can use the fact that the argument of the arcsin function is small and make the approximation

$$F \approx {{\pi \sqrt {{r_1}{r_2}}} \over {1 - {r_1}{r_2}}} \approx {\pi \over {1 - {r_1}{r_2}}}.$$
(95)

The behaviour of a two mirror cavity depends on the length of the cavity (with respect to the frequency of the laser) and on the reflectivities of the mirrors. Regarding the mirror parameters one distinguishes three casesFootnote 5:

  • when T1 < T2 the cavity is called undercoupled

  • when T1 = T2 the cavity is called impedance matched

  • when T1 > T2 the cavity is called overcoupled

The differences between these three cases can seem subtle mathematically but have a strong impact on the application of cavities in laser systems. One of the main differences is the phase evolution of the light fields, which is shown in Figure 26. The circulating power shows that the resonance effect is better used in over-coupled cavities; this is illustrated in Figure 27, which shows the transmitted and circulating power for the three different cases. Only in the impedance-matched case can the cavity transmit (on resonance) all the incident power. Given the same total transmission T1 + T2, the overcoupled case allows for the largest circulating power and thus a stronger ‘resonance effect’ of the cavity, for example, when the cavity is used as a mode filter. Hence, most commonly used cavities are impedance matched or overcoupled.

Figure 26
figure 26

This figure compares the fields reflected by, transmitted by and circulating in a Fabry-Pérot cavity for the three different cases: over-coupled, under-coupled and impedance matched cavity (in all cases T1 + T2 = 0.2 and the round-trip loss is 1%). The traces show the phase and amplitude of the electric field as a function of laser frequency detuning.

Figure 27
figure 27

Power transmitted and circulating in a two mirror cavity with input power 1 W. The mirror transmissions are set such that T1 + T2 = 0.8 and the reflectivities of both mirrors are set as R = 1 − T. The cavity is undercoupled for T1 < 0.4, impedance matched at T1 = T2 = 0.4 and overcoupled for T1 > 0.4. The transmission is maximised in the impedance-matched case and falls similarly for over or undercoupled settings. However, the circulating power (and any resonance performance of the cavity) is much larger in the overcoupled case.

5.2 Michelson interferometer

We came across the Michelson interferometer in Section 2.4 when we discussed the phase relation at a beam splitter. The typical optical layout of the Michelson interferometer is shown again in Figure 28: a laser beam is split by a beam splitter and send along two perpendicular interferometer arms. The four directions seen from the beam splitter are called North, East, West and South. The ends of these arms (North and East) are marked by highly reflective end mirrors, which reflect the beams back into themselves so that they can be recombined by the beam splitter. Generally, the Michelson interferometer has two outputs, namely the so far unused beam splitter port (South) and the input port (West). Both output ports can be used to obtain interferometer signals, however, most setups are designed such that the signals with high signal-to-noise ratios are detected in the South port.

Figure 28
figure 28

Typical optical layout of a Michelson interferometer: a laser beam is split into two and sent along two perpendicular interferometer arms. We will label the directions in a Michelson interferometer as North, East, West and South in the following. The end mirrors reflect the beams such that they are recombined at the beam splitter. The South and West ports of the beam splitter are possible output port, however, in many cases, only the South port is used.

The Michelson interferometer output is determined by the laser wavelength λ, the reflectivity and transmittance of the beam splitter and the end mirrors, and the length of the interferometer arms. In many cases the end mirrors are highly reflective and the beam splitter ideally a 50:50 beam splitter. In that case, we can compute the output for a monochromatic field as shown in Section 2.4. Using Equation (20) we can write the field in the South port as

$${E_S} = {E_0}{{\rm{i}} \over 2}({e^{{\rm{i}}2k{L_N}}} + {e^{{\rm{i}}2k{L_E}}}).$$
(96)

We define the common arm length and the arm-length difference as

$$\begin{array}{*{20}c}{\bar L = {{{L_N} + {L_E}} \over 2}} \\ {\Delta L = {L_N} - {L_E},}\\\end{array}$$
(97)

which yield \(2{L_N} = 2\bar L + \Delta L\) and \(2{L_E} = 2\bar L - \Delta L\). Thus, we can further simplify to get

$${E_S} = {E_0}{{\rm{i}} \over 2}{e^{{\rm{i}}2k\bar L}}({e^{{\rm{i}}\;k\Delta L}} + {e^{{\rm{- i}}k\Delta L}}) = {E_0}\;{\rm{i}}{e^{{\rm{i}}2k\bar L}}\cos (k\Delta L){.}$$
(98)

The photo detector then produces a signal proportional to

$$S = {E_S}E_S^{\ast} = {P_0}{\cos ^2}(k\Delta L) = {P_0}{\cos ^2}(2\pi \Delta L/\lambda){.}$$
(99)

This signal is depicted in Figure 29; it shows that the power in the South port changes between zero and the input power with a period of ΔL/λ = 0.5. The tuning at which the output power drops to zero is called the dark fringe. Current interferometric gravitational-wave detectors operate their Michelson interferometer at or near the dark fringe.

Figure 29
figure 29

Power in the South port of a symmetric Michelson interferometer as a function of the arm length difference ΔL.

The above seems to indicate that the macroscopic arm-length difference plays no role in the Michelson output signal. However, this is only correct for a monochromatic laser beam with infinite coherence length. In real interferometers care must be taken that the arm-length difference is well below the coherence length of the light source. In gravitational-wave detectors the macroscopic arm-length difference is an important design feature; it is kept very small in order to reduce coupling of laser noise into the output but needs to retain a finite size to allow the transfer of phase modulation sidebands from the input to the output port; this is illustrated in the Finesse example below and will be covered in detail in Section 6.4.

5.3 Finesse examples

5.3.1 Michelson power

The power in the South port of a Michelson detector varies as the cosine squared of the microscopic arm length difference. The maximum output can be equal to the input power, but only if the Michelson interferometer is symmetric and lossless. The tuning for which the South port power is zero is referred to as the dark fringe.

Figure 30
figure 30

Finesse example: Michelson power.

Finesse input file for ‘Michelson power’

figure Graphic6

5.3.2 Michelson modulation

This example demonstrates how a macroscopic arm length difference can cause different ‘dark fringe’ tuning for injected fields with different frequencies. In this case, some of the 10 MHz modulation sidebands are transmitted when the interferometer is tuned to a dark fringe for the carrier light. This effect can be used to separate light fields of different frequencies. It is also the cause for transmission of laser noise (especially frequency noise) into the Michelson output port when the interferometer is not perfectly symmetric.

Finesse input file for ‘Michelson modulation’

figure Graphic7
Figure 31
figure 31

Finesse example: Michelson modulation.

6 Interferometric Length Sensing and Control

In this section we introduce interferometers as length sensing devices. In particular, we explain how the Fabry-Pérot interferometer and the Michelson interferometer can be used for high-precision measurements and that both require a careful control of the base length (which is to be measured) in order to yield their large sensitivity. In addition, we briefly introduce the general concepts of error signals and transfer functions, which are used to describe most essential features of length sensing and control.

6.1 Error signals and transfer functions

In general, we will call an error signal any measured signal suitable for stabilising a certain experimental parameter p with a servo loop. The aim is to maintain the variable p at a user-defined value, the operating point, p0. Therefore, the error signal must be a function of the parameter p. In most cases it is preferable to have a bipolar signal with a zero crossing at the operating point. The slope of the error signal at the operating point is a measure of the ‘gain’ of the sensor (which in the case of interferometers is a combination of optics and electronics).

Transfer functions describe the propagation of a periodic signal through a plant and are usually given as plots of amplitude and phase over frequency. By definition a transfer function describes only the linear coupling of signals inside a system. This means a transfer function is independent of the actual signal size. For small signals or small deviations, most systems can be linearised and correctly described by transfer functions.

Experimentally, network analysers are commonly used to measure a transfer function: one connects a periodic signal (the source) to an actuator of the plant (which is to be analysed) and to an input of the analyser. A signal from a sensor that monitors a certain parameter of the plant is connected to the second analyser input. By mixing the source with the sensor signal the analyser can determine the amplitude and phase of the input signal with respect to the source (amplitude equals one and the phase equals zero when both signals are identical).

Mathematically, transfer functions can be modeled similarly: applying a sinusoidal signal sin(ωst) to the interferometer, e.g., as a position modulation of a cavity mirror, will create phase modulation sidebands with a frequency offset of ±ωs to the carrier light. If such light is detected in the right way by a photodiode, it will include a signal at the frequency component ωs, which can be extracted, for example, by means of demodulation (see Section 4.2).

Transfer functions are of particular interest in relation to error signals. Typically a transfer function of the error signal is required for the design of the respective electronic servo. A ‘transfer function of the error signal’ usually refers to a very specific setup: the system is held at its operating point, such that, on average, \(\bar p = {p_0}\). A signal is applied to the system in the form of a very small sinusoidal disturbance of p. The transfer function is then constructed by computing for each signal frequency the ratio of the error signal and the injected signal. Figure 32 shows an example of an error signal and its corresponding transfer function. The operating point shall be at

$${x_{\rm{d}}} = 0\quad {\rm{and}}\quad {x_{{\rm{EP}}}}({x_{\rm{d}}} = 0) = 0$$
(100)

The optical transfer function \({T_{{\rm{opt,}}{{\rm{x}}_{\rm{d}}}}}\) with respect to this error signal is defined by

$${\tilde x_{{\rm{EP}}}}(f) = {T_{{\rm{opt}},{{\rm{x}}_{\rm{d}}}}}{T_{\det}}{\tilde x_{\rm{d}}}(f),$$
(101)

with Tdet as the transfer function of the sensor. In the following, Tdet is assumed to be unity. At the zero crossing the slope of the error signal represents the magnitude of the transfer function for low frequencies:

$${\left\vert {{{d{x_{{\rm{EP}}}}} \over {d{x_{\rm{d}}}}}} \right\vert _{\vert {x_{\rm{d}}} = 0}} = \vert {T_{{\rm{opt}},{{\rm{x}}_{\rm{d}}}}}{\vert _{\vert f \rightarrow 0}}$$
(102)

The quantity above will be called the error-signal slope in the following text. It is proportional to the optical gain |Topt,xd|, which describes the amplification of the gravitational-wave signal by the optical instrument.

Figure 32
figure 32

Example of an error signal: the top graph shows the electronic interferometer output signal as a function of mirror displacement. The operating point is given as the zero crossing, and the error-signal slope is defined as the slope at the operating point. The right graph shows the magnitude of the transfer function mirror displacement error signal. The slope of the error signal (left graph) is equal to the low frequency limit of the transfer function magnitude (see Equation (102)).

6.2 Fabry-Pérot length sensing

In Figure 25 we have plotted the circulating power in a Fabry-Pérot cavity as a function of the laser frequency. The steep features in this plot indicate that such a cavity can be used to measure changes in the laser frequency. From the equation for the circulating power (see Equation (88)),

$${P_1}/{P_0} = {{{T_1}} \over {1 + {R_1}{R_2} - 2{r_1}{r_2}\cos (2kL)}} = {{{T_1}} \over d},$$
(103)

we can see that the actual frequency dependence is given by the cos(2kL) term. Writing this term as

$$\cos (2kL) = \cos \left({2\pi {{Lf} \over c}} \right),$$
(104)

we can highlight the fact that the cavity is in fact a reference for the laser frequency in relation to the cavity length. If we know the cavity length very well, a cavity should be a good instrument to measure the frequency of a laser beam. However, if we know the laser frequency very accurately, we can use an optical cavity to measure a length. In the following we will detail the optical setup and behaviour of a cavity used for a length measurement. The same reasoning applies for frequency measurements. If we make use of the resonant power enhancement of the cavity to measure the cavity length, we can derive the sensitivity of the cavity from the differentiation of Equation (88), which gives the slope of the trace shown in Figure 25,

$${{d\;{P_1}/{P_0}} \over {d\;L}} = {{- 4{T_1}{r_1}{r_2}k\sin (2kL)} \over {{d^2}}},$$
(105)

with d as defined in Equation (103). This is plotted in Figure 33 together with the cavity power as a function of the cavity tuning. From Figure 33 we can deduce a few key features of the cavity:

  • The cavity must be held as near as possible to the resonance for maximum sensitivity. This is the reason that active servo control systems play an important role in modern laser interferometers.

  • If we want to use the power directly as an error signal for the length, we cannot use the cavity directly on resonance because there the optical gain is zero. A suitable error signal (i.e., a bipolar signal) can be constructed by adding an offset to the light power signal. A control system utilising this method is often called DC-lock or offset-lock. However, we show below that more elegant alternative methods for generating error signals exist.

  • The differentiation of the cavity power looks like a perfect error signal for holding the cavity on resonance. A signal proportional to such differentiation can be achieved with a modulation-demodulation technique.

Figure 33
figure 33

The top plot shows the cavity power as a function of the cavity tuning. A tuning of 360° refers to a change in the cavity length by one laser wavelength. The bottom plot shows the differentiation of the upper trace. This illustrates that near resonance the cavity power changes very rapidly when the cavity length changes. However, for most tunings the cavity seems not sensitive at all.

6.3 The Pound-Drever-Hall length sensing scheme

This scheme for stabilising the frequency of a light field to the length of a cavity, or vice versa, is based on much older techniques for performing very similar actions with microwaves and microwave resonators. Drever and Hall have adapted such techniques for use in the optical regime [14] and today what is now called the Pound-Drever-Hall technique can be found in a great number of different types of optical setups. An example layout of this scheme is shown in Figure 34, in this case for generating a length (or frequency) signal of a two-mirror cavity. The laser is passed through an electro-optical modulator, which applies a periodic phase modulation at a fixed frequency. In many cases the modulation frequency is chosen such that it resides in the radio frequency band for which low-cost, low-noise electronic components are available. The phase modulated light is then injected into the cavity. However, from the frequency domain analysis introduced in Section 5, we know that in most cases not all the light can be injected into the cavity. Let’s consider the example of an over-coupled cavity with the reflectivity of the end mirror R2 < 1. Such a cavity would have a frequency response as shown in the top traces of Figure 26 (recall that the origin of the frequency axis refers to an arbitrarily chosen default frequency, which for this figure has been selected to be a resonance frequency of the cavity). If the cavity is held on resonance for the unmodulated carrier field, this field enters the cavity, gets resonantly enhanced and a substantial fraction is transmitted. If the frequency offset of the modulation sidebands is chosen such that it does not coincide with (or is near to) an integer multiple of the cavity’s free spectral range, the modulation sidebands are mostly reflected by the cavity and will not be influenced as much by the resonance condition of the cavity as the carrier. The photodiode measuring the reflected light will see the optical beat between the carrier field and the modulation sidebands. This includes a component at the modulation frequency which is a measure of the phase difference between the carrier field and the sidebands (given the setup as described above). Any slight change of the cavity length would introduce a proportional change in the phase of the carrier field and no change in the sideband fields. Thus the photodiode signal can be used to measure the length changes of the cavity. One of the advantages of this method is the fact that the so-generated signal is bipolar with a zero crossing and steep slope exactly at the cavity’s resonance, see Figure 35.

Figure 34
figure 34

Typical setup for using the Pound-Drever-Hall scheme for length sensing and with a two-mirror cavity: the laser beam is phase modulated with an electro-optical modulator (EOM). The modulation frequency is often in the radio frequency range. The photodiode signal in reflection is then electrically demodulated at the same frequency.

Figure 35
figure 35

This figure shows an example of a Pound-Drever-Hall (PDH) signal of a two-mirror cavity. The plots refer to a setup in which the cavity mirrors are stationary and the frequency of the input laser is tuned linearly. The upper trace shows the light power circulating in the cavity. The three peaks correspond to the frequency tunings for which the carrier (main central peak) or the modulation sidebands (smaller side peaks) are resonant in the cavity. The lower trace shows the PDH signal for the same frequency tuning. Coincident with the peaks in the upper trace are bipolar structures in the lower trace. Each of the bipolar structures would be suitable as a length-sensing signal. In most cases the central structure is used, as experimentally it can be easily identified because its slope has a different sign compared to the sideband structures.

6.4 Michelson length sensing

Similarly to the two-mirror cavity, we can start to understand the length-sensing capabilities of the Michelson interferometer by looking at the output light power as a function of a mirror movement, as shown in Figure 29. The power changes as sine squared with the maximum slope at the point when the output power (in what we call the South port) is half the input power. The slope of the output power, which is the optical gain of the instrument for detecting a differential arm-length change ΔL with a photo detector in the South port can be written as

$${{dS} \over {d\Delta L}} = {{2\pi {P_0}} \over \lambda}\sin \left({{{4\pi} \over \lambda}\Delta L} \right)$$
(106)

and is shown in Figure 36. The most notable difference of the optical gain of the Michelson interferometer with respect to the Fabry-Pérot interferometer (see Figure 33) is the wider, more smooth distribution of the gain. This is due to the fact that the cavity example is based on a high-finesse cavity in which the optical resonance effect is dominant. In a basic Michelson interferometer such resonance enhancement is not present.

Figure 36
figure 36

Power and slope of a Michelson interferometer. The upper plot shows the output power of a Michelson interferometer as detected in the South port (as already shown in Figure 29). The lower plot shows the optical gain of the instrument as given by the slope of the upper plot.

However, the main difference is that the measurement is made differentially by comparing two lengths. This allows one to separate a larger number of possible noise contributions, for example noise in the laser light source, such as amplitude or frequency noise. This is why the main instrument for gravitational-wave measurements is a Michelson interferometer. However, the resonant enhancement of light power can be added to the Michelson, for example, by using Fabry-Pérot cavities within the Michelson. This construction of new topologies by combining Michelson and Fabry-Pérot interferometers will be described in detail in a future version of this review.

The Michelson interferometer has two longitudinal degrees of freedom. These can be represented by the positions (along the optical axes) of the end mirrors. However, it is more efficient to use proper linear combinations of these and describe the Michelson interferometer length or position information by the common and differential arm length, as introduced in Equation (97):

$$\begin{array}{*{20}c}{\bar L = {{{L_N} + {L_E}} \over 2}}\quad\\{\Delta L = {L_N} - {L_E}.}\\\end{array}$$

The Michelson interferometer is intrinsically insensitive to the common arm length \({\bar L}\).

6.5 The Schnupp modulation scheme

Similar to the Fabry-Pérot cavity, the Michelson interferometer is also often used to set an operating point where the optical gain of a direct light power detection is zero. This operating point, given by ΔL/λ = (2N + 1) • 0.25 with N a non-negative integer, is called dark fringe. This operating point has several advantages, the most important being the low (ideally zero) light power on the diode. Highly efficient and low-noise photodiodes usually use a small detector area and thus are typically not able to detect large power levels. By using the dark fringe operating point, the Michelson interferometer can be used as a null instrument or null measurement, which generally is a good method to reduce systematic errors [49].

One approach to make use of the advantages of the dark fringe operating point is to use an operating point very close to the dark fringe at which the optical gain is not yet zero. In such a scenario a careful trade-off calculation can be done by computing the signal-to-noise with noises that must be suppressed, such as the laser amplitude noise. This type of operation is usually referred to as DC control or offset control and is very similar to the similarly-named mechanism used with Fabry-Pérot cavities.

Another option is to employ phase modulated light, similar to the Pound-Drever-Hall scheme described in Section 6.3. The optical layout of such a scheme is depicted in Figure 37: an electro-optical modulator is used to apply a phase modulation at a fixed (usually RF type) frequency to the (monochromatic) laser light before it enters the interferometer. The photodiode signal from the interferometer output is then demodulated at the same frequency. This scheme allows one to operate the interferometer precisely on the dark fringe. The method originally proposed by Lise Schnupp is also sometimes referred to as frontal modulation.

Figure 37
figure 37

This length sensing scheme is often referred to as frontal or Schnupp modulation: an EOM is used to phase modulate the laser beam before entering the Michelson interferometer. The signal of the photodiode in the South port is then demodulated at the same frequency used for the modulation.

The optical gain of a Michelson interferometer with Schnupp modulation is shown in Figure 39 in Section 6.6.

Figure 38
figure 38

Finesse example: Cavity power and slope.

Figure 39
figure 39

Finesse example: Michelson with Schnupp modulation.

6.6 Finesse examples

6.6.1 Cavity power and slope

Figure 33 shows a plot of the analytical functions describing the power inside a cavity and its differentiation by the cavity tuning. This example recreates the plot using a numerical model in Finesse.

Finesse input file for ‘Cavity power and slope’

figure Graphic8

6.6.2 Michelson with Schnupp modulation

Figure 39 shows the demodulated photodiode signal of a Michelson interferometer with Schnupp modulation, as well as its differentiation, the latter being the optical gain of the system. Comparing this figure to Figure 36, it can be seen that with Schnupp modulation, the optical gain at the dark fringe operating points is maximised and a suitable error signal for these points is obtained.

Finesse input file for ‘Michelson with Schnupp modulation’

figure Graphic9

7 Beam Shapes: Beyond the Plane Wave Approximation

In previous sections we have introduced a notation for describing the on-axis properties of electric fields. Specifically, we have described the electric fields along an optical axis as functions of frequency (or time) and the location z. Models of optical systems may often use this approach for a basic analysis even though the respective experiments will always include fields with distinct off-axis beam shapes. A more detailed description of such optical systems needs to take the geometrical shape of the light field into account. One method of treating the transverse beam geometry is to describe the spatial properties as a sum of ‘spatial components’ or ‘spatial modes’ so that the electric field can be written as a sum of the different frequency components and of the different spatial modes. Of course, the concept of modes is directly related to the use of a sort of oscillator, in this case the optical cavity. Most of the work presented here is based on the research on laser resonators reviewed originally by Kogelnik and Li [35]. Siegman has written a very interesting historic review of the development of Gaussian optics [52, 51] and we use whenever possible the same notation as used in his textbook ‘Lasers’ [50].

This section introduces the use of Gaussian modes for describing the spatial properties along the transverse orthogonal x and y directions of an optical beam. We can write

$$E(t,x,y,z) = \sum\limits_j {\sum\limits_{n,m} {{a_{jnm}}\;{u_{nm}}(x,y,z)} \;\exp ({\rm{i}}({\omega _j}t - {k_j}z)),}$$
(107)

with unm as special functions describing the spatial properties of the beam and ajnm as complex amplitude factors (ωj is again the angular frequency and kj = ωj/c). For simplicity we restrict the following description to a single frequency component at one moment in time (t = 0), so

$$E(x,y,z) = \exp (- {\rm{i}}\;kz)\;\sum\limits_{n,m} {{a_{nm}} {u_{nm}}(x,y,z){.}}$$
(108)

In general, different types of spatial modes unm can be used in this context. Of particular interest are the Gaussian modes, which will be used throughout this document. Many lasers emit light that closely resembles a Gaussian beam: the light mainly propagates along one axis, is well collimated around that axis and the cross section of the intensity perpendicular to the optical axis shows a Gaussian distribution. The following sections provide the basic mathematical framework for using Gaussian modes for analysing optical systems.

7.1 The paraxial wave equation

Mathematically, Gaussian modes are solutions to the paraxial wave equation — a specific wave equation for electromagnetic fields. All electromagnetic waves are solutions to the general wave equation, which in vacuum can be given as:

$$\Delta \vec E - {1 \over {{c^2}}}\ddot \vec E = 0.$$
(109)

But laser light fields are special types of electromagnetic waves. For example, they are characterised by low diffraction. Hence, a laser beam will have a characteristic length ω describing the ‘width’ (the dimension of the field transverse to the main propagation axis), and a characteristic length l defining some local length along the propagation over which the beam characteristics do not vary much. By definition, for what we call a beam ω is typically small and l large in comparison, so that ω/l can be considered small. In fact, the paraxial wave equation (and its solutions) can be derived as the first-order terms of a series expansion of Equation (109) into orders of ω/l [37].

A simpler approach to the paraxial-wave equation goes as follows: A particular beam shape shall be described by a function u(x, y, z) so that we can write the electric field as

$$E(x,y,z) = u(x,y,z)\;\exp (- {\rm{i}}\;kz){.}$$
(110)

Substituting this into the standard wave equation yields a differential equation for u:

$$(\partial _x^2 + \partial _y^2 + \partial _z^2)u(x,y,z) - 2{\rm{i}}\;k{\partial _z}u(x,y,z) = 0{.}$$
(111)

Now we put the fact that u(x, y, z) should be slowly varying with z in mathematical terms. The variation of u(x, y, z) with z should be small compared to its variation with x or y. Also the second partial derivative in z should be small. This can be expressed as

$$\vert \partial _z^2u(x,y,z)\vert \ll \vert 2k{\partial _z}u(x,y,z)\vert, \vert \partial _x^2u(x,y,z)\vert, \vert \partial _y^2u(x,y,z)\vert.$$
(112)

With this approximation, Equation (111) can be simplified to the paraxial wave equation,

$$(\partial _x^2 + \partial _y^2)u(x,y,z) - 2{\rm{i}}\;k{\partial _z}u(x,y,z) = 0{.}$$
(113)

Any field u that solves this equation represents a paraxial beam shape when used in the form given in Equation (110).

7.2 Transverse electromagnetic modes

In general, any solution u(x, y, z) of the paraxial wave equation, Equation (113), can be employed to represent the transverse properties of a scalar electric field representing a beam-like electromagnetic wave. Especially useful in this respect are special families or sets of functions that are solutions of the paraxial wave equation. When such a set of functions is complete and countable, it’s called a set of transverse electromagnetic modes (TEM). For instance, the set of Hermite-Gauss modes are exact solutions of the paraxial wave equation. These modes are represented by an infinite, countable and complete set of functions. The term complete means they can be understood as a base system of the function space defined by all solutions of the paraxial wave equation. In other words, we can describe any solution of the paraxial wave equation u′ by a linear superposition of Hermite-Gauss modes:

$$u^{\prime}(x,y,z) = \sum\limits_{n,m} {{a_{jnm}}\;{u_{nm}}(x,y,z),}$$
(114)

which in turn allows us to describe any laser beam using a sum of these modes:

$$E(t,x,y,z) = \sum\limits_j {\sum\limits_{n,m} {{a_{jnm}}\;{u_{nm}}(x,y,z)\;\exp ({\rm{i}}({\omega _j}t - {k_j}z)).}}$$
(115)

The Hermite-Gauss modes as given in this document (see Section 7.5) are orthonormal so that

$$\int \int dxdy\;{u_{nm}}u_{{n^\prime}{m^\prime}}^\ast = {\delta _{n{n^\prime}}}{\delta _{m{m^\prime}}} = \left\{{\begin{array}{*{20}c} {1\quad {\rm{if}}\quad n = n^\prime\quad {\rm{and}}\quad m = m^\prime}\\ {0\quad {\rm{otherwise}}\quad \quad \quad \quad \quad \quad}\\ \end{array}} \right\}.$$
(116)

This means that, in the function space defined by the paraxial wave equation, the Hermite-Gauss functions can be understood as a complete set of unit-length basis vectors. This fact can be utilised for the computation of coupling factors. Furthermore, the power of a beam, as given by Equation (108), being detected on a single-element photodetector (provided that the area of the detector is large with respect to the beam) can be computed as

$$EE^\ast = \sum\limits_{n,m} {{a_{nm}}a_{nm}^{\ast},}$$
(117)

or for a beam with several frequency components (compare with Equation (76)) as

$$EE^\ast = \sum\limits_{n,m} {\sum\limits_i {\sum\limits_j {{a_{inm}}a_{jnm\quad}^{\ast}{\rm{with}}\quad \{i,j\vert i,j \in \{0, \ldots, N\}} \wedge {\omega _i} = {\omega _j}\}}.}$$
(118)

7.3 Properties of Gaussian beams

The basic or ‘lowest-order’ Hermite-Gauss mode is equivalent to what is usually called a Gaussian beam and is given by

$$u(x,y,z) = \sqrt {{2 \over \pi}} {1 \over {w(z)}}\exp ({\rm{i}}\Psi (z))\;\exp \left({- {\rm{i}}\;k{{{x^2} + {y^2}} \over {2{R_C}(z)}} - {{{x^2} + {y^2}} \over {{w^2}(z)}}} \right).$$
(119)

The parameters of this equation are explained in detail below. The shape of a Gaussian beam is quite simple: the beam has a circular cross section, and the radial intensity profile of a beam with total power P is given by

$$I(r) = {{2P} \over {\pi {w^2}(z)}}\exp (- 2{r^2}/{w^2}),$$
(120)

with ω the spot size, defined as the radius at which the intensity is 1/e2 times the maximum intensity I(0). This is a Gaussian distribution, see Figure 40, hence the name Gaussian beam.

Figure 40
figure 40

One dimensional cross-section of a Gaussian beam. The width of the beam is given by the radius ω at which the intensity is 1/e2 of the maximum intensity.)

Figure 41 shows a different cross section through a Gaussian beam: it plots the beam size as a function of the position on the optical axis.

Figure 41
figure 41

Gaussian beam profile along z: this cross section along the x-z-plane illustrates how the beam size ω(z) of the Gaussian beam changes along the optical axis. The position of minimum beam size ω0 is called beam waist. See text for a description of the parameters Θ, zr and Rc.

Such a beam profile (for a beam with a given wavelength λ) can be completely determined by two parameters: the size of the minimum spot size ω0 (called beam waist) and the position z0 of the beam waist along the z-axis.

To characterise a Gaussian beam, some useful parameters can be derived from ω0 and z0. A Gaussian beam can be divided into two different sections along the z-axis: a near field — a region around the beam waist, and a far field — far away from the waist. The length of the near-field region is approximately given by the Rayleigh range zR. The Rayleigh range and the spot size are related by

$${z_{\rm{R}}} = {{\pi w_0^2} \over \lambda}.$$
(121)

With the Rayleigh range and the location of the beam waist, we can usefully write

$$w(z) = {w_0}\sqrt {1 + {{\left({{{z - {z_0}} \over {{z_{\rm{R}}}}}} \right)}^2}}.$$
(122)

This equation gives the size of the beam along the z-axis. In the far-field regime (zzR, z0), it can be approximated by a linear equation, when

$$w(z) \approx {w_0}{z \over {{z_{\rm{R}}}}} = {{z\lambda} \over {\pi {w_0}}}.$$
(123)

The angle Θ between the z-axis and ω(z) in the far field is called the diffraction angleFootnote 6 and is defined by

$$\Theta = \arctan \left({{{{w_0}} \over {{z_{\rm{R}}}}}} \right) = \arctan \left({{\lambda \over {\pi {w_0}}}} \right) \approx {{{w_0}} \over {{z_{\rm{R}}}}}.$$
(124)

Another useful parameter is the radius of curvature of the wavefront at a given point z. The radius of curvature describes the curvature of the ‘phase front’ of the electromagnetic wave — a surface across the beam with equal phase — intersecting the optical axis at the position z. We obtain the radius of curvature as a function of z:

$${R_C}(z) = z - {z_0} + {{z_{\rm{R}}^2} \over {z - {z_0}}}.$$
(125)

We also find:

$$\begin{array}{*{20}c} {{R_C} \approx \infty, \quad z - {z_0} \ll {z_{\rm{R}}}} & {({\rm{beam}}\;{\rm{waist}})\quad \quad \quad \quad} \\ {{R_C} \approx z,\;\;\;\;\;\;\;z \gg {z_{\rm{R}}},{z_{\rm{0}}}} & {({\rm{far}}\;{\rm{field}})\quad \quad \quad \quad \quad} \\ {{R_C} = 2{z_{\rm{R}}},\quad z - {z_0} = {z_{\rm{R}}}} & {({\rm{maximum}}\;{\rm{curvature}})\;.} \\ \end{array}$$
(126)

7.4 Astigmatic beams: the tangential and sagittal plane

If the interferometer is confined to a plane (here the xz plane), it is convenient to use projections of the three-dimensional description into two planes [46]: the tangential plane, defined as the xz plane and the sagittal plane as given by y and z.

The beam parameters can then be split into two respective parameters: z0,s, ω0,s for the sagittal plane and z0,t and ω0,t for the tangential plane so that the Hermite-Gauss modes can be written as

$${u_{nm}}(x,y) = {u_n}(x,{z_{0,t}},{w_{0,t}})\;{u_m}(y,{z_{0,s}},{w_{0,s}}){.}$$
(127)

Beams with different beam waist parameters for the sagittal and tangential plane are astigmatic.

Remember that these Hermite-Gauss modes form a base system. This means one can use the separation into sagittal and tangential planes even if the actual optical system does not show this special type of symmetry. This separation is very useful in simplifying the mathematics. In the following, the term beam parameter generally refers to a simple case where ω0,x = ω0,y and z0,x = z0,y but all the results can also be applied directly to a pair of parameters.

7.5 Higher-order Hermite-Gauss modes

The complete set of Hermite-Gauss modes is given by an infinite discrete set of modes unm(x, y, z) with the indices n and m as mode numbers. The sum n+m is called the order of the mode. The term higher-order modes usually refers to modes with an order n + m > 0. The general expression for Hermite-Gauss modes can be given as [35]

$${u_{{\rm{nm}}}}(x,y,z) = {u_{\rm{n}}}(x,z)\;{u_{\rm{m}}}(y,z),$$
(128)

with

$$\begin{array}{*{20}c} {{u_{\rm{n}}}(x,z) = {{\left({{2 \over \pi}} \right)}^{1/4}}{{\left({{{\exp ({\rm{i}}(2n + 1)\Psi (z))} \over {{2^n}n!w(z)}}} \right)}^{1/2}} \times}\qquad\qquad\qquad\\ {{H_n}\left({{{\sqrt 2 x} \over {w(z)}}} \right)\exp \left({- {\rm{i}}{{k{x^2}} \over {2{R_C}(z)}} - {{{x^2}} \over {{w^2}(z)}}} \right),}\\ \end{array}$$
(129)

and Hn(x) the Hermite polynomials of order n. The first Hermite polynomials, without normalisation, can be written

$$\begin{array}{*{20}c} {{H_0}(x) = 1\quad \quad \;\;\;{H_1}(x) = 2x}\qquad\;\\ {{H_2}(x) = 4{x^2} - 2\;{H_3}(x) = 8{x^3} - 12x.}\\ \end{array}$$
(130)
$${H_{n + 1}}(x) = 2x{H_n}(x) - 2n{H_{n - 1}}(x).$$
(131)

Further orders can be computed recursively since

$$\begin{array}{*{20}c} {{u_{{\rm{nm}}}}(x,y,z) = {{({2^{n + m - 1}}n!m!\pi)}^{- 1/2}}{1 \over {w(z)}}\;\exp ({\rm{i}}(n + m + 1)\Psi (z)) \times}\qquad \qquad \qquad \qquad\qquad \\ {{H_n}\left({{{\sqrt 2 x} \over {w(z)}}} \right){H_m}\left({{{\sqrt 2 y} \over {w(z)}}} \right)\exp \left({- {\rm{i}}{{k({x^2} + {y^2})} \over {2{R_C}(z)}} - {{{x^2} + {y^2}} \over {{w^2}(z)}}} \right).}\\ \end{array}$$
(132)

for both transverse directions we can also rewrite the above to

$${1 \over {q(z)}} = {1 \over {{R_C}(z)}} - {\rm{i}}{\lambda \over {\pi {w^2}(z)}}.$$
(133)

The latter form has the advantage of clearly showing the extra phase shift along the z-axis of (n + m +1)Ψ(z) called the Gouy phase; see Section 7.8.

7.6 The Gaussian beam parameter

For a more compact description of the interaction of Gaussian modes with optical components we will make use of the Gaussian beam parameter q [34]. The beam parameter is a complex quantity defined as

$$q(z) = {\rm{i}}{z_{\rm{R}}} + z - {z_0} = {q_0} + z - {z_0}\quad {\rm{and}}\quad {q_0} = {\rm{i}}{z_{\rm{R}}}.$$
(134)

It can also be written as

$$u(x,y,z) = {1 \over {q(z)}}\exp \left({- {\rm{i}}\;k{{{x^2} + {y^2}} \over {2q(z)}}} \right).$$
(135)

Using this parameter, Equation (119) can be rewritten as

$${w^2}(z) = {\lambda \over \pi}{{\vert q{\vert ^2}} \over {\Im \{q\}}},$$
(136)

Other parameters, like the beam size and radius of curvature, can also be written in terms of the beam parameter q:

$$w_0^2 = {{\Im \{q\} \lambda} \over \pi},$$
(137)
$${z_{\rm{R}}} = \Im \{q\}$$
(138)
$${R_C}(z) = {{\vert q\vert ^{2}} \over {\Re \{q\}}}.$$
(139)

and

$$\begin{array}{*{20}c} {{u_{{\rm{nm}}}}(x,y,z) = {u_{\rm{n}}}(x,z){u_{\rm{m}}}(y,z)\quad {\rm{with}}}\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad \;\;\\ {u_{\rm n}(x,z) = {{\left({{2 \over \pi}} \right)}^{1/4}}{{\left({{1 \over {{2^n}n!{w_0}}}} \right)}^{1/2}}{{\left({{{{q_0}} \over {q(z)}}} \right)}^{1/2}}{{\left({{{{q_0}\;q^\ast(z)} \over {q_0^{\ast}\;q(z)}}} \right)}^{n/2}}{H_n}\left({{{\sqrt 2 x} \over {w(z)}}} \right)\exp \left({- {\rm{i}}{{k{x^2}} \over {2q(z)}}} \right).}\\ \end{array}$$
(140)

The Hermite-Gauss modes can also be written using the Gaussian beam parameter asFootnote 7

$$\Psi (z) = \arctan \left({{{z - {z_0}} \over {{z_{\rm{R}}}}}} \right),$$
(141)

7.7 Properties of higher-order Hermite-Gauss modes

Some of the properties of Hermite-Gauss modes can easily be described using cross sections of the field intensity or field amplitude. Figure 42 shows such cross sections, i.e., the intensity in the xy plane, for a number of higher-order modes. This shows a xy symmetry for mode indices n and m. We can also see how the size of the intensity distribution increases with the mode index, while the peak intensity decreases.

Figure 42
figure 42

This plot shows the intensity distribution of Hermite-Gauss modes unm. One can see that the intensity distribution becomes wider for larger mode indices and the peak intensity decreases. The mode index defines the number of dark stripes in the respective direction.

Similarly, Figure 44 shows the amplitude and phase distribution of several higher-order Hermite-Gauss modes. Some further features of Hermite-Gauss modes:

  • The size of the intensity profile of any sum of Hermite-Gauss modes depends on z while its shape remains constant over propagation along the optical axis.

  • The phase distribution of Hermite-Gauss modes shows the curvature (or radius of curvature) of the beam. The curvature depends on z but is equal for all higher-order modes.

Note that these are special features of Gaussian beams and not generally true for arbitrary beam shapes. Figure 43, for example, shows the amplitude and phase distribution of a triangular beam at the point where it is (mathematically) created and after a 10 m propagation. Neither the shape is preserved nor does it show a spherical phase distribution.

Figure 43
figure 43

These top plots show a triangular beam shape and phase distribution and the bottom plots the diffraction pattern of this beam after a propagation of z = 5 m. It can be seen that the shape of the triangular beam is not conserved and that the phase front is not spherical.

Figure 44
figure 44

These plots show the amplitude distribution and wave front (phase distribution) of Hermite-Gaussian modes unm (labeled as HGnm in the plot). All plots refer to a beam with λ = 1 µm, ω = 1 mm and distance to waist z = 1 m. The mode index (in one direction) defines the number of zero crossings (along that axis) in the amplitude distribution. One can also see that the phase distribution is the same spherical distribution, regardless of the mode indices.

7.8 Gouy phase

The equation for Hermite-Gauss modes shows an extra longitudinal phase lag. This Gouy phase [8, 26, 25] describes the fact that, compared to a plane wave, the Hermite-Gauss modes have a slightly slower phase velocity, especially close to the waist. The Gouy phase can be written as

$$\Psi (z) = \arctan \left({{{\Re \{q\}} \over {\Im \{q\}}}} \right).$$
(142)

or, using the Gaussian beam parameter,

$$\varphi = (n + m + 1)\Psi (z).$$
(143)

Compared to a plane wave, the phase lag φ of a Hermite-Gauss mode is

$$\varphi = \left({n + {1 \over 2}} \right){\Psi _t}(z) + \left({m + {1 \over 2}} \right){\Psi _s}(z),$$
(144)

With an astigmatic beam, i.e., different beam parameters in the tangential and sagittal planes, this becomes

$${\Psi _t}(z) = \arctan \left({{{\Re \{{q_t}\}} \over {\Im \{{q_t}\}}}} \right),$$
(145)

with

$$\begin{array}{*{20}c} {{u_{p,l}}(r, \phi, z) = {1 \over {w(z)}}\sqrt {{{2p!} \over {\pi (\vert l\vert + p)!}}} \exp ({\rm{i}}(2p + \vert l\vert + 1)\Psi (z))}\qquad \qquad \qquad\\ {\times \left({{{\sqrt 2 r} \over {w(z)}}} \right)\;\;L_p^{(l)}\left({{{2{r^2}} \over {w{{(z)}^2}}}} \right)\exp \left({- {\rm{i}}\;k{{{r^2}} \over {2q(z)}} + {\rm{i}}l\phi} \right),}\\ \end{array}$$
(146)

as the Gouy phase in the tangential plane (and is similarly defined in the sagittal plane).

7.9 Laguerre-Gauss modes

Laguerre-Gauss modes are another complete set of functions, which solve the paraxial wave equation. They are defined in cylindrical coordinates and can have advantages over Hermite-Gauss modes in the presence of cylindrical symmetry. More recently, Laguerre-Gauss modes are being investigated in a different context: using a pure higher-order Laguerre-Gauss mode instead of the fundamental Gaussian beam can significantly reduce the impact of mirror thermal noise on the sensitivity of gravitational wave detectors [54, 12]. Laguerre-Gauss modes are commonly given as [50]

$$L_p^{(l)}(x) = {1 \over {p!}}\sum\limits_{j = 0}^p {{{p!} \over {j!}}} \left({\begin{array}{*{20}c} {l + p} \\ {p - j} \\ \end{array}} \right){(- x)^j}.$$
(147)

with r, ϕ and z as the cylindrical coordinates around the optical axis. The letter p is the radial mode index, l the azimuthal mode indexFootnote 8 and \(L_p^{(l)}(x)\) are the associated Laguerre polynomials:

$$\begin{array}{*{20}c} {u_{p,l}^{{\rm{alt}}}(r, \phi, z) = {2 \over {w(z)}}\sqrt {{{p!} \over {(1 + {\delta _{0l}}\pi (\vert l\vert + p)!}}} \exp ({\rm{i}}(2p + \vert l\vert + 1)\Psi (z))}\qquad\quad\quad\\ {\times \left({{{\sqrt 2 r} \over {w(z)}}} \right)\;\;L_p^{(l)}\left({{{2{r^2}} \over {w{{(z)}^2}}}} \right)\exp \left({- {\rm{i}}\;k{{{r^2}} \over {2q(z)}}} \right)\cos (l\phi){.}}\\\end{array}$$
(148)

All other parameters (w(z), q(z),…) are defined as above for the Hermite-Gauss modes.

The dependence of the Laguerre modes on ϕ as given in Equation (146) results in a spiraling phase front, while the intensity pattern will always show unbroken concentric rings; see Figure 45. These modes are also called helical Laguerre-Gauss modes because of the their special phase structure.

Figure 45
figure 45

These plots show the amplitude distribution and wave front (phase distribution) of helical Laguerre-Gauss modes upl. All plots refer to a beam with λ = 1 pm, ω = 1 mm and distance to waist z = 1 m.

The reader might be more familiar with a slightly different type of Laguerre modes (compare Figure 46 and Figure 47) that features dark radial lines as well as dark concentric rings. Mathematically, these can be described simply by replacing the phase factor exp(i ) in Equation (146) by a sine or cosine function. For example, an alternative set of Laguerre-Gauss modes is given by [55]

$$u_{n,m}^{LG}(x,y,z) = \sum\limits_{k = 0}^N {{{\rm{i}}^k}b(n,m,k)u_{N - k,k}^{HG}(x,y,z),}$$
(149)

This type of mode has a spherical phase front, just as the Hermite-Gauss modes. We will refer to this set as sinusoidal Laguerre-Gauss modes throughout this document.

Figure 46
figure 46

Intensity profiles for helical Laguerre-Gauss modes upl. The u00 mode is identical to the Hermite-Gauss mode of order 0. Higher-order modes show a widening of the intensity and decreasing peak intensity. The number of concentric dark rings is given by the radial mode index p.

Figure 47
figure 47

Intensity profiles for sinusoidal Laguerre-Gauss modes \(u_{pl}^{{\rm{alt}}}\). The up0 modes are identical to the helical modes. However, for azimuthal mode indices l > 0 the pattern shows l dark radial lines in addition to the p dark concentric rings.

For the purposes of simulation it can be sometimes useful to decompose Laguerre-Gauss modes into Hermite-Gauss modes. The mathematical conversion for helical modes is given as [7, 1]

$$b(n,m,k) = \sqrt {{{(N - k)!k!} \over {{2^N}n!m!}}} {1 \over {k!}}{({\partial _t})^k}{[{(1 - t)^n}{(1 + t)^m}]_{t = 0}},$$
(150)

with real coefficients

$$P_n^{\alpha, \beta}(x) = {{{{(- 1)}^n}} \over {{2^n}n!}}{(1 - x)^{- \alpha}}{(1 + x)^{- \beta}}{({\partial _x})^n}{(1 - x)^{\alpha + n}}{(1 + x)^{\beta + n}},$$
(151)

if N = n + m. This relates to the common definition of Laguerre modes as upl as follows: p = min(n, m) and l = nm. The coefficients h(n, m, k) can be computed numerically by using Jacobi polynomials. Jacobi polynomials can be written in various forms:

$$P_n^{\alpha, \beta}(x) = {1 \over {{2^n}}}\sum\limits_{j = 0}^n {\left({\begin{array}{*{20}c} {n + \alpha}\\ {\;\;j}\\ \end{array}} \right)\left({\begin{array}{*{20}c} {n + \beta}\\ {n - j}\\ \end{array}} \right)} {(x - 1)^{n - j}}{(1 + x)^j},$$
(152)

or which leads to

$$b(n,m,k) = \sqrt {{{(N - k)!k!} \over {{2^N}n!m!}}} {(- 2)^k}P_k^{n - k,m - k}(0).$$
(153)

7.10 Tracing a Gaussian beam through an optical system

Whenever Gauss modes are used to analyse an optical system, the Gaussian beam parameters (or equivalent waist sizes and locations) must be defined for each location at which field amplitudes are to be computed (or at which coupling equations are to be defined). In our experience the quality of a computation or simulation and the correctness of the results depend critically on the choice of these beam parameters. One might argue that the choice of a basis should not alter the result. This is correct, but there is a practical limitation: the number of modes having non-negligible power might become very large if the beam parameters are not optimised, so that in practice a good set of beam parameters is usually required.

In general, the Gaussian beam parameter of a mode is changed at every optical surface in a well-defined way (see Section 7.11). Thus, a possible method of finding reasonable beam parameters for every location in the interferometer is to first set only some specific beam parameters at selected locations and then to derive the remaining beam parameters from these initial ones: usually it is sensible to assume that the beam at the laser source can be properly described by the (hopefully known) beam parameter of the laser’s output mode. In addition, in most stable cavities the light fields should be described by using the respective cavity eigenmodes. Then, the remaining beam parameters can be computed by tracing the beam through the optical system. ‘Trace’ in this context means that a beam starting at a location with an already-known beam parameter is propagated mathematically through the optical system. At every optical element along the path the beam parameter is transformed according to the ABCD matrix of the element (see below).

7.11 ABCD matrices

The transformation of the beam parameter can be performed by the ABCD matrix-formalism [34, 50]. When a beam passes an optical element or freely propagates though space, the initial beam parameter q1 is transformed into q2. This transformation can be described by four real coefficients as follows:

$${{{q_2}} \over {{n_2}}} = {{A{{{q_1}} \over {{n_1}}} + B} \over {C{{{q_1}} \over {{n_1}}} + D}},$$
(154)

with the coefficient matrix

$$M\left({\begin{array}{*{20}c} A & B\\ C & D\\ \end{array}} \right),$$
(155)

n1 being the index of refraction at the beam segment defined by q1, and n2 the index of refraction at the beam segment described by q2. ABCD matrices for some common optical components are given below, for the sagittal and tangential plane.

7.11.1 Transmission through a mirror:

A mirror in this context is a single, partly-reflecting surface with an angle of incidence of 90°. The transmission is described by with RC being the radius of curvature of the spherical surface. The sign of the radius is defined such that RC is negative if the centre of the sphere is located in the direction of propagation. The curvature shown above (in Figure 48), for example, is described by a positive radius. The matrix for the transmission in the opposite direction of propagation is identical.

figure 48

Figure 48

figure 49

Figure 49

7.11.2 Reflection at a mirror:

The matrix for reflection is given by The reflection at the back surface can be described by the same type of matrix by setting C = 2n2/RC.

7.11.3 Transmission through a beam splitter:

A beam splitter is understood as a single surface with an arbitrary angle of incidence α1. The matrices for transmission and reflection are different for the sagittal and tangential planes (Ms and Mt): with α2 given by Snell’s law:

$${n_1}\sin ({\alpha _1}) = {n_2}\sin ({\alpha _2}),$$
(156)

and Δn by

$$\Delta n = {{{n_2}\cos ({\alpha _2}) - {n_1}\cos ({\alpha _1})} \over {\cos ({\alpha _1})\cos ({\alpha _2})}}.$$
(157)

if the direction of propagation is reversed, the matrix for the sagittal plane is identical and the matrix for the tangential plane can be obtained by changing the coefficients A and D as follows:

$$\begin{array}{*{20}c} {A \rightarrow 1/A,} \\ {D \rightarrow 1/D.} \\ \end{array}$$
(158)
figure 50

Figure 50

7.11.4 Reflection at a beam splitter:

The reflection at the front surface of a beam splitter is given by: To describe a reflection at the back surface the matrices have to be changed as follows:

$$\begin{array}{*{20}c} {{R_C} \rightarrow - {R_C},}\\ {{n_1} \rightarrow {n_2},}\quad\\ {{\alpha _1}\rightarrow - {\alpha _2}.}\\ \end{array}$$
(159)
figure 51

Figure 51

7.11.5 Transmission through a thin lens:

A thin lens transforms the beam parameter as follows: where f is the focal length. The matrix for the opposite direction of propagation is identical. Here it is assumed that the thin lens is surrounded by ‘spaces’ with index of refraction n = 1.

figure 52

Figure 52

7.11.6 Transmission through a free space:

As mentioned above, the beam in free space can be described by one base parameter q0. In some cases it is convenient to use a matrix similar to that used for the other components to describe the z-dependency of q(z) = q0 + z. On propagation through a free space of the length L and index of refraction n, the beam parameter is transformed as follows. The matrix for the opposite direction of propagation is identical.

figure 53

Figure 53

8 Interferometer Matrix with Hermite-Gauss Modes

In the plane-wave analysis Section 1.4, a laser beam is described in general by the sum of various frequency components of its electric field

$$E(t,z) = \sum\limits_j {{a_j}\;\exp \left({{\rm{i}}({\omega _j}t - {k_j}z)} \right).}$$
(160)

Here we include the geometric shape of the beam by describing each frequency component as a sum of Hermite-Gauss modes:

$$E(t,x,y,z) = \sum\limits_j {\sum\limits_{n,m} {{a_{jnm}}\;{u_{nm}}(x,y)\;\exp ({\rm{i}}({\omega _j}t - {k_j}z))}.}$$
(161)

The shape of such a beam does not change along the z-axis (in the paraxial approximation). More precisely, the spot size and the position of the maximum intensity with respect to the z-axis may change, but the relative intensity distribution across the beam does not change its shape. Each part of the sum may be treated as an independent field that can be described using the equation for plane-waves with only two exceptions:

  • the propagation through free space has to include the Gouy phase shift, and

  • upon reflection or transmission at a mirror or beam splitter the different Hermite-Gauss modes may be coupled (see below).

The Gouy phase shift can be included in the simulation in several ways. For example, for reasons of flexibility the Gouy phase has been included in Finesse as a phase shift of the component space.

8.1 Coupling of Hermite-Gauss modes

Let us consider two different cavities with different sets of eigenmodes. The first set is characterised by the beam parameter q1 and the second by the parameter q2. A beam with all power in the fundamental mode u00(q1) leaves the first cavity and is injected into the second. Here, two ‘misconfigurations’ are possible:

  • if the optical axes of the beam and the second cavity do not overlap perfectly, the setup is called misaligned,

  • if the beam size or shape at the second cavity does not match the beam shape and size of the (resonant) fundamental eigenmode (q1(zcav) ≠ q2(zcav)), the beam is then not mode-matched to the second cavity, i.e., there is a mode mismatch.

The above misconfigurations can be used in the context of simple beam segments. We consider the case in which the beam parameter for the input light is specified. Ideally, the ABCD matrices then allow one to trace a beam through the optical system by computing the proper beam parameter for each beam segment. In this case, the basis system of Hermite-Gauss modes is transformed in the same way as the beam, so that the modes are not coupled.

For example, an input beam described by the beam parameter q1 is passed through several optical components, and at each component the beam parameter is transformed according to the respective ABCD matrix. Thus, the electric field in each beam segment is described by Hermite-Gauss modes based on different beam parameters, but the relative power between the Hermite-Gauss modes with different mode numbers remains constant, i.e., a beam in a u00 mode is described as a pure u00 mode throughout the entire system.

In practice, it is usually impossible to compute proper beam parameter for each beam segment as suggested above, especially when the beam passes a certain segment more than once. A simple case that illustrates this point is reflection at a spherical mirror. Let the input beam be described by q1. From Figure 49 we know that the proper beam parameter of the reflected beam is

$${q_2} = {{{q_1}} \over {- 2{q_1}/{R_C} + 1}},$$
(162)

with RC being the radius of curvature of the mirror. In general, we get q1q2 and thus two different ‘proper’ beam parameters for the same beam segment. Only one special radius of curvature would result in matched beam parameters (q1 = q2).

8.2 Coupling coefficients for Hermite-Gauss modes

Hermite-Gauss modes are coupled whenever a beam is not matched to a cavity or to a beam segment or if the beam and the segment are misaligned. This coupling is sometimes referred to as ‘scattering into higher-order modes’ because in most cases the laser beam is a considered as a pure TEM00 mode and any mode coupling would transfer power from the fundamental into higher-order modes. However, in general, every mode with non-zero power will transfer energy into other modes whenever mismatch or misalignment occur, and this effect also includes the transfer from higher orders into a low order.

To compute the amount of coupling the beam must be projected into the base system of the cavity or beam segment it is being injected into. This is always possible, provided that the paraxial approximation holds, because each set of Hermite-Gauss modes, defined by the beam parameter at a position z, forms a complete set. Such a change of the basis system results in a different distribution of light power in the new Hermite-Gauss modes and can be expressed by coupling coefficients that yield the change in the light amplitude and phase with respect to mode number.

Let us assume that a beam described by the beam parameter q1 is injected into a segment described by the parameter q2. Let the optical axis of the beam be misaligned: the coordinate system of the beam is given by (x, y, z) and the beam travels along the z-axis. The beam segment is parallel to the z′-axis and the coordinate system (x′, y′, z′) is given by rotating the (x, y, z) system around the y-axis by the misalignment angle γ. The coupling coefficients are defined as

$${u_{nm}}({q_1})\exp \left({{\rm{i}}(\omega t - kz)} \right) = \sum\limits_{n^{\prime},m^{\prime}} {{k_{n,m,n^{\prime},m^{\prime}}}} {u_{n^{\prime}m^{\prime}}}({q_2})\exp \left({{\rm{i}}(\omega t - kz^{\prime})} \right),$$
(163)

where unm(q1) are the Hermite-Gauss modes used to describe the injected beam and \({u_{{n^{\prime}}\,{m^{\prime}}}}({q_2})\) are the ‘new’ modes that are used to describe the light in the beam segment. Note that including the plane wave phase propagation within the definition of coupling coefficients is very important because it results in coupling coefficients that are independent of the position on the optical axis for which the coupling coefficients are computed.

Using the fact that the Hermite-Gauss modes unm are orthonormal, we can compute the coupling coefficients by the convolution [6]

$${k_{n,m,n^{\prime},m^{\prime}}} = \exp \left({{\rm{i}}2kz^{\prime}\sin^{2} \left({{\gamma \over 2}} \right)} \right)\int {\int {dx^{\prime}} dy^{\prime}} \;{u_{n^{\prime}m^{\prime}}}\exp ({\rm{i}}\;kx^{\prime}\sin \gamma)u_{nm}^{\ast}.$$
(164)

Since the Hermite-Gauss modes can be separated with respect to x and y, the coupling coefficients can also be split into \({k_{nm{n^{\prime}}{m^{\prime}}}} = {k_{n{n^{\prime}}}}{k_{m{m^{\prime}}}}\). These equations are very useful in the paraxial approximation as the coupling coefficients decrease with large mode numbers. In order to be described as paraxial, the angle γ must not be larger than the diffraction angle. In addition, to obtain correct results with a finite number of modes the beam parameters q1 and q2 must not differ too much.

The convolution given in Equation (164) can be computed directly using numerical integration. However, this is computationally very expensive. The following is based on the work of Bayer-Helms [6]. Another very good description of coupling coefficients and their derivation can be found in the work of Vinet [55]. In [6] the above projection integral is partly solved and the coupling coefficients are given by simple sums as functions of γ and the mode mismatch parameter K, which are defined by

$$K = {1 \over 2}({K_0} + {\rm{i}}\;{K_2}),$$
(165)

where K0 = (zRz′R)/z′R and K2 = ((zz0) − (z′z′R))/zRR. This can also be written using q = i zr + zz0, as

$$K = {{{\rm{i}}(q - q^{\prime})^\ast} \over {2\Im \{q^{\prime}\}}}.$$
(166)

The coupling coefficients for misalignment and mismatch (but no lateral displacement) can then be written as

$${k_{nn^{\prime}}} = {(- 1)^{n^{\prime}}}{E^{(x)}}{(n!n^{\prime}!)^{1/2}}{(1 + {K_0})^{n/2 + 1/4}}{(1 + K^\ast)^{- (n + n^{\prime} + 1)/2}}\{{S_g} - {S_u}\},$$
(167)

where

$$\begin{array}{*{20}c} {{S_g} = \sum\limits_{\mu = 0}^{[n/2]} {\sum\limits_{\mu^{\prime} = 0}^{[n^{\prime}/2]} {{{{{(- 1)}^\mu}{{\bar X}^{n - 2\mu}}{X^{n^{\prime}- 2\mu^{\prime}}}} \over {(n - 2\mu)!(n^{\prime} - 2\mu^{\prime})!}}\sum\limits_{\sigma = 0}^{\min (\mu, \mu^{\prime})} {{{{{(- 1)}^\sigma}{{\bar F}^{\mu - \sigma}}{F^{\mu^{\prime} - \sigma}}} \over a}(2\sigma)!(\mu - \sigma)!(\mu^{\prime} - \sigma)!,}}}}\\ {{S_u} = \sum\limits_{\mu = 0}^{[(n - 1)/2]} {\sum\limits_{\mu^{\prime} = 0}^{[(n^{\prime} - 1)/2]} {{{{{(- 1)}^\mu}{{\bar X}^{n - 2\mu - 1}}{X^{n^{\prime} - 2\mu^{\prime} - 1}}} \over {(n - 2\mu - 1)!(n^{\prime} - 2\mu^{\prime} - 1)!}}\sum\limits_{\sigma = 0}^{\min (\mu, \mu^{\prime})} {{{{{(- 1)}^\sigma}{{\bar F}^{\mu - \sigma}}{F^{\mu^{\prime} - \sigma}}} \over {(2\sigma + 1)!(\mu - \sigma)!(\mu^{\prime} - \sigma)!}}.}}}}\\ \end{array}$$
(168)

The corresponding formula for \({k_{m{m^{\prime}}}}\) can be obtained by replacing the following parameters: nm, n′ → m′, X, \(\bar X \rightarrow 0\) and E(x) → 1 (see below). The notation [n/2] means

$$\left[ {{m \over 2}} \right] = \left\{{\begin{array}{*{20}c} {m/2}\quad & {{\rm{if}}\;m\;{\rm{is}}\;{\rm{even}},} \\ {(m - 1)/2} & {{\rm{if}}\;m\;{\rm{is}}\;{\rm{odd}}.} \\ \end{array}} \right.$$
(169)

The other abbreviations used in the above definition are

$$\begin{array}{*{20}c} {\bar X = ({\rm{i}}z_{\rm{R}}^\prime - {z^\prime})\sin (\gamma)/(\sqrt {1 + K^\ast} {w_0}),}\\ {X = ({\rm{i}}{z_{\rm{R}}} - {z^\prime})\sin (\gamma)/(\sqrt {1 + K^\ast} {w_0}),}\\ {F = K/(2(1 + {K_0}))},\\ {\bar F = K^\ast/2,}\\ {{E^{(x)}} = \exp \left({- {{X\bar X} \over 2}} \right).}\\ \end{array}$$
(170)

In general, the Gaussian beam parameter might be different for the sagittal and tangential planes and a misalignment can be given for both possible axes (around the y-axis and around the x-axis), in this case the coupling coefficients are given by

$${k_{nmm^{\prime}n^{\prime}}} = {k_{nn^{\prime}}}{k_{mm^{\prime}}},$$
(171)

where \({k_{n{n^{\prime}}}}\) is given above with

$$\begin{array}{*{20}c} {q \rightarrow {q_t}\quad \quad \quad}\\ {{\rm{and}}\quad \quad \quad \quad}\\ {{w_0} \rightarrow {w_{t,0}},{\rm{etc}}.}\\ \end{array}$$
(172)

and γγy is a rotation about the y-axis. The \({k_{m{m^{\prime}}}}\) can be obtained with the same formula, with the following substitutions:

$$\begin{array}{*{20}c} {n \rightarrow m,\quad \quad \quad}\\ {{n^\prime} \rightarrow {m^\prime},\quad \quad \quad}\\ {q \rightarrow {q_s},\quad \quad \quad}\\ {{\rm{thus}}\quad \quad \quad \quad \;}\\ {{w_0} \rightarrow {w_{s,0}},{\rm{etc}}.}\\\end{array}$$
(173)

and γγx is a rotation about the x-axis.

At each component a matrix of coupling coefficients has to be computed for transmission and reflection; see Figure 54.

Figure 54
figure 54

Coupling coefficients for Hermite-Gauss modes: for each optical element and each direction of propagation complex coefficients k for transmission and reflection have to be computed. In this figure k1, k2, k3, k4 each represent a matrix of coefficients \({k_{nm{n^{\prime}}{m^{\prime}}}}\) describing the coupling of un,m into \({u_{{n^{\prime}},{m^{\prime}}}}\).

8.3 Finesse examples

8.3.1 Beam parameter

This example illustrates a possible use of the beam parameter detector ‘bp’: the beam radius of the laser beam is plotted as a function of distance to the laser. For this simulation, the interferometer matrix does not need to be solved. ‘bp’ merely returns the results from the beam tracing algorithm of Finesse.

Figure 55
figure 55

Finesse example: Beam parameter

Finesse input file for ‘Beam parameter’

figure Graphic10

8.3.2 Mode cleaner

This example uses the ‘tem’ command to create a laser beam which is a sum of equal parts in u00 and u10 modes. This beam is passed through a triangular cavity, which acts as a mode cleaner. Being resonant for the u00, the cavity transmits this mode and reflects the u10 mode as can be seen in the resulting plots.

Finesse input file for ‘Mode cleaner’

figure Graphic11
figure Graphic12
Figure 56
figure 56

Finesse example: Mode cleaner

8.3.3 LG33 mode

Finesse uses the Hermite-Gauss modes as a base system for describing the spatial properties of laser beams. However, Laguerre-Gauss modes can be created using the coefficients given in Equation (149). This example demonstrates this and the use of a ‘beam’ detector to plot amplitude and phase of a beam cross section.

Figure 57
figure 57

Finesse example: LG33 mode. The ring structure in the phase plot is due to phase jumps, which could be removed by applying a phase ‘unwrap’.

Finesse input file for ‘LG33 mode’

figure Graphic13