Zum Hauptinhalt springen

CFL Optimized Forward-Backward Runge-Kutta Schemes for the Shallow Water Equations

Lilly, Jeremy R. ; Engwirda, Darren ; et al.
2023
Online report

CFL Optimized Forward–Backward Runge–Kutta Schemes for the Shallow-Water Equations 

We present the formulation and optimization of a Runge–Kutta-type time-stepping scheme for solving the shallow-water equations, aimed at substantially increasing the effective allowable time step over that of comparable methods. This scheme, called FB-RK(3,2), uses weighted forward–backward averaging of thickness data to advance the momentum equation. The weights for this averaging are chosen with an optimization process that employs a von Neumann–type analysis, ensuring that the weights maximize the admittable Courant number. Through a simplified local truncation error analysis and numerical experiments, we show that the method is at least second-order in time for any choice of weights and exhibits low dispersion and dissipation errors for well-resolved waves. Further, we show that an optimized FB-RK(3,2) can take time steps up to 2.8 times as large as a popular three-stage, third-order strong stability-preserving Runge–Kutta method in a quasi-linear test case. In fully nonlinear shallow-water test cases relevant to oceanic and atmospheric flows, FB-RK(3,2) outperforms SSPRK3 in admittable time step by factors roughly between 1.6 and 2.2, making the scheme approximately twice as computationally efficient with little to no effect on solution quality. Significance Statement: The purpose of this work is to develop and optimize time-stepping schemes for models relevant to oceanic and atmospheric flows. Specifically, for the shallow-water equations we optimize for schemes that can take time steps as large as possible while retaining solution quality. We find that our optimized schemes can take time steps between 1.6 and 2.2 times larger than schemes that cost the same number of floating point operations, translating directly to a corresponding speedup. Our ultimate goal is to use these schemes in climate-scale simulations.

Keywords: Shallow-water equations; Numerical analysis/modeling; Model evaluation/performance

1. Introduction

Large-scale climate models, such as the U.S. Department of Energy's Energy Exascale Earth System Model (E3SM) ([13]), grow in complexity every year as new numerical techniques are developed and as high-performance computing (HPC) systems become more powerful. Complex, climate-scale simulations can take days or weeks or more of real time to complete; it is vital that these run times are controlled without making undue sacrifices to overall model quality. In the present work, we develop a way to formulate and optimize a class of time-stepping schemes for the shallow-water equations (SWEs). Our ultimate goal is to improve the efficiency of simulations at the climate scale, but here we focus on the shallow-water equations as a starting point. Although this work is directed to models of ocean and atmosphere circulation, the ideas and methods developed here may generalize to other coupled systems.

The central issue here is the following. The Courant–Friedrichs–Lewy (CFL) condition is a well-known necessary condition for the stability of a given model under an explicit time-stepping scheme. In essence, the CFL condition places a restriction on the size of the model's time step, where this restriction depends on the model problem itself and the chosen time and space discretizations. Formally, it is necessary for stability that

(1) ν=cΔtΔxνmax,

where c is a speed such as a gravity wave speed. We refer to ν as the Courant number and νmax as the maximal Courant number for the given situation. In this work, we develop time-stepping schemes optimized to have maximal νmax. We do this by starting with a three-stage second-order Runge–Kutta method from [36], presently used in the Model for Prediction Across Scales-Atmosphere (MPAS-A) ([33]). This scheme is augmented by adding forward–backward averaging of the thickness tendencies used to update the fluid velocity. Resulting schemes have large νmax and good dispersion and dissipation relations, and they compare well against other schemes in both solution accuracy and effective maximal Courant number.

The SWEs serve as a standard test problem for algorithms and methods designed for applications to ocean and atmosphere modeling; in both cases, the model equations used for climate-scale simulations are generalizations of the SWEs, in which additional vertical tendencies and forcing terms are included, and variable density is allowed. In recent years, strong stability preserving Runge–Kutta (SSPRK) methods ([14]) for solving the SWEs have become quite popular; they have the property that at each time level, a spatial norm of the solution is less than or equal to the norm of the solution at the preceding level. These methods are used to solve the SWEs in a variety of numerical paradigms and applications, such as with finite element spatial discretizations ([2]; [7]; [11]), finite volume discretizations ([3]; [4]; [15]; [17]; [23]; [30]; [34]), and layered models that use a time splitting of barotropic and baroclinic motions ([16]; [19]). Despite this widespread use, strong stability preserving (SSP) time-integration methods may not be the best choice for most large-scale geophysical SWE applications. The SSP property is useful in systems where strong shocks and discontinuities can develop and propagate, but such dynamics are rare in climate-scale oceanic and atmospheric flows, in which barotropic evolution is typically characterized by Rossby- and Kelvin-type wave modes. Numerous contemporary large-scale ocean and atmosphere models employ time-stepping methods that do not have the SSP property, motivating an investigation of its optimality for this class of flow. For example, to solve the barotropic equations in split barotropic/baroclinic systems, the Model for Prediction Across Scales-Ocean (MPAS-O) ([29]; [26]) and Model for Prediction Across Scales-Atmosphere (MPAS-A) ([33]) use a two-stage predictor-corrector method and a three-stage second-order Runge–Kutta method, respectively. The Community Atmosphere Model (CAM) ([6]), the Parallel Ocean Program (POP) ([18]), the Nucleus for European Modeling of the Ocean (NEMO) ([24]), and the Hybrid Coordinate Ocean Model (HYCOM) ([5]) all use variations of semi-implicit leapfrog time integration schemes for their barotropic systems. The Regional Ocean Modeling System (ROMS) ([31]) uses a multistep forward–backward Adams–Bashforth/Adams–Moulton method.

With the potential performance benefits of the ability to take larger time steps in mind, we turn our attention toward methods that are optimal in this sense. Taking advantage of the structure of the SWEs, we employ a strategy similar to that used by [25], wherein the thickness equation is advanced in time with a simple forward Euler step, then the momentum equation is advanced with an explicit backward Euler step using the recently obtained data for the thickness. The method, referred to by Mesinger as the forward–backward method, was more computationally efficient than popular leapfrog methods of the time. The idea of advancing the variables of a system of ODE separately, using the most recently computed data from previously advanced variables, has been employed by the developers of ROMS. Specifically, [31], [32]) present an extension of a two-stage second-order Runge–Kutta scheme that, within each RK stage, first advances the thickness variable, then uses a weighted average of the most recently computed data with previously computed data to advance the momentum equation in a shallow-water system. The coefficients of this scheme are selected to optimize the allowable time-step length for a one-dimensional gravity wave system based on a von Neumann–type stability analysis. In this work, we develop a new, forward–backward weighted three-stage Runge–Kutta method with favorable CFL limits using a similar approach. The optimal weights for our forward–backward scheme are obtained by von Neumann–type analysis on the two-dimensional system, including a linear Coriolis term and linearization of the advection operators about a nonzero mean state. Using a numerical optimization algorithm, we obtain a scheme that is very efficient in terms of its maximal Courant number, as well as its dissipation and dispersion characteristics.

The paper is structured as follows. We begin by introducing our new time-stepping scheme, referred to as FB-RK(3,2), for a general system of ODEs. Then, we describe the process by which we obtain optimal weights for the FB averaging within each RK stage. This process includes the derivation of a von Neumann formulation of the linearized SWEs with a nonzero mean flow and the formulation of an optimization problem. Next, we present the results of this optimization and give a brief discussion of the dispersion and dissipation errors of the optimal FB-RK(3,2) schemes. Finally, we perform a number of numerical experiments that demonstrate the computational efficiency and efficacy of the optimal FB-RK(3,2) schemes as compared to the three-stage third-order SSPRK scheme (SSPRK3) from [14]. These experiments include an investigation of CFL performance across five shallow-water test cases, convergence tests, and comparisons of solution quality. We show that an optimized FB-RK(3,2) scheme can take time steps up to 2.2 times as large as SSPRK3 in nonlinear test cases while retaining solution quality, and because both schemes consist of three RK stages, this translates directly to a corresponding increase in efficiency.

2. CFL optimization of FB-RK(3,2)

Here, we introduce a time-stepping scheme for solving the shallow-water equations (SWEs) that uses weighted forward–backward averaging in the layer thickness data to advance the momentum equation. We then use a von Neumann analysis and a numerical optimization scheme to tune the forward–backward weights so that the scheme has a large Courant number when applied to the linearized SWEs.

a. FB-RK(3,2)

The time-stepping scheme presented here is an extension of the three-stage, second-order Runge–Kutta time-stepping scheme RK(3,2) from [36]. RK(3,2) was originally presented as a third-order method and referred to as RK3, but it is shown by [27] that the method is third-order only on linear problems, reducing to second-order on nonlinear problems. When applied to the SWEs, our extension of RK(3,2) allows us to use the most recently obtained data for the layer thickness to update the momentum data within each Runge–Kutta stage of the method. This is done by taking a weighted average of layer thickness data at the old time level tn and the most recent RK stage, then applying this to the momentum equation. For the remainder of the manuscript, we will call our extension of RK(3,2) that uses forward–backward averaging FB-RK(3,2), as it is a three-stage, second-order scheme.

Consider a general system of ODEs in independent variables u = u(t) and h = h(t) of the following form:

(2) dudt=Φ(u,h),dhdt=Ψ(u,h),

where t is the time coordinate. Let unu(tn) and hnh(tn) hnh(tn) be the numerical approximations to u and h at time t = tn. Let Δt be a time step such that tn+1 = tn + Δt. Also, let tn+1/m=tn+(Δt/m) for any positive integer m. Then, FB-RK(3,2) is given by

(3a) h¯n+1/3=hn+Δt3Ψ(un,hn),u¯n+1/3=un+Δt3Φ(un,h*),h*=β1h¯n+1/3+(1β1)hn,

(3b) h¯n+1/2=hn+Δt2Ψ(u¯n+1/3,h¯n+1/3),u¯n+1/2=un+Δt2Φ(u¯n+1/3,h**),h**=β2h¯n+1/2+(1β2)hn,

(3c) hn+1=hn+ΔtΨ(u¯n+1/2,h¯n+1/2),un+1=un+ΔtΦ(u¯n+1/2,h***),h***=β3hn+1+(12β3)h¯n+1/2+β3hn

The weights β1, β2, and β3 are called the forward–backward (FB) weights. The best choice for the FB weights βi is a primary concern of this work.

The choice to calculate data at time tn+1/3 and tn+1/2 in the first and second stages comes from the original RK(3,2) scheme; the treatment of the thickness variable h in (3) shows the formulation of RK(3,2). FB-RK(3,2) can be approximately reduced to RK(3,2) by taking β1 = 0, β2=2/3 , and β3 = 0. This produces an approximation to RK(3,2) as, while the first and third stages are exact, the FB averaged h**=(2/3)h¯n+1/2+(1/3)hn data in the second stage is an approximation to h¯n+1/3 . The FB averages are chosen in this way so that in each stage, data are symmetrically distributed in time as preliminary experiments showed that this provided the best CFL performance. In the first two RK stages, the FB averaging is done on data from the beginning and the end of the current time interval, i.e., in the first stage, we compute values for h and u at time tn+1/3, so the FB averaging is done with data at time tn and time tn+1/3. In the third RK stage, we use data from the beginning, middle, and end of the current time interval.

In the context of this work, h would be the fluid layer thickness and u the fluid velocity. We have chosen to use an FB average of thickness to advance the velocity variable. However, one could reasonably define a scheme similar to that given by (3) by taking FB averages of u to advance h instead. The subsequent analysis would remain largely the same, but the optimal choices for the FB weights could change substantially, along with the performance of the scheme as a whole. What happens in this case is an open question.

As stated above, the original RK(3,2) scheme is O[(Δt)2] . For any choice of the FB weights, FB-RK(3,2) remains O[(Δt)2] while greatly increasing the maximum allowable time step for most problems. A generalized local truncation error analysis is outside the scope of this paper, so we show this holds for a particular problem of interest in section a of the appendix.

b. Linearized shallow-water equations

A primary concern of this work is the application of FB-RK(3,2) to the nonlinear SWEs on a rotating sphere. Application of a von Neumann–type analysis requires a linear set of PDEs, and we consider the two-dimensional system with linear Coriolis and linearized advection operators to ensure the time-step optimization is conducted on a system that is a close approximation to the fully nonlinear equations. The linearized SWEs centered about a nonzero mean flow are given by

(4) u˜t+f(U˜+u˜)+ζ˜U˜=U˜u˜gh˜,h˜t+H(u˜)+(h˜U˜)=0,

where u˜=[u˜(x,y,t),υ˜(x,y,t)] is a perturbation of the horizontal fluid velocity, x and y are the spatial coordinates, t is the time coordinate, f = 2Ω sinφ is the Coriolis parameter, where Ω is the angular velocity of the rotating sphere and φ denotes latitude, U˜=(U˜,V˜) is the constant horizontal mean fluid velocity, u˜=(υ˜,u˜) , ζ˜=(υ˜/x)(u˜/y) is the vertical component of vorticity ×u˜ , g is the gravitational constant, h˜=h˜(x,y,t) is a perturbation of the layer thickness, and H is the (constant in space) fluid layer thickness when at rest. Note that while u˜ and h˜ depend on x and y throughout this work, we will often suppress this dependence in the notation. The full derivation of these equations is given in section b of the appendix.

We can write these in terms of dimensionless quantities by writing c=gH,u=(u,υ)=(u˜/c),U=(U˜/c) , and η=h˜/H . Then, the dimensionless, linearized SWEs centered about a nonzero mean flow are given by

(5) ut+f(U+u)+cζU=cUucη,ηt+c(u)+c[(ηU)]=0.

c. von Neumann analysis

In the analysis that follows, we assume that the system (5) is discretized on a rectangular, staggered Arakawa C-grid, where η is computed at cell centers, u is computed at the left and right edges of cells, and υ is computed at the top and bottom edges of cells ([1]). For brevity, we skip some details of calculations relating to the spatial discretization in the main text and instead provide them in section c of the appendix.

With the goal of using a von Neumann–type analysis to obtain an optimal choice of FB weights, we apply a Fourier transform in x and y to the linearized SWEs (5). This is equivalent to seeking solutions of the following form:

u(x,y,t)=u^(k,l,t)eikx+ily,υ(x,y,t)=υ^(k,l,t)eikx+ily,η(x,y,t)=η^(k,l,t)eikx+ily,

where k is the wavenumber with respect to x, and l is the wavenumber with respect to y. In doing this, spatial derivatives are now all of the form (/x)(eikx+ily) or (/y)(eikx+ily) . Assuming a rectangular C-grid, using centered differences these spatial derivatives become

(6) x(eikx+ily)=i2sin(kΔx2)Δxeikx+ily,

(7) y(eikx+ily)=i2sin(lΔy2)Δyeikx+ily.

To compute the Coriolis terms, we need the values of u at υ points and the values of υ at u points; call these uυ and υu, respectively. On a rectangular C-grid, we achieve this with unweighted four-point averages of the nearest data, which gives

(8) uυ(x,y,t)=cos(kΔx2)cos(lΔy2)u^(k,l,t)eikx+ily,

(9) υu(x,y,t)=cos(kΔx2)cos(lΔy2)υ^(k,l,t)eikx+ily.

The details of these calculations are given in section c of the appendix. Set K=2sin[k(Δx/2)],L=2sin[l(Δy/2)] , and ψ=fcos[k(Δx/2)]cos[l(Δy/2)] . Then, accounting for the Fourier transform and the C-grid spatial discretization, (5) becomes

(10) u^t=fV+ψυ^(iUcKΔx+iVcLΔy)u^icKΔxη^,υ^t=fUψu^(iUcKΔx+iVcLΔy)υ^icLΔyη^,η^t=(icKΔxu^+icLΔyυ^)(iUcKΔx+iVcLΔy)η^

Let νx=c(Δt/Δx) and νy=c(Δt/Δy) be the Courant numbers with respect to x and y, respectively. Also, set φ = Δ. Then, we apply FB-RK(3,2) to (10):

(11a) η^¯n+1/3=η^n13[iKνxu^n+iLνyυ^n+(iUKνx+iVLνy)η^n],u^¯n+1/3=u^n+13[ΔtfV+φυ^n(iUKνx+iVLνy)u^niKνxη^*],υ^¯n+1/3=υ^n+13[ΔtfUφu^n(iUKνx+iVLνy)υ^niKνxη^*],η^*=β1η^¯n+1/3+(1β1)η^n,

(11b) η^¯n+1/2=η^n12[iKνxu^¯n+1/3+iLνyυ^¯n+1/3+(iUKνx+iVLνy)η^¯n+1/3],u^¯n+1/2=u^n+12[ΔtfV+φυ^¯n+1/3(iUKνx+iVLνy)u^¯n+1/3iKνxη^**],υ^¯n+1/2=υ^n+12[ΔtfUφu^¯n+1/3(iUKνx+iVLνy)υ^¯n+1/3iKνxη^**],η^**=β2η^¯n+1/2+(1β2)η^n,

(11c) η^n+1=η^n[iKνxu^¯n+1/2+iLνyυ^¯n+1/2+(iUKνx+iVLνy)η^¯n+1/2],u^n+1=u^n+[ΔtfV+φυ^¯n+1/2(iUKνx+iVLνy)u^¯n+1/2iKνxη^***],υ^n+1=υ^n+[ΔtfUφu^¯n+1/2(iUKνx+iVLνy)υ^¯n+1/2iKνxη^***],η^***=β3η^n+1+(12β3)η^¯n+1/2+β3η^n.

Note that (11) forms a linear system in u^n,υ^n , and η^n , that is, there is a matrix G and column vector b such that

(12) w^n+1=Gw^n+b,

where w^n=(u^n,υ^n,η^n)T . It follows from (12) that

(13) w^n=Gnw^0+(IGn)(IG)1b,

where I is the identity matrix. Note that in the above, the superscript on w^ is an index on the time step, whereas the superscript on G is an exponent. The matrix G is called the amplification matrix; properties of von Neumann analysis tell us that a time-stepping method is stable if the eigenvalues of its amplification matrix reside within in the unit circle centered at the origin in the complex plane ([20]). This property is fundamental to the analysis in section 2d.

d. CFL optimization

With FB-RK(3,2) applied to the von Neumann formulation of the linearized SWEs (11) and the corresponding amplification matrix G from (12), we can more clearly state the goal of the analysis presented in this section. We are interested in selecting the FB weights β1, β2, and β3 in FB-RK(3,2) so that the Courant numbers νx=c(Δt/Δx) and νy=c(Δt/Δy) can be taken as large as possible while retaining stability of the method.

The eigenvalues of the amplification matrix G depend on the Courant numbers νx and νy and on β1, β2, and β3. To simplify the analysis, we will assume that νx = νy = ν is the Courant number (this is equivalent to assuming that the spatial discretization is on a grid consisting of sqaure cells). Then, we can write G = G (ν, β1, β2, β3). For a given choice of β1, β2, and β3, let ν(β1,β2,β3)max be the largest value of the Courant number for which FB-RK(3,2) is stable, i.e., largest value of the Courant number for which the eigenvalues of the amplification matrix G reside within the unit circle. Symbolically, ν(β1,β2,β3)max is the largest number such that

(14) |λmax[G(ν,β1,β2,β3)]|1,forallν(0,ν(β1,β2,β3)max],

where λmax( G ) is the largest eigenvalue, in absolute value, of the matrix G .

While the matrix G is too complex to analyze directly, we can use numerical optimization tools to find optimal FB weights. That is, we can formulate this problem explicitly as an optimization problem with an appropriate cost function and constraints, then use numerical optimization algorithms to search the space of admittable FB weights. We can maximize νmax by minimizing an appropriate cost function:

(15) C=C(β1,β2,β3),

subject to the constraint that the eigenvalues of G satisfy (14).

The particular numerical optimization scheme we employ here is the simplicial homology global optimization (SHGO) algorithm developed by [8], implemented in the Python programming language via the open-source Python library SciPy ([35]). SHGO is suitable for so-called black-box optimization problems wherein the cost function can be nonsmooth and discontinuous. Further, SHGO is guaranteed to converge to a global optimum as the number of iterations increases. As such, SHGO is an appropriate choice for our purposes, where the eigenvalues of the matrix G are sensitive to small changes in the FB weights.

The matrix G also depends on additional parameters Δtf, kΔx, and lΔy as can be seen in (11). For the purposes of the numerical optimizations we perform here, we set Δtf = 10−2 since taking f = 10−4 represents the value of the Coriolis parameter at midlatitudes and we expect a FB-RK(3,2) to be capable of a time step on the order of 102 in the nonlinear SWE test cases described in section 3. Preliminary tests showed that the choice of value for Δtf had little to no effect on the optimizations in practice. We take kΔx=lΔy=π , which corresponds to the case of gridscale waves. This choice is informed by the fact that instability often comes from gridscale processes. That is, we perform the optimization on gridscale waves, which represents a worst-case with the hope that this provides the best overall stability.

1) The choice of cost functions

Given a particular choice of β1, β2, and β3, we can easily estimate the value of ν(β1,β2,β3)max to arbitrary precision. Our goal is to maximize this parameter, so we seek to minimize the cost function:

(16) C1(β1,β2,β3)=1ν(β1,β2,β3)max.

In the special case where the mean flow U = (U, V) is zero, we additionally consider a second choice of cost function. When U = 0 the column vector b in (12) is also zero, so (13) simplifies to

(17) w^n=Gnw^0.

In this case, it is easy to show that the numerical amplification matrix G approximates G˜:=eAΔt , where

(18) A=(0fickf0iclickicl0).

The matrix G˜ is the analytical counterpart to the numerical amplification matrix G that arises from solving a spatially continuous version of (10). In short, if we assume that our time-stepping method, which is determined by the matrix G , produces the exact solution at time t = Δt, then G=G˜ . We give the details of this in appendix, section d.

Since we wish for G to approximate G˜ for a given choice of FB weights, we aim to minimize a second choice of cost function:

(19) C2(β1,β2,β3)=1ν(β1,β2,β3)max+0π/6G˜(ν)G(ν,β1,β2,β3)dν,

where ∥⋅∥ is the Frobenius norm:

(20) A=i,j(aij)2.

This choice of cost function seeks to maximize νmax while also producing a scheme that has favorable dissipation and dispersion behavior. The choice of norm in the integral term of C2 was chosen as it is quick to compute during the optimization process. There is a certain freedom in the choice of limits of integration for the integral term in C2; this term requires that the difference between G and G˜ is minimal as for small values of ν, which correspond to well-resolved waves. In essence, while C1 only considers the CFL performance of the resulting method, C2 does this while also considering that G should accurately approximate its analytical counterpart G˜ .

In summary, we consider two choices of cost function given by C1 in (16) and C2 in (19). The function C1 is valid in any case, while C2 applies only when the mean flow U is taken to be zero.

2) Optimization results

In the context of the dimensionless Eq. (5), the parameter |U| is called the Froude number. We perform the optimization described above five total times, considering different values of the Froude number |U| that appear in oceanic and atmospheric flows.

As noted above, the SHGO algorithm used here is guaranteed to converge to a globally optimal solution if run for enough iterations. As our problem is relatively complex and computing resources finite, it is more likely in practice that the method will convergence to a local extremum. For example, consider the first two rows of Table 1 and notice that the value of ν(β1,β2,β3)max given by optimizing C2 is greater than that given by optimizing C1. If these were globally optimal solutions, we would have that ν(0.500,0.500,0.344)maxν(0.516,0.532,0.331)max since C1 consists only of a term that asks for a maximal ν(β1,β2,β3)max , while C2 is the sum of the same term with a nonnegative term that asks for G to approximate G˜ . Despite the problem being sufficiently complex that globally optimal solutions are difficult to obtain, the results from section 3 show that the choices of FB weights from Table 1 are extremely efficient in practice.

Graph: Table 1. Results of optimizing FB-RK(3,2) for the given cost functions, rounded to three decimal places. Note that the reported values of ν(β1,β2,β3)max are the maximal Courant numbers in the context of a problem consisting only of gridscale waves, i.e., the problem for which FB-RK(3,2) was optimized.

A similar Runge–Kutta-based method using forward–backward averaging in the momentum equation is developed by [31] (from here on denoted ShMc). In ShMc, the efficiency of their optimized FB method is reported in terms of the parameter α := ckΔt, by considering the maximal value αmax for which the method is stable. Translating this into the notation used here, we obtain

(21) α=ckΔt=νkΔx.

For their optimal RK2-based FB method, ShMc reports that (νkΔx)max ≈ 2.141; since their scheme uses two RK stages, this translates to what we refer to as an effective CFL number of 2.141/2 ≈ 1.071. Here, we have taken kΔx = π and have νmax = 1.804 in the best case, so (νkΔx)max = 1.804π ≈ 5.667. FB-RK(3,2) uses three RK stages, so we obtain an effective CFL number of 5.667/3 ≈ 1.889, an increase of approximately 1.764 times over ShMc's RK2 FB scheme. Note that ShMc's (νkΔx)max comes from their FB scheme as applied to a one-dimensional wave system analogous to (22), whereas ours is obtained from the two-dimensional linearized SWEs. The comparison between these two values is not one-to-one, but do serve as a general comparison of the efficiency of both methods.

To visualize FB-RK(3,2)'s treatment of gravity waves within the SWEs, we can apply the scheme to a simple wave system and analyze the resulting dissipation and dispersion errors (Fig. 1). Applying a Fourier transform in space to the 1D linearized SWEs (A1) gives

(22) η^t=icku^u^t=ickη^

Choose a time step Δt and let K˜=kΔx and ν=c(Δt/Δx) . The FB-RK(3,2) scheme for this system is then given by

(23a) η¯n+1/3=ηniK˜ν3un,u¯n+1/3=uniK˜ν3[β1η¯n+1/3+(1β1)ηn],

(23b) η¯n+1/2=ηniK˜ν2u¯n+1/3,u¯n+1/2=uniK˜ν2[β2η¯n+1/2+(1β2)ηn],

(23c) ηn+1=ζniK˜νu¯n+1/2,un+1=uniK˜ν[β3ηn+1+(12β3)η¯n+1/2+β3ηn].

Equation (23) defines a system of linear equations, which in turn defines a numerical amplification matrix G 0 similar to that in (12). One can show that the general solution to the system (22) is

(24) [η(t)u(t)]=exp[t(0ickick0)][η(0)u(0)],

and that eigenvalues of the matrix exponential above are λ = e±ickt. At time t = Δt, these eigenvalues are λ=e±iK˜ν .

Graph: Fig. 1. An illustration of dissipation and dispersion errors for FB-RK(3,2) with each choice of FB weights from Table 1 and SSPRK3. Eigenvalues λ=e±iK˜ν of the matrix in the exact solution (24) are given by orange dots. Drawing a ray from the origin to one of these dots and measuring the angle that ray makes with the positive real axis gives the corresponding value of K˜ν. In other words, K˜ν is the argument of the corresponding complex number. Eigenvalues of the numerical amplification matrix G 0 as K˜ν =kΔxν increases are given by the blue curve and the blue stars. Dashed lines connect exact eigenvalues to numerically determined counterparts, which gives a measure of dissipation and dispersion errors. Values of K˜ν close to zero correspond to well-resolved waves, while values approaching π correspond to temporal gridscale waves.

Note that although we have used the Courant number ν=c(Δt/Δx) in this discussion in an effort to use notation analogous to the rest of this section, the system considered here is not discretized in space; we have that K˜ν=kΔx×c(Δt/Δx)=ckΔt . Values of K˜ν close to zero correspond to well-resolved waves, while values approaching π correspond to temporal gridscale waves. Dissipation error can be read as the distance a numerical eigenvalue is from the unit circle, resulting in the corresponding wave being damped out of the system. Dispersion error can be read as the difference in the angles made with the positive real axis between a numerically determined eigenvalue with its analytical counterpart, resulting in an error in the phase speed of the corresponding wave. For all choices of FB weights in Table 1, dissipation and dispersion errors for well resolved waves are very low (Fig. 1). As waves become less well resolved, each of the five schemes exhibits different behavior. In particular, note the behavior of the eigenvalues in Fig. 1c; well resolved waves are handled well as expected, and as waves move toward grid scale, the dissipation errors become very high, i.e., the blue curve becomes closer to the origin. This means that waves that are not well-resolved are quickly damped out of the system, which may explain the strong performance of FB-RK(3,2) with the corresponding choices of FB weights. Contrast this with the behavior of SSPRK3 in Fig. 1f; as we move toward the grid scale, rather than damping out the potentially problematic motions, the eigenvalues simply leave the unit circle, signaling that the method has become unstable.

Note that the paradigm in which the plots of Fig. 1 exist is different than the paradigm for which the FB weights were optimized. That is, the matrix G 0 comes from applying FB-RK(3,2) to a one-dimensional gravity wave system (22), whereas the FB weights presented in Table 1 are optimized for the linearized SWEs (10). We present the plots in Fig. 1 to show that the schemes optimized for stability in the linearized SWEs treat well-resolved gravity waves correctly in the best case, where the equations are not discretized in space. This also allows for a quick visual comparison between other methods presented with similar plots in the literature; in particular with the methods presented in ShMc.

3. Numerical experiments

Here, we present a number of numerical experiments that demonstrate the efficiency and accuracy of the CFL-optimized FB-RK(3,2) schemes presented in section 2. These experiments compare SSPRK3 and RK(3,2) to FB-RK(3,2) in terms of time-stepping efficiency, temporal convergence, and solution accuracy. We consider a total of five different nonlinear shallow-water test cases described below. The fully nonlinear SWEs are given by (A14) in section b of the appendix.

  • Quasi-linear gravity wave (QLW): A simple gravity wave traveling from the north pole to the south, then returning to the north, a process which takes approximately seven days of simulated time. This is the case of an external wave mode, also called the Lamb mode in the context of atmospheric modeling. The momentum advection terms in the momentum equation of (A14) are turned off, making the momentum equation purely linear. The thickness equation (A14) is formally nonlinear, but because the fluid depth is sufficiently large compared to any perturbation in the thickness, this behaves as though it is linear. Therefore, we refer to this test case as quasi-linear. The fluid velocity is initialized to zero, while the thickness is initialized as a Gaussian bell curve centered at the north pole of the form exp[100(xπ)2100y2]+500 .
  • Barotropically unstable jet (BUJ): Originally formulated by [12], this case consists of a strong zonal flow at midlatitude with a small perturbation that causes the barotropic jet to become unstable. This flow is driven by strongly nonlinear momentum advection tendencies, physical processes that are of particular relevance to weather prediction and climate modeling.
  • Williamson test case 2 (WTC2): A steady state solution to the nonlinear SWEs, one of the standard SWE test cases developed by [37]. The case consists of a geostrophically balanced zonal flow that is constant for all time.
  • Williamson test case 4 (WTC4): A forced nonlinear flow consisting of a translating low pressure center superimposed on a jet stream, one of the standard [37] SWE test cases.
  • Williamson test case 5 (WTC5): A zonal flow over an isolated mountain, one of the standard [37] SWE test cases. Here, the initial flow is exactly as in WTC2, but the seafloor topography is nonuniform, consisting of an isolated mountain at midlatitude. This perturbation alters the geostrophic balance of the flow, triggering the development of advection-driven instabilities.

All test cases take place on a rotating aquaplanet with angular velocity Ω = 7.292 × 10−5 s−1 and a radius of 6371.22 km. The seafloor topography is uniform unless otherwise specified. Each test case is implemented using a TRiSK spatial discretization ([28]), which is a finite volume-type spatial discretization made for unstructured, variable-resolution polygonal grids. Here, we slightly alter the standard TRiSK discretization by using an upwind potential vorticity flux to remove gridscale noise in the evolution of vorticity. Icosahedral meshes were used throughout, with quasi-uniform 60-km resolution employed in all test cases. Meshes were optimized according to the spherical centroidal Voronoi tessellation (SCVT) metric using the JIGSAW library ([9]), ensuring the TRiSK discretization achieves approximately second-order accuracy in the L2 norm. Balanced initial conditions for the BUJ and WTC test cases are generated following the procedure described in section e of the appendix. The code used to run all the experiments described in this section is open source and available on GitHub and Zenodo; see the data availability statement at the end of this document.

a. CFL performance

On all of the test cases described above, the optimized FB-RK(3,2) schemes from section 2d admit time steps between 1.59 and 2.81 times as large as SSPRK3 or RK(3,2). This demonstrates the ability for the FB averaging employed by FB-RK(3,2) and the subsequent optimization of the FB weights to greatly increase the stability range of an existing method across multiple problems. SSPRK3 is used as a baseline comparison here since it admits the smallest time steps for each case out of the schemes considered here, and it also consists of three Runge–Kutta stages, costing the same number of floating-point operations as RK(3,2) and FB-RK(3,2).

The largest increases in the admittable time step are in the QLW test case, with the largest increase at 2.81 times in the best case (Table 2). For practical reasons, we used a nonlinear SWE model to run the above test cases, but QLW was intentionally designed so that the model equations, while still formally nonlinear, approximate the behavior of the linear SWEs. That is, QLW provides a test case as close as possible to the paradigm for which FB-RK(3,2) was optimized. As noted, the comparison is not one-to-one. FB-RK(3,2) was optimized for the linearized SWEs on a square C-grid, while QLW uses quasi-linear SWEs and a hexagonal grid with a TRiSK discretization. Nonetheless, the improvement in CFL performance is remarkable and clearly demonstrates the efficacy of the optimization performed in section 2d.

Graph: Table 2. CFL performance of FB-RK(3,2) with optimal FB weights from Table 1 compared to SSPRK3 and RK(3,2) on various test cases. The recorded values for Δ t represent the largest time step for which the given test case could run without becoming unstable for the given duration. Time steps were determined experimentally by running the model for the given duration, increasing the time step until the model became unstable. For practical reasons, time steps were found as multiples of 5.

The remaining four test cases are fully nonlinear, covering a range of standard shallow-water test cases relevant to atmospheric and oceanic flows. In these, FB-RK(3,2) admits time steps between 1.59 and 2.18 times larger than SSPRK3. As SSPRK3 consists of the same number of Runge–Kutta stages as FB-RK(3,2), this directly translates to an increase of overall efficiency of between 1.59 and 2.18. The increases in admittable time step in these nonlinear cases are slightly less than those in the QLW case in part because these problems include fully nonlinear motions that cannot be accounted for in the optimization process used here; because our optimization relies on properties of von Nuemann stability analysis, we are limited to optimizing on a linear set of equations. However, Table 1 suggests that optimizing for the linearized SWEs provides substantial benefits even for the nonlinear SWEs. Indeed, as noted above, our analysis and the subsequent optimization was performed under a number of simplifications versus the paradigm in which these numerical experiments were performed. Nonetheless, our results show that the optimized FB-RK(3,2) schemes are robust when applied to fully nonlinear problems with a full TRiSK discretization. This provides evidence that our schemes are at least partially agnostic to the details of the spatial discretization.

Additionally, one can observe that the optimized FB-RK(3,2) schemes are sensitive to the Froude number; as this increases, the corresponding values of ν(β1,β2,β3)max decrease (Table 1). As the Froude number increases, the flow becomes dominated by advection rather than gravity wave propagation. When the stability is dominated by fast waves, FB-RK(3,2) works best as evidenced by its performance in the QLW case. As the wave and advective phase speeds come closer together, the speed-up from FB is reduced, due to the flow taking on more advective nature as can be seen in the four fully nonlinear test cases.

Turning our attention to the four nonlinear test cases, we see that FB-RK(3,2) with the FB weights obtained by optimizing C1 with |U| = 0.05 admits the largest time step in each case except WTC2 where it instead admits the second largest time step. This points to (β1, β2, β3) = (0.531, 0.531, 0.313) as a robust option for a broad span of problems. However, it is almost certain that the optimal choice of coefficient will be problem dependent.

b. Temporal convergence

In section a of the appendix, we perform a simplified local truncation error analysis that shows that FB-RK(3,2) is second-order in time for any choice of FB weights on the one dimensional linearized SWEs. Here, we support the assertion that FB-RK(3,2) is second-order in time with numerical experiments.

Each tested variation of FB-RK(3,2) is second-order in time (Fig. 2). Differences in accuracy between the variations of FB-RK(3,2) are negligible, showing that the choice of optimized FB weights has little effect on accuracy in this case. RK(3,2), the scheme on which FB-RK(3,2) is based, is third-order as expected as this is a linear problem ([27]).

Graph: Fig. 2. Temporal convergence for FB-RK(3,2) for a selection of FB weights. Convergence tests were performed using the QLW case, run for 7 simulated days. The L 2 errors are computed against a reference solution generated by the classical four-stage, fourth-order Runge–Kutta method (RK4) with a time step of 10 s.

c. Solution quality

We evaluate the solution quality of an optimal FB-RK(3,2) to that of SSPRK3 in two test cases of interest, BUJ and WTC5, comparing the relative vorticity produced by FB-RK(3,2) with (β1, β2, β3) = (0.531, 0.531, 0.313) to that produced by SSPRK3. The strong stability preserving property of SSPRK3 makes it a good choice for a point of comparison; if FB-RK(3,2) can produce a solution of comparable quality, we can be confident that it is a good choice for the SWEs and related problems.

In both cases, shown in Figs. 3 and 4, FB-RK(3,2) produces solutions of comparable quality to SSPRK3 even with a significantly increased time step. In BUJ, the two schemes produce relative velocities that differ by at most 10−7. In WTC5, this difference is on the order of 10−8. We also ran both test cases using RK(3,2) which produced plots, which we omit, that are visually identical to those given here.

Graph: Fig. 3. Relative vorticity produced by (a) SSPRK3 and (b) FB-RK(3,2) with (β 1 , β 2 , β 3) = (0.531, 0.531, 0.313) on the BUJ test case after 6 days of simulated time. SSPRK3 uses a time step of 108 s and FB-RK(3,2) uses a time step of 192 s. (c) The absolute difference between the two solutions. The plots themselves are lat/lon projections of the sphere.

Graph: Fig. 4. Relative vorticity produced by (a) SSPRK3 and (b) FB-RK(3,2) with (β 1 , β 2 , β 3) = (0.531, 0.531, 0.313) on the WTC5 test case after 50 days of simulated time. SSPRK3 uses a time step of 135 s and FB-RK(3,2) uses a time step of 288 s. (c) The absolute difference between the two solutions. The plots themselves are lat/lon projections of the sphere.

FB-RK(3,2) uses a time step 1.78 and 2.13 times larger that that used by SSPRK3 on BUJ and WTC5, respectively. Despite taking about half the wall-clock time and computational resources, FB-RK(3,2) produces vorticities that are qualitatively equivalent to SSPRK3. As noted in section 3a, though FB-RK(3,2) is lower order in time than SSPRK3, the produced solutions are qualitatively equivalent; likely partially due to lower-order spatial errors which dominate. In operational configurations of climate-scale models, such spatial errors are generally dominant, so one can use a more economical time-stepping scheme like FB-RK(3,2) without sacrificing model quality.

4. Conclusions

With the goal of producing computationally efficient time-stepping schemes for the SWEs, we have presented a class of three-stage second-order Runge–Kutta schemes that use forward–backward averaging to advance the momentum equation. The FB weights in these FB-RK(3,2) schemes were then optimized using a von Neumann–type analysis on the shallow-water equations linearized about a nonzero mean flow. Even though it is impossible to perform a similar analysis and optimization on the fully nonlinear SWEs, the results of our linearized analysis clearly result in significant increases in efficiency when solving nonlinear test problems.

Our optimized FB-RK(3,2) schemes compare favorably in terms of effective CFL number to an existing forward–backward scheme in [31]. In numerical experiments, the schemes are up to 2.8 times more computationally efficient than the RK(3,2) scheme from which they were derived and a popular three-stage third-order strong stability preserving Runge–Kutta scheme (Table 2). Despite being significantly more computationally economical than other schemes, FB-RK(3,2) produces solutions that are of comparable quality to other schemes while using a time step approximately twice as long (Figs. 3 and 4). FB-RK(3,2) schemes are generally applicable to geophysical fluid flows in the ocean and atmosphere, with a range of Froude numbers, as evidenced by their performance across a number of relevant test cases. Their use provides an economical pathway to increasing the computational efficiency of models of any scope, but particularly models at the climate scale wherein time-to-solution can be restrictive.

Moving forward, we are interested in adapting FB-RK(3,2) to a local time-stepping (LTS) scheme similar to that used in [7] or [15], thereby combining the benefits of the CFL performance of FB-RK(3,2) with the obvious performance implications of LTS schemes. Another potentially valuable application of FB-RK(3,2) would be its use as a barotropic solver within a split-explicit, layered model wherein barotropic and baroclinic motions are separated and solved by different methods. The barotropic equations are essentially the SWEs and our work here clearly demonstrates that FB-RK(3,2) is a good choice for solving such a system. As the number of MPI processes increases, the communication costs with in the barotropic solver begin to dominate within a split ocean model. Taking a longer barotropic time step would help to assuage some of these costs.

Acknowledgments.

JRL was supported by the U.S. Department of Energy (DOE), Office of Science, Office of Workforce Development for Teachers and Scientists, Office of Science Graduate Student Research (SCGSR) program. The SCGSR program is administered by the Oak Ridge Institute for Science and Education (ORISE) for the DOE. ORISE is managed by ORAU under Contract DE-SC0014664. JRL was also supported by the National Science Foundation (NSF) Mathematical Sciences Graduate Internship (MSGI). GC, MRP, and DE were supported by the Earth System Model Development program area of the U.S. DOE, Office of Science, Office of Biological and Environmental Research as part of the multi-program, collaborative Integrated Coastal Modeling (ICoM) project. The authors would also like to thank Pedro S. Peixoto and two anonymous reviewers for their insightful comments and suggestions.

APPENDIX

Supplementary Mathematical Details

a. Local truncation error analysis

What follows is a simplified local truncation error analysis of FB-RK(3,2) that shows that the method is at least O[(Δt)2] on a particular problem for any choice of the FB weights βi. To ease notation and simplify symbolic calculations, consider a simplified version of the linearized SWEs (5) with zero mean flow, in one spatial dimension, and where f = 0:

(A1) ut=cηx,ηt=cux.

Assume that u is sufficiently smooth that derivatives up to sixth order in both t and x exist, and that η is sufficiently smooth that derivatives up to fifth order in both t and x exist. Further, assume that both u and η are sufficiently smooth that (/t)(k/xk)u=(k/xk)(/t)u and (/t)(k/xk)η=(k/xk)(/t)η for k = 1, 2. This second assumption gives us

(A2) kutk={ckkηxk,ifkisoddckkuxk,ifkiseven,

(A3) kηtk={ckkuxk,ifkisoddckkηxk,ifkiseven,

for k = 1, 2. Also, note that in the analysis that follows, the dependence of u and η on x is suppressed to simplify notation.

Now, by applying FB-RK(3,2) to (A1), we can write

(A4) un+1unΔt=cηnx+Δt2c22unx2(Δt)26c3(β3+1)3ηnx3+(Δt)336c4(4β1β3+2β1+9β2β3)4unx4(Δt)412c5β2β35ηnx5+(Δt)536c6β1β2β36unx6=U(un,ηn),

(A5) ηn+1ηnΔt=cunx+Δt2c22ηnx2(Δt)24c3β23unx3+(Δt)312c4β24ηnx4(Δt)436c5β1β25unx5=H(un,ηn).

Notice that the right-hand side of (A4) and (A5) are equations in un and ηn; call them U(un, ηn) and H(un, ηn), respectively.

The local truncation error for a given time-stepping scheme can be thought of as the error introduced by said scheme during one time step, assuming exact data at the beginning of the time step. For this problem, the local truncation errors in u and η are defined to be the quantities τun and τηn such that

(A6) u(tn+1)u(tn)Δt=U[u(tn),η(tn)]+τun,

(A7) η(tn+1)η(tn)Δt=H[u(tn),η(tn)]+τηn.

To calculate τun and τηn , rearrange (A6) and (A7), then expand u(tn+1) and η(tn+1) in t as Taylor polynomials centered at t = tn to get

(A8) Δtτun=u(tn+1)u(tn)ΔtU[u(tn),η(tn)]={k=06(Δt)kk!kutk(tn)+O[(Δt)7]}u(tn)ΔtU[u(tn),η(tn)],

(A9) Δtτηn=η(tn+1)η(tn)ΔtH[u(tn),η(tn)]={k=05(Δt)kk!kηtk(tn)+O[(Δt)6]}η(tn)ΔtH[u(tn),η(tn)].

Now, simplify (A8) and (A9). Starting with (A8), we get

(A10) Δtτun={k=03(Δt)kk!kutk(tn)+O[(Δt)4]}u(tn)ΔtU[u(tn),η(tn)]={u(tn)+Δtut(tn)=cηx(tn)+(Δt)222ut2(tn)=c22ux2(tn)+(Δt)363ut3(tn)+O[(Δt)4]}  +{u(tn)+Δtcηnx(tn)(Δt)22c22unx2(tn)+(Δt)36c3(β3+1)3ηnx3(tn)+O[(Δt)4]}={(Δt)363ut3(tn)+(Δt)36c3(β3+1)3ηx3(tn)+O[(Δt)4]}.

From (A10), we see that Δtτun=O[(Δt)3] no matter the values of the FB weights, so

(A11) τun=O[(Δt)2].

Next simplify (A9) in a similar way:

(A12) Δtτηn={k=03(Δt)kk!kηtk(tn)+O[(Δt)4]}η(tn)ΔtH[u(tn),η(tn)]={η(tn)+Δtηt(tn)=cux(tn)+(Δt)222ηt2(tn)=c22ηx2(tn)+(Δt)363ηt3(tn)+O[(Δt)4]}  +{η(tn)+Δtcunx(tn)(Δt)22c22ηnx2(tn)+(Δt)34c3β23unx3(tn)+O[(Δt)4]}={(Δt)363ηt3(tn)+(Δt)34c3β23ux3(tn)+O[(Δt)4]}.

From (A12), we see that with no assumptions on the FB weights, we have Δtτηn=O[(Δt)3] , so

(A13) τηn=O[(Δt)2].

b. Linearization of the shallow-water equations

Here, we present the derivation of (4), the linearized SWEs centered about a nonzero mean flow. The nonlinear SWEs on a rotating sphere are given by

(A14) ut+(×u+fk)×u=|u|22gh,ht+(hu)=0,

where u(x,y,t)=[u(x,y,t),υ(x,y,t)] is the horizontal fluid velocity, x and y are the spatial coordinates, t is the time coordinate, f is the Coriolis parameter, g is the gravitational constant, and h is the fluid thickness.

To linearize (A14) about a nonzero mean flow, assume that

(A15) u(x,y,t)=U˜+u˜(x,y,t),

(A16) h(x,y,t)=H+h˜(x,y,t),

where U˜=(U˜,V˜) is the constant (with respect to time and space) mean horizontal fluid velocity, u˜=[u˜(x,y,t),υ˜(x,y,t)] is a small perturbation of U˜ , H is the constant (with respect to time and space) fluid thickness when at rest, and h˜=h˜(x,y,t) is a small perturbation of H. In both cases, small means that the quantities themselves, as well as their derivatives, are negligible when multiplied together or squared.

Now, we make the substitutions u=U˜+u˜ and h=H+h˜ and simplify term by term. The kinetic energy term becomes

(A17) u22=12(u2+υ2)=12(U˜2+2U˜u˜+u˜2+V˜2+2V˜υ˜+υ˜2)=U˜u˜V˜υ˜12(u˜2+υ˜2)U˜u˜V˜υ˜=U˜u˜,

where u˜ is the Jacobian matrix,

(xu˜xυ˜yu˜yυ˜),

and U˜u˜ denotes a 2 × 1 column vector wherein the ith entry is the dot product of U˜ with the ith row of u˜ . The potential vorticity and Coriolis term becomes

(A18) (×u+fk)×u=[(υxuy)k+fk]×u=i[υ(υxuy+f)]+j[u(υxuy+f)]=i[f(V˜+υ˜)V˜(υ˜xu˜y)υ˜(υ˜xu˜y)]  +j[f(U˜+u˜)+U˜(υ˜xu˜y)+u˜(υ˜xu˜y)]i[f(V˜+υ˜)V˜(υ˜xu˜y)]+j[f(U˜+u˜)+U˜(υ˜xu˜y)]=f(U˜+u˜)+ζ˜U˜,

where ζ˜=(υ˜/x)(u˜/y) is the vertical component of the vorticity ×u˜ . The thickness gradient becomes

(A19) gh=g(H+h˜)=gh˜

Finally, the mass advection term becomes

(A20) (hu)=(HU˜+Hu˜+h˜U˜+h˜u˜)=(Hu˜+h˜U˜+h˜u˜)H(u˜)+(h˜U˜)

So, linearized SWEs with a nonzero mean flow are given by

(A21) u˜t+f(U˜+u˜)+ζ˜U˜=U˜u˜gh˜h˜t+H(u˜)+(h˜U˜)=0.

c. Arakawa C-grid discretization

Figure A1 shows an example of a rectangular, staggered Arakawa C-grid [1]. Here, the mass variable η is computed at cell centers and the normal components of velocity u and υ are computed at the center of cell edges.

Graph: Fig. A1. An example Arakawa C-grid.

To see how to Eqs. (6) and (7) are obtained, consider the following example wherein we compute a derivative in x at the point (x1, y1) in Fig. A1. On this grid, the derivative /x of some quantity can be approximated by taking a centered difference of values at the two nearest points:

x[exp(ikx1+ily1)]=exp[ik(x1+Δx2)+ily1]exp[ik(x1+Δx2)+ily1]Δx=eik(Δx/2)eik(Δx/2)Δxexp(ikx1+ily1)=i2sin(kΔx2)Δxexp(ikx1+ily1)

An identical calculation hold for derivatives in y.

To calculate the Coriolis terms, we need data for u at υ points, and for υ at u points, called uυ and υu, respectively. To see how Eqs. (8) and (9) are obtained, consider the following example wherein we calculate υu at (x2, y2) in Fig. A1 by taking the average of υ data at the four nearest υ points:

υu(x2,y2,t)=14υ^(k,l,t){exp[ik(x2+Δx2)+il(y2+Δy2)]+exp[ik(x2+Δx2)il(y2+Δy2)]+exp[ik(x2Δx2)+il(y2+Δy2)]+exp[ik(x2Δx2)+il(y2Δy2)]}=14[exp(ikΔx2+ilΔy2)+exp(ikΔx2ilΔy2)+exp(ikΔx2+ilΔy2)+exp(ikΔx2ilΔy2)]υ^(k,l,t)exp(ikx2+ily2)=14[4cos(kΔx2)cos(lΔy2)]υ^(k,l,t)exp(ikx2+ily2)=cos(kΔx2)cos(lΔy2)υ^(k,l,t)exp(ikx2+ily2).

An identical calculation holds for uυ.

d. Formulation of G ˜

Assume that the mean flow U in the linearized SWEs (5) is taken to be zero, and apply a Fourier transform in space. The resulting equations can be written as

(A22) t(u^υ^η^)=(0fickf0iclickicl0)(u^υ^η^).

These equations are spatially continuous counterparts to (10).

The general solution to this system is given by

(A23) w^(t)=eAtw^(0),

where w^(t)=[u^(t),υ^(t),η^(t)]T and A is as given in (18). Now, consider the general solution of the system at time t = Δt. Then,

(A24) w^(Δt)=eAΔtw^(0).

Multiplying initial data w^(0)=w^0 by the numerical amplification matrix G defined in (12) returns the numerical solution after one time step. Assume that the FB-RK(3,2) scheme produces an exact solution, that is, assume Gw^0=w^(Δt) . Then

(A25) Gw^0=eAΔtw^0.

This holds for any choice of initial data w^0 . Take w^0=ei , where ei is the ith standard basis vector for i = 1, 2, 3 to get

(A26) Gei=eAΔteifori=1,2,3.

Because Gei and eAΔtei are just the ith columns of G and eAΔt , respectively, it follows that G = eAΔt . The reverse implication is trivial. To simplify notation, we write

(A27) G˜:=eAΔt.

e. Constructing balanced initial conditions

While the shallow-water cases described by [12] and [37] have been successfully implemented for various structured grid models, we briefly describe an initialization procedure that is used to produce geostrophically balanced initial conditions for the unstructured TRiSK-like model used in this study. Introducing a matrix–vector form of the nonlinear SWE system presented in (A14)

(A28) ut+(Cu+f)u=G[g(h+zb)+K],ht+D(uh)=0,K=12|u|2,

where D,G,C are sparse linear operators encoding the action of divergence, gradient and curl on a given discrete vector, as per [28]. A balanced thickness distribution h0 can be found for a given velocity field u0 by setting u/t=0 and taking the divergence of momentum tendencies:

(A29) K0=12|u0|2,D[(Cu0+f)(u0)+GK0]=gDGh0.

Recognizing the product DG=L is an approximation of the continuous Laplacian ∇2, (A29) can be seen as an elliptic problem that can be solved by inverting the system of linear equations:

(A30) h0=1gL1D[(Cu0+f)(u0)+GK0]

In this study we solve (A30) using SciPy's ([35]) gcrotmk iterative solver. Compared to the evaluation of analytical profiles, the construction of initial conditions through the solution of (A30) is agnostic to details of the computational mesh used and ensures the discrete state is balanced with respect to the spatial operators used to advance the flow. This approach has proved robust for the unstructured model configurations considered in this study.

Footnotes 1 © 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses). REFERENCES Arakawa, A., and V. R. Lamb, Eds., 1977 : Computational design of the basic dynamical processes of the UCLA general circulation model. Methods in Computational Physics: Advances in Research and Applications, J. Chang, Ed., General Circulation Models of the Atmosphere, Vol. 17, Elsevier, 173–265, https://doi.org/10.1016/B978-0-12-460817-7.50009-4. 2 Azerad, P., J.-L. Guermond, and B. Popov, 2017 : Well-balanced second-order approximation of the shallow water equation with continuous finite elements. SIAM J. Numer. Anal., 55, 3203 – 3224, https://doi.org/10.1137/17M1122463. 3 Bao, L., R. D. Nair, and H. M. Tufo, 2014 : A mass and momentum flux-form high-order discontinuous Galerkin shallow water model on the cubed-sphere. J. Comput. Phys., 271, 224 – 243, https://doi.org/10.1016/j.jcp.2013.11.033. 4 Capodaglio, G., and M. Petersen, 2022 : Local time stepping for the shallow water equations in MPAS. J. Comput. Phys., 449, 110818, https://doi.org/10.1016/j.jcp.2021.110818. 5 Chassignet, E. P., H. E. Hurlburt, O. M. Smedstad, G. R. Halliwell, P. J. Hogan, A. J. Wallcraft, R. Baraille, and R. Bleck, 2007 : The HYCOM (HYbrid Coordinate Ocean Model) data assimilative system. J. Mar. Syst., 65, 60 – 83, https://doi.org/10.1016/j.jmarsys.2005.09.016. 6 Collins, W., and Coauthors, 2004 : Description of the NCAR Community Atmosphere Model (CAM 3.0). NCAR Tech. Note NCAR/TN-464+STR, 214 pp., https://doi.org/10.5065/D63N21CH. 7 Dawson, C., C. J. Trahan, E. J. Kubatko, and J. J. Westerink, 2013 : A parallel local timestepping Runge–Kutta discontinuous Galerkin method with applications to coastal ocean modeling. Comput. Methods Appl. Mech. Eng., 259, 154 – 165, https://doi.org/10.1016/j.cma.2013.03.015. 8 Endres, S. C., C. Sandrock, and W. W. Focke, 2018 : A simplicial homology algorithm for Lipschitz optimisation. J. Global Optim., 72, 181 – 217, https://doi.org/10.1007/s10898-018-0645-y. 9 Engwirda, D., 2017 : JIGSAW-GEO (1.0): Locally orthogonal staggered unstructured grid generation for general circulation modelling on the sphere. Geosci. Model Dev., 10, 2117 – 2140, https://doi.org/10.5194/gmd-10-2117-2017. Engwrida, D., and J. R. Lilly, 2023 : Source code for swe-python, version 1. Zenodo, accessed 22 May 2023, https://doi.org/10.5281/zenodo.7958483. Fu, G., 2022 : A high-order velocity-based discontinuous Galerkin scheme for the shallow water equations: Local conservation, entropy stability, well-balanced property, and positivity preservation. J. Sci. Comput., 92, 86, https://doi.org/10.1007/s10915-022-01902-y. Galewsky, J., R. K. Scott, and L. M. Polvani, 2004 : An initial-value problem for testing numerical models of the global shallow-water equations. Tellus, 56A, 429 – 440, https://doi.org/10.3402/tellusa.v56i5.14436. Golaz, J.-C., and Coauthors, 2022 : The DOE E3SM model version 2: Overview of the physical model and initial model evaluation. J. Adv. Model. Earth Syst., 14, e2022MS003156, https://doi.org/10.1029/2022MS003156. Gottlieb, S., and C.-W. Shu, 1998 : Total variation diminishing Runge-Kutta schemes. Math. Comput., 67, 73 – 85, https://doi.org/10.1090/S0025-5718-98-00913-2. Hoang, T.-T.-P., W. Leng, L. Ju, Z. Wang, and K. Pieper, 2019 : Conservative explicit local time-stepping schemes for the shallow water equations. J. Comput. Phys., 382, 152 – 176, https://doi.org/10.1016/j.jcp.2019.01.006. Kärnä, T., S. C. Kramer, L. Mitchell, D. A. Ham, M. D. Piggott, and A. M. Baptista, 2018 : Thetis coastal ocean model: Discontinuous Galerkin discretization for the three-dimensional hydrostatic equations. Geosci. Model Dev., 11, 4359 – 4382, https://doi.org/10.5194/gmd-11-4359-2018. Katta, K. K., R. D. Nair, and V. Kumar, 2015 : High-order finite volume shallow water model on the cubed-sphere: 1D reconstruction scheme. Appl. Math. Comput., 266, 316 – 327, https://doi.org/10.1016/j.amc.2015.04.053. Kerbyson, D. J., and P. W. Jones, 2005 : A performance model of the parallel ocean program. Int. J. High Perform. Comput. Appl., 19, 261 – 276, https://doi.org/10.1177/1094342005056114. Lan, R., L. Ju, Z. Wang, M. Gunzburger, and P. Jones, 2022 : High-order multirate explicit time-stepping schemes for the baroclinic-barotropic split dynamics in primitive equations. J. Comput. Phys., 457, 111050, https://doi.org/10.1016/j.jcp.2022.111050. LeVeque, R. J., Ed., 2007 : Diffusion equations and parabolic problems. Finite Difference Methods for Ordinary and Partial Differential Equations, Other Titles in Applied Mathematics, Society for Industrial and Applied Mathematics, 181–200, https://doi.org/10.1137/1.9780898717839.ch9. Lilly, J. R., 2023a : Data for 'CFL optimized forward-backward Runge-Kutta schemes for the shallow water equations,' version 1. Zenodo, accessed 22 May 2023, https://doi.org/10.5281/zenodo.7958518. Lilly, J. R., 2023b : Source code for rk-fb-optimization, version 1. Zenodo, accessed 22 May 2023, https://doi.org/10.5281/zenodo.7958458. Lilly, J. R., G. Capodaglio, M. R. Petersen, S. R. Brus, D. Engwirda, and R. L. Higdon, 2023 : Storm surge modeling as an application of local time-stepping in MPAS-ocean. J. Adv. Model. Earth Syst., 15, e2022MS003327, https://doi.org/10.1029/2022MS003327. Madec, G., and Coauthors, 2017 : NEMO ocean engine, version 3.6. Notes du Pôle de modélisation de l'Institut Pierre-Simon Laplace (IPSL) 27, 412 pp., https://doi.org/10.5281/zenodo.3248739. Mesinger, F., 1977 : Forward-backward scheme, and its use in a limited area model. Contrib. Atmos. Phys., 50, 200 – 210. Petersen, M. R., and Coauthors, 2019 : An evaluation of the ocean and sea ice climate of E3SM using MPAS and interannual CORE-II forcing. J. Adv. Model. Earth Syst., 11, 1438 – 1458, https://doi.org/10.1029/2018MS001373. Purser, R. J., 2007 : Accuracy considerations of time-splitting methods for models using two-time-level schemes. Mon. Wea. Rev., 135, 1158 – 1164, https://doi.org/10.1175/MWR3339.1. Ringler, T. D., J. Thuburn, J. B. Klemp, and W. C. Skamarock, 2010 : A unified approach to energy conservation and potential vorticity dynamics for arbitrarily-structured C-grids. J. Comput. Phys., 229, 3065 – 3090, https://doi.org/10.1016/j.jcp.2009.12.007. Ringler, T. D., M. Petersen, R. L. Higdon, D. Jacobsen, P. W. Jones, and M. Maltrud, 2013 : A multi-resolution approach to global ocean modeling. Ocean Modell., 69, 211 – 232, https://doi.org/10.1016/j.ocemod.2013.04.010. Roullet, G., and T. Gaillard, 2022 : A fast monotone discretization of the rotating shallow water equations. J. Adv. Model. Earth Syst., 14, e2021MS002663, https://doi.org/10.1029/2021MS002663. Shchepetkin, A. F., and J. C. McWilliams, 2005 : The Regional Oceanic Modeling System (ROMS): A split-explicit, free-surface, topography-following-coordinate oceanic model. Ocean Modell., 9, 347 – 404, https://doi.org/10.1016/j.ocemod.2004.08.002. Shchepetkin, A. F., and J. C. McWilliams, 2009 : Computational kernel algorithms for fine-scale, multiprocess, longtime oceanic simulations. Handbook of Numerical Analysis, R. M. Temam and J. J. Tribbia, Eds., Computational Methods for the Atmosphere and the Oceans, Vol. 14, Elsevier, 121–183, https://doi.org/10.1016/S1570-8659(08)01202-0. Skamarock, W. C., J. B. Klemp, M. G. Duda, L. D. Fowler, S.-H. Park, and T. D. Ringler, 2012 : A multiscale nonhydrostatic atmospheric model using centroidal Voronoi tesselations and C-grid staggering. Mon. Wea. Rev., 140, 3090 – 3105, https://doi.org/10.1175/MWR-D-11-00215.1. Ullrich, P. A., C. Jablonowski, and B. van Leer, 2010 : High-order finite-volume methods for the shallow-water equations on the sphere. J. Comput. Phys., 229, 6104 – 6134, https://doi.org/10.1016/j.jcp.2010.04.044. Virtanen, P., and Coauthors, 2020 : SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods, 17, 261 – 272, https://doi.org/10.1038/s41592-019-0686-2. Wicker, L. J., and W. C. Skamarock, 2002 : Time-splitting methods for elastic models using forward time schemes. Mon. Wea. Rev., 130, 2088 – 2097, https://doi.org/10.1175/1520-0493(2002)130<2088:TSMFEM>2.0.CO;2. Williamson, D. L., J. B. Drake, J. J. Hack, R. Jakob, and P. N. Swarztrauber, 1992 : A standard test set for numerical approximations to the shallow water equations in spherical geometry. J. Comput. Phys., 102, 211 – 224, https://doi.org/10.1016/S0021-9991(05)80016-6.

By Jeremy R. Lilly; Darren Engwirda; Giacomo Capodaglio; Robert L. Higdon and Mark R. Petersen

Reported by Author; Author; Author; Author; Author

Titel:
CFL Optimized Forward-Backward Runge-Kutta Schemes for the Shallow Water Equations
Autor/in / Beteiligte Person: Lilly, Jeremy R. ; Engwirda, Darren ; Capodaglio, Giacomo ; Higdon, Robert L. ; Petersen, Mark R.
Link:
Veröffentlichung: 2023
Medientyp: report
DOI: 10.1175/MWR-D-23-0113.1
Schlagwort:
  • Mathematics - Numerical Analysis
Sonstiges:
  • Nachgewiesen in: arXiv
  • Collection: Computer Science ; Mathematics
  • Document Type: Working Paper

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

oder
oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

oder
oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.

xs 0 - 576
sm 576 - 768
md 768 - 992
lg 992 - 1200
xl 1200 - 1366
xxl 1366 -