High-order filtered schemes for the Hamilton-Jacobi continuum limit of nondominated sorting

We investigate high-order finite difference schemes for the Hamilton-Jacobi equation continuum limit of nondominated sorting. Nondominated sorting is an algorithm for sorting points in Euclidean space into layers by repeatedly removing minimal elements. It is widely used in multi-objective optimization, which finds applications in many scientific and engineering contexts, including machine learning. In this paper, we show how to construct filtered schemes, which combine high order possibly unstable schemes with first order monotone schemes in a way that guarantees stability and convergence while enjoying the additional accuracy of the higher order scheme in regions where the solution is smooth. We prove that our filtered schemes are stable and converge to the viscosity solution of the Hamilton-Jacobi equation, and we provide numerical simulations to investigate the rate of convergence of the new schemes.


Introduction
In this paper, we investigate high-order finite difference schemes for the Hamilton-Jacobi equation u x 1 · · · u xn = f in (0, 1] n u = 0 on ∂[0, 1] n \ (0, 1] n , (1.1) where f ≥ 0. This Hamilton-Jacobi equation is the continuum limit of nondominated sorting, which is an algorithm for arranging points in Euclidean space into layers by peeling away extremal points. Nondominated sorting is fundamental in multi-objective optimization problems, which are ubiquitous in science and engineering, and more recently in machine learning. For more details on the connection to nondominated sorting and applications, we refer the reader to [3,[5][6][7][8][9][11][12][13][14]. The Hamilton-Jacobi equation (1.1) has a unique non-decreasing viscosity solution. In order to select the viscosity solution of (1.1), the finite difference scheme is required to be monotone [1]. Roughly speaking, the monotonicity property leads to a maximum principle for the scheme, which is one of the main techniques for proving stability and ensuring convergence to the viscosity solution. Unfortunately, all monotone schemes are necessarily first order at best [4].
It has been observed [1,16] that the monotonicity property can be relaxed to hold only approximately, with a residual error that vanishes as the grid is refined, while still ensuring the scheme converges to the viscosity solution. This allows one to design so-called filtered schemes, which blend together high-order nonmonotone schemes with monotone first-order schemes in such a way that the resulting filtered scheme is approximately monotone. The idea at a high level is to use the higher order scheme in regions where the solution is smooth while falling back on the monotone scheme near singularities. High order filtered schemes have received a lot of attention recently [2,10,[15][16][17][18]20].
In this paper, we show how to construct arbitrary order filtered upwind finite schemes for the Hamilton-Jacobi equation (1.1). The upwind direction for this Hamilton-Jacobi equation is to look backwards along the coordinate axes. Therefore, our upwind schemes all use backward difference quotients in order to follow the flow of the characteristics and select the viscosity solution. This allows the schemes to be solved in a single pass yielding fast (linear complexity) algorithms. We prove that our filtered schemes are stable and convergent for any order, and we present numerical simulations investigating rates of convergence.
Our numerical simulations lead to two surprising conclusions that merit future work. First, we observe that backward differences (without filtering) for the two dimensional version of (1.1) appear to be stable for order k ≤ 2, borderline stable for k = 3 (numerical solutions remain bounded but do not converge), and highly unstable for k ≥ 4. We recall that it is a classical fact that backward differences for the one-dimensional version of (1.1) (e.g., u ′ (x) = f (x)) are stable for order k ≤ 6 and unstable for k ≥ 7. It would be interesting to prove the order k = 2 scheme is stable for (1.1), and examine the situation in higher dimensions. We note the filtered higher order schemes are stable for any k.
Second, we observe that filtering higher order schemes with first order monotone schemes is only successful at increasing accuracy for order k = 2. For order k ≥ 3, we find that the filtering relies too much on the first order scheme, and while the schemes are stable and convergent, the order of accuracy is closer to first order. This is true even when the solution of (1.1) is smooth. As far as we are aware, this observation has not appeared in the literature on filtered schemes; the existing literature [2,10,[15][16][17][18]20] has only considered numerical experiments with second order schemes. This observation refutes the conventional wisdom that one can filter any higher order scheme-the choice of the higher order scheme may be crucial, and would be an interesting problem to pursue in future work.
We should note that while some filtered schemes show higher order convergence rates in some test cases, there are no proofs that any schemes have convergence rates better than first order in general. This is a limitation in the viscosity solution theory; in fact, since solutions are not classical, the best provable rate in general is O( √ h), with a one-sided O(h) rate when the solution is semi-concave [4]. Even in the special case where the solution of the Hamilton-Jacobi equation is smooth, it is generally difficult to prove a higher order convergence rate, since it requires a strong stability result for the higher order scheme and the maximum principle is unavailable.
The coefficients of the difference quotients in our schemes can be obtained by solving a small linear system involving a Vandermonde matrix, as is common in the literature. As an interesting addition to the paper, we give explicit formulas for the coefficients for arbitrary order backward difference quotients, and give simple direct proofs of the formulas. We expect these coefficients have appeared explicitly before in the literature, but we include the results for completeness. Our method extends to computing the coefficients for nonsymmetric and offset centered differences, which we explore in Section 4. We do not explicitly invert the Vandermonde matrix, but this could be done via an LU factorization as was given in [19].
In [7] a fast approximate nondominated sorting algorithm was developed based on estimating the distribution of the data, which is the right-hand side f , and solving the Hamilton-Jacobi equation (1.1) numerically. The algorithm is called PDE-based ranking and was shown to be significantly faster than nondominated sorting in low dimensions. The higher order filtered schemes developed in this paper can be directly used in the PDE-based ranking algorithm to improve the accuracy with minimal additional computational cost. Thus, this work has the potential to have a broad impact in applications of nondominated sorting.
This paper is organized as follows. In Section 2 we present our filtered schemes and prove stability and convergence. In Section 3 we present the results of numerical simulations, and in Section 4 we give explicit formulas for the coefficients of various difference quotients.

Filtered schemes
In this section we introduce our filtered schemes and prove stability and convergence.

Backward differences
Since the upwind direction for (1.1) is the negative orthant, our higher order schemes will all use backward difference quotients. The following theorem gives the exact coefficients for all backward different quotients for first derivatives. We expect this is known in the literature, but we give the proof for completeness. Theorem 2.1. Let k be a positive integer and f ∈ C k+1 . Then The proof relies on an elementary lemma, which is useful to state independently. Proof. Let So, It's can also be observed that for all k ≤ n − 1, q k (1) = 0. Therefore, We now give the proof of Theorem 2.1.
Proof. By using a Taylor series expansion, we obtain Rearranging the left hand side of equation ( By binomial expansion, By Lemma 2.2, for j = 2, . . . , k, Therefore, we yield as desired.
We can also express the backward difference quotient in a more usual form.
Proof. It's easy to see that d i = c i (defined as in Theorem 2.1) for i = 0. In the case when i = 0, we have

High-order filtered finite difference schemes
The Hamilton-Jacobi equation (1.1) does not have smooth or even Lipschitz solutions, due to a gradient singularity near the boundary where x i = 0. Indeed, in the special case that f = 1 the viscosity solution is u = n(x 1 · · · x n ) 1/n . Following [5] we first perform a singularity factorization before solving the Hamilton-Jacobi equation. In particular, let u be the viscosity solution of (1.1) and define It is possible to show [5] that w is Lipschitz continuous and satisfies We can also show that w is the unique bounded viscosity solution of the Hamilton-Jacobi equation Although it appears that (2.4) does not have a boundary condition, it is in fact encoded into the Hamilton-Jacobi equation, since w(0) n = f (0). See [5] for more details on the above two statements. The idea from [5], which we borrow here, is to solve (2.4) numerically and then undo the transformation (2.2) to obtain a numerical approximation of u. The work in [5] explored first order finite difference schemes for (2.4) and proved optimal convergence rates of O( √ h). We extend these ideas here to higher order filtered schemes. In light of Corollary 2.3 we define for u : [0, 1] n → R the k th -order backward difference quotient to be Here, h > 0 is the grid resolution, and as a convention we take u(x) = 0 whenever x ∈ [0, 1] n . The boundary value is irrelevant and does not enter into the scheme. By Corollary 2.3 we have that We define the first order upwind scheme approximating the left-hand side of (2.4) to be (2.6) The first order scheme from [5] corresponds to solving where Ω h := Ω ∩ hZ n for any Ω ⊆ R n . This scheme has a unique solution, and is monotone, stable, and convergent to the viscosity solution. Furthermore, this scheme can be solved with a (linear time) fast sweeping algorithm visiting each grid point exactly once. For more details on this scheme, we refer the reader to [5]. We define the k th -order upwind finite difference scheme approximating the left-hand side of (2.4) by The k th -order upwind finite difference scheme is then given by We note that this scheme is not monotone and may not have a unique solution. We can construct a solution with a fast sweeping method, as we did with the first order scheme, but there is no guarantee that the scheme will be stable or convergent to the viscosity solution. In Section 3 we present results of simulations suggesting that the second order scheme is stable and convergent, but higher order schemes are unstable. The k th -order filtered upwind finite difference approximation of the left-hand side of (2.4) is given by The k th -order filtered upwind finite difference scheme is then given by The idea behind the filtering is that when the solution is smooth we should have |F k −F 1 | ≤ Ch and so the higher order scheme is selected. In regions where the solution is not smooth, the filtered scheme falls back on the monotone and stable first order scheme. The key property of filtered schemes is that any solution w h of (2.10) also satisfies By approximately solving the first order scheme, the solutions of the filtered schemes inherit the stability and convergence properties of the first-order scheme. We note that solutions of (2.10) may not be unique, since the scheme is not monotone. We compute the solution as follows: At each grid point we solve both of the schemes F 1 = f and F k = f for w h (x), taking the largest root when there are multiple solutions. We then take the solution of F k = f and check if the first property in (2.9) is satisfied. If so, we choose this solution for w h (x), otherwise we select the solution of F 1 = f . We analyze stability and convergence in the next section.

Stability and convergence
We prove here stability and convergence of the filtered schemes given in (2.10).
Theorem 2.4. Assume that f ≥ 0 and let w h be a solution of (2.10). Then we have This establishes a lower bound of w h .
To determine an upper bound of w h , we note that the filtered scheme satisfies At the maximum of w h , we have Thus max w h ≤ (max f + √ h) 1/n . This establishes an upper boundary of w h . Therefore, and the stability follows.
Given the stability result in Theorem 2.4, it is now a standard application of the Barles-Souganidis framework [1] to prove convergence of the filtered scheme (2.10) to the viscosity solution of (2.4). In fact, the proof of convergence of the filtered schemes is very similar to the results in [5]. Let us mention, however, that these convergence results do not establish any convergence rate. The best provable convergence rate for the filtered scheme is still O( √ h). In practice, we do often see far better convergence rates for the filtered schemes, and we present simulations investigating convergence rates in Section 3.

Numerical simulations
We run simulations on both backward difference schemes and filtered schemes of orders 1, 2, 3, 5, 8, and 13 with two probability density functions f 1 and f 2 that were introduced before in [5]. The function f 1 is defined as follows: where k > 0. In the simulations, we set k = 20. The solution of (1.1) in this case is known to be We note that the solution u 1 is smooth on (0, 1] 2 . The function f 2 is defined as follows: where x(i) = x πx(i) for a permutation π x such that x(1) ≤ x(2), and We set C = 10 in the simulations. The solution in this case is known to be The function f 2 is Lipschitz and the solution u 2 has a gradient discontinuity. This is common in Hamilton-Jacobi equations due to crossing characteristics.
Given these f 1 and f 2 , we gather the errors from their numerical solutions compared to the known solutions for each order of each scheme in different mesh sizes h. These errors are measured in both the L 1 norm and the L ∞ norm as numerical evidence of the rate of    Table 1: Fraction of grid points for which the k th order scheme being used in the filtered schemes.
convergence. The results from the simulations are shown in Figure 1 - Figure 8. Note that here we use "S" to denote "backward difference scheme" and "FS" to denote "filtered scheme". The results from the simulations suggest that the unfiltered 2 nd order backward difference scheme is convergent with a second order rate. On the other hand, the backward difference schemes of order higher than two appear to be unstable. In fact, the errors for the unfiltered schemes for order k = 5, 8, 13 are so large they are not shown in the figures.
Observing the errors from filtered schemes in both the L 1 norm and the L ∞ norm, we see that the 2 nd order filtered scheme also tends to give better accuracy than the 1 st order one, but other higher order filtered schemes only give comparable accuracies to the 1 st order scheme. A further investigation shows that high order filtered schemes rely most of the time on the first order scheme in solving (1.1). This explains why higher order filtered schemes do not produce better accuracy than lower order ones. To give a better idea, we show the fraction of grid points for which the k th order scheme is being used for various mesh sizes and orders when setting f = f 1 in Table 1. Since setting f = f 2 gives a similar result, we omit showing the same table of data in this case. This explains why filtering is not successful for higher order schemes. It would be interesting to determine why this is happening and whether it can be improved by using different schemes or a different type of filtering.

Explicit formulas for difference quotients
Our proof of the backward difference formulas in Section 2.1 can be extended to more general difference quotients. We present these results in this section. The first part will present results regarding the coefficients of finite difference quotients for f ′ . The second part will be devoted for generalizing the coefficients of finite difference quotients for f (n) where n is any positive integer. Although we expect some of these coefficients to have appeared in the literature before, we include this section for completeness of the paper. Additionally, we hope to provide different approaches for proving the formulas that may be simpler and more direct.

Other finite differences for f ′ (x)
We begin by introducing the notation for an extension of binomial coefficient.
We found that Lemma 2.2 can be used for computing the k−order forward difference quotients as well. Compared to backward differences, these quotients simply have opposite signs as stated in Theorem 4.1 and Corollary 4.3.
The proofs for both of the above theorem and corollary will be omitted as they can be proceeded in the same way as the proofs for Theorem 2.1 and Corollary 2.3, respectively.
Another finite difference quotient that is often implemented is centered difference quotient. With Lemma 2.2, we can also generalize centered difference quotients as shown in Theorem 4.1.  Proof. By using a Taylor series expansion, we obtain
The next finite difference quotient that will be introduced in Theorem 4.1 uses arithmetic progression differences in computing f ′ (x). Essential lemmas for proving its coefficients are given below.

Proof.
Writing one can compare the coefficient of x p on both sides of the equation and yield Corollary 4.6. For any given real number r and positive integer p, the polynomial r (x) = x r (x + 1) −r , and for k ≥ 1, dx .
Then for k = 1, 2, . . . , p, r (x) is a polynomial of degree less than p.
Proof. We will show, by mathematical induction, that for all k ≥ 1, where g(x) is a polynomial of degree less than k. It is true for k = 1 because s [1] r (x) = r · x r · (x + 1) −(r+1) . Let's assume that this is true for some positive integer l. Then we will have that Since g(x) has degree less than l, it is not difficult to see that has degree less than l + 1. Hence, g(x) has degree less than k for all k ≥ 1. Therefore, for k = 1, 2, . . . , p, is a polynomial of degree less than p.
By letting, By Lemma 4.7, for k = 1, 2, , . . . , p , the polynomial Q r,p (x) is of degree less than p. Therefore, Theorem 4.9 (Arithmetic Progression Difference). Let a be a real number, d be a nonzero real numbers, and k be a positive integer such that 0 / ∈ {a 1 , a 2 , . . . , a k }, where a i = a + d(i − 1). Then where for i = 1, 2, . . . , k.
Proof. By using a Taylor series expansion, we obtain We claim that Rearranging the left hand side of equation (4.2), we deduce and the desired result follows.

General forms of finite differences for f (n)
We extend the idea from previous results to generalize the coefficients of finite difference quotients for f (n) . In computing them, we need to solve for the inverses of the matrices defined as in Theorem 4.14 and 4.15. Because of its generality, the results presented in this section will also hold for all the previous results. Indeed, the first columns of these inverses represent the coefficients of a finite difference quotient for f ′ , the second columns represent the coefficients of a finite difference quotient for f ′′ , and so on. Some notations and lemmas that are essential for proving them are given below. λ,r (x) = x r (x + 1) −(r−λ) , and for k ≥ 1, λ,r (x) dx .
Then for k = 1, 2, . . . , p, the coefficient of the term x p in the polynomial is λ k .
Proof. We will show by mathematical induction that for all k ≥ 1, where g(x) is a polynomial of degree k with the leading coefficient of λ k . It is true when k = 1 because t [1] λ,r (x) = (λx + r) · x r · (x + 1) −r+λ+1 .
Let's assume that this is true for some positive integer l. Then we will have that Since the leading term of g(x) is λ l x l , it follows that the leading coefficient of Hence, g(x) is a polynomial of degree k with the leading coefficient of λ k for all k ≥ 1. Therefore, for k = 1, 2, . . . , p, the coefficient of the term x p in the polynomial is λ k .
Lemma 4.11. Let r be a real number and p be a positive integer. Define the polynomial Then, for i = 1, . . . , p, Lemma 4.14. For any real number a, nonzero real number d, and positive integer n such that 0 / ∈ T = {a, a + d, . . . , a + (n − 1)d}, define A to be an n × n−matrix with Then Proof. We will first show that AB = I n . By directly computing the product of A and B, we deduce We note that (AB) ij equals the coefficient of This establishes AB = I n as desired. Now we will show that BA = I n . By directly computing the product of B and A, we deduce where H i is defined as in Lemma 4.11. Thus, by Lemma 4.11, This establishes BA = I n . Therefore, A −1 = B.
and D * is an n × (n + m)−matrix with The proof for Lemma 4.15 is omitted since it can be proceeded in the similar way as the proof for Lemma 4.14. The reader who is interested in proving it may set d = 1 for simplicity. Once the case d = 1 is proven, it is not difficult to see that the result also holds in the case d = 1.
It is not practical to explicitly express matrix B. However, it is not too difficult to express the first, the last, and the second last columns of B. We already show how to compute the first column in the previous two sections (aka Theorem 2.1 -4.1). In this section, we will only show how to compute the last and the second last columns. Their proofs will be omitted due to the fact that they are derived directly from Lemma 4.14 and Lemma 4.15.
Computing the last column of B Theorem 4.16. Let a be a real number, d be a nonzero real number, and n be a positive integer such that 0 / ∈ {a 1 , a 2 , . . . , a n }, where a i = a + d(i − 1). Then where c a i = (−1) n+i a i · 1 d n−1 (i − 1)!(n − i)! , for i = 1, 2, . . . , n. Computing the second last column of B Theorem 4.18. Let a be a real number, d be a nonzero real number, and n be a positive integer such that 0 / ∈ {a 1 , a 2 , . . . , a n }, where a i = a + d(i − 1). Then where c a i = (−1) n+i−1 a i · n 2 (a 1 + a n ) − a i d n−1 (i − 1)!(n − i)! , for i = 1, 2, . . . , n.

Conclusions
In this paper, we showed how to construct filtered schemes for the Hamilton-Jacobi equation continuum limit of nondominated sorting by combining high order possibly unstable schemes with first order monotone and stable schemes. We proved that the filtered schemes are stable and convergent for all orders. We then investigated both high-order unfiltered and filtered schemes for the Hamilton-Jacobi equation by implementing both schemes for order k = 1, 2, 3, 5, 8, and 13 numerically solving the equations in various mesh sizes. The errors from their numerical solutions compared to the known solutions were measured in the L 1 norm and the L ∞ norm. Our results suggest that the unfiltered schemes of order higher than 2 are unstable while the 1 st order and 2 nd order unfiltered schemes remain stable. Moreover, we see that the 2 nd order unfiltered scheme shows 2 nd order accuracy. Similarly to the unfiltered schemes, we see that the 2 nd order filtered scheme seems to show 2 nd order accuracy. However, it turns out that the filtered schemes of order higher than 2 only exhibit a 1 st order convergence rate. Upon further investigation, this appears to be due to fact that the filtering relies too often on the 1 st order scheme. Future work would include proving stability of the second order unfiltered scheme, and investigating techniques to improve the accuracy of the higher order filtered schemes.