A Clustering Method to Solve Backward Stochastic Differential Equations with Jumps ()
1. Introduction
Solving BSDEs is of great importance in mathematical finance. The derivatives pricing problems, dynamic portfolio choice problems and even dynamic stochastic general equilibrium problems can be related to various types of BSDEs. However, due to its recursive nature, a BSDE with jumps cannot be directly solved via Monte Carlo method.
In this paper, we apply the methodology to evaluate conditional expectations, which is originally introduced in [1], to solve BSDEs with jumps. This novel method simplifies the structures of the approximate solution to the true one, by running a local approximation. Under the assumption that the solution is continuous, we can show that a linear regression approximation locally will be accurate enough with very limited computation budget and convergence is shown. To be specific, we use Monte-Carlo simulation and machine learning clustering method to divide the state space into small clusters, in which we approximate the true solutions via linear regression.
Recent literature of machine learning method to solve PDEs/BSDEs can be found in, for example, [2] [3] [4] [5] and [6]. In those references, the authors perform a brute-force machine learning approximation to the true solutions and the methods are therefore time-consuming. Our method can serve as a generalization of the above methods, in that, when the number of clusters is set to be 1, our method degenerates to the aforementioned ones.
The organization of this paper is as follows. Section 2 describes the method and convergence proof. Section 3 discusses an application in finance and Section 4 concludes. All the proofs can be found in Appendix.
2. A Formal Description of Our Approach
2.1. The BSDE
Throughout the paper, we consider a filtered probability space
, with
. The space is supporting a d-dimensional Brownian motion
and a Poisson random measure N on
, where
is the Borel σ-algebra on
and
is a measurable space. Define
and
as the Borel σ-algebra on E.
is the probability measure on
. The filtration
is completed with all
-null sets, right-continuous and
is generated by
for
. Assume that
and W and N are mutually independent under
. Suppose that the compensating measure of N is
, where
is a σ-finite measure on
satisfying
. The corresponding compensated Poisson random measure is defined by
.
The BSDEJ under consideration, whose solution is denoted by
, is
(1)
where
for a given smooth and bounded function
. The vector of state variables
is r-dimensional and satisfies the Forward-SDE (FSDE) indicated. The standard Brownian motion W is d-dimensional,
is 1-dimensional,
is d-dimensional,
is q-dimensional,
is 1-dimensional and
is q-dimensional.
is called the solution to the BSDEJ system (1). We assume that
throughout the paper without loss of generality.
2.2. The Numerical Approximation Scheme
This section describes the construction of the approximate solution to the BSDEJ (1). The intermediate solutions can be obtained by the following equations, according to [7]
(2)
where
are functions of
. It is therefore crucial to be able to evaluate the above conditional expectations for each t.
Suppose that the domain of state variables X is
1. Partition
by a class of disjoint sets
. Denote the distance
, where
is the canonical Euclidean distance. It is known that the evaluation of conditional expectation problems are essentially regression ones, in the following sense
(3)
where
is an appropriate function space with domain
2. First, we need the following result.
Lemma 1. We have
(4)
in
sense.
With the above result in mind, we will always write
as
and assume that
is bounded and closed. Many methods have been emerging to find global solutions to the above Equation (3). However, in this paper, we follow a divide-and-conquer approach, by trying to fit the regression locally in each of the subspace
, for
. This means, we should try to evaluate
(5)
where
is an appropriate function space with domain
, under the constraint
. ( [1], Theorem 23) concludes that, indeed, we have
. In ( [1], Theorem 24), the condition
is relaxed under certain conditions. In this paper, we will use the two aforementioned theorems and show that polynomial regressions and Taylor expansions for
and
, applying a Monte Carlo simulation to generate samples and use machine learning clustering to partition the simulated paths, will result in convergent numerical methods. In order to use Taylor expansion, we need
to be
-smooth, meaning it should have bounded derivatives up to order l in
. This requires us to approximate it using
-smooth functions. The following assumption applies throughout the paper.
Assumption 1. (On
). Assume that, in some sense,
can be approximated by a sequence of
-smooth functions
, with l finite or
. Moreover, we have
(6)
for any
.
Remark 2. For example,
can converge to
in a point-wise sense or
sense. Because of Assumption 1, we will always assume that
is
-smooth.
Then, we have the following Theorems.
Theorem 3. (On Polynomial Regression). Assume that3
(7)
where
is the space of all polynomials of degree equal to or less than J with domain
. Then, we have
(8)
in
sense with
fixed.
Theorem 4 (On Taylor Expansion). Assume that
and
(9)
where
. Then, we have
(10)
in
sense with
fixed.
The following theorem establishes the convergence under Monte Carlo simulation.
Theorem 5 (On Monte Carlo Simulation). Assume that there is a time-discretization
and we obtain M samples
. Then, we have
(11)
where
is obtained via OLS in the cases of polynomial regression and Taylor expansion, in
sense with
fixed.
Theorem 5 is ultimately the convergence result for the methodology we propose. We first generate M copies of the sample paths of X from 0 to T. Then, we use MiniBatchKMeans function in Python programming language to partition the M copies of simulated values of X, at each time
, into K clusters. With each cluster
for
, we perform the analysis as shown above and concatenate different pieces together to formulate the values for the conditional expectations at time t. The methodology works backwards in time. Of course, there are many different choices of
, depending on how we want to approximate the conditional expectations within
.
With the above discussions on the evaluation of conditional expectations, we can start to describe the main numerical method for a BSDEJ. First, define a set of time nodes
, with
and
. Suppose that
. Second, we simulate M copies of the forward process X at the
discretization nodes
. Third, we iterate backwards in time using the methodologies to evaluate conditional expectations outlined above. At time interval
, use MiniBatchKMeans method to cluster
into K disjoint subsets
. Within each cluster
, apply polynomial regression or Taylor expansion method to evaluate
, where we add subscript
to remind the readers that this intermediate function depends on time discretization now and M to denote Monte Carlo simulation dependence. After we obtain
, we can treat it as the intermediate solution to the BSDEJ at time
. In order to evaluate Z and V, taking Z as an example,
we use the relation
. The equations for Z and V are stated in
( [7], Equation 2.10). When we obtain intermediate approximate solution
, we plug the tuple into the driver f of the BSDEJ and
is the terminal condition for the evaluations in time interval
. Iterate until we reach
.
2.3. Convergence Results
First, we introduce necessary spaces to facilitate the discussions of convergence results. For an
-valued function
, let the sup-norm be
(12)
We shall use the following spaces for stochastic processes with
•
is the set of
-valued adapted càdlàg processes X such that
(13)
Here
is a nonnegative constant. We sometimes write
as
if doing so causes no ambiguity. The same is true for the spaces to be defined below.
•
is the set of progressively measurable
-valued processes Z such that
(14)
•
is the set of q-dimensional processes
,
which is
-measurable and satisfy
(15)
•
is the set of processes
in the space
with the norm
(16)
Let
be the space of bounded functions that have continuous and bounded derivatives up to order g in the domain
and
the space of functions that have continuous derivatives up to order g. In what follows, we will only consider the norms with
and
. In order to provide the final convergence result, we need the following assumption.
Assumption 6 (On BSDEJ). Assume that the BSDEJ (1) admits a unique solution
in
and the time discretized solution
converges to
in
sense.
Then, the following convergence result is valid to justify the methodology we introduce above.
Theorem 7 (Main Convergence Result). Under Assumptions 1 and 6, we have
(17)
where
and
.
3. Applications in Finance
The BSDE associated with American option prices can be found in [8]. We numerically solve ( [8], Equation 2.10) using the methodology introduced in Section 2. Performance is documented below. Here M is the number of sample paths and H is the number of time discretization nodes. The time to maturity of the American call option is
.
is the initial price of the underlying asset and the strike price is 100. Table 1 and Table 2 contain efficiency results.
4. Conclusion
In this paper, we propose a new machine learning method, based on clustering, to run [9] type regression to solve BSDEs with jumps. Convergence proof is given and numerical results show good performance. The generalization of the proposed methodologies to more complicated BSDEs with jumps, for example, the Mckean-Vlasov type BSDEs, is of interest. We leave it for future research.
Table 1. Numerical results when M = 50000.
Table 2. Numerical results when M = 75000.
Appendix: Proofs
Proof of Lemma 1. Denote
. Because we have
, it is obvious that
. Therefore, we have
, where
(18)
We immediately have
. In addition, we have
(19)
(20)
(21)
Here
. The term
(22)
(23)
Here C is independent of i and we have
. Combine this result with
and we will obtain the claim of this lemma.
Lemma 2 (On Lead-Lag Regression). Suppose that
, then we have
(24)
where
.
Proof of Lemma 2. The proof of this lemma follows directly from the Repeated Projection Theorem (see, e.g., [1], Theorem 8).
Proof of Theorem 3. Apply Lemma 2 and ( [1], Theorem 23) and the proof is obvious.
Proof of Theorem 4. The proof of the theorem follows from the error term of the Taylor expansion, Lemma 2 and ( [1], Theorem 24).
NOTES
1Of course, set
can be time-dependent and unbounded. For brevity we ignore the time variable throughout the paper. But we will use a sequence of bounded and closed subsets of
, denoted by
and assume that
and
. Then, we will choose a sufficiently large I and perform computations in
instead of
.
2For example, we can ask
.
3The equation below means that we run a lead-lag polynomial regression within each
, with Y variable
and X variable
.