MATHS4UQ Seminar


The MATH4UQ seminar features talks by internal and external colleagues and collaborators as well as guests visiting the chair. Everybody interested is welcome to attend.Please subscribe to our MATH4UQ seminar mailing list to receive notifications about upcoming seminars. Recordings of several previous talks can also be found on our MATH4UQ YouTube channel.

21.11.2023, Tuesday, 12:00 (CET)

  • Speaker:  Prof. David Pardo,  University of the Basque Country
  • Title: Multimodal Variational Autoencoder for Inverse Problems in Geophysics.
  • Abstract: 

    Estimating subsurface properties from geophysical measurements is a common challenge in inverse problems. In numerous geophysical applications, there is no unique solution; instead, there are multiple plausible solutions. In this context, some Bayesian methods are designed to solve geophysical inverse problems and their associated uncertainties. 

    In this presentation, based on [1], we describe a novel approach to solve geophysical inverse problems with uncertainty quantification (UQ): a multimodal variational autoencoder (MVAE) model that employs a mixture of truncated Gaussian densities to provide multiple solutions to the inverse problem. These solutions contain information about their probability of occurrence and their UQ.

    The MVAE model consists of two main components: an encoder and a decoder. The encoder employs a neural network to generate a mixture of truncated Gaussian densities, representing the distribution to the inverse problem solution. Conversely, the decoder computes the numerical solution to the forward problem based on geophysical principles.

    MVAE allows for a more comprehensive exploration of potential solutions to geophysical inverse problems, accommodating their inherent UQ and providing a more affluent understanding of the solutions space.

    The objective of the talk is (a) to describe the need for UQ in geophysical exploration methods, (b) introduce neural networks as a family of methods that introduces new opportunities toward achieving UQ, (c) explain MVAE as an example of a method that provides UQ, and more importantly, (d) open a collaboration space to develop new practical UQ methods in geophysics based on neural networks.

     [1] Rodriguez, O., Taylor, J. M., & Pardo, D. (2023). Multimodal variational autoencoder for inverse problems in geophysics: application to a 1-D Magnetotelluric problem. Geophysical Journal International, 235(3), 2598-2613.


28.11.2023, Tuesday, 12:00 (CET)

  • Speaker:  Prof. Sebastian Reich,  University of Potsdam
  • Title: An application of the EnKF to optimal control.
  • Abstract: 

    Stochastic optimal control problems are typically phrased in terms of the
    associated Hamilton-Jacobi-Bellman equation. Solving such partial differential equations remains 
    challenging. In this talk, an alternative approach involving forward and backward mean-field
    evolution equations will be considered. In its simplest incarnation, these equations become 
    equivalent to the formulations used in score generative modelling. General optimal control problems
    lead to more complex forward-backward mean-field equations which require further approximations for which
    we will employ variants of the ensemble Kalman filter. 


05.12.2023, Tuesday, 13:30 (CET)

  • Speaker:  Prof. Björn Sprungk, TU Bergakademie Freiberg
  • Title: Metropolis-adjusted interacting particle sampling.
  • Abstract: 

    In recent years, several interacting particle samplers have been proposed in order to sample approximately from a complicated target distributions such as posterior measures occuring in Bayesian inverse problems. These interacting particle samplers use an ensemble of interacting particles moving in the product state space according to coupled stochastic differential equations. In practice, we have to apply numerical time stepping to simulate these systems, such as the Euler-Maruyama scheme. However, the time discretization affects the invariance of the particle system with respect to the target distribution and, thus, introduces a bias. In order to correct for this we study the application of a Metropolisation step similar to the Metropolis-adjusted Langevin algorithm. We discuss ensemble- and particle-wise Metropolization and state the basic convergence of the resulting ensemble Markov chain to the product target distribution. We show the benefits of this correction in numerical examples for common interacting particle samplers such as the affine invariant interacting Langevin dynamics (ALDI), consensus-based sampling, and stochastic Stein variational gradient descent.


14.11.2023, Tuesday, 12:00 (CET)

  • Speaker: Xinzhu Liang, University of Manchester
  • Title: A randomized multi-index Monte Carlo method
  • Abstract: 

    We consider the problem of estimating expectations with respect to a target distribution with an unknown normalizing constant, and where even the unnormalized target needs to be approximated at finite resolution. Under such an assumption, we extend a recently introduced multi-index Sequential Monte Carlo (SMC) ratio estimator, which provably enjoys the complexity improvements of multi-index Monte Carlo (MIMC) and the efficiency of SMC for inference. The present work leverages a randomization strategy to remove bias entirely, which simplifies estimation substantially, particularly in the MIMC context, where the choice of index set is otherwise important. With theoretical results, the proposed method provably achieves the same canonical complexity of MSE-1 under appropriate assumptions as the original method, but without discretization bias. It is illustrated on examples of Bayesian inverse and spatial statistics problems.


07.11.2023, Tuesday, 12:00 (CET)

  • Speaker: Prof. Jonas Latz, University of Manchester
  • Title: Subsampling in Continuous Time: from Optimisation to Sampling.
  • Abstract: 

    The Stochastic Gradient Langevin Dynamics (SGLD) are popularly used to approximate Bayesian posterior distributions in statistical learning procedures with large-scale data. As opposed to many usual Markov chain Monte Carlo (MCMC) algorithms, SGLD is not stationary with respect to the posterior distribution; two sources of error appear: The first error is introduced by an Euler--Maruyama discretisation of a Langevin diffusion process, the second error comes from the data subsampling that enables its use in large-scale data settings. In this work, we consider an idealised version of SGLD to analyse the method's pure subsampling error that we then see as a best-case error for diffusion-based subsampling MCMC methods. Based on a recent continuous-time formulation of the Stochastic Gradient Descent algorithm, we introduce and study the Stochastic Gradient Langevin Diffusion (SGLDiff). This dynamical system is a continuous-time Markov process that follows the Langevin diffusion corresponding to a data subset and switches this data subset after exponential waiting times. There, we show that the Wasserstein distance between the posterior and the limiting distribution of SGLDiff is bounded above by a fractional power of the mean waiting time. Importantly, this fractional power does not depend on the dimension of the state space. We bring our results into context with other analyses of SGLD.


24.10.2023, Tuesday, 12:00 (CEST)

  • Speaker: Sophia Wiechert, RWTH Aachen University
  • Title: Generic Importance Sampling via Optimal Control for Stochastic Reaction Networks.
  • Abstract: 

    Stochastic reaction networks (SRNs) are a class of continuous-time discrete-state Markov processes describing the random interaction of d species through reactions. SRNs are commonly used for modeling diverse phenomena such as (bio)chemical reactions, epidemics, risk theory, queuing, and supply chain networks. In this context, we propose two alternative importance sampling (IS) approaches to improve the efficiency of Monte Carlo (MC) estimators for various statistical quantities. Our special interest lays in estimating rare event probabilities in high dimensional SRNs, in which d>>1. The challenge in IS is to choose an appropriate change of probability measure to achieve a substantial variance reduction. We propose an automated approach to obtain a highly efficient path-dependent measure change based on an original connection between finding optimal IS parameters and solving a variance minimization problem via a stochastic optimal control formulation. We pursue two alternative approaches to mitigate the curse of dimensionality when solving the resulting  backward Hamilton-Jacobi-Bellman (HJB) equation. In the first approach [1], we propose a learning-based method to approximate the value function using an ansatz function (e.g. a neural network), where the parameters are determined via a stochastic optimization algorithm. As an alternative, we present in [2] a dimension reduction method, based on mapping the problem to a significantly lower dimensional space via the Markovian projection (MP) idea. The output of this model reduction technique is a much lower dimensional SRN that preserves at all time steps the marginal distribution of the original high-dimensional problem. By solving the HJB equation for a projected lower dimensional process, we get projected IS parameters, which can be mapped back to the original $d$ dimensional SRN. Our analysis and numerical experiments verify that both proposed IS (learning based and MP-HJB) approaches substantially reduce the MC estimator’s variance, resulting in a lower computational complexity in the rare event regime than standard MC estimators.

    [1] Ben Hammouda, C., Ben Rached, N., Tempone, R., & Wiechert, S. (2023). Learning-based importance sampling via stochastic optimal control for stochastic reaction networks. Statistics and Computing, 33(3), 58.

    [2] Hammouda, C. B., Rached, N. B., Tempone, R., & Wiechert, S. (2023). Automated Importance Sampling via Optimal Control for Stochastic Reaction Networks: A Markovian Projection-based Approach. arXiv preprint arXiv:2306.02660.


17.10.2023, Tuesday, 12:00 (CEST)

  • Speaker: Arved Bartuska, RWTH Aachen University
  • Title: Efficient nested integration estimators for expected information gain.
  • Abstract: 

    Nested integration problems consist of one outer and one inner integral separated by a nonlinear function. Estimating these integrals poses significant computational challenges. We employ the randomized quasi-Monte Carlo method to estimate both the outer and inner integrals, yielding a double-loop quasi-Monte Carlo estimator.  Furthermore, we derive asymptotic error bounds to obtain the optimal number of outer and inner samples, depending on a specified error tolerance. This estimator is then applied to find the expected information gain (EIG) of an experiment, allowing for efficient data collection. 

    Mathematical models of experiments often contain nuisance uncertainty, i.e., uncertainty about parameters that are not of interest to the experimenter directly. If these uncertainties are to be quantified rigorously, a second inner integral is added to the nested structure of the EIG. We utilize the Laplace approximation to derive two novel and efficient estimators that consider nuisance uncertainty and demonstrate the effectiveness of these estimators via three numerical examples.


10.10.2023, Tuesday, 12:00 (CEST)

  • Speaker: Prof.  Ahmed Kebaier, University of Evry-University Paris-Saclay
  • Title: The interpolated drift implicit Euler scheme Multilevel Monte Carlo method for pricing Barrier options and applications to the CIR and CEV models.
  • Abstract: 

    Recently, Giles et al. [14] proved that the efficiency of the Multilevel Monte Carlo (MLMC) method for evaluating Down-and-Out barrier options for a diffusion process (Xt) t∈[0,T] with globally Lipschitz coefficients, can be improved by combining a Brownian bridge technique and a conditional Monte Carlo method provided that the running minimum inf t∈[0,T] Xt has a bounded density in the vicinity of the barrier. 
    In the present work, thanks to the Lamperti transformation technique and using a Brownian interpolation of the drift implicit Euler scheme of Alfonsi [2], we show that the efficiency of the MLMC can be also improved for the evaluation of barrier options for models with non-Lipschitz diffusion coefficients under certain moment constraints. We study two example models: the Cox-Ingersoll-Ross (CIR) and the Constant of Elasticity of Variance (CEV) processes for which we show that the conditions of our theoretical framework are satisfied under certain restrictions on the models parameters. In particular, we develop semi-explicit formulas for the densities of the running minimum and running maximum of both CIR and CEV processes which are of independent interest. Finally, numerical tests are processed to illustrate our results.

    [14] M. B. Giles, K. Debrabant, and A. Rössler. 'Analysis of multilevel Monte Carlo path simulation using the Milstein discretisation.' Discrete Contin. Dyn. Syst. Ser. B, 24(8):3881–3903, 2019.

    [2] A. Alfonsi. 'Strong order one convergence of a drift implicit euler scheme: Application to the cir process. Statistics & Probability Letters, 83(2):602–607, 2013.


09.05.2023, Tuesday, 15:00 (CEST)

  • Speaker: Prof.  Michael Herty, IGPM, RWTH Aachen University
  • Title: UQ for hyperbolic problems 
  • Abstract: 

    We are interested in quantifying uncertainties that appear in nonlinear hyperbolic partial differential equations arising in a variety of applications from fluid flow to traffic modeling. A common approach to treat the stochastic components of the solution is by using generalized polynomial chaos expansions. This method was successfully applied in particular for general elliptic and parabolic PDEs as well as linear hyperbolic stochastic equations. More recently, gPC methods have been successfully applied to particular hyperbolic PDEs using the explicit form of nonlinearity or the particularity of the studied system structure as, e.g., in the p-system. While such models arise in many applications, e.g., in atmospheric flows, fluid flows under uncertain gas compositions and shallow water flows, a general gPC theory with corresponding numerical methods are still at large. Typical analytical and numerical challenges that appear for the gPC expanded systems are loss of hyperbolicity and positivity of solutions (like gas density or water depth). Any of those effects might trigger severe instabilities within classical finite-volume or discontinuous Galerkin methods. We will discuss properties and conditions to guarantee stability and present numerical results on selected examples.


02.05.2023, Tuesday, 15:00 (CEST)

  • Speaker: Dr. Burcu Aydogan, RWTH Aachen University
  • Title: Optimal investment strategies under the relative performance in jump-diffusion markets 
  • Abstract: 

    We work on a portfolio management problem for a representative agent and a group of people, forming a market under relative performance concerns in a continuous-time setting. Herein, we define two wealth dynamics: the agent’s and the market’s wealth. The wealth dynamics appear in jump-diffusion markets. In our setting, we measure the performances of the market and the individual agent with preferences linked to the market performance. Therefore, we have two classical Merton problems to determine what the market does and the agent’s optimal strategy relative to the market performance. Furthermore, our framework assumes that the agent’s utility performance does not affect the market, while the market affects the agent’s utility. We explore the optimal investment strategies for both the agent and the market.

    This is a joint work with Mogens Steffensen in the University of Copenhagen.


18.04.2023, Tuesday, 15:00 (CEST)

  • Speaker: Prof. Ian Hugh Sloan, University of New South Wales (UNSW Australia)
  • Title: High dimensional approximation – avoiding the curse of dimensionality, doubling the proven convergence rate.
  • Abstract: 

    High dimensional approximation problems commonly arise from parametric PDE problems in which the parametric input depends on very many independent univariate random variables.  Typically (as in the method of “generalized polynomial chaos”, or GPC) the dependence on these variables is modelled by multivariate polynomials, leading to exponentially increasing difficulty and cost (expressed as the “curse of dimensionality”) as the dimension increases.  For this reason sparsity of coefficients is a major focus in implementations of GPC.

    In this lecture we develop a different approach to one version of GPC.  In  this method there is no need for sparsification, and no curse of dimensionality.    The method, proposed in a 2022 paper with Frances Kuo, Vesa Kaarnioja, Yoshihito Kazashi and Fabio Nobile, uses kernel interpolation with periodic kernels, with the kernels located at lattice points, as advocated long ago by Hickernell and colleagues.

    The lattice points and the kernels depend on parameters called “weights”.  In the 2022 paper the recommended weights were “SPOD” weights, leading to a cost growing as the square of the number of lattice points.   A newer 2023 paper with Kuo and Kaarnioja introduced “serendipitous” weights, for which the cost grows only linearly with both dimension and number of lattice points, allowing practical computations in as many as 1,000 dimensions.

    The rate of convergence proved in the above papers was of the order $n^{-\alpha/2}$, for interpolation using the reproducing kernel of a space with mixed smoothness of order $\alpha$.  A new result with Frances Kuo doubles the proven convergence rate to $n^{-\alpha}$.


11.04.2023, Tuesday, 15:00 (CEST)

  • Speaker: Yang Liu, King Abdullah University of Science and Technology
  • Title: Goal-oriented adaptive finite element multilevel Monte Carlo with convergence rates
  • Abstract: 

    We propose our Adaptive Multilevel Monte Carlo (AMLMC) [Beck, Joakim, et al. "Goal-oriented adaptive finite element multilevel Monte Carlo with convergence rates." Computer Methods in Applied Mechanics and Engineering (2022)] method to solve an elliptic partial differential equation (PDE) with lognormal random input data, where the PDE model is subject to geometry-induced singularity. 

    The previous work [Moon, K-S., et al. "Convergence rates for an adaptive dual weighted residual finite element algorithm." BIT Numerical Mathematics 46.2 (2006)] developed convergence rates for a goal-oriented adaptive algorithm based on isoparametric d-linear quadrilateral finite element approximations and the dual weighted residual error representation in the deterministic setting. This algorithm refines the mesh based on the error contribution to the QoI. 

    This work aims to combine MLMC and the adaptive finite element solver. Contrary to the standard Multilevel Monte Carlo methods, where each sample is computed using a discretization-based numerical method, whose resolution is linked to the level, our AMLMC algorithm uses a sequence of tolerances as the levels. Specifically, for a given realization of the input coefficient and a given accuracy level, the AMLMC constructs its approximate sample as the ones using the first mesh in the sequence of deterministic, non-uniform meshes generated by the above-mentioned adaptive algorithm that satisfies the sample-dependent bias constraint.


28.03.2023, Tuesday, 15:00 (CET)

  • Speaker: Dr. André Carlon, KAUST
  • Title: Bayesian quasi-Newton method for stochastic optimization
  • Abstract: 

    Stochastic optimization problems arise in many fields, like data sciences, reliability engineering, and finance. The stochastic gradient descent (SGD) method is a cheap approach to solving such problems, relying on noisy gradient estimates to converge to local optima. However, in the case of μ-convex, L-smooth problems, the convergence is deeply affected by the condition number of the problem, L/μ. Here, we propose a Bayesian approach to find a suitable matrix to pre-condition gradient estimates in stochastic optimization that reduces the effect of large conditioning numbers. We show that maximizing the posterior distribution to find a suitable pre-conditioning matrix is a constrained deterministic strongly convex problem that can be solved efficiently using the Newton-CG method with a path-following approach. Numerical results on stochastic problems with large condition numbers show that our Bayesian quasi-Newton pre-conditioner improves the convergence of SGD.


21.03.2023, Tuesday, 15:00 (CET)

  • Speaker:  Prof. Antonis Papapantoleon, the Delft Institute of Applied Mathematics,  Institute of Applied and Computational Mathematics, FORTH
  • Title: A splitting deep Ritz method for multi-asset option pricing in Lévy models
  • Abstract: 

    Solving high-dimensional differential equations is still a challenging field for researchers. In recent years, many works have been presented that provide approximation by training neural networks using loss functions based on the differential operator of the equation at hand, as well as its initial/terminal and boundary conditions. In this work, we use a machine learning approach for pricing European (basket) options written with respect to a set of correlated underlyings whose dynamics undertake random jumps. We approximate the solution of the corresponding partial integro-differential equation using a variant of the deep Ritz method that splits the differential operator into symmetric and asymmetric parts. The method is driven by a modified version of the neural network introduced in the deep Galerkin method. The structure of the proposed neural network ensures the asymptotic behavior of the solution for large values of the underlyings. Moreover, it leads the outputs of the network to be consistent with the prior known qualitative properties of the solution. We present results on the Merton jump-diffusion model.


14.03.2023, Tuesday, 15:00 (CET)

  • Speaker: Prof. Dr. Markus Bachmayr, RWTH Aachen University
  • Title:  Optimality of adaptive stochastic Galerkin methods for affine-parametric elliptic PDEs
  • Abstract: 

    We consider the computational complexity of approximating elliptic PDEs with random coefficients by sparse product polynomial expansions. Except for special cases (for instance, when the spatial discretisation limits the achievable overall convergence rate), previous approaches for a posteriori selection of polynomial terms and corresponding spatial discretizations do not guarantee optimal complexity in the sense of computational costs scaling linearly in the number of degrees of freedom. We show that one can achieve optimality of an adaptive Galerkin scheme for discretizations by spline wavelets in the spatial variable when a multiscale representation of the affinely parameterized random coefficients is used. This is joint work with Igor Voulis.

    M. Bachmayr and I. Voulis, An adaptive stochastic Galerkin method based on multilevel expansions of random fields: Convergence and optimality, ESAIM:M2AN 56(6), pp. 1955-1992, 2022. Preprint: arXiv:2109:09136.


07.03.2023, Tuesday, 15:00 (CET)

  • Speaker: Prof. dr. ir. C.W. (Kees) Oosterlee, Utrecht University
  • Title:On the application of machine learning to enhance algorithms in computational finance
  • Abstract: In this presentation, we will first give a brief overview of our experiences with the use of artificial neural networks (ANNs) in finance. We'll give an example of supervised, unsupervised and reinforcement learning.
    After this we will outline the use of neural networks for the calibration of a financial asset price model in the context of financial option pricing. To provide an efficient calibration framework, a data-driven approach is proposed to learn the solutions of financial models and to reduce the corresponding computation time significantly. 
    Specifically, fitting model parameters is formulated as training hidden neurons within a machine-learning framework.
    The rapid on-line computation of ANNs combined with a flexible optimization method (i.e. Differential Evolution) provides us fast calibration without getting stuck in local minima.

21.02.2023, Tuesday, 15:00 (CET)

  • Speaker: Shyam Mohan Subbiah Pillai, RWTH Aachen University
  • Title:Importance sampling for McKean-Vlasov stochastic differential equation
  • Abstract: We are interested in Monte Carlo (MC) methods for estimating probabilities of rare events associated with solutions to the McKean-Vlasov stochastic differential equation (MV-SDE). MV-SDEs arise in the mean-field limit of stochastic interacting particle systems, which have many applications in pedestrian dynamics, collective animal behaviour and financial mathematics. Importance sampling (IS) is used to reduce high relative variance in MC estimators of rare event probabilities. Optimal change of measure is methodically derived from variance minimisation, yielding a high-dimensional partial differential control equation which is cumbersome to solve. This problem is circumvented by using a decoupling approach, resulting in a lower dimensional control PDE. The decoupling approach necessitates the use of a double Loop Monte Carlo (DLMC) estimator. We further combine IS with a novel multilevel DLMC estimator which not only reduces complexity from O(TOL-4) to O(TOL-3) but also drastically reduces associated constant, enabling computationally feasible estimation of rare event probabilities.

14.02.2023, Tuesday, 15:00 (CET)

  • Speaker: Christian Bayer, Weierstrass Institute for Applied Analysis and Stochastics
  • Title: Motivated by the challenges related to the calibration of financial models, we consider the problem of solving numerically a singular McKean-Vlasov equation.
  • Abstract: 

    d S_t= \sigma(t,S_t) S_t \frac{\sqrt v_t}{\sqrt {\mathbb{E}[v_t|S_t]}}dW_t,
    where $W$ is a Brownian motion and $v$ is an adapted diffusion process. This equation can be considered as a singular local stochastic volatility model.
    Whilst such models are quite popular among practitioners, unfortunately, its well-posedness has not been fully understood yet and, in general, is possibly not guaranteed at all.
    We develop a novel regularization approach based on the reproducing kernel Hilbert space (RKHS) technique and show that  the regularized  model is well-posed.  Furthermore, we prove propagation of chaos. We demonstrate numerically  that a thus regularized model is able to perfectly replicate option prices due to typical local volatility models. Our results are also applicable to more general McKean--Vlasov equations.

    (Joint work with Denis Belomestny, Oleg Butkovsky, and John Schoenmakers.)


07.02.2023, Tuesday, 15:00 (CET)

  • Speaker: Prof. Per-Christian Hansen, Technical University of Denmark
  • Title:   Edge-Preserving Computed Tomography (CT) with Uncertain View Angles.
  • Abstract: In computed tomography, data consist of measurements of the attenuation of X-rays passing through an object. The goal is to reconstruct an image of the linear attenuation coefficient of the object's interior. For each position of the X-ray source, characterized by its angle with respect to a fixed coordinate system, one measures a set of data referred to as a view. A common assumption is that these view angles are known, but in some applications they are known with imprecision.

    We present a Bayesian inference approach to solving the joint inverse problem for the image and the view angles, while also providing uncertainty estimates. For the image, we impose a Laplace difference prior enabling the representation of sharp edges in the image; this prior has connections to total variation regularization. For the view angles, we use a von Mises prior which is a 2π-periodic continuous probability distribution.
    Numerical results show that our algorithm can jointly identify the image and the view angles, while also providing uncertainty estimates of both. We demonstrate our method with simulations of a 2D X-ray computed tomography problems using fan beam configurations.

    This is joint work with N. A. B. Riis and Y. Dong, Technical University of Denmark; F. Uribe, LUT University, Finland; and J. M. Bardsley, University of Montana.


06.12.2022, Tuesday, 15:00 (CET)

  • Speaker​:   Prof.  Michael Feischl, TU Wien (Institute for Analysis and Scientific Computing)
  • Title:  A quasi-Monte Carlo data compression algorithm for machine learning.
  • Abstract: We present an algorithm to reduce large data sets using so-called digital nets, which are well distributed point sets in the unit cube. The algorithm efficiently scans the data and computes certain data dependent weights. Those weights are used to approximately represent the data, without making any assumptions on the distribution of the data points. Under smoothness assumptions on the model, we then show that this can be used to reduce the computational effort needed in finding good parameters in machine learning problems which aim to minimize standard loss functions. While the principal idea of the approximation might also work with other point sets, the particular structural properties of digital nets can be exploited to make the computation of the necessary weights extremely fast.  

29.11.2022, Tuesday, 15:00 (CET)

  • Speaker​: Dr. Truong-Vinh  Hoang, Chair of Mathematics for Uncertainty Quantification at RWTH Aachen University
  • Title:  A likelihood-free nonlinear filtering approach using the machine-learning-based approximation of conditional expectation.
  • Abstract: We discuss the machine learning-based ensemble conditional mean filter (ML-EnCMF) developed for the nonlinear data assimilation based on the orthogonal projection of the conditional mean. The updated mean of the filter matches that of the posterior. Moreover, we show that the filter's updated covariance coincides with the expected conditional covariance. Implementing the EnCMF requires computing the conditional mean. A likelihood-based estimator is prone to significant errors for small ensemble sizes, causing filter divergence. We develop a systematical methodology for integrating machine learning into the EnCMF using the conditional expectation's orthogonal projection

    property. First, we use a combination of an artificial neural network (ANN) and a linear function, obtained based on the ensemble Kalman filter (EnKF), to approximate the conditional mean, enabling the ML-EnCMF to inherit EnKF's advantages. Secondly, we apply a suitable variance reduction technique to reduce statistical errors when estimating loss function. Lastly, we propose a model selection procedure for element-wisely selecting the applied filter. We demonstrate the ML-EnCMF performance using the Lorenz-63 and Lorenz-96 systems and show that the ML-EnCMF outperforms the EnKF and the likelihood-based EnCMF.


22.11.2022, Tuesday, 15:00 (CET)

  • Speaker​: Prof. Raúl Tempone, RWTH Aachen University and KAUST
  • Title:  A simple approach to proving the existence, uniqueness, and strong and weak convergence rates for a broad class of McKean-Vlasov equations
  • Abstract: By employing a system of interacting stochastic particles as an approximation of the McKean–Vlasov equation and utilizing classical stochastic analysis tools, namely Itô’s formula and Kolmogorov–Chentsov continuity theorem, we prove the existence and uniqueness of strong solutions for a broad class of McKean–Vlasov equations as a limit of the conditional expectation of exchangeable particles. Considering an increasing number of particles in the approximating stochastic particle system, we also prove the Lp  strong convergence rate and derive the weak convergence rates using the Kolmogorov backward equation and variations of the stochastic particle system. Our convergence rates were verified by numerical experiments which also indicate that the assumptions made here and in the literature can be relaxed.