Thursday 12 September 2024, 11:00-12:00
Watson Building LTC
Recently, there has been growing interest in machine learning (ML) models capable of simulating physical phenomena from data. However, it is often challenging to effectively train these models, especially when only limited data is available. One promising approach is to use prior knowledge from physics (e.g., energy-conservation laws) as inductive bias during training. In this talk, we present our recent work that leverages the theory of Hamiltonian mechanics to train ML models for physics simulations more effectively. Specifically, we discuss (1) Gaussian Process models that incorporate symplectic structures, and (2) a training algorithm for neural operators using a regulariser designed based on the energy-based theory of physics.
Wednesday 23 October 2024, 14:00-15:00
Watson Building 310
In this talk, I will introduce the concept of differential privacy, the prevailing framework for developing statistical procedures while quantifying the amount of privacy offered to each individual in the data set. Differential privacy guarantees are often achieved by injecting noise into deterministic algorithms and this fact makes a large class of sampling algorithms naturally private without any modifications. I will focus on the simplest unadjusted Langevin algorithm and discuss several attempts to characterise its privacy guarantees under the differential privacy framework.
Monday 4 November 2024, 13:00-14:00
Arts 201
Stochastic differential equations (SDE) driven by white noise are important models for stochastic dynamical systems in natural science and engineering. The statistical inference of the parameters of such models based on noisy observations has also attracted considerable interest in the machine learning community. Using Girsanov's change of measure approach one can apply powerful variational techniques to solve the inference problem. A limitation of standard SDE models is the fact that they show typically a fast decay of correlation functions. If one is interested in stochastic processes with a long-time memory, a well-known possibility is to replace the Brownian motion in the SDE by the so called fractional Brownian motion (fBM) which is no longer a Markov process. Unfortunately, variational inference for this case is much less straightforward. Our approach to this problem utilises a somewhat overlooked idea by Carmona and Coutin (1998) who showed that fBM can be exactly represented as an infinite-dimensional linear combination of Ornstein-Uhlenbeck processes with different time constants. Using an appropriate discretisation, we arrive at a finite dimensional approximation which is an 'ordinary' SDE model in an augmented space. For this new model we can apply (more or less) off-the shelve variational inference approaches.
Monday 4 November 2024, 14:00-15:00
Arts 201
In this talk, I will give an overview of the field of graph-based learning, a field that has matured over the last 15 years and is rich in both practical applications and theoretical underpinnings. The key idea of graph-based learning is to understand interrelated data as a graph, to solve variational problems and PDEs on that graph to analyse that data, and to study the limits of such models as the number of nodes goes to infinity. I will begin by motivating the approach and then will discuss the mathematical framework, three classic methods in the field, the nuances of implementing these methods, and finally the theoretical underpinnings of this field.
Monday 11 November 2024, 14:00-15:00
Watson Building 310
In this talk I give a short overview of recent research projects that focus on the development of statistical theory for problems originating from applied probability and machine learning. Concretely, I will first demonstrate how nonparametric statistical methods can be employed to develop data-driven solutions for singular optimal control problems in the presence of model uncertainty. Our statistical techniques build on the ergodic properties of reflected diffusion processes, which we use for generative modelling in the second part of the talk. Here, I will present our recent findings on minimax optimality of denoising reflected diffusion models that build on the idea of time-reversing a symmetric reflected diffusion process to generate new data from an unknown target distribution. The infinitesimal dynamics of the forward model that we employ for this purpose are described by a weighted Laplacian, which is also at the heart of the third statistical problem that I will discuss, albeit with a significant twist: here, a weighted Laplacian with broken diffusivity determines the dynamics of a stochastic heat equation driven by space-time white noise. The presence of a jump in the diffusivity naturally leads us to the statistical identification problem of its spatial location, which translates into a change estimation problem for SPDEs.
Monday 18 November 2024, 14:00-15:00
Watson Building 310
My research aims to further our understanding of neural networks. The first part of the talk will focus on parameter constraints. Common techniques used to improve the generalisation performance of deep neural networks (such as e.g. L2 regularisation and batch normalisation) are tantamount to imposing a constraint on the neural network parameters, but despite their widespread use are often not well understood. In the talk I will describe an approach for efficiently incorporating hard constraints into a stochastic gradient Langevin dynamics framework. Our constraints offer direct control of the parameter space, which allows us to study their effect on generalisation. In the second part of the talk, I will focus on the role played by individual layers and substructures of neural networks: layer-wise sensitivity to the choice of initialisation and optimiser hyperparameter settings varies and training neural network layers differently may lead to enhanced generalisation and/or reduced computational cost. Specifically, I will show that 1) a multirate approach can be used to train deep neural networks for transfer learning applications in half the time, without reducing the generalisation performance of the model, and 2) solely applying the sharpness-aware minimisation (SAM) technique to the normalisation layers of the network enhances generalisation, while providing computational savings.