Semester 2, 2024-25

Physics-informed neural networks: Fitting a mathematical model to real data using artificial neural networks

Hyeontae Jo, Korea University Sejong Campus

Tuesday 7 January 2025, 14:00-15:00
Watson Building 310

A dynamical system y'(t)=f(y) can be used to model the evolution of natural or engineered systems. Traditionally, the overall structure of the system f is determined by researchers' insights (experience) or experimental observations (data). However, such insights might fail to capture the system's complexity and non-linearities due to the limitations in human intuition and experimental precision. In response, data-driven scientific discovery methods have been developed by employing artificial neural networks (ANN). Specifically, ANN is trained to simultaneously fit the system f and the data, called Physics-informed Neural Networks (PINN). In this presentation, we will study 1) the basic idea of the PINN 2) effectiveness of the PINN. Furthermore, we show how we can extend the concept of PINN to real-world dataset, thus understanding the structure of the system f.

ThyRST - Developing a Thyroid Risk Stratification Tool for Indeterminate Thyroid Nodules

Emma Watts, University of Birmingham

Monday 27 January 2025, 14:00-15:00
Arts, Lecture Room 4 (101)

The incidence of thyroid cancer has risen by 175% since the 1990s, yet only 10% of thyroid nodules are malignant. Distinguishing benign from malignant nodules remains a diagnostic challenge. Ultrasound Fine Needle Aspiration Cytology (US FNAC) is widely used, but diagnostic certainty often requires an operation.

This study aims to analyse the relationship between ultrasound findings, biopsy results, nodule size, and clinical risk factors. A retrospective cohort study of adults undergoing thyroid ultrasounds (2014 2024) at major hospitals across the West Midlands will use AI Natural Language Processing to extract data from thousands of reports. Preliminary AI models have achieved 99.2% accuracy in scoring ultrasound reports. Pilot data has shown promise, but the lack of statistically significant differences may be attributable to our small sample size. We aim to streamline diagnostic decision-making through developing a risk stratification tool to aid the prediction of malignancy for thyroid nodules detected on ultrasound.

Self-normalized Cramér-type moderate deviation of stochastic gradient Langevin dynamics

Jianya Lu, University of Essex

Monday 10 February 2025, 14:00-15:00
Arts, Lecture Room 4

In this talk, we analyse the stochastic gradient Langevin dynamics (SGLD) from the perspective of continuous systems. Specifically, we construct a stochastic differential equation (SDE) close to the SGLD. Using diffusion approximation and Stein’s method to the SDE, a self-normalized Cramér-type moderate deviation result for the empirical measure of SGLD is established. Additionally, we also derive the Berry-Esseen bound for SGLD.

The low-dimensional Universe. Uncovering hidden information by probabilistic modelling of general noisily sampled, abstract manifolds

Marco Canducci, University of Birmingham

Monday 17 February 2025, 14:00-15:00
Arts, Lecture Room 4

To the casual observer, information about natural phenomena may seem unorganised and chaotic. However, when inspected through the lens of probabilistic modelling, excess noise can be reduced, leading to the emergence of order and interpretability. Formidable examples are major mergers between massive galaxies, where multiple components of different physical properties interact in complicated ways. Despite the large variability in interacting objects and the hosting environments, these systems are often characterised by common features, that also keep track of their evolution. These are low-dimensional structures (manifolds) that, up to some noise, can locally be represented as continuously deformable intervals of ℝ^j, where j represents the intrinsic dimensionality of the manifold. In this seminar, I will present a framework for the detection of these manifolds from noisily sampled point clouds. By first partitioning the data set into subsets of points with different intrinsic dimensionality, a dedicated 'crawling' technique is proposed as a way to recover a skeleton (abstract graph) for each noise-free manifold. Taking advantage of the constrained Gaussian Mixture Model formulation, a smooth mapping from the abstract to the embedded space can be recovered and properties of the manifolds studied locally. This framework is designed to capture complicated, noisily sampled, abstract manifolds such as the Mobius bands, from noisy environments with multiple structures. I will show its application to astronomical data sets of different length scales and representing different processes.

How 3D Printing Needs Data Science

Moataz Attallah, University of Birmingham

Monday 24 February 2025, 14:00-15:00
Arts, Lecture Room 4

3D printing, or additive manufacturing, is revolutionising industries, particularly high-value manufacturing sectors such as aerospace, medical implants, energy, and automotive. Its ability to produce complex geometries with minimal material waste makes it particularly attractive when working with expensive materials like titanium or specialised polymers. Despite its potential, optimising 3D printing remains a complex challenge due to the interplay of numerous parameters, including print speed, heat input, temperature, layer thickness, and material composition.

The complexity of these parameters creates a multi-dimensional optimisation problems that is difficult to address using traditional trial-and-error approaches. This is where data science becomes very useful. Tools such as statistical analysis, machine learning, and physics-informed reinforcement learning have been employed to model and predict the outcomes of different parameter combinations, and to search for anomalies in the process. These approaches enable the identification of optimal settings while significantly reducing the number of experimental iterations, saving time, cost, and resources. However, the field still faces future challenges. A critical question remains: how much data is sufficient to make reliable predictions? Researchers must determine the minimal, yet comprehensive set of experiments required to gain actionable insights. Additionally, there is significant potential in correlating material properties with defect patterns and microstructural characteristics. This could lead to more precise control over the quality of printed parts and minimise post-processing requirements.

The synergy between 3D printing and data science holds a great promise. By leveraging advanced analytical techniques, the industry can overcome optimisation challenges and unlock new possibilities for innovation. Establishing the delicate balance between data volume, parameter selection, and predictive accuracy will be essential in realising the full potential of 3D printing for high-value applications.

Data-driven stochastic control: balancing learning and optimisation

Claudia Strauch, Heidelberg University

Monday 17 March 2025, 14:00-15:00
Arts, Lecture Room 4

Reinforcement learning and stochastic control both aim to find optimal strategies in uncertain environments. While reinforcement learning is highly flexible, obtaining theoretical guarantees and interpretable models remains challenging. In contrast, stochastic control provides rigorous solutions, but typically requires knowledge of the underlying dynamics, a condition that is rarely met in practice.

This talk explores data-driven approaches to stochastic control of continuous diffusion processes without prior knowledge of the dynamics. We focus on impulse control problems, where the exploration-exploitation trade-off is crucial. Using nonparametric methods, we study convergence rates for estimators of key process quantities and derive regret bounds to assess the performance of proposed strategies over time.

Our results provide mathematical guarantees for learning-based control methods and show how nonparametric inference can address key challenges in decision-making under uncertainty. The talk is based on joint work with Sören Christensen, Niklas Dexheimer and Lukas Trottner.

Novel Parameterisations of Multivariate Markov Chain Models: Modelling and Bayesian Inference

Theo Kypraios , University of Nottingham

Monday 24 March 2025, 14:00-15:00
Arts, Lecture Room 4

Multivariate Markov Chain (MMC) models provide a powerful framework for capturing dependencies across multiple interrelated processes that evolve over time. They are widely used in fields such as finance, epidemiology, and genetics, where understanding the joint behaviour of several variables is crucial for prediction, risk assessment, and decision-making.

Assuming all chains share the same discrete state space, a natural way to parameterise such models is by considering the joint transition probability matrix on the extended state space formed by the Cartesian product of all marginal state spaces. However, the number of parameters to be estimated increases significantly with both the number of states and the number of chains, posing several challenges when fitting such a model to data.

In this talk, I will introduce alternative and novel general parameterisations for MMC models that enable us to capture and characterise any first-order dependence between the chains (e.g., conditional independence, contemporaneous independence, and Granger non-causality) while remaining amenable to likelihood-based inference.

Our work is motivated by developing (multivariate) Markov modulated Poisson processes for detecting disruption on the National Railway in Great Britain, using the content and volume of tweets referring to delays and disturbances on the railway. I will discuss how the proposed parameterisations can be applied in this context to address questions of practical importance. If time permits, I will also present some recent developments on an efficient individual Forward-Backward algorithm for inferring the (unobserved) state of the Markov chains.

Multi-target semi-supervised learning with application to survey samples

Katarzyna Reluga, University of Bristol

Monday 31 March 2025, 14:00-15:00
Arts, Lecture Room 4

In the classical single-target semi-supervised setting, one has access to (i) a moderately sized labelled dataset containing both response values and associated features, and (ii) a much larger unlabelled dataset where only the covariates are observed. Semi-supervised learning naturally arises in scenarios where collecting unlabelled data is straightforward for a large cohort, but acquiring corresponding labels is expensive or time-consuming. Common examples include electronic health records and survey data, where only a small subset of the population is fully observed.

We extend this framework to multi-target semi-supervised learning, where labelled data to estimate multiple target parameters is scarce, and classical semi-supervised methods lead to excessive variability. We discuss the challenges of this setup and propose novel estimation methods for target parameters at the subpopulation level.

The separation capacity of linear reservoirs with random connectivity matrix

Youness Boutaib, University of Liverpool

Tuesday 6 May 2025, 14:00-15:00
Watson Building, 310

Recurrent neural networks (RNNs) constitute the simplest machine learning paradigm that is able to handle variable-length data sequences while tracking long-term dependencies and taking into account the temporal order of the received information. Reservoir computing (i.e. randomly choosing the connectivity matrix of the RNN) is a paradigm based on the idea that universal approximation properties can be achieved for several dynamical systems without the need to optimise all parameters. This technique simplifies the training of RNNs and has shown exceptional performance in a variety of tasks. Despite this, there is a fundamental lack in the mathematical understanding of the success of such approach. In this work, we explain this success by the separation capacity of such random reservoirs. In particular, we show that the expected separation capacity is characterised by the spectral analysis of a generalised Hankel matrix of moments. As a byproduct of this result, we discuss how the parameters of the problem (dimension of the reservoir, geometry of the classes of time-series, the choice of the probability distribution, symmetries, etc.) impact the performance of the architecture.

Bayesian nonparametric modelling of criminal networks

Daniele Durante, Bocconi University

Monday 19 May 2025, 14:00-15:00
Zoom

Europol recently defined criminal networks as a modern version of the Hydra mythological creature, with covert and complex structure. Indeed, connectivity data among criminals are subject to measurement errors, structured missingness patterns, and exhibit a sophisticated combination of an unknown number of core-periphery, assortative and disassortative structures that may encode key architectures of the criminal organization. The coexistence of these noisy, and possibly multiscale, block patterns limits the reliability of community detection algorithms routinely used in criminology, thereby leading to overly-simplified and possibly biased reconstructions of organized crime architectures. In this seminar, I will present a number of Bayesian model-based solutions which aim at covering these gaps via a combination of stochastic block models and priors for random partitions arising from Bayesian nonparametrics. These include Gibbs-type priors, and random partition priors driven by the urn scheme of a hierarchical normalized completely random measure. Product-partition models to incorporate criminals' attributes, and zero-inflated Poisson representations accounting for weighted edges and security strategies, will be also discussed. Collapsed Gibbs samplers for posterior computation are presented, and refined strategies for estimation, prediction, uncertainty quantification and model selection will be outlined. Results are illustrated in an application to an Italian Mafia network, where the proposed models unveil a structure of the criminal organization mostly hidden to state-of-the-art alternatives routinely used in criminology. I will conclude the seminar by introducing an innovative phylogenetic latent space model that learns nested modular hierarchies among criminals from the formation process of the corresponding latent connectivity features.

Semester 1, 2024-25

Machine learning models for Hamiltonian and dissipative systems

Yusuke Tanaka, NTT Communication Science Laboratories

Thursday 12 September 2024, 11:00-12:00
Watson Building LTC

Recently, there has been growing interest in machine learning (ML) models capable of simulating physical phenomena from data. However, it is often challenging to effectively train these models, especially when only limited data is available. One promising approach is to use prior knowledge from physics (e.g., energy-conservation laws) as inductive bias during training. In this talk, we present our recent work that leverages the theory of Hamiltonian mechanics to train ML models for physics simulations more effectively. Specifically, we discuss (1) Gaussian Process models that incorporate symplectic structures, and (2) a training algorithm for neural operators using a regulariser designed based on the energy-based theory of physics.

Differential privacy analysis of Langevin algorithms

Mengchu Li, University of Birmingham

Wednesday 23 October 2024, 14:00-15:00
Watson Building 310

In this talk, I will introduce the concept of differential privacy, the prevailing framework for developing statistical procedures while quantifying the amount of privacy offered to each individual in the data set. Differential privacy guarantees are often achieved by injecting noise into deterministic algorithms and this fact makes a large class of sampling algorithms naturally private without any modifications. I will focus on the simplest unadjusted Langevin algorithm and discuss several attempts to characterise its privacy guarantees under the differential privacy framework.

Variational inference for stochastic differential equations driven by fractional Brownian motion

Manfred Opper, TU Berlin and University of Birmingham

Monday 4 November 2024, 13:00-14:00
Arts 201

Stochastic differential equations (SDE) driven by white noise are important models for stochastic dynamical systems in natural science and engineering. The statistical inference of the parameters of such models based on noisy observations has also attracted considerable interest in the machine learning community. Using Girsanov's change of measure approach one can apply powerful variational techniques to solve the inference problem. A limitation of standard SDE models is the fact that they show typically a fast decay of correlation functions. If one is interested in stochastic processes with a long-time memory, a well-known possibility is to replace the Brownian motion in the SDE by the so called fractional Brownian motion (fBM) which is no longer a Markov process. Unfortunately, variational inference for this case is much less straightforward. Our approach to this problem utilises a somewhat overlooked idea by Carmona and Coutin (1998) who showed that fBM can be exactly represented as an infinite-dimensional linear combination of Ornstein-Uhlenbeck processes with different time constants. Using an appropriate discretisation, we arrive at a finite dimensional approximation which is an 'ordinary' SDE model in an augmented space. For this new model we can apply (more or less) off-the shelve variational inference approaches.

A guide to graph-based learning

Jeremy Budd, University of Birmingham

Monday 4 November 2024, 14:00-15:00
Arts 201

In this talk, I will give an overview of the field of graph-based learning, a field that has matured over the last 15 years and is rich in both practical applications and theoretical underpinnings. The key idea of graph-based learning is to understand interrelated data as a graph, to solve variational problems and PDEs on that graph to analyse that data, and to study the limits of such models as the number of nodes goes to infinity. I will begin by motivating the approach and then will discuss the mathematical framework, three classic methods in the field, the nuances of implementing these methods, and finally the theoretical underpinnings of this field.

Snapshots of statistics for SPDEs, optimal control and generative models

Lukas Trottner, University of Birmingham

Monday 11 November 2024, 14:00-15:00
Watson Building 310

In this talk I give a short overview of recent research projects that focus on the development of statistical theory for problems originating from applied probability and machine learning. Concretely, I will first demonstrate how nonparametric statistical methods can be employed to develop data-driven solutions for singular optimal control problems in the presence of model uncertainty. Our statistical techniques build on the ergodic properties of reflected diffusion processes, which we use for generative modelling in the second part of the talk. Here, I will present our recent findings on minimax optimality of denoising reflected diffusion models that build on the idea of time-reversing a symmetric reflected diffusion process to generate new data from an unknown target distribution. The infinitesimal dynamics of the forward model that we employ for this purpose are described by a weighted Laplacian, which is also at the heart of the third statistical problem that I will discuss, albeit with a significant twist: here, a weighted Laplacian with broken diffusivity determines the dynamics of a stochastic heat equation driven by space-time white noise. The presence of a jump in the diffusivity naturally leads us to the statistical identification problem of its spatial location, which translates into a change estimation problem for SPDEs.

Constrained and layer-wise training of neural networks

Tiffany Vlaar, University of Glasgow

Monday 18 November 2024, 14:00-15:00
Watson Building 310

My research aims to further our understanding of neural networks. The first part of the talk will focus on parameter constraints. Common techniques used to improve the generalisation performance of deep neural networks (such as e.g. L2 regularisation and batch normalisation) are tantamount to imposing a constraint on the neural network parameters, but despite their widespread use are often not well understood. In the talk I will describe an approach for efficiently incorporating hard constraints into a stochastic gradient Langevin dynamics framework. Our constraints offer direct control of the parameter space, which allows us to study their effect on generalisation. In the second part of the talk, I will focus on the role played by individual layers and substructures of neural networks: layer-wise sensitivity to the choice of initialisation and optimiser hyperparameter settings varies and training neural network layers differently may lead to enhanced generalisation and/or reduced computational cost. Specifically, I will show that 1) a multirate approach can be used to train deep neural networks for transfer learning applications in half the time, without reducing the generalisation performance of the model, and 2) solely applying the sharpness-aware minimisation (SAM) technique to the normalisation layers of the network enhances generalisation, while providing computational savings.

Legislative impeachments in a neural network society

Juan Neirotti, Aston University

Monday 25 November 2024, 14:00-15:00
Watson Building 310

Inspired by studies of government overthrows in modern South American presidential democracies, we present an agent-based Statistical Mechanics analysis of the coordinated actions of strategic political actors within legislative chambers and the conditions that can lead to premature changes in executive leadership, such as presidential impeachments or motions of no confidence in prime ministers. The legislative actors are modelled as information-processing agents, equipped with neural networks, who express opinions on issues from the presidential agenda.

We construct a Hamiltonian representing the collective cost incurred by agents for holding a particular set of opinions from a range of possible stances. Using replica methods, we explore two types of disorder: in the distribution of neural network weights and in the structure of agent interactions. The resulting phase diagram illustrates how control parameters - loosely interpreted as indices of legislative strategic support, presidential polling popularity, and the volume of issues on the presidential agenda - govern the system's behaviour. The model reveals an intermediate phase where strategic behaviours in support of or against the executive coexist, flanked by phases (characterised by a pure state) where the legislative vote aligns fully with either supporting or opposing the executive.

Changes in these indices, driven by external factors, can push the system out of the coexistence phase and into the opposing pure phase, triggering a phase transition that leads to the executive's removal through constitutional means. Using data from Brazil, we analyse presidential trajectories during the democratic period starting in 1989, showing that these trajectories align with the phase diagram in terms of whether the president was removed or remained in office.

A novel use of pseudospectra in mathematical biology: Understanding HPA axis sensitivity

Catherine Drysdale, University of Birmingham

Monday 2 December 2024, 13:00-14:00
Watson Building B16

The Hypothalamic-Pituitary-Adrenal (HPA) axis is a major neuroendocrine system, and its dysregulation is implicated in various diseases. This system also presents interesting mathematical challenges for modelling. We consider a non-linear delay differential equation model and calculate pseudospectra of three different linearisations: a time-dependent Jacobian, linearisation around the limit cycle, and dynamic mode decomposition (DMD) analysis of Koopman operators (global linearisation). The time-dependent Jacobian provided insight into experimental phenomena, explaining why rats respond differently to perturbations during corticosterone secretion's upward versus downward slopes. We developed new mathematical techniques for the other two linearisations to calculate pseudospectra on Banach spaces and apply DMD to delay differential equations, respectively. These methods helped establish local and global limit cycle stability and study transients. Additionally, we discuss using pseudospectra to substantiate the model in experimental contexts and establish bio-variability via data-driven methods. This work is the first to utilise pseudospectra to explore the HPA axis.

Higher-order organisation of multivariate time series

Enrico Amico, University of Birmingham

Monday 2 December 2024, 14:00-15:00
Watson Building B16

Time series analysis has proven to be a powerful method to characterise several phenomena in biology, neuroscience and economics, and to understand some of their underlying dynamical features. Several methods have been proposed for the analysis of multivariate time series, yet most of them neglect the effect of non-pairwise interactions on the emerging dynamics. In this talk I will introduce a framework to characterise the temporal evolution of higher-order dependencies within multivariate time series. Using network analysis and topology, I will show that our framework robustly differentiates various spatiotemporal regimes of coupled chaotic maps. This includes chaotic dynamical phases and various types of synchronisation. Furthermore, using the higher-order co-fluctuation patterns in simulated dynamical processes as a guide, I will highlight and quantify signatures of higher-order patterns in data from brain functional activity, financial markets and epidemics. Overall, this approach sheds light on the higher-order organisation of multivariate time series, allowing a better characterization of dynamical group dependencies inherent to real-world data.

Tractable non-diffusive Wright-Fisher processes

Nathan Judd, University of Birmingham

Monday 9 December 2024, 11:00-12:00
Watson Building 310

Wright-Fisher diffusions are a standard model in mathematical population genetics, and, more recently, as time-dependent priors of [0,1]-valued parameters in Bayesian inference. They possess many notable properties, most important of which is the existence of an infinite-sum representation of its transition function, which can be sampled from exactly. However, the assumption of continuous trajectories is often inappropriate in evolutionary and financial modelling. Therefore, interest lies in augmenting the Wright-Fisher diffusion to induce discontinuities in their trajectories. There are two main methods to explore – those perturbing time and those perturbing space, the former of which corresponds to a time-change, and is mathematically simpler. Many notable properties of the original process are inherited, despite the change in path structure, that enable exact simulation and exact posterior computation.