The Learning on Graphs and Geometry seminar series at the University of Oxford is a forum for discussing recent advances in the field of geometric deep learning, which lies at the intersection of machine learning and geometry. The group focuses on exploring the theoretical foundations and practical applications of graph neural networks, manifold learning, and other techniques for processing and analyzing data with a non-Euclidean structure. Members of the group include researchers and graduate students from various departments, including Computer Science, Engineering, Mathematics, and Statistics, who share an interest in developing novel machine learning methods for understanding complex data sets with graph or geometric structure.
The seminar series will normally take place biweekly during term time. Please look out for talk announcements for the exact time and location. To receive announcements, please subscribe to our mailing list by sending an email here.
Future Talks
Abstract
Deep neural networks (DNNs) exhibit remarkable generalization abilities, yet the mechanisms behind these capabilities remain poorly understood, defying the established wisdom of statistical learning theory. Recent research has revealed a compelling link between the fractal structures formed during iterative training and the resulting generalization performance. In this talk, Dr. Birdal sheds new light on these connections by presenting a novel framework that ties complexity measures to the topological properties of the training process. The presentation begins by bounding the generalization error through the fractal dimension of training trajectories, practically computed using tools from persistent homology—introducing the 'persistent homology dimension' as a new, insightful proxy for generalization. Building on this, Dr. Birdal introduces more computationally efficient topological complexity measures that bypass the need for continuous training trajectories. These measures consistently show strong correlations with the generalization gap across diverse models, including transformers and graph networks. The findings hold transformative implications for both theory and practice, offering a new lens to study, understand and optimize the generalization power of modern AI systems. Relevant Publications: [1] https://arxiv.org/abs/2111.13171 (NeurIPS 2021) [2] https://arxiv.org/abs/2407.08723 (NeurIPS 2024)Bio: Tolga Birdal is an assistant professor and UKRI Future Leaders Fellow in the Department of Computing of Imperial College London. Previously, he was a Postdoctoral Research Fellow at Stanford University within the Geometric Computing Group of Prof. Leonidas Guibas. Tolga has defended his masters and Ph.D. theses at the Computer Vision Group under Chair for Computer Aided Medical Procedures, Technical University of Munich led by Prof. Nassir Navab. He was also a Doktorand at Siemens AG under supervision of Dr. Slobodan Ilic working on “Geometric Methods for 3D Reconstruction from Large Point Clouds”. His current foci of interest involve topological / geometric machine learning and 3D computer vision. More theoretical work is aimed at investigating and interrogating limits in geometric computing and non-Euclidean inference as well as principles of deep learning. Tolga is an AC for major vision conferences such as CVPR, ICCV, ECCV and 3DV and he has several publications at the well-respected venues such as NeurIPS, CVPR, ICCV, ECCV, ICLR, ICML, T-PAMI, IJCV, ICRA, IROS, ICASSP and 3DV. Aside from his academic life, Tolga has co-founded multiple companies including Befunky, a widely used web-based image editing platform.
Past Talks
Abstract
Graphs are widely applicable to problems in the biological domain. In this talk I will discuss two application areas. First, the use of graph-based dimensionality reduction techniques for exploratory single-cell analysis. Graph-based dimensionality reduction techniques such as T-SNE, UMAP, and diffusion maps are almost universal in analysis of single cell data. I will discuss how these methods are used in practice, how they work, and where they may fail. Second I will discuss the use of graph-based methods in protein folding and design tasks. Graph-based representations of proteins have been a key drive behind the most recent Nobel prize in chemistry. I will discuss how these representations work and why they might be particularly effective in this application.Bio: (Alex)ander Tong is a postdoctoral fellow at Mila with Yoshua Bengio, visiting postdoctoral fellow at Oxford with Michael Bronstein, cofounder of Dreamfold—a protein design startup, and incoming assistant professor at Duke University starting July 2025. Alex completed his Ph.D. in Computer Science at Yale University in 2021 with Smita Krishnaswamy. He is interests span generative modeling, graph signal processing, and optimal transport to understand biological systems with a focus on cells and proteins. https://www.alextong.net/
Abstract
We study the asymptotic behaviour of GNN classifiers across a range of distributions and show that a wide class of models converge asymptotically almost surely to a constant function. This establishes a new uniform upper bound of the expressiveness of these architectures. The prominent features of our work are as follows. Our results are stated in terms of a new term language, which captures many message passing neural networks, transformer architectures, and arbitrary combinations of architectural variations, like skip connections and global readout. The distributions studied include the sparse Erdös–Rényi and Barabási–Albert models. These are closer to real-world distributions than those which have received the most attention in this area. The core of our theoretical work is an almost-sure quantifier elimination result, which is of independent interest.Bio: https://samadamday.com/
Abstract
In this talk, we will discuss our recent work leveraging generative techniques to enhance the performance and interpretability of graph neural networks. First, we will introduce a method that models the unknown graph underlying a set of entities as a first-order probabilistic graph, which, when combined with a GCN, enables similar entities to communicate and improve the predictive task. Next, we will demonstrate how redefining graph convolution can add interpretability by learning the distribution of key local substructures. Finally, we will explore an alternative approach to graph generative modeling, where we use a truncated eigenbasis of the graph Laplacian, inspired by shape analysis, and apply a denoising diffusion probabilistic model that can later be conditioned on both eigenvalues and partial eigenvectors.Bio: https://www.dais.unive.it/~cosmo/
Abstract
Irregular spatiotemporal data are characterized by asynchronous observations occurring at different times and spatial locations. These data are commonly encountered in sensor networks, such as traffic management systems or online social networks. Unlike their regular counterparts, the inherent irregularities in the data prevent the use of traditional methods for discrete-time sequences and Euclidean spaces. Instead, in the spatial domain, we can often rely on relational information between locations, such as distances or similarities among observations. In this talk, we will see how graph deep learning helps us exploit these relational dependencies for both interpolation and extrapolation tasks. We will discuss methods for reconstructing missing data from a restricted set of valid observations by enforcing spatiotemporal consistency and examine forecasting methodologies to predict future observations from sparse data.Bio: https://marshka.github.io/
Abstract
Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre-trained models directly in weight space: By adding the fine-tuned weights of different tasks, the model’s performance can be improved on these tasks, while negating them leads to task forgetting. Yet, our understanding of the effectiveness of task arithmetic and its underlying principles remains limited. We present a comprehensive study of task arithmetic in vision-language models and show that weight disentanglement is the crucial factor that makes it effective. This property arises during pre-training and manifests when distinct directions in weight space govern separate, localized regions in function space associated with the tasks. Notably, we show that fine-tuning models in their tangent space by linearizing them amplifies weight disentanglement. This leads to substantial performance improvements across multiple task arithmetic benchmarks and diverse models. Building on these findings, we provide theoretical and empirical analyses of the neural tangent kernel (NTK) of these models and establish a compelling link between task arithmetic and the spatial localization of the NTK eigenfunctions. Overall, our work uncovers novel insights into the fundamental mechanisms of task arithmetic and offers a more reliable and effective approach to edit pre-trained models through the NTK linearization.Abstract
We introduce the concept of geometry-informed neural networks (GINNs), which encompass (i) learning under geometric constraints, (ii) neural fields as a suitable representation, and (iii) generating diverse solutions to under-determined systems often encountered in geometric tasks. Notably, the GINN formulation does not require training data, and as such can be considered generative modeling driven purely by constraints. We add an explicit diversity loss to mitigate mode collapse. We consider several constraints, in particular, the connectedness of components which we convert to a differentiable loss through Morse theory. Experimentally, we demonstrate the efficacy of the GINN learning paradigm across a range of two and three-dimensional scenarios with increasing levels of complexity.Abstract
Spectral Convolutional Networks were among the first successfully developed graph learning architectures and continue to provide state of the art results on a variety of tasks to this day. However, conventional wisdom within the graph learning community dictates that their utility is limited: On the one hand, they supposedly cannot be extended to directed graphs as the standard underlying Graph-Fourier-Transform is not well defined for such directed graphs. On the other hand, it is believed that after training, spectral methods do not generalize to previously unseen graphs, as spectral quantities may vary significantly between such graphs. In this talk, we show both of these assumptions to be in error: Making use of certain advanced tools from complex analysis and spectral theory, we first extend spectral convolutions to arbitrary directed graphs. We then investigate the influence of the basis used to express the developed spectral filters and discuss the interplay with characteristic operators on which networks are based. In particular, we develop an architecture yielding state of the art performance for node classification on heterophilic graphs. Finally, we show that by using a specific class of complex filter functions, networks can be rendered transferable between distinct graphs describing the same underlying object at different resolutions. As will be discussed making use of a triangle-inequality argument, this then indeed establishes the ability to generalize to generic previously unseen graphs.Abstract
In recent years Deep Generative Models have seen remarkable success in a variety of data domains such as images, text, and audio to name a few. However, the predominant approach in many of these models (e.g. GANS, VAE, Normalizing Flows) is to treat data as fixed-dimensional continuous vectors in some Euclidean space, despite significant evidence to the contrary (e.g. 3D molecules). This lecture places a direct emphasis on learning generative models for complex geometries described via manifolds, such as spheres, tori, hyperbolic spaces, implicit surfaces, and homogeneous spaces. Towards this goal we will cover how to build two prominent approaches to generative modeling: Diffusion models and Flow Matching from a first principles point of view and show how to elevate them to geometric data types of interest.Abstract
In this lecture, I will start by reconsidering the question of equivariance for the segmentation/partition function for labelling images. This will lead naturally to the study of braid groups and an intuitive introduction to Algebraic Topology. After reviewing the historical origins of topology in the work of Riemann and Betti, I will outline how Topological Data Analysis (TDA) is an effective featurization engine for shape in data, requiring new mathematics to understand its statistical properties. Finally, I will show how topology naturally leads to sheaf theory, which was recently used by Bodnar et Bronstein to address the problem of oversmoothing in heterophilic graphs. By the end of this talk, you will have an intuitive understanding of topology and sheaves, and will hopefully be inspired to integrate TDA and sheaves with Geometric Deep Learning in order to develop new state-of-the-art methods.Abstract
Drug discovery is a very long and expensive process, taking on average more than 10 years and costing $2.5B to develop a new drug. Artificial intelligence has the potential to significantly accelerate the process of drug discovery by extracting evidence from a huge amount of biomedical data and hence revolutionizes the entire pharmaceutical industry. In particular, graph representation learning and geometric deep learning--a fast growing topic in the machine learning and data mining community focusing on deep learning for graph-structured and 3D data---has seen great opportunities for drug discovery as many data in the domain are represented as graphs or 3D structures (e.g. molecules, proteins, biomedical knowledge graphs). In this talk, I will introduce our recent progress on geometric deep learning for drug discovery, in particular, pretraining methods for learning 3D molecular structure geometric encoder and generative diffusion models for 3D molecule generation.Abstract
Combining physics with machine learning is a rapidly growing field of research. Thereby, most work focuses on leveraging machine learning methods to solve problems in physics. Here, however, we focus on the converse, i.e., physics-inspired machine learning, which can be described as incorporating structure from physical systems into machine learning methods to obtain models with better inductive biases. More concretely, we propose several physics-inspired deep learning architectures for sequence modeling based on nonlinear coupled oscillators, Hamiltonian systems, and multi-scale dynamical systems. The proposed architectures tackle central problems in the field of recurrent sequence modeling, namely the vanishing and exploding gradients problem as well as the issue of insufficient expressive power. Moreover, we discuss physics-inspired learning on graphs, wherein the dynamics of the message-passing propagation are derived from physical systems. We further prove that these methods mitigate the over-smoothing issue, thereby enabling the construction of deep graph neural networks (GNNs). We extensively test all proposed methods on a variety of versatile synthetic and real-world datasets, ranging from image recognition, speech recognition, medical applications, and scientific computing for sequence models, to citation networks, computational chemistry applications, and networks of articles and websites for graph learning models.Abstract
Curvature bridges geometry and topology, using local information to derive global statements. While well-known in a differential topology context, it was recently extended to the domain of graphs. In fact, graphs give rise to various notions of curvature, which differ in expressive power and purpose. We will give a brief overview of curvature in graphs, define some relevant concepts, and show their utility for data science and machine learning applications. In particular, we shall discuss two applications: first, the use of curvature to *distinguish* between different models for synthesising new graphs from some unknown distribution; second, a novel *framework* for defining curvature for hypergraphs, whose structural properties require a more generic setting. We will also describe new applications that are specifically geared towards a treatment by curvature, thus underlining the utility of this concept for data science.Bio: https://bastian.rieck.me/
Abstract
The expressive power of Graph Neural Networks (GNNs) has been studied extensively through the Weisfeiler-Leman (WL) graph isomorphism test. However, standard GNNs and the WL framework are inapplicable for geometric graphs embedded in Euclidean space, such as biomolecules, materials, and other physical systems. In this work, we propose a geometric version of the WL test (GWL) for discriminating geometric graphs while respecting the underlying physical symmetries: permutations, rotation, reflection, and translation. We use GWL to characterise the expressive power of geometric GNNs that are invariant or equivariant to physical symmetries in terms of distinguishing geometric graphs. GWL unpacks how key design choices influence geometric GNN expressivity: (1) Invariant layers have limited expressivity as they cannot distinguish one-hop identical geometric graphs; (2) Equivariant layers distinguish a larger class of graphs by propagating geometric information beyond local neighbourhoods; (3) Higher order tensors and scalarisation enable maximally powerful geometric GNNs; and (4) GWL's discrimination-based perspective is equivalent to universal approximation.Bio: https://www.chaitjo.com/
Abstract
Convolutional layers within graph neural networks operate by aggregating information about local neighbourhood structures; one common way to encode such substructures is through random walks. The distribution of these random walks evolves according to a diffusion equation defined using the graph Laplacian. We extend this approach by leveraging classic mathematical results about hypo-elliptic diffusions. This results in a novel tensor-valued graph operator, which we call the hypo-elliptic graph Laplacian. We provide theoretical guarantees and efficient low-rank approximation algorithms. In particular, this gives a structured approach to capture long-range dependencies on graphs that is robust to pooling. Besides the attractive theoretical properties, our experiments show that this method competes with graph transformers on datasets requiring long-range reasoning but scales only linearly in the number of edges as opposed to quadratically in nodes.Bio: https://www.maths.ox.ac.uk/people/csaba.toth
Abstract
Score-based generative models (SGMs) are a powerful class of generative models that exhibit remarkable empirical performance. Score-based generative modelling (SGM) consists of a “noising” stage, whereby a diffusion is used to gradually add Gaussian noise to data, and a generative model, which entails a “denoising” process defined by approximating the time-reversal of the diffusion. Existing SGMs assume that data is supported on a Euclidean space, i.e. a manifold with flat geometry. In many domains such as robotics, geoscience or protein modelling, data is often naturally described by distributions living on Riemannian manifolds and current SGM techniques are not appropriate. We introduce here Riemannian Score-based Generative Models (RSGMs), a class of generative models extending SGMs to Riemannian manifolds. We demonstrate our approach on a variety of manifolds, and in particular with earth and climate science spherical data.Bio: http://mjhutchinson.info/
Abstract
In this talk I will provide an overview on the over-squashing phenomenon for Message Passing Neural Networks (MPNN) and on different strategies that have been proposed to tackle this problem that fall under the class of graph-rewiring techniques. I will start by discussing existing works in the literature that have shaped the problem. I will then go through a recent paper where we provide a more rigorous theoretical understanding of over-squashing, connecting it to classical random-walk properties of the graph. This allows us to draw a unified framework to justify why different rewiring operations studied thus far, do indeed help mitigate over-squashing. I will then discuss a new work where we introduce notions of dynamic edge addition and delay mechanism, resulting in a novel way of treating the underlying graph which can enhance any existing MPNN and achieve strong performances on long-range tasks. Finally, I will try to focus on the bigger picture and on why I think these directions of research are important to understand if Graph Neural Networks are here to stay or not.Bio: https://francescodgv.github.io/
Abstract
Neural networks that are able to reliably execute algorithmic computation may hold transformative potential to both machine learning and theoretical computer science. On one hand, they could enable the kind of extrapolative generalisation scarcely seen with deep learning models. On another, they may allow for running classical algorithms on inputs previously considered inaccessible to them. Both of these promises are shepherded by the neural algorithmic reasoning blueprint, which has been recently proposed in a position paper I co-authored with Charles Blundell. On paper, this is a remarkably elegant pipeline for reasoning on natural inputs which carefully leverages the tried-and-tested power of deep neural networks as feature extractors. In practice, how far did we actually take it? In this tutorial, we aim to provide the foundations needed to answer three key questions of neural algorithmic reasoning: how to develop neural networks that execute algorithmic computation, how to deploy such neural networks in real-world problems, and how to deepen their theoretical links to classical algorithms.Bio: https://petar-v.com/
Abstract
Whilst considered the leading architectures for Deep Learning on graph-structured data, Message Passing Neural Networks (MPNNs) are intrinsically bounded in their expressive power by the WL heuristic for graph isomorphism testing - a limitation that has recently ignited prolific research aiming at breaking this fundamental bottleneck. Yet, the most prominent approaches either suffer from high computational complexity and weak inductive biases or require some form of domain knowledge for their effective application. In this talk - which I structure in two parts - I will argue that these aforementioned limitations can be mitigated by resorting to a principled modelling of subgraphs. We will first observe that graphs not distinguishable by MPNNs often contain distinguishable subgraphs. In the first part of the talk, we will build upon this intuition to design a novel framework dubbed Equivariant Subgraph Aggregation Networks (ESAN), prescribing to represent a graph as a bag of subgraphs returned by a pre-specified policy and to process it with a suitable equivariant architecture. We show that this approach effectively increases the expressive power of both MPNNs and more expressive architectures, and study the impact of its design choices. Interestingly, we notice a surge in concurrent approaches which - sometimes unwittingly - make use of subgraphs for powerful graph representations. In the second part of the talk we will focus on the most prominent form of these methods: those where subgraphs are directly “generated” by nodes in the original graph (node-based methods). A novel symmetry analysis allows to unify and better characterise this class of approaches: we prove an upper-bound on their expressive power and conceive a framework serving as a design space for equivariant node-based subgraph architectures. Finally, we introduce a novel instantiation of this framework: a new method dubbed SUN, which captures previous architectures while providing better empirical performance on multiple benchmarks.Bio: https://noired.github.io/
About
Organisers (Faculty): Michael Bronstein, Ismail Ilkan Ceylan, Mihai Cucuringu, Xiaowen Dong, Renaud Lambiotte
Organisers (Students): Álvaro Arroyo, Jacob Bamberger, Federico Barbero, Ben Finkelshtein, Emily Jin