Mathematical Foundations of AI Seminar

The Seminars in Mathematical Foundations of Artificial Intelligence are open to employees and students of Umeå University.

Subribe to the seminar e-mail list

Subscribe to mathfoundations-ai-seminar@lists.umu.se for notification of future seminars.

Send an email to sympa@lists.umu.se and in the subject/heading of the email write: subscribe mathfoundations-ai-seminar

Leave the email blank.

28 March 2025, 15:15-16:15

On the Geometry and Optimization of Polynomial Convolutional Networks

Speaker: Vahid Shahverdi, KTH

Abstract: In this talk, I will explore the rich interplay between algebraic geometry and convolutional neural networks (CNNs) with polynomial activation functions. At the heart of this study is the parameterization map, which translates network parameters into functions. We show that this map is a regular morphism and an isomorphism almost everywhere, up to scaling symmetries. The image of this map, which we call the “neuromanifold,” exhibits intricate geometric properties. I will discuss its dimension, degree, and singularities, and their implications for the learning process. Beyond structural insights, I will highlight a connection between the geometry of the neuromanifold and optimization: for large generic datasets, we compute the number of critical points that arise during training with a quadratic loss function, using tools from metric algebraic geometry. This is joint work with Giovanni Luca Marchetti and Kathlén Kohn.

13 March 2025, 16:00-16:45

Equivariant Neural Tangent Kernels

Speaker: Philipp Misof, Chalmers University of Technology

Abstract: In recent years, the neural tangent kernel (NTK) has proven to be a valuable tool to study training dynamics of neural networks (NN) analytically. In this talk, I will present how this NTK framework can be extended to equivariant NNs based on group convolutional NNs (GCNNs). Not only does this enable the analytic study of influences of hyperparameters, training biases etc. in equivariant NNs, but it also allows us to draw an interesting connection between data augmentation and manifestly equivariant architectures. In particular, we show that the mean predictions of an ensemble of data augmented non-equivariant networks coincide with the mean predictions of an ensemble of specific GCNNs at all training times in the infinite-width limit. We further provide explicit implementations of the equivariant NTK for roto-translations in the plane and 3d rotations. To evaluate the performance of the equivariant infinite width solution, we benchmark the models on quantum mechanical property prediction and medical image classification. This talk is based on joined work with Jan Gerken and Pan Kessel.

Place: MIT.A.356

6 March 2025, 14:15-14:55

Learning-Based Surrogate Models for the Fluid Dynamics of a Pharmaceutical Bioreactor

Speaker: Umut Kaya, Daiichi Sankyo Europe GmbH / University of Ghent

Abstract: We developed learning-based surrogate models to predict fluid dynamics in pharmaceutical bioreactors. Traditional CFD simulations take too long to run, making real-time process optimization impractical. By using machine learning, including graph neural networks and reduced-order modeling, we built models that provide fast and accurate predictions of hydrodynamic stress and mixing behavior. These surrogate models significantly cut down computational costs while maintaining reliability, making them valuable for biopharmaceutical process development. Our work bridges the gap between physics-based modeling and data-driven approaches, helping improve bioprocess design, monitoring, and control.

Place: MIT.A.356

13 February 2025, 15:15-16:15

Communication-efficient distributed optimization algorithms

Speaker: Laurent Condat, King Abdullah University of Science and Technology (KAUST)

Abstract: In distributed optimization and machine learning, a large number of machines perform computations in parallel and communicate back and forth with a distant server. Communication can be costly and slow, in particular in federated learning. To address the communication bottleneck, two strategies are popular: 1) communicate less frequently; 2) compress the communicated vectors. Also, a robust algorithm should allow for partial participation. I will present several randomized algorithms we developed recently in this area, with proved convergence guarantees and state-of-the-art communication complexity.

Place: MIT.A.346

Contact

Fredrik Ohlsson

+46 90 786 53 89

Latest update: 2025-03-24