### Wed 24 Apr 14:00: Feature Learning in Two-layer Neural Networks: The Effect of Data Covariance

We study the effect of gradient-based optimization on feature learning in two-layer neural networks. We consider a setting where the number of samples is of the same order as the input dimension and show that, when the input data is isotropic, gradient descent always improves upon the initial random features model in terms of prediction risk, for a certain class of targets. Further leveraging the practical observation that data often contains additional structure, i.e., the input covariance has non-trivial alignment with the target, we prove that the class of learnable targets can be significantly extended, demonstrating a clear separation between kernel methods and two-layer neural networks in this regime.

- Speaker: Murat A. Erdogdu (University of Toronto)
- Wednesday 24 April 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Nicolas Boulle.

### Wed 24 Apr 14:00: Feature Learning in Two-layer Neural Networks: The Effect of Data Covariance

We study the effect of gradient-based optimization on feature learning in two-layer neural networks. We consider a setting where the number of samples is of the same order as the input dimension and show that, when the input data is isotropic, gradient descent always improves upon the initial random features model in terms of prediction risk, for a certain class of targets. Further leveraging the practical observation that data often contains additional structure, i.e., the input covariance has non-trivial alignment with the target, we prove that the class of learnable targets can be significantly extended, demonstrating a clear separation between kernel methods and two-layer neural networks in this regime.

- Speaker: Murat A. Erdogdu (University of Toronto)
- Wednesday 24 April 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Nicolas Boulle.

### Thu 25 Apr 14:00: Machine Learning and Dynamical Systems Meet in Reproducing Kernel Hilbert Spaces with Insights from Algorithmic Information Theory

Since its inception in the 19th century, through the efforts of Poincaré and Lyapunov, the theory of dynamical systems has addressed the qualitative behavior of systems as understood from models. From this perspective, modeling dynamical processes in applications demands a detailed understanding of the processes to be analyzed. This understanding leads to a model, which approximates observed reality and is often expressed by a system of ordinary/partial, underdetermined (control), deterministic/stochastic differential or difference equations. While these models are very precise for many processes, for some of the most challenging applications of dynamical systems, such as climate dynamics, brain dynamics, biological systems, or financial markets, developing such models is notably difficult. On the other hand, the field of machine learning is concerned with algorithms designed to accomplish specific tasks, whose performance improves with more data input. Applications of machine learning methods include computer vision, stock market analysis, speech recognition, recommender systems, and sentiment analysis in social media. The machine learning approach is invaluable in settings where no explicit model is formulated, but measurement data are available. This is often the case in many systems of interest, and the development of data-driven technologies is increasingly important in many applications. The intersection of the fields of dynamical systems and machine learning is largely unexplored, and the objective of this talk is to show that working in reproducing kernel Hilbert spaces offers tools for a data-based theory of nonlinear dynamical systems.

In the first part of the talk, we introduce simple methods to learn surrogate models for complex systems. We present variants of the method of Kernel Flows as simple approaches for learning the kernel that appear in the emulators we use in our work. First, we will discuss the method of parametric and nonparametric kernel flows for learning chaotic dynamical systems. We’ll also explore learning dynamical systems from irregularly sampled time series and from partial observations. We will introduce the methods of Sparse Kernel Flows and Hausdorff-metric based Kernel Flows (HMKFs) and apply them to learn 132 chaotic dynamical systems. We draw parallels between Minimum Description Length (MDL) and Regularization in Machine Learning (RML), showcasing that the method of Sparse Kernel Flows offers a natural approach to kernel learning. By considering code lengths and complexities rooted in Algorithmic Information Theory (AIT), we demonstrate that data-adaptive kernel learning can be achieved through the MDL principle, bypassing the need for cross-validation as a statistical method. Finally, we extend the method of Kernel Mode Decomposition to design kernels in view of detecting critical transitions in some fast-slow random dynamical systems.

Then, we introduce a data-based approach to estimating key quantities which arise in the study of nonlinear autonomous, control, and random dynamical systems. Our approach hinges on the observation that much of the existing linear theory may be readily extended to nonlinear systems – with a reasonable expectation of success – once the nonlinear system has been mapped into a high or infinite dimensional Reproducing Kernel Hilbert Space. We develop computable, non-parametric estimators approximating controllability and observability energies for nonlinear systems. We apply this approach to the problem of model reduction of nonlinear control systems. It is also shown that the controllability energy estimator provides a key means for approximating the invariant measure of an ergodic, stochastically forced nonlinear system. Finally, we show how kernel methods can be used to approximate center manifolds, propose a data-based version of the center manifold theorem, and construct Lyapunov functions for nonlinear ODEs.

- Speaker: Boumediene Hamzi
- Thursday 25 April 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Matthew Colbrook.

### Thu 25 Apr 14:00: Machine Learning and Dynamical Systems Meet in Reproducing Kernel Hilbert Spaces with Insights from Algorithmic Information Theory

Since its inception in the 19th century, through the efforts of Poincaré and Lyapunov, the theory of dynamical systems has addressed the qualitative behavior of systems as understood from models. From this perspective, modeling dynamical processes in applications demands a detailed understanding of the processes to be analyzed. This understanding leads to a model, which approximates observed reality and is often expressed by a system of ordinary/partial, underdetermined (control), deterministic/stochastic differential or difference equations. While these models are very precise for many processes, for some of the most challenging applications of dynamical systems, such as climate dynamics, brain dynamics, biological systems, or financial markets, developing such models is notably difficult. On the other hand, the field of machine learning is concerned with algorithms designed to accomplish specific tasks, whose performance improves with more data input. Applications of machine learning methods include computer vision, stock market analysis, speech recognition, recommender systems, and sentiment analysis in social media. The machine learning approach is invaluable in settings where no explicit model is formulated, but measurement data are available. This is often the case in many systems of interest, and the development of data-driven technologies is increasingly important in many applications. The intersection of the fields of dynamical systems and machine learning is largely unexplored, and the objective of this talk is to show that working in reproducing kernel Hilbert spaces offers tools for a data-based theory of nonlinear dynamical systems.

In the first part of the talk, we introduce simple methods to learn surrogate models for complex systems. We present variants of the method of Kernel Flows as simple approaches for learning the kernel that appear in the emulators we use in our work. First, we will discuss the method of parametric and nonparametric kernel flows for learning chaotic dynamical systems. We’ll also explore learning dynamical systems from irregularly sampled time series and from partial observations. We will introduce the methods of Sparse Kernel Flows and Hausdorff-metric based Kernel Flows (HMKFs) and apply them to learn 132 chaotic dynamical systems. We draw parallels between Minimum Description Length (MDL) and Regularization in Machine Learning (RML), showcasing that the method of Sparse Kernel Flows offers a natural approach to kernel learning. By considering code lengths and complexities rooted in Algorithmic Information Theory (AIT), we demonstrate that data-adaptive kernel learning can be achieved through the MDL principle, bypassing the need for cross-validation as a statistical method. Finally, we extend the method of Kernel Mode Decomposition to design kernels in view of detecting critical transitions in some fast-slow random dynamical systems.

Then, we introduce a data-based approach to estimating key quantities which arise in the study of nonlinear autonomous, control, and random dynamical systems. Our approach hinges on the observation that much of the existing linear theory may be readily extended to nonlinear systems – with a reasonable expectation of success – once the nonlinear system has been mapped into a high or infinite dimensional Reproducing Kernel Hilbert Space. We develop computable, non-parametric estimators approximating controllability and observability energies for nonlinear systems. We apply this approach to the problem of model reduction of nonlinear control systems. It is also shown that the controllability energy estimator provides a key means for approximating the invariant measure of an ergodic, stochastically forced nonlinear system. Finally, we show how kernel methods can be used to approximate center manifolds, propose a data-based version of the center manifold theorem, and construct Lyapunov functions for nonlinear ODEs.

- Speaker: Boumediene Hamzi
- Thursday 25 April 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Matthew Colbrook.

### Thu 16 May 15:00: Efficient Computation through Tuned Approximation

Numerical software is being reconstructed to provide opportunities to tune dynamically the accuracy of computation to the requirements of the application, resulting in savings of memory, time, and energy. Floating point computation in science and engineering has a history of “oversolving” relative to requirements or worthiness for many models. So often are real datatypes defaulted to double precision that GPUs did not gain wide acceptance in simulation environments until they provided in hardware operations not required in their original domain of graphics. However, driven by performance or energy incentives, much of computational science is now reverting to employ lower precision arithmetic where possible. Many matrix operations considered at a blockwise level allow for lower precision and, in addition, many blocks can be approximated with low rank near equivalents. This leads to smaller memory footprint, which implies higher residency on memory hierarchies, leading in turn to less time and energy spent on data copying, which may even dwarf the savings from fewer and cheaper flops. We provide examples from several application domains, including a look at campaigns in geospatial statistics and seismic processing that earned Gordon Bell Prize finalist status in, resp., 2022 and 2023.

- Speaker: David Keyes (KAUST)
- Thursday 16 May 2024, 15:00-16:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Hamza Fawzi.

### Thu 16 May 15:00: Efficient Computation through Tuned Approximation

Numerical software is being reconstructed to provide opportunities to tune dynamically the accuracy of computation to the requirements of the application, resulting in savings of memory, time, and energy. Floating point computation in science and engineering has a history of “oversolving” relative to requirements or worthiness for many models. So often are real datatypes defaulted to double precision that GPUs did not gain wide acceptance in simulation environments until they provided in hardware operations not required in their original domain of graphics. However, driven by performance or energy incentives, much of computational science is now reverting to employ lower precision arithmetic where possible. Many matrix operations considered at a blockwise level allow for lower precision and, in addition, many blocks can be approximated with low rank near equivalents. This leads to smaller memory footprint, which implies higher residency on memory hierarchies, leading in turn to less time and energy spent on data copying, which may even dwarf the savings from fewer and cheaper flops. We provide examples from several application domains, including a look at campaigns in geospatial statistics and seismic processing that earned Gordon Bell Prize finalist status in, resp., 2022 and 2023.

- Speaker: David Keyes (KAUST)
- Thursday 16 May 2024, 15:00-16:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Hamza Fawzi.

### Thu 02 May 14:00: Title to be confirmed

Abstract not available

- Speaker: Sina Mohammadtaheri
- Thursday 02 May 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Matthew Colbrook.

### Thu 02 May 14:00: Title to be confirmed

Abstract not available

- Speaker: Sina Mohammadtaheri
- Thursday 02 May 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Matthew Colbrook.

### Thu 25 Apr 14:00: Title to be confirmed

Abstract not available

- Speaker: Boumediene Hamzi
- Thursday 25 April 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Matthew Colbrook.

### Thu 25 Apr 14:00: Title to be confirmed

Abstract not available

- Speaker: Boumediene Hamzi
- Thursday 25 April 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Matthew Colbrook.

### Thu 11 Apr 14:00: The Algorithmic Transparency Requirement

Deep learning still has drawbacks in terms of trustworthiness, which describes a comprehensible, fair, safe, and reliable method. To mitigate the potential risk of AI, clear obligations associated to trustworthiness have been proposed via regulatory guidelines, e.g., in the European AI Act. Therefore, a central question is to what extent trustworthy deep learning can be realized. Establishing the described properties constituting trustworthiness requires that the factors influencing an algorithmic computation can be retraced, i.e., the algorithmic implementation is transparent. We derive a mathematical framework which enables us to analyze whether a transparent implementation in a computing model is feasible. Finally, we exemplarily apply our trustworthiness framework to analyze deep learning approaches for inverse problems in digital computing models represented by Turing machines.

- Speaker: Adalbert Fono (LMU Munich)
- Thursday 11 April 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Matthew Colbrook.

### Thu 11 Apr 14:00: The Algorithmic Transparency Requirement

Deep learning still has drawbacks in terms of trustworthiness, which describes a comprehensible, fair, safe, and reliable method. To mitigate the potential risk of AI, clear obligations associated to trustworthiness have been proposed via regulatory guidelines, e.g., in the European AI Act. Therefore, a central question is to what extent trustworthy deep learning can be realized. Establishing the described properties constituting trustworthiness requires that the factors influencing an algorithmic computation can be retraced, i.e., the algorithmic implementation is transparent. We derive a mathematical framework which enables us to analyze whether a transparent implementation in a computing model is feasible. Finally, we exemplarily apply our trustworthiness framework to analyze deep learning approaches for inverse problems in digital computing models represented by Turing machines.

- Speaker: Adalbert Fono (LMU Munich)
- Thursday 11 April 2024, 14:00-15:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Matthew Colbrook.