Fall 2020

Date Speaker Topic
M Sep 14 Organizational Meeting  




M Sep 21  




M Sep 28 Vince Lyzinski (UMD) The Importance of Being Correlated: Implications of Dependence in Joint Spectral Inference across Multiple Networks


Abstract: Spectral inference on multiple networks is a rapidly-developing subfield of graph statistics. Recent work has demonstrated that joint, or simultaneous, spectral embedding of multiple independent network realizations can deliver more accurate estimation than individual spectral decompositions of those same networks. Little attention has been paid, however, to the network correlation that such joint embedding procedures necessarily induce. In this paper, we present a detailed analysis of induced correlation in a {\em generalized omnibus} embedding for multiple networks. We show that our embedding procedure is flexible and robust, and, moreover, we prove a central limit theorem for this embedding and explicitly compute the limiting covariance. We examine how this covariance can impact inference in a network time series, and we construct an appropriately calibrated omnibus embedding that can detect changes in real biological networks that previous embedding procedures could not discern. Our analysis confirms that the effect of induced correlation can be both subtle and transformative, with import in theory and practice.

M Oct 5 Howard Elman (UMD) A Low-Rank Solver for the Stochastic Unsteady Navier-Stokes Problem


Abstract: We study a low-rank iterative solver for the unsteady Navier–Stokes equations for incompressible flows with a stochastic viscosity. The equations are discretized using the stochastic Galerkin method, and we consider an all-at-once formulation where the algebraic systems at all the time steps are collected and solved simultaneously. The problem is linearized with Picard’s method. To efficiently solve the linear systems at each step, we use low-rank tensor representations within the Krylov subspace method, which leads to significant reductions in storage requirements and computational costs. Combined with effective mean-based preconditioners and the idea of inexact solve, we show that only a small number of linear iterations are needed at each Picard step. The proposed algorithm is tested with a model of flow in a two-dimensional symmetric step domain with different settings to demonstrate the computational efficiency.

M Oct 12 No seminar University Holiday
M Oct 19 Rongjie Lai (RPI) Chart Auto-encoder for Manifold-Structured Data



Abstract: Deep generative models have made tremendous advances in image and signal representation learning and generation. These models employ the full Euclidean space or a bounded subset as the latent space, whose flat geometry, however, is often too simplistic to meaningfully reflect the manifold structure of the data. In this talk, I will discuss our recent working on advocating the use of a multi-chart latent space for better data representation. We analyze the topology requirement of latent space for a faithful latent representation of manifold-structured data. Inspired by differential geometry, we propose a Chart Auto-Encoder (CAE) and prove a universal approximation theorem on its representation capability. We show that the training data size and the network size scale exponentially in approximation error with an exponent depending on the intrinsic dimension of the data manifold. CAE admits desirable manifold properties that auto-encoders with a flat latent space fail to obey, predominantly proximity of data. We conduct extensive experimentation with synthetic and real-life examples to demonstrate that CAE provides reconstruction with high fidelity, preserves proximity in the latent space, and generates new data remaining near the manifold. This is joint work with Stefan Schonsheck and Jie Chen. 

M Oct 26 Gal Mishne (UCSD)  Multiway Tensor Analysis with Neuroscience Applications



Abstract: Experimental advances in neuroscience enable the acquisition of increasingly large-scale, high-dimensional and high-resolution neuronal and behavioral datasets, however addressing the full spatiotemporal complexity of these datasets poses significant challenges for data analysis and modeling. We propose to model such datasets as multiway tensors with an underlying graph structure along each mode, learned from the data. In this talk I will present three frameworks we have developed to model, analyze and organize tensor data that infer the coupled multi-scale structure of the data, reveal latent variables and visualize short and long-term temporal dynamics with applications in calcium imaging analysis, fMRI and artificial neural networks. 

M Nov 2 Sui Tang (UCSB) Learning interaction laws in multi-agent systems from data




Multi-agent systems are ubiquitous in science, from the modeling of particles in Physics to prey-predator in Biology, to opinion dynamics in economics and social sciences, where the interaction law between agents yields a rich variety of collective dynamics. Inferring the interaction laws between agents from observational trajectory data is a fundamental task for modeling and prediction. 



Given abundant data sampled from multiple trajectories, we use tools from statistical/machine learning to construct estimators for interaction kernels with provably good statistical and computational properties, under the minimal assumptions that the interaction kernels only depend on pairwise distance. In particular, we show that despite the high-dimensionality of the systems, optimal learning rates can still be achieved, equal to that of a one-dimensional regression problem. Numerical simulations on a variety of examples suggest the learnability of kernels in models used in practice, and that our estimators are robust to noise, and produced accurate predictions of collective dynamics in relative large time intervals, even when they are learned from data collected in short time intervals. This talk is based on the joint work with Mauro Maggioni, Jason Miller, Fei Lu and Ming Zhong.
M Nov 9 Bahareh Tolooshams (Harvard) On the relationship between dictionary learning and sparse autoencoders



Abstract: Dictionary learning refers to the problem of learning a dictionary from a dataset where each data can be represented with a sparse linear combination of atoms of the dictionary. We discuss dictionary learning from a probabilistic generative model perspective, and describe its generalization beyond Gaussian distributed data. We explore connections between dictionary learning and neural networks. In this framework, the encoder, performing sparse coding, maps the data into a sparse representation, and the decoder contains the dictionary reconstructing the data. We demonstrate the usefulness of dictionary learning and its corresponding neural network in inverse problems such as denoising.

M Nov 16 Kasso Okoudjou (Tufts) Optimal l1 rank one matrix decomposition




For a positive semi-definite d\times d A let

    \[\begin{cases} \gamma_+(A):=\inf_{A=\sum_{n\geq 1}g_ng_n^*}\sum_{n\geq 1}\|g_n\|_1^2\\ \gamma(A):=\inf_{A=\sum_{n\geq 1}g_nh_n^*}\sum_{n\geq 1}\|g_n\|_1\|h_n\|_1\\ \gamma_0(A):=\inf_{A=\sum_{n\geq 1}g_ng_n^*-\sum_{k\geq 1}h_kh_k^*}\sum_{n\geq 1}\|h_k\|_1^2 \end{cases}\]

where \|x\|_1=\sum_{k=1}^d|\langle x, e_k\rangle| with respect to a fixed ONB \{e_k\}_{k=1}^d of \mathbb{C}^d.

In this talk, I will discuss optimal rank one decompositions of positive definite d\times d matrices with respect to the above functionals. For some classes of positive semidefinite matrices including diagonally dominant matrices and certain of their generalizations, 2\times 2, and a class of 3\times 3 matrices, I will give explicitly these optimal decompositions. I will motivate the talk with a brief discussion of a related infinite-dimensional problem.

This talk is based on joint work with R.~Balan, M.~Rawson, Z.~Rui and Y.~Wang.

T Nov 24 (1:30-2:30) Michael Perlmutter (UCLA)  Understanding Convolutional Neural Networks via Scattering



The scattering transform is a mathematical framework for understanding convolutional neural networks (CNNs), originally introduced by St\’ephane Mallat for functions defined on \mathbb{R}^n. Similar to standard CNNs, the scattering transform is constructed as an alternating cascade of convolutions and nonlinearities. It differs from traditional CNNs by using predesigned, wavelet filters rather than filters learned from training data. This leads to a network that provably has desirable mathematical properties, such as translation invariance and diffeomorphism stability.

In addition to these theoretical properties, the scattering transform is also a practical object that can achieve near state of the art numerical results in certain settings. Moreover, in addition to preforming well on tasks such as image recognition, the scattering transform can also be used to extract statistical information about stochastic processes with stationary increments. I will provide an overview of Mallat’s original construction and also discuss recent variations of the scattering transform which are customized for certain tasks including geometric deep learning and quantum chemistry.

M Nov 30 Rachel Ward (UT Austin)   Stochastic Gradient Descent: From Practice Back to Theory



Abstract: Stochastic Gradient Descent (SGD) is an increasingly popular optimization algorithm for a variety of large-scale learning problems, due to its computational efficiency and ease of implementation.  In particular, SGD is the standard algorithm for training neural networks.  In the neural network industry, certain “tricks” added on top of the basic SGD algorithm have demonstrated to improve convergence rate and generalization accuracy in practice , without theoretical foundations.   In this talk, we focus on two particular such tricks: AdaGrad, an adaptive gradient method which automatically adjusts the learning rate schedule in SGD to reduce the need for hyperparameter tuning, and Weight Normalization, where SGD is implemented (essentially) with respect to polar rather than Cartesian coordinates.  We reframe each of these tricks as a general-purpose modification to the standard SGD algorithm, and provide first theoretical guarantees of convergence, robustness, and generalization performance, in a general context.

M Dec 7 Monika Nitsche (UNM)  

Evaluation of near-singular integrals with application to vortex sheet and Stokes flow


Abstract: Boundary integral formulations yield efficient numerical methods to solve elliptic boundary value problems. They are the method of choice for interfacial fluid flow, either in the inviscid vortex sheet limit, or the viscous Stokes limit. The fluid velocity at a target point is given by integrals over the interfaces. However, for target points near, but not on the interface, the integrals are near singular and standard quadratures lose accuracy. While several accurate methods exist to resolve the analytic integrals that appear in planar geometries, they dont generally apply to the non-analytic case that arises in axisymmetric geometries. Motivated by the latter, we developed a simple method that accurately resolves a large class of integrals. This talk describes the method, presents analytical convergence results, and applies it to resolve planar vortex sheet and axisymmetric Stokes flow.

M Dec 14 Eyal Neuman (Imperial)   Scaling Properties of a Moving Polymer




We set up an SPDE model for a moving, weakly self-avoiding polymer with intrinsic length J taking values in (0,\infty).  Our main result states that the effective radius of the polymer is approximately J^{5/3}; evidently for large J the polymer undergoes stretching.  This contrasts with the equilibrium situation without the time variable, where many earlier results show that the effective radius is approximately J.  For such a moving polymer taking values in \mathbf{R}^2, we offer a conjecture that the effective radius is approximately J^{5/4}