Modifying Manifold Learning Algorithms to Collect and Exploit Data for Chemical Engineering Modeling

Aug 25, 2022, 9:00 am10:30 am



Event Description

Large data sets from observations of complex dynamical systems become increasingly prevalent in science and engineering. High-dimensionality of these data sets complicates (even renders impractical) human understanding as well as many algorithmic tasks. However, many systems exhibit an effective low-dimensionality in their parameter or state space; this can enable valuable analysis/modeling tools. While analytical techniques are sometimes applicable, data-driven techniques are often required. This dissertation makes extensive use of diffusion maps, a manifold learning algorithm which can characterize nonlinear structure in high-dimensional data. The data sets analyzed were collected under particular data collection modes, relevant to chemical/biological system dynamics, motivating modifications to/ extensions of standard diffusion maps. First, we analyze dynamical system data in the form of 3D tensors, with the axes of the tensor representing parameters, measurement channels, and time. With such data, the questionnaire metric for diffusion maps allows separate embeddings for each axis, iteratively updated based on the embeddings of the other two axes. With no a priori knowledge of the effective dimensionality of the parameter or state space, we can still learn an effective low-dimensional description (and effective evolution equations) in a data-derived emergent space. This dissertation demonstrates the approach on (a) a system described by ordinary differential equations, with the goal of bifurcation analysis, as well as on (b) a Drosophila embryonic development model that can be approximated by partial differential equations. Finally, we consider data from two different sensors, each observing different views of the same system (each sensor also having its own independent, sensor specific information). Alternating Diffusion Maps can filter out the sensor-specific (“uncommon”) information, and construct an embedding that captures only the “common” information. With this tool (as well as its “jointly smooth functions” alternative) we can learn which variables from one sensor can be written as a function of which variables of the other sensor. We also discuss the subsequent parameterization of each sensor’s uncommon information, as well as how time iii delays between the two sensor’s measurements can enable the approximation of evolution equations.