NEURAL LAGRANGIAN SCHR ÖDINGER BRIDGE: DIF-FUSION MODELING FOR POPULATION DYNAMICS

Abstract

Population dynamics is the study of temporal and spatial variation in the size of populations of organisms and is a major part of population ecology. One of the main difficulties in analyzing population dynamics is that we can only obtain observation data with coarse time intervals from fixed-point observations due to experimental costs or measurement constraints. Recently, modeling population dynamics by using continuous normalizing flows (CNFs) and dynamic optimal transport has been proposed to infer the sample trajectories from a fixed-point observed population. While the sample behavior in CNFs is deterministic, the actual sample in biological systems moves in an essentially random yet directional manner. Moreover, when a sample moves from point A to point B in dynamical systems, its trajectory typically follows the principle of least action in which the corresponding action has the smallest possible value. To satisfy these requirements of the sample trajectories, we formulate the Lagrangian Schrödinger bridge (LSB) problem and propose to solve it approximately by modeling the advection-diffusion process with regularized neural SDE. We also develop a model architecture that enables faster computation of the loss function. Experimental results show that the proposed method can efficiently approximate the population-level dynamics even for high-dimensional data and that using the prior knowledge introduced by the Lagrangian enables us to estimate the sample-level dynamics with stochastic behavior.

1. INTRODUCTION

The population dynamics of time-evolving individuals appears in various scientific fields, such as cell population in biology (Schiebinger et al., 2019; Yang & Uhler, 2018) , air in meteorology (Fisher et al., 2009) , and healthcare statistics (Manton et al., 2008) in medicine. However, tracking individuals over a long period is often difficult due to experimental costs. Furthermore, it can sometimes be impossible to track the time evolution. For example, since single-cell RNA sequencing (scRNA-seq) destroys all measured cells, we cannot analyze the behavior of individual cells over time in cell transcriptome measurements. Instead, we only obtain individual samples from crosssectional populations without alignment across time steps at a few distinct time points. Under these constraints on data measurements, our goal is to better understand the time evolution of samples in the populations. Existing methods attempt to estimate population-level dynamics following the Wasserstein gradient flow using a recurrent neural network (RNN) (Hashimoto et al., 2016) or the Jordan-Kinderlehrer-Otto (JKO) flow (Bunne et al., 2021) . Recent studies have attempted to interpolate the trajectories of individual samples between cross-sectional populations at multiple time points by using optimal transport (OT) (Schiebinger et al., 2019; Yang & Uhler, 2018 ), or CNF (Tong et al., 2020) . Using a CNF generates continuous-time non-linear sample trajectories from multiple time points. In addition, Tong et al. ( 2020) proposed a regularization for CNF that encourages a straight trajectory on the basis of the OT theory. Since the probability distribution transformation based on ordinary differential equations (ODEs) is used in CNF, the behavior of each sample is described by its initial condition in a



Figure 1: Example of trajectories by NLSB.

