TRANSFORMER MEETS BOUNDARY VALUE INVERSE PROBLEMS

Abstract

A Transformer-based deep direct sampling method is proposed for electrical impedance tomography, a well-known severely ill-posed nonlinear boundary value inverse problem. A real-time reconstruction is achieved by evaluating the learned inverse operator between carefully designed data and the reconstructed images. An effort is made to give a specific example to a fundamental question: whether and how one can benefit from the theoretical structure of a mathematical problem to develop task-oriented and structure-conforming deep neural networks? Specifically, inspired by direct sampling methods for inverse problems, the 1D boundary data in different frequencies are preprocessed by a partial differential equation-based feature map to yield 2D harmonic extensions as different input channels. Then, by introducing learnable non-local kernels, the direct sampling is recast to a modified attention mechanism. The new method achieves superior accuracy over its predecessors and contemporary operator learners and shows robustness to noises in benchmarks. This research shall strengthen the insights that, despite being invented for natural language processing tasks, the attention mechanism offers great flexibility to be modified in conformity with the a priori mathematical knowledge, which ultimately leads to the design of more physics-compatible neural architectures.

1. INTRODUCTION

Boundary value inverse problems aim to recover the internal structure or distribution of multiple media inside an object (2D reconstruction) based on only the data available on the boundary (1D signal input), which arise from many imaging techniques, e.g., electrical impedance tomography (EIT) (Holder, 2004) , diffuse optical tomography (DOT) (Culver et al., 2003) , magnetic induction tomography (MIT) (Griffiths et al., 1999) . Not needing any internal data renders these techniques generally non-invasive, safe, cheap, and thus quite suitable for monitoring applications. In this work, we shall take EIT as an example to illustrate how a more structure-conforming neural network architecture leads to better results in certain physics-based tasks. Given a 2D bounded domain Ω and an inclusion D, the forward model is the following partial differential equation (PDE) ∇ • (σ∇u) = 0 in Ω, where σ = σ 1 in D, and σ = σ 0 in Ω\D, where σ is a piecewise constant function defined on Ω with known function values σ 0 and σ 1 , but the shape of the inclusion D buried in Ω is unknown. The goal is to recover the shape of D using only the boundary data on ∂Ω (Figure 1 ). Specifically, by exerting a current g on the boundary, one solves (1) with the Neumann boundary condition σ∇u • n| ∂Ω = g, where n is the outwards unit normal direction of ∂Ω, to get a unique u on the whole domain Ω. In practice, only the Dirichlet boundary value representing the voltages f = u| ∂Ω on the boundary can be measured. This procedure is called Neumann-to-Dirichlet (NtD) mapping:

annex

Published as a conference paper at ICLR 2023 For various notation and the Sobolev space formalism, we refer readers to Appendix A; for a brief review of the theoretical background of EIT we refer readers to Appendix B. The NtD map above in (2) can be expressed as f = A σ g, (3) where g and f are (infinite-dimensional) vector representations of functions g and f relative to a chosen basis, and A σ is the matrix representation of Λ σ (see Appendix B for an example).The original mathematical setup of EIT is to use the NtD map Λ σ in (2) to recover σ, referred to as the case of full measurement (Calderón, 2006) . In this case, the forward and inverse operators associated with EIT can be formulated asFix a set of basis {g l } ∞ l=1 of the corresponding Hilbert space containing all admissible currents. Then, mathematically speaking, "knowing the operator Λ σ " means that one can measure all the current-to-voltage pairs {g l , f l := Λ σ g l } ∞ l=1 and construct the infinite-dimensional matrix A σ . However, as infinitely many boundary data pairs are not attainable in practice, the problem of more practical interest is to use only a few data pairs {(g l , f l )} L l=1 for reconstruction. In this case, the forward and inverse problems can be formulated as F L : σ → {(g 1 , Λ σ g 1 ), ..., (g L , Λ σ g L )} and F -1 L : {(g 1 , Λ σ g 1 ), ..., (g L , Λ σ g L )} → σ.(5) For limited data pairs, the inverse operator F -1 L is extremely ill-posed or even not well-defined (Isakov & Powell, 1990; Barceló et al., 1994; Kang & Seo, 2001; Lionheart, 2004) ; namely, the same boundary measurements may correspond to different σ. In view of the matrix representation A σ , for g l = e l , l = 1, ..., L, with e l being unit vectors of a chosen basis, (f 1 , ..., f L ) only gives the first L columns of A σ . It is possible that two matrices A σ and A σ have similar first L columns but ∥σ -σ∥ is large.How to deal with this "ill-posedness" is a central theme in boundary value inverse problem theories. The operator learning approach has the potential to tame the ill-posedness by restricting F -1 L at a set of sampled data D := {σ (k) } N k=1 , with different shapes and locations following certain distribution. Then the problem becomes to approximateThe fundamental assumption here is that this map is "well-defined" enough to be regarded as a high-dimensional interpolation (learning) problem on a compact data submanifold (Seo et al., 2019; Ghattas & Willcox, 2021) , and the learned approximate mapping can be evaluated at newly incoming σ's. The incomplete information of Λ σ due to a small L for one single σ is compensated by a large N ≫ 1 sampling of different σ's.

2. BACKGROUND, RELATED WORK, AND CONTRIBUTIONS

Classical iterative methods. There are in general two types of methodology to solve inverse problems. The first one is a large family of iterative or optimization-based methods (Dobson &

