Teaching unit 3: A mathematical approach to machine learning in imaging

Lecturer: Xui-Cheng Tai

In this course, I will present recent research exploring several innovative directions in neural network design, grounded in mathematical modeling and algorithmic insights.

 First, we introduce a general framework for constructing neural networks via operator splitting schemes. Starting from a suitable control problem, we discretize it using a carefully designed splitting method. Unrolling this scheme naturally yields new network architectures. We demonstrate this approach with two examples: a simplified UNet and the recently proposed PottsMGNet, both of which emerge naturally from the discretization process.

 Second, we offer a new mathematical explanation of the widely used UNet architecture. While UNet has been immensely successful in image segmentation tasks, its underlying structure has lacked rigorous theoretical interpretation. We show that UNet can be viewed as a one-step operator-splitting method for a control problem. Each component of the architecture corresponds to an element in the control formulation, and multigrid techniques are used to decompose the control variables. This perspective not only explains the effectiveness of UNet but also connects it with numerical PDE methods.

 Third, we delve into shape representation and segmentation using neural networks, particularly through the lens of the PottsMGNet framework. Encoder-decoder architectures are prevalent in image processing, yet their mathematical foundations remain incomplete. We reinterpret these architectures using the two-phase Potts model, formulating the segmentation problem as a control problem in the continuous setting. The problem is then discretized—temporally via operator splitting (yielding PottsMGNet) and spatially via multigrid methods. This leads to a network structure that is provably equivalent to encoder-decoder architectures. PottsMGNet, with a soft-thresholding regularizer, demonstrates robustness to network width, depth, and high noise levels, outperforming or matching state-of-the-art networks in accuracy and Dice score.

 We further extend this framework to handle convex shape representation using level set methods. We derive necessary and sufficient conditions for level set functions to represent convex shapes and apply this to variational models for image segmentation. Efficient numerical algorithms are developed and validated through experiments. To improve segmentation in complex images, we incorporate landmark constraints—either enforcing that the boundary passes through specific points or that certain regions belong to foreground or background. These techniques are broadly applicable to convex shape optimization and can be adapted for other applications.