Biography and Research Interests

Brief Biography

X.Y. Han is a final-year PhD Candidate supervised by Adrian S. Lewis at Cornell ORIE; previously, he earned an MS from Stanford Statistics—where he began still-ongoing research mentored by David L. Donoho—and a BSE from Princeton ORFE. He discovered the now-widely-studied Neural Collapse phenomenon in deep neural network training (with V. Papyan and D. Donoho) and invented the Survey Descent method for nonsmooth optimization (with A. Lewis). For these works, he received the ICLR 2022 Outstanding Paper Award and was a finalist for the ICCOPT 2022 Best Paper Prize for Young Researchers. He maintains real-world collaborations with the Frick Art Reference Library in NYC, the USC Keck School of Medicine, and the Veolia North America utilities company. Broadly, his research on nonsmooth optimization and deep learning mathematically analyzes and builds new methods based on realistic phenomena observed in modern computational practices.

Full Biography and Research Interests

Nonsmooth Optimization

I am a final year PhD Candidate in Cornell’s Operations Research and Information Engineering (ORIE) Program under the supervision of Prof. Adrian Lewis. Recently, we developed the Survey Descent iteration that is a multipoint generalization of gradient decent for nonsmooth optimization. Specifically, for smooth, strongly convex objectives, classic theory guarantee the linear* convergence of gradient descent. An analogous guarantee for nonsmooth objectives is challenging: While traditional remedies such as subgradient and bundle methods are empirically successful, guarantees of their convergence have generally remained sublinear. We prove that Survey Descent achieves linear convergence when the nonsmooth objective possesses a “max-of-smooth” structure, while our experiments suggest a more general phenomenon. This work was selected as a finalist for the ICCOPT 2022 Best Paper Prize for Young Researchers.

(*In optimization, “linear” means “linear in log-scale”. So, linear convergence means error decreases exponentially in the number of iterations, while sublinear convergence means error decreases at a polynomial rate.)

In addition to Survey Descent, in an earlier work, we demonstrated that optimizing the numerical radius of square matrices often lead to disk matrix solutions—thereby discovering a simple, concrete illustration of previously abstract theorems claiming that the commonly-referenced proximal map snaps onto (“identifies”) partly smooth manifolds, even if the manifolds are low-dimensional compared to the ambient space. Moreover, this example is practically-motivated by numerical radius optimization—a fundamental operation in engineering control systems.

Deep Learning and Artificial Intelligence

Since 2017, I have also been conducting on-going research analyzing the mathematical geometry of deep neural networks under Prof. David Donoho, for whom I had worked (before moving to Cornell) as a PhD student. Most notably, Dave, Vardan Papyan, and I discovered the Neural Collapse (NC) phenomenon that occurs prevalently during the training of canonical deep classification networks with the popular cross-entropy loss. During NC, last-layer features collapse to their class-means, both classifiers and class-means collapse to the same Simplex Equiangular Tight Frame, and classifier behavior collapses to the nearest-class-mean decision rule.

In a recent follow-up, we demonstrate that NC also occurs in MSE-trained deep nets and present theory that decomposes MSE loss into NC-interpretable terms, elaborate upon the connection between NC and the classic signal-to-noise ratio (SNR), and derive explicit-form dynamics for the SNR. This work received the ICLR 2022 Outstanding Paper Award.

Dave, Vardan, and I also maintain an active collaboration with the Frick Art Reference Library in New York City where we apply new computer vision techniques to expedite their image cataloging process.

High-dimensional Statistics

Before that, during 2014-2016, I conducted statistical research at Princeton University on high-dimensional graphical inference under the supervision of Prof. Han Liu, for whom I worked as an undergraduate research assistant.

%d bloggers like this: