Community corner Jarvis Haupt

Jarvis Haupt

What are your current research interests?

Probably like many of us, it’s fair to say that overall my technical interests are pretty broad. That said, my specific research interests generally pertain to the development and analysis of solutions for various kinds of inverse problems. Some areas I’ve worked on recently include sparse inference (which generally aims to identify and exploit parsimonious intrinsic structure in data, to facilitate inference in high dimensional settings where data may possibly be scarce), tensor decompositions and their analyses (to identify simple latent structure in multidimensional data that can facilitate the drawing of accurate inferences), and deep learning (for example, trying to learn how neural networks trained via fairly simple algorithms like stochastic gradient descent are able to achieve their remarkable generalization abilities).

On the more applied side, one of our current thrusts explores the ability to image in certain non-line-of-sight settings. One easy way to imagine these kinds of problems is that you want to be able to use ordinary walls as if they were mirrors, to be able to “see” what’s hidden around occlusions, barriers, etc., or just simply to infer aspects of the scene in general that may be occluded or not otherwise directly visible. The key idea is that your camera is not directly observing the scenes of interest as it would in more traditional settings. What you get, instead, is light scattered from the scene and bounced off of other surfaces (like diffusely scattering walls). We’ve explored these kinds of problems from an information theoretic perspective, to quantify the “hardness” of certain parameter estimation tasks, and are continuing to explore variations of these problems. It turns out there is some intriguing and elegant math that arises in some of these settings, and we’re having fun ironing all of that out.

Last but not least, I’ve always had a bit of a passion for magnetic resonance imaging (MRI). We’re exploring a number of problems along these lines, including trying to produce accurate images in some unconventional settings (including, most recently, where the strong, static so-called “polarizing” magnetic field has non-negligible spatial inhomogeneities).

How do you define Data Science?

To me, Data Science is more of a concept than it is a specific discipline. I know by now it sounds a bit cliché to say, but data really are everywhere, coming from medical devices or scanners, embedded systems or wearable devices, sensors on Internet of Things (IoT) devices, the stock market, agriculture, weather, stresses and strains on structures, and so on, to name only a few. More generally, I believe it is an innate human desire to observe, quantify, and measure our world. If you buy that, and we take those processes as the first steps in a discovery process, then Data Science really is a suite of tools, methods, and ideas that address the natural next steps. We have data, now what do we do with it? The answers can be as disparate as the application areas, but they really are unified in the purpose of understanding and being able to make usable predictions. This is one of the fascinating aspects of Data Science – it unites so many different people, areas, and ideas with a common, innately human purpose. Very cool.

Can you share an interesting or surprising result you’ve found in your data?

I can think of a couple of instances. First, in one of our recent MRI-related works, we were dealing with problems that, at their essence, could be understood as regularized linear inverse problems. The challenge we faced was that the (discrete) linear operators in question were huge. Specifically, we were considering problems where one aims to image a 3D volume at a resolution of a few hundred voxels in each dimension, given a few thousand or tens of thousands of observations (resulting in operators having billions of entries, or more, even for modest spatial resolutions). The high dimensionality is less of a challenge in conventional settings since the observed data often have interpretations in the Fourier domain, and fast Fourier transforms allow one to “sidestep” the dimensionality issues. The complications come, for example, when imaging in the presence of modest polarizing field inhomogeneities and/or when using some alternative (what are called non-Fourier) imaging sequences. Using off-the-shelf tools for these kinds of problems is possible in theory, but not really in practice, since the dimension of the problem renders them infeasible. Maybe more interestingly, the linear system model inherent to these problems is generated sequentially (via the solution of the governing differential equations in MRI, which are known as the Bloch equations). When viewed as a (giant) matrix, this means being generated essentially row-by-row from top to bottom. This additional temporal aspect makes it challenging to apply methods like stochastic gradient descent – a go-to in “big data” processing. What we did, instead, was try to identify some latent structure in the linear system model as it was being generated, as a kind of pre-processing step, so that we could more efficiently operate with (and store!) a parsimonious approximation of the overall linear system model. At the outset, this was just an idea, but what was surprising to us was that there was quite a bit of structure in this data that we could efficiently exploit to enable inference in these challenging settings.

I can think of a few other examples that have popped up in the Machine Learning/Data Science course that I developed in Electrical and Computer Engineering. Throughout the semester we examine a fairly broad range of data sets for different purposes. I remember being surprised by some of the usage patterns we identified when clustering power consumption data (from smart meters), some of the (possibly predictable!) quasi-oscillatory behaviors in cryptocurrency price data at different time scales, and the (possibly actionable) intrinsic low-dimensionality of the feature data associated with one of the more widely used classification benchmark datasets.

Aside from some benign cases like white noise, it’s probably the case that if you can’t find any interesting structure in your data you’re not yet looking at it in the right way.

Are there any interesting new tools or libraries you or your students have been using?

One of the directions I’ve become intrigued by lately is the quest to parallelize code to make it GPU-ready. I love using our GPUs with packages like PyTorch (or TensorFlow) to explore neural network training, and those packages already do a fantastic job of parallelization behind the scenes. That said, I’m trying to bias myself toward thinking about parallel processing for the other code that I write for other projects. It’s not always possible, of course, but it’s a lot of fun when it is.

I “grew up” (academically speaking) programming in Matlab, and I’ve been impressed to see how they’ve remained relevant on this front. Matlab always had its place as a tool for rapid prototyping, for example, with users ready and willing to accept the fact that it is an interpreted language so slower than things like C/C++ that are directly compiled for a given architecture. In some ways Python is similar, and more on the “convenience” side of the spectrum than the “performance” side in general, though specialized application areas like the deep learning toolboxes I mentioned really do a nice job of integrating parallel processing effectively.

That said, there are some very cool new-ish tools that bring the efficiency of parallel programming to both of these languages, sometimes without the overhead of having to write (NVIDIA) CUDA code directly. Some examples are ‘arrayfun’ in Matlab for “single instruction multiple data” type computations, and the Numba package in Python, which does just-in-time compilation of Python functions for native architectures like GPUs. Packages like PyCUDA also allow for more direct implementation of CUDA into Python code, but the learning curve for that is a little steeper if you don’t come from a C/C++ background.

What are you most excited about in the field of data science in the next 5 years?

I would probably say AI, but maybe not in the sense that one might immediately think. By now we’ve probably all seen some of the incredible advances that AI can produce, from image generation to large language models and their applications, and even things like voice/speech and video synthesis. I think it’s safe to assume that the capabilities of these tools will only improve. In this sense, the power and potential of these tools is sort of taken as a given. 

For me, I’m interested to see more about how – and in many cases whether – society adopts them. It’s one thing to produce a tool. It’s quite another for people to embrace it and trust it enough to use it. Notwithstanding the great research that’s gone into the analysis of deep learning in recent years, I think it’s still fair to say that some of these powerful AI models and tools are still beyond our current understanding.

To bring it back to your question, this is an exciting prospect for me because I think we are at a kind of a “pause and reflect” moment that will (hopefully) usher in new advances in our understanding of how these types of AI tools function at their essence. If we (as a society) can be confident in the trustworthiness and reliability of these tools, they really do have the potential to usher in a new kind of technical revolution.

Catchup on the Latest News at DSI