CISE Seminar: Julio Castrillon, Research Assistant Professor, Boston University

  • Starts: 3:00 pm on Friday, February 24, 2023
  • Ends: 4:00 pm on Friday, February 24, 2023

Stochastic coordinate transformations in Hilbert subspaces for anomaly detection and machine learning

For many problems in machine learning feature information can be diffusive, making detection or classification a hard problem. This can also be true for overlapping feature information in the training set . However, class separation might exist in a more appropriate space. By applying a proper stochastic coordinate transformation to standard features, new separations between classes (or anomalies) can potentially be identified. We will discuss connections between stochastic functional analysis, anomaly detection and machine learning. Based on tensor product representations and stochastic processes, new orthogonal nested subspaces are constructed for detecting anomalous signal components. This can succeed even if the behavior of the signal changes either globally or locally. This approach applies to general topologies, including geo-spatial, spatio-temporal, manifold, network, etc., or any combination. The nested subspace constructions involve efficient algorithms from computational applied mathematics and high performance computing. A mathematical approach is developed to locate and measure sizes of components to the class of interest based on multilevel subspaces. An effective hypothesis test is constructed that depends on the degree of the multilevel spaces and decay of eigenvalues of the covariance. No assumptions on independence nor distribution of the data are needed. We illustrate applications to Amazon deforestation detection in remote sensing. These new subspace features are also very novel in machine learning. Applications to cancer classification, synthetic biology, and Alzheimer subtyping are given, producing significant and sometimes dramatic increases in accuracy. In addition, tests from unbalanced datasets confirm increased accuracy as datasets become more unbalanced, in stark contrast to traditional ML where accuracy decreases.

Julio Enrique Castrillon Candas is a Research Assistant Professor in the Mathematics and Statistics department in Boston University. He has a Masters and PhD in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology. His interests include uncertainty quantification, machine and statistical learning, non-linear stochastic networks, (non-linear) stochastic partial differential equations, and anomaly detection. Applications include protein interactions, power systems (electric grid), deforestation (Amazons), and genomics.

Faculty Host: Mark Kon

Student Host: Ahmad Ahmad

8 Saint Mary's Street (PHO 203)