SE PhD Prospectus Defense of Aditya Gangrade

TITLE: STRUCTURAL TESTING OF NETWORKS AND GRAPHICAL MODELS

ABSTRACT: High dimensional data in fields such as sociology, molecular biology, and neuroscience is inherently network structured - relations between people, groups of proteins, or individual neurons are captured by an underlying network that drives the bulk behaviour of these systems. Hypothesis testing of the network structure encompasses a basic problem in such fields. For instance, biologists may want to test if proteins interact differently in diseased populations as opposed to healthy ones, while neuroscientists may want to detect changes in the neuronal connectome in response to learning a task.

A baseline method to test models is to simply recover the underlying structures and compare them, the so called ‘testing via learning’ strategy. However, sucha strategy can often be profligate in terms of data costs due to the inherent high-dimensionality of the setting. For these testing tasks, one expects that directtesting methods should succeed with much less data, since we are only trying to infer very crude information about the networks.

The work investigates the data costs of structural testing in such settings, with a focus towards establishing separations between these and the correspondingcosts of recovery. Concretely, we consider two models in two distinct observational regimes. The first is the Ising Model, which captures settings where individualagents can be directly observed, but the links between them cannot, and the second is the Stochastic Block Model (SBM), which captures settings where thenetwork is directly observable and the goal is to reason about latent low rank structures such as clusters.

For these models, I will describe fundamental limits on how large a change needs to be to be detectable, and in particular detectable much more easily than the recovery of the same structure. This is done by both determining information theoretic lower bounds on the amount of data needed to allow reliable detection of changes of a given size, and by developing statistically efficient testing schema that match these bounds for the SBM, and for some classes of Ising models. The main observation is that testing of structural changes exhibits a biphasic behaviour. For both models, there is a critical size scale of changes such thatdetection of changes smaller than this is difficult, in the sense that the data costs are comparable to that of recovery, and conversely, the detection of changeslarger than this critical scaling is easy, in the sense that data costs vanish compared to the cost of recovery.

COMMITTEE: ADVISOR/CHAIR Bobak Nazer, SE/ECE; Venkatesh Saligrama, SE/ECE; Ery Arias-Castro, USCD Math; Daniel Sussman, Math & Statistics

When 4:00 pm on Monday, February 8, 2021
Location ZOOM