Covariance Models for Genetic Cohorts with Complex Relatedness Structures - Quantitative Biology Seminar Series

  • Starts: 12:30 pm on Thursday, December 4, 2025
  • Ends: 1:30 pm on Thursday, December 4, 2025
Modern population genetic datasets are getting larger and are more likely to include individuals with different ancestries, admixture, and close and distant family relatives. However, many key statistical genetics analyses are not well suited for these data, especially when they assume that samples are drawn from one or a few homogeneous populations. My lab is focused on developing models for arbitrary relatedness. The key observation is that, while standard models assume that samples are independent, complex relatedness structures result in strong covariance patterns between genetic variants of individuals, so modeling such covariance is imperative. The main model of genetic covariance is called the kinship model. I will review previous models of relatedness for genetic association studies, particularly principal components regression and linear mixed-effects models (LMMs), and related mathematical tricks for decorrelating data. We have shown that LMMs are critical to model real datasets, which have numerous distant relatives. I developed an unbiased kinship estimator when there is population structure, which in practice yields more interpretable values, is crucial for accurate estimation of heritability, and permits fast admixture model fitting. Lastly, I describe our most recent work for modeling linkage disequilibrium under relatedness, extending the kinship model to cross-correlation between different loci and individuals. Our generalized model enables modeling of arbitrary relatedness in common polygenic risk score approaches. Overall, our work is laying the foundation for expanding the models that can be applied to all individuals in a dataset regardless of their Ancestry.
Location:
CDS 1646