With NIH Funding, Pawel Przytycki Seeks to Decode Cell State Progression
In most biology textbooks, cells live in tidy boxes: neurons, heart cells, immune cells, each with a fixed “type” and a stable identity. For Assistant Professor of Computing & Data Sciences Pawel Przytycki, those boxes feel too rigid.
“When you think about the cells in your body, you think of them being very specialized,” he says. “But a lot of biology doesn’t fit neatly into these categories of cell types, especially during disease and development.”
His new five year $2.2 million NIH project tackles that mismatch: instead of treating cell identity as a set of discrete labels, his lab is building tools to model continuous cell state to understand how cells gradually shift from one state to another over time, particularly in cancer, fibrosis, and development.
The idea traces back to Przytycki’s postdoc work, where meetings about brain development often devolved into neuron count debates.

“There was this constant back and forth,” he remembers. “The brain has 100 types of neurons. No, the brain only has 20 types. The brain has 500. Everyone had a different opinion of what the number of neuron types was.”
Even more interestingly, as single-cell datasets got larger, the number of “cell types” seemed to grow along with them.
“At some point, if more data just gives you more cell types, then clearly cell types just aren’t the right unit to operate on,” he says. “They’re not meaningful anymore if collecting more data just makes more of them.”
That pushed him to a more dynamic view. During development, stem cells gradually become neurons or heart cells through many subtle intermediate steps. In cancer, cells slowly drift away from their original identity as mutations accumulate. Those processes are not clean jumps between labels, but rather continuous journeys.
“Not all cells are constantly changing,” he notes. “Your heart cells, for example, are pretty stable. But we’re really interested in the times when cell state is changing, and in understanding the mechanisms that drive those transitions.”
Most single-cell analysis pipelines deal with huge, sparse, noisy data by clustering, which is grouping similar cells and averaging their measurements. That makes the data easier to work with, but it also erases trajectories that are present at a finer resolution.
“The main technical challenge is noise and sparsity at scale,” Przytycki says. “If you have sparse noisy data, you usually sum over many examples. But if you want continuous states, you’re trying to get around noise and sparsity and scale without just taking a cluster and summing over a bunch of cells.”
His solution is to leverage network-based methods. In his previous tool CellWalker, cells and annotations are nodes in a graph, and information flows along edges via random walks. Originally, that framework was built to map cells to discrete types.
“CellWalker was really focused on discrete cell states,” he says. “The whole design assumed you had distinct types.” More recently, his group developed a version that sits between hard clusters and smooth trajectories, using hierarchical cell states.
“We consider individual discrete states, but also how they might relate to each other,” he explains. “That captures the fact that some might be really distinct cell types, and some might be closely related states we shouldn’t treat as fully distinct.” What stays constant, however, is the graph perspective.
“You can think about walking along a graph,” he says. “You have all these cells, and they’re sparse and noisy, but if there are enough edges, as you walk around you still move in the right direction across the graph. Without ever aggregating, you’re using as much information as possible from a bunch of noisy signals together.”
A second pillar of the project is multimodal single-cell data, especially pairing gene expression (RNA) with chromatin accessibility (ATAC).
“We’re pretty sure that using these different modalities at the same time is crucial for capturing how cells are changing state,” Przytycki says. “If you look at just RNA expression, you won’t really capture these transitions so well.”
The NIH project has three main goals:
- Track how individual cells move along continuous trajectories, rather than forcing them into clusters,
- Study how cells organize and communicate with each other and how that shapes these trajectories, and
- Pinpoint the genetic changes like mutations, enhancers, and transcription factors that push cells along those paths.
The team is testing this in several systems. In cardiac fibrosis, for example, heart fibroblasts gradually shift from a resting state to an activated, scar forming state after stress. “Even if you look just at fibroblasts in the heart, you can see there are activated and non activated fibroblasts,” Przytycki says. “But it’s clearly not a discrete thing where you go from one type to another. There’s a progression across some trajectory.”
By working with collaborators who can knock out specific genes or enhancers in mouse models, the lab can see which elements actually control that shift. “If we remove this enhancer, does fibrotic activation still happen? That starts to tell us about the mechanisms of the change,” he says.
Looking ahead five to ten years, Przytycki envisions continuous cell state modeling moving from abstract trajectories to actionable levers.
“I think we’re going to finally start building an understanding of how cells transition, even if you still talk about discrete cell types like immature and mature,” he says. “In ten years, for at least some systems, we should understand those progressions well enough to actually target them.”
For fibrosis, that could mean pinpointing a step in the activation pathway where a drug could prevent or delay scarring.
For cancer, it might mean disrupting transitions that push cells toward more aggressive states. For development, the payoff is a deeper mechanistic map of how cells move from one fate to another: a map drawn not as a set of boxes, but as the continuous paths that connect them.
-- Shriya Jonnalagadda (CDS'28), Data Science Research Communications Intern