From Frogs to Frontiers: CDS Freshman Publishes Research on Machine Learning and Conservation

While still in high school in Irvine, California, Nemai Anand set out to solve a surprisingly complex problem: how to identify frog species based on their calls despite having access to only sparse, noisy data. Driven by persistence and resourcefulness, he developed a machine learning solution on his home computer, working alongside his father, Anand Sampath. The result? A published research paper in the Journal of Emerging Investigators (March 2025) and the beginning of his academic journey at Boston University’s Faculty of Computing & Data Sciences (CDS), where he is now a freshman.

“It’s incredibly rare to see a student enter their undergraduate studies with a published research paper already under their belt—especially one that tackles a real-world challenge with such ingenuity,” noted CDS Associate Provost Azer Bestavros. “Nemai’s work reflects not only technical skill but also a curiosity and drive that embody the spirit of CDS. We’re excited to support his continued journey and look forward to the contributions he’ll make in the years ahead.”

Photo of BU CDS Nemai Anand
Boston University Freshman Nemai Anand (CDS'28)

Classifying the Unheard

Anand's research centers on building a machine learning classifier capable of identifying frog species in the Western Ghats—a mountain range in India known as a biodiversity hotspot and known for its large number of endemic amphibians. “Because so many species in this region have very few recorded calls, there’s a real challenge in training accurate models,” Nemai explains. “We wanted to test how well a classifier could perform under data-scarce conditions.” Because many frog species in the region are under-recorded, researchers lack the kind of robust datasets that are ideal for training machine learning models.

“I was interested in computer science because solving problems in coding was like a fun puzzle,” Nemai said.  His father, Anand Sampath, suggested the idea of building a frog call classifier. They chose to focus on the mountain range along the southwest coast of India, known for its dense biodiversity and high concentration of endemic amphibians.

Anand’s challenge was to build a system that could learn to recognize these species anyway. With only limited data available, Anand tested four data augmentation techniques: pitch shifting, time stretching, noise injection, and spectral augmentation. These methods were applied to expand the training set artificially. While spectral augmentation worked best individually, the real gains came from combining methods. “Each technique perturbs the data differently,” Anand explained. “Combining them made the dataset more diverse, and the model learned better from it.”

Why Frogs?

Frogs are often described as biological alarms. Their sensitivity to environmental changes makes them ideal indicators of ecosystem health. But monitoring them in dense, remote regions like the Western Ghats is labor-intensive and inefficient. Anand’s classifier could serve as the backbone of a lightweight, automated tool for researchers in the field: a way to detect species presence without intrusive surveys or costly equipment.

“The Western Ghats are home to many endemic species, but there’s little data on each one,” Nemai noted. “Our goal was to see whether we could still build something useful with what little we had.”

Listen to the recording of a Kemphole Night Frog.

Building Without Blueprints

What stands out most about the project is how independently it came together. With no formal advisor, lab access, or institutional resources, Anand built the entire study from the ground up, relying on self-teaching, experimentation, and occasional input from people he reached out to online. The final study was published in The Journal of Emerging Investigators, a peer-reviewed journal for high school researchers. Since its release, the work has drawn interest for its interdisciplinary approach and its promise as a model for low-cost, tech-enabled conservation tools.

........

“I want to keep exploring how data science can be used across disciplines—from conservation biology to infectious disease...and I’m just getting started,” Nemai Anand (CDS'28)

........

From Amphibians to Antibodies

Now at BU, Nemai (CDS’28) continues pursuing his interests in machine learning and bioinformatics. He plans to conduct research at the National Emerging Infectious Disease Laboratories (NEIDL) under Dr. John Misasi, where he’ll apply machine learning techniques to questions in immunology and bioinformatics. His transition from ecological data to biomedical research reflects his interest in using computation to uncover patterns across disciplines.

For Anand, the next phase is about leveraging the flexibility of data science to pursue questions that matter, whether in a rainforest or a research lab. “I want to keep exploring how data science can be used across disciplines—from conservation biology to infectious disease,” he says. “There’s so much potential, and I’m just getting started.”

Neeza Singh, CDS Research Communications Intern; Maureen McCarthy, Contributor