Boston University’s EVAL Collaborative Names Amira Learning Winner of Inaugural EVAL EdTech Evaluation Challenge

Advancing Evidence-Based Literacy Reform in Massachusetts

As Massachusetts considers landmark literacy legislation requiring schools to adopt evidence-based reading instruction, Boston University’s Evidence-Based AI in Learning (EVAL) Industry Collaborative announced Amira Learning as the winner of its inaugural EVAL EdTech Evaluation Challenge. (March 3, 2026). The challenge was a national competition that identified AI-powered K–12 educational tools with preliminary evidence and readiness for gold-standard, independent evaluation.

“Generative AI is rapidly transforming K–12 education, offering new opportunities for teaching and learning, yet fewer than 10% of AI tools in classrooms have undergone independent validation,” says Ola Ozernov-Palchik, Ph.D., Director of the EVAL Collaborative. “That leaves educators, families, and policymakers uncertain about which products truly improve learning outcomes.”

Ola Ozernov-Palchik, Research Assistant Professor at Boston University Wheelock College of Education and Human Development; Core Faculty of the AI and Education Initiative; Director of the EVAL Industry Collaborative; Research Scientist at MIT’s McGovern Institute for Brain Research

The EVAL Collaborative was created to provide a trusted, independent evaluation infrastructure to address this gap. “Through the EVAL Challenge, we aim to encourage rigorous evidence generation and remove barriers that make gold-standard studies, like randomized controlled trials, difficult for companies to conduct,” says Dr. Ozernov-Palchik.

The timing is significant. State lawmakers have advanced legislation (H 4683 and S 2940) that would require Massachusetts districts to adopt literacy curricula aligned with five research-based components of reading instruction: phonemic awareness, phonics, fluency, vocabulary, and comprehension. These bills reflect growing urgency around student outcomes across the Commonwealth.

According to 2025 MCAS results, 68 percent of third graders in Massachusetts are not proficient readers. Seventy-nine percent of English learners in fourth grade scored “not meeting expectations” in English language arts, and only 38 percent of fifth graders are proficient readers. Overall, six out of ten third graders are not reading at grade level.

States that have embraced science-of-reading-aligned curriculum reforms have demonstrated measurable gains. New York City saw a 3.5 percentage point increase in reading proficiency following implementation of literacy initiatives, and several southern states have shown upward trends.

Dr. Ozernov-Palchik advocated for the Massachusetts literacy bill and now works with organizations across the state to develop implementation guidelines.

“This legislation represents a commitment to ensuring that every child in Massachusetts is taught to read using evidence-based approaches,” says Dr. Ozernov-Palchik. “But implementation must be guided by rigorous research. Schools deserve independent, gold-standard evidence about which tools genuinely improve outcomes.”

Innovative Cross-Disciplinary Randomized Controlled Trial

Reflecting a growing commitment among EdTech companies to ensure AI tools are effective, validated and equitable, the competition drew 32 applicants spanning literacy and math platforms, tutoring, assessment, and classroom support systems for students from early grades through high school.

Amira Learning was selected for its grounding in learning science and its readiness for large-scale classroom use. Amira’s AI-powered reading tutor listens to students read aloud and provides real-time feedback and targeted support across decoding, fluency, vocabulary, and comprehension.

As the Challenge winner, Amira Learning will undergo a rigorous randomized controlled trial (RCT) that will be designed and developed by a cross-disciplinary Boston University research team collaboratively building an innovative evaluation framework around the tool.

“We’re honored to be the inaugural winner of the EVAL AI in Learning Challenge,” said Mark Angel, CEO and Co-founder of Amira Learning. “Partnering with Boston University’s cross-disciplinary research team allows us to build on our validation work and contribute to a new standard for independent, high-quality evidence in AI and literacy.”

Rather than applying a standard, off-the-shelf RCT model, the BU team is co-designing an advanced, methodologically innovative randomized evaluation tailored to AI-enabled literacy tools—integrating expertise in computing, AI, biostatistics, economics, special education, and learning sciences.

The cross-disciplinary team includes:

  • Nathan Jones, Associate Professor of Special Education (Wheelock); Hariri Institute Faculty Affiliate
  • Mayank Varia, Associate Professor, Faculty of Computing & Data Sciences; Hariri Institute Core Faculty
  • John Liagouris, Assistant Professor of Computer Science (CAS); Hariri Institute Faculty Affiliate
  • Eshed Ohn-Bar, Assistant Professor of Electrical & Computer Engineering (ENG); Hariri Institute Core Faculty
  • Mark Chang, Adjunct Professor of Biostatistics (SPH); Founder of AGInception
  • Wei Jin, Assistant Professor of Biostatistics (SPH)
  • Shahar Lahad, PhD Student in Economics (CAS)

Together, the team is developing a next-generation RCT model that integrates:

  • Advanced statistical and biostatistical design
  • AI-informed measurement and real-time usage analytics
  • Privacy-preserving and secure data infrastructure
  • Economic and policy-relevant impact analysis
  • Subgroup and equity-focused evaluation, including effects for English learners and students with disabilities

This collaborative design approach ensures that the evaluation framework itself advances the science of how AI tools should be tested in real-world classrooms—producing findings that are rigorous, transparent, and actionable for policymakers implementing Massachusetts’ literacy reform.

“By aligning rigorous, cross-disciplinary research with state policy reform, the EVAL Collaborative aims to ensure that AI tools adopted under Massachusetts’ proposed literacy legislation are not only innovative but demonstrably effective,” says Dr. Ozernov-Palchik.

EVAL EdTech Evaluation Challenge Winner & Finalists

2026 Winner:

Amira Learning (Amira): An AI-powered reading tutor that listens, assesses, and provides real-time feedback and micro-interventions to accelerate literacy growth.

Finalists:

  • Wolfram Research (Wolfram Tutor): An AI-powered algebra tutor that identifies student misconceptions and provides step-by-step assistance to improve math knowledge.
  • ClasseAI (ReadGenie): An AI-powered structured literacy app that assesses speech and handwriting and provides real-time feedback to promote literacy skills.

About the Evidence-Based AI in Learning (EVAL) Industry Collaborative

The Evidence-Based AI in Learning (EVAL) Collaborative partners with EdTech companies to rigorously test AI tools in K–12 classrooms, translating research into practice and generating trusted evidence. Co-sponsored by Boston University’s Wheelock College of Education & Human Development, the Hariri Institute for Computing and Computational Science & Engineering, and the AI & Education Initiative, EVAL advances foundational and translational research at the intersection of AI and learning—ensuring innovations are effective, equitable, and aligned with evidence-based educational practice.

Media Contact

Maureen Stanton, Assistant Director Communications, Hariri Institute for Computing
Email: stanton@bu.edu
Phone: 617-358-5973