Teaching AI to Personalize: Aldo Pacchiano Introduces a New Method for Adaptive Large Language Models
As large language models (LLMs) like ChatGPT become embedded in everyday life—from drafting emails to debugging code—the expectation that they understand how different people want to be helped is growing. Yet despite their sophistication, most of these models still treat all users more or less the same. This is where Aldo Pacchiano, Assistant Professor at Boston University’s Faculty of Computing and Data Sciences (CDS), saw a problem worth solving. “Current models optimize for what most people tend to like,” he explains, “but they’re not actually learning what you like.”

In a new paper titled “Language Model Personalization via Reward Factorization,” Pacchiano and his co-authors introduce Personalization via Reward Factorization (PReF): a framework that personalizes LLM responses to individual users without requiring extensive retraining or massive datasets. Instead of averaging preferences across millions of users, PReF gradually learns which traits the user values most, like humor, brevity, or formality. With just a handful of these comparisons, it builds a personalized profile that guides the model’s future responses to better match the individual’s style. This approach offers a more flexible and efficient alternative to default systems like GPT-4o and moves toward AI that could better understand and serve each individual, not just the average user.
To test how well this lightweight personalization could work, Pacchiano and his team first trained PReF on synthetic users, essentially role-played personalities created by prompting a language model to favor specific traits. Each simulated user chose between dozens of response pairs, gradually revealing their underlying preferences. “It’s a trick,” he explains, “but it works. You tell the model, ‘Pretend you’re a user who likes things this way,’ and that generates reliable training data.” Once PReF could consistently adapt to these synthetic users, the team moved on to a more difficult benchmark: real people. Using the PRISM dataset, which includes thousands of user profiles and prompt–response interactions, they tested whether the model could still match individuals’ preferences even when the traits weren’t cleanly labeled or pre-defined. The results were promising: with just 10 to 20 response comparisons, PReF could generate replies that real users preferred over GPT-4o’s standard outputs.
While general-purpose tools like ChatGPT allow users to adjust tone or style by tweaking their prompts, Pacchiano believes this places too much burden on the user. In contrast, he sees PReF as especially useful in applications where users either don’t know how to steer the model or won’t interact with it frequently enough to learn how. “If you’re deploying a customer support bot, or even an internal tool like a Slack assistant, the user might only interact with it once,” he notes. “But in that short time, the system still needs to feel helpful, efficient, and intuitive.” In these settings, PReF’s ability to adapt automatically, without explicit instruction, offers a major advantage. Whether the user prefers long explanations or direct answers, formal language or casual phrasing, the model can learn and adjust accordingly, even with minimal input. This opens the door to more responsive AI agents in education, workplace tools, healthcare communication, and other domains where personalization can directly improve trust, usability, and satisfaction.
What this work ultimately raises is a deeper question: if AI is going to be everywhere, should it learn to meet us more than halfway? For Pacchiano, personalization is a shift in how we think about human-AI interaction. His next steps involve extending this adaptability beyond style and tone, toward reasoning and decision-making in real-world settings. He imagines systems that don’t just tailor their language, but learn how to move through space, solve unfamiliar problems, and adapt with limited feedback. That may sound far from a simple pairwise comparison, but the principle is the same. If AI can start by listening more closely to how we prefer to be answered, perhaps it can also learn how we prefer to be helped.
- Neeza Singh (CDS'25), CDS Research Communications Intern