FURI | Spring 2025
Merging Large Language Models: Threats and Opportunities

While Merged Generative AI Models offer promising avenues for creating domain-specific experts using significantly less compute resources, their safety and detectability are not formally evaluated in large language models. This threat, and merging as a whole, is especially relevant to open-source models (like DeepSeek and Facebook’s Llama) since they are publicly accessible and modifiable by anyone on the internet. Using real-world datasets, safety benchmarks, and diverse attack scenarios, researchers assess the impact of model merging techniques and quantify trade-offs between performance and safety. Findings are expected to contribute to designing safer and more reliable model merging techniques.
Student researcher
Aryan Vinod Keluskar
Computer science
Hometown: Mumbai and Hyderabad, India
Graduation date: Spring 2026