FURI | Spring 2025
Optimizing Large Language Models to Minimize Attention Latency for Increasingly Complex Inputs

This project aims to improve Large Language Models’ context retention and coherence by optimizing attention mechanisms, altering parameters, and amending inputs. Unlike previous studies, this approach focuses on modifying context windows and word graph parameters to augment the model’s processing of inputs beyond restricted context retention. A simplified version of an AI model will be designed and evaluated for improved attention based on relevance and output changes. By enhancing AI attention span, this research aims to advance LLMs’ contextual awareness, ameliorating the conversational style of generative AI. These findings could contribute to the broader advancement and reliability of AI applications.
Student researcher
Kiera Charlotte Lai
Computer systems engineering
Hometown: Pleasanton, California, United States
Graduation date: Spring 2028