FURI | Spring 2025

Optimizing Large Language Models to Minimize Attention Latency for Increasingly‬‭ Complex Inputs

Data icon, disabled. Four grey bars arranged like a vertical bar chart.

This project aims to improve Large Language Models’ context retention and coherence by optimizing attention mechanisms, altering parameters, and amending inputs. Unlike previous studies, this approach focuses on modifying context windows and word graph parameters to augment the model’s processing of inputs beyond restricted context retention. A simplified version of an AI model will be designed and evaluated for improved attention based on relevance and output changes. By enhancing AI attention span, this research aims to advance LLMs’ contextual awareness, ameliorating the conversational style of generative AI. These findings could contribute to the broader advancement and reliability of AI applications.

View the poster