FURI | Summer 2025
Text-to-Image Models Do NOT “Know” How to Interpret “No”
Modern text-to-image (T2I) models often misunderstand negative instructions, generating the very objects, attributes, or relationships users ask to omit. The inability to interpret negation limits the precision and reliability of these creative tools. To address this failure, the study establishes a benchmark dataset of negation-inclusive prompts to systematically evaluate state-of-the-art (SOTA) T2I models. The methodology further involves assessing current automated T2I evaluation metrics and investigating the underlying causes of this negation deficiency. By establishing a critical benchmark that quantifies how poorly current T2I models handle negation, this work ranks current SOTA T2I models and provides a standard against which all future efforts to solve this problem can be measured.
Student researcher
Anish Pravin Kulkarni
Computer science
Hometown: Pune, Maharashtra, India
Graduation date: Spring 2026