FURI | Fall 2024, Summer 2024

Next-Gen Immunotherapies: Analysis of Large TCR Repertoires to Create a TCR Cluster Benchmark Using the catELMO Embedding Technique

Health icon, disabled. A red heart with a cardiac rhythm running through it.

Accurate clusters in large-scale TCR datasets provide candidate disease-specific receptors and are vital to repertoire classification for personalized immunotherapy. This study investigates why catElmo, a bidirectional LSTM model, outperforms GIANA, a handcrafted static embedder, through an examination of various metrics, and noise mitigation strategies using one density and one distribution-based algorithm for the Unsupervised Learning task: clustering of TCR-Epitope data.

Student researcher

Muhammed Hunaid Topiwala

Computer science

Hometown: Bengaluru, Karnataka, India

Graduation date: Spring 2026