FURI | Fall 2023

Identifying the Optimal Orientation for Selecting Embedding Models for TCR-Epitope Binding Affinity Prediction

Health icon, disabled. A red heart with a cardiac rhythm running through it.

Accurate prediction of T cell receptor (TCR)-epitope binding results is crucial for personalized healthcare and immunotherapy. However, encoding TCR amino acid sequences into numerical representations remains challenging. A recent study demonstrated that catELMo, a bidirectional-LSTM-based model, outperformed Transformer-based models prevalent in Natural Language Processing. In this project, the reasons behind catELMo’s superior performance were investigated, exploring two possibilities: 1) bi-LSTM’s suitability for TCxfR analysis compared to Transformers, and 2) the impact of learning objectives. To further investigate these possibilities, GPT was trained, a Transformer-based model with the same learning objective as catELMo, as the embedding model and compared to previous models. Moreover, catELMo with vanilla LSTM was trained to identify the impact of bidirectional language models.

Student researcher

Aiko Muraishi

Computer science

Hometown: Miyama, Fukuoka, Japan

Graduation date: Spring 2024