FURI | Fall 2023
Identifying the Optimal Orientation for Selecting Embedding Models for TCR-Epitope Binding Affinity Prediction
Accurate prediction of T cell receptor (TCR)-epitope binding results is crucial for personalized healthcare and immunotherapy. However, encoding TCR amino acid sequences into numerical representations remains challenging. A recent study demonstrated that catELMo, a bidirectional-LSTM-based model, outperformed Transformer-based models prevalent in Natural Language Processing. In this project, the reasons behind catELMo’s superior performance were investigated, exploring two possibilities: 1) bi-LSTM’s suitability for TCxfR analysis compared to Transformers, and 2) the impact of learning objectives. To further investigate these possibilities, GPT was trained, a Transformer-based model with the same learning objective as catELMo, as the embedding model and compared to previous models. Moreover, catELMo with vanilla LSTM was trained to identify the impact of bidirectional language models.