Large-scale modeling of pHLA complexes for structure-based immunogenicity prediction

Sae Hee Choi*, Martiela Vaz de Freitas, Rafael Lorenzo Belle, Didier Devaurs, Tiago Coelho Ferreto, Dinler Amaral Antunes

*Corresponding author for this work

Research output: Contribution to journalConference abstractpeer-review

Abstract

Background Cancer immunotherapies have become an effective tool by manipulating cellular immune response to fight cancer. A key step of cellular immunity involves peptides binding to the Human Leukocyte Antigen (HLA) molecules and forming a stable peptide-HLA (pHLA) complex. pHLAs are transported to the cell surface where they can be ‘inspected’ by T-cells, through highly specific T-cell receptors (TCRs). Understanding such interactions is central to designing peptide-based vaccines and T-cell-based immunotherapies. Key limitations to widespread use of cancer immunotherapies includes lack of immunogenicity of tumor-associated peptide-antigens, and risk of off-target causing immune-related adverse events. Different approaches are being explored to address and minimize this issue. Therefore, accurate prediction of pHLA immunogenicity is a critical area for advancing the design of cancer immunotherapies. However, current immunogenicity prediction tools are limited to identifying peptide motifs, neglecting critical structural features and TCR-specific recognition. We developed a structure-based machine learning tool that utilizes models of HLA-peptide-TCR complexes to extract features predictive of immunogenicity.

Methods A labelled dataset involving over 15 million pHLA sequences with experimentally-determined immunogenicity results was selected from a previously developed sequence-based immunogenicity prediction tool called BigMHC.1 APE-Gen 2.02 and Boltz-23 were used to generate structural models for each pHLA, to be utilized for the extraction of structural features. APE-Gen 2.0 was selected for its tailored, scalable pHLA modeling and docking-based scoring of conformational ensembles.2 Boltz-2, a novel AI-based tool, for outperforming Alphafold2 by incorporating improved biophysical refinement.3 In addition, it also predicts complex binding affinity using a new AI-based approach.

Results We modeled a pilot dataset of 77 complexes with binding affinity and immunogenicity labels using Ape-Gen 2.0 and Boltz2. Top scored conformation from Ape-Gen 2.0 ensembles are being analyzed to determine which structural features contribute to immunogenicity. Different properties and featurization approaches will be explored. We are also evaluating accuracy of affinity predictions on Boltz-2, in comparison to Ape-Gen 2.0 and Rosetta. Large-scale modeling of the entire dataset of 15 million complexes is ongoing.

Conclusions By integrating structural features of the entire pHLA complex we aim to overcome limitations of current sequence-based approaches and enable more accurate screening of therapeutic peptides, improving the safety of next-generation immunotherapies. Once modeled with both independent approaches (~30 million pHLA complexes), our dataset will be the largest available for machine-learning training. We will then explore how to further improve immunogenicity prediction by leveraging different structural sources, featurization methods and AI models.
Original languageEnglish
Pages (from-to)A1234
Number of pages1
JournalJournal for ImmunoTherapy of Cancer
Volume13
Issue numberSuppl 2
DOIs
Publication statusPublished - 4 Nov 2025
EventSITC 40th Annual Meeting - National Harbor, United States
Duration: 5 Nov 20259 Nov 2025

Keywords

  • Cancer immunotherapies
  • immunogenicity

Fingerprint

Dive into the research topics of 'Large-scale modeling of pHLA complexes for structure-based immunogenicity prediction'. Together they form a unique fingerprint.

Cite this