Overview

SEMA (Spatial Epitope Modelling with Artificial intelligence) is a platform to solve a number of immunology problems using artificial intelligence. SEMA provides two analyses: conformational B-cell eptiope prediction from the primary protein sequence or tertiary structure ("Find epitopes" tab); structural comparison of antigen epitopes ("Compare epitopes" tab).
Eptiope prediction tool involves the use of sequence-based (SEMA-1D) and structure-based (SEMA-3D) approaches. SEMA-1D model is based on an ensemble of Esm-1v transformer deep neural network protein language models. SEMA-3D model is based on an ensemble of inverse folding models, Esm-IF1. Both models were fine-tuned to predict the antigen interaction propensity of the amino acid residue with Fab regions of immunoglobulins. In addition, eptiope prediction tool comprises the model for prediction of N-glycosilation site based on the primary protein sequence. This model is based on the Esm-1v model. SEMA provides an interpretable score indicating the log-scaled expected number of contacts with antibody residues and labeled predicted N-glycosylation sites.
Epitopes comparison tool can identify structurally similar conformational epitopes between two antigens even if antigens have very low overall similarity. This tool is useful for comparison of proteins of different viral or bacterial strains and based on the neural network trained on embeddings derived from the ESM-IF1 model.

Usage

SEMA web-platform is open access and simple to use. But for analisys of comprehensive number of sequences we recommended to use python implementaion availible on GitHub.

"Find epitopes" tab

This tab allows to use conformational B-cell eptiope prediction from the primary protein sequence.
SEMA-1D

Input
SEMA-1D uses amino acid sequence as input data. A sequence can be submitted in two ways: paste or type a string of interest

Output
The output include predicted epitope score and N-glycosylation label for each residue in the AA sequence. Epitope score indicates to logarithm of contact number, that is calculated as the number of antibody residues in contact with any atom of antigen residues within the distance radius of 8 Angstrom. A sequence is colored according to the predicted epitope score from brown (not epitope) to cyan (epitope). The predicted N-glycosylated AA is marked with an asterisk.
User can download results in JSON or CSV format.

SEMA-3D

Input
SEMA-3D uses tertiary structure as input data. The user can submit a tertiary structure with a target chain or an entire structure:

Note: Please don't try to sumbit both PDB ID and pdb-file at once.

Output
The output include predicted epitope score and N-glycosylation label for each residue in the AA sequence. Epitope score indicates to logarithm of contact number, that is calculated as the number of antibody residues in contact with any atom of antigen residues within the distance radius of 8 Angstrom.
AA sequence is colored according to the predicted epitope score from brown (not epitope) to cyan (epitope). The tertiary structure of protein is shown with the same color gradient. The predicted N-glycosylated AA is marked with an asterisk in the sequence and shown as a sphere in 3D.
User can download results in JSON, CSV or PDB format.

"Compare epitopes" tab

This tab allows to compare structural of antigen epitopes in order to find similar parts.

Input
The tool uses tertiary structures as input data. The user can submit a tertiary structure with a target chain for the proteins under study. Each structure can be submitted in two ways:

Note: Please don't forget to enter the chain.

Output
The output include predicted similarity and epitope scores for each residue in the AA sequence for both proteins under study. Epitope score indicates to logarithm of contact number, that is calculated as the number of antibody residues in contact with any atom of antigen residues within the distance radius of 8 Angstrom. Similarity scores indicates the degree of similarity of the sites for each residue.
Similar parts of the teritary structures are painted in the same colors (similarity score > 2), while non-similar parts are shown in grey. Epitopes are shown as sticks, and their radius corresponds to epitope scores.
User can download results in JSON, CSV or PDB format.

Contuct

SEMA was developed at AIRI.
Contact e-mail: bioinformatics@airi.net

Citation

Shashkova, T.I., Umerenkov, D., Salnikov, M., Strashnov, P.V., Konstantinova A.V., Lebed, I., Shcherbinin, D.N., Asatryan, M.N., Kardymon, O.L., Ivanisenko, N.V. (2022). SEMA: Antigen B-cell conformational epitope prediction using deep transfer learning. Front. Immunol. doi: 10.3389/fimmu.2022.960985