ASAP-SML: An antibody sequence analysis pipeline using statistical testing and machine learning
Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors....
Saved in:
Published in | PLoS computational biology Vol. 16; no. 4; p. e1007779 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
United States
Public Library of Science
01.04.2020
Public Library of Science (PLoS) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors. We develop a pipeline, Antibody Sequence Analysis Pipeline using Statistical testing and Machine Learning (ASAP-SML), to identify features that distinguish one set of antibody sequences from antibody sequences in a reference set. The pipeline extracts feature fingerprints from sequences. The fingerprints represent germline, CDR canonical structure, isoelectric point and frequent positional motifs. Machine learning and statistical significance testing techniques are applied to antibody sequences and extracted feature fingerprints to identify distinguishing feature values and combinations thereof. To demonstrate how it works, we applied the pipeline on sets of antibody sequences known to bind or inhibit the activities of matrix metalloproteinases (MMPs), a family of zinc-dependent enzymes that promote cancer progression and undesired inflammation under pathological conditions, against reference datasets that do not bind or inhibit MMPs. ASAP-SML identifies features and combinations of feature values found in the MMP-targeting sets that are distinct from those in the reference sets. |
---|---|
Bibliography: | new_version ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 The authors have declared that no competing interests exist. |
ISSN: | 1553-7358 1553-734X 1553-7358 |
DOI: | 10.1371/journal.pcbi.1007779 |