The Database of Antimicrobial Activity and Structure of Peptides (DBAASP) has been created to provide users with detailed information on the chemical structure and biological activity of peptides tested experimentally against particular targets. The database is manually curated and contains information on ribosomal, nonribosomal, and synthetic peptides that show antimicrobial activity as Monomers, Multimers, and Multi-Peptides.

In DBAASP:

Ribosomal peptides are peptides which amino acid sequence is genetically encoded and naturally produced despite its termini modification.

Nonribosomal peptides are a class of peptide secondary metabolites, usually produced by microorganisms like bacteria and fungi. They are not synthesized by ribosomes.

Monomer - consists of one polypeptide chain.

Multimer - consists of two or more polypeptide chains with an interchain covalent bond(s).

Multi-Peptide - consists of two or more distinct polypeptide chains in equimolar concentrations without interchain covalent bonds.

Utilities of the database are: Search , , , , API

In Statistics, there are presented three kinds of data: General Data, Compositional Data, Physicochemical Data.

General Data consists of data on a) compositions according to the type of synthesis, complexity, target groups etc.; b) distribution of lengths.

Compositional Data consists of data on a) amino acid compositions b) Statistics of the occurrence of sequential pairs of residues.

Physicochemical Data consists of data on the distributions of values of different physicochemical parameters such as hydrophobicity, amphipathicity, isoelectic point, etc.

Property calculation allows calculating various physicochemical characteristics of peptides such as hydrophobicity, hydrophobic moment, net charge, isoelectric point, etc.

API describes how the database can be accessed with programs. All resources (individual entries as well as sets of entries retrieved by queries) are accessible using simple URLs.

Prediction predicts antimicrobial and hemolytic/cytotoxic activity of a peptide.

Users are provided with information on synonyms (based on the NCBI Taxonomy Database). Information about the existence of synonyms of particular target species can be obtained with the search tool.

For detailed information see the links below:

Search

Page Search includes several options:

ID(s)

Peptides can be found using one (77), several (77, 99) or interval (77-99) of ID.

Name

In this field, either full or part of peptide name should be written.

Sequence

Search by a sequence can be performed using two options: 1) Full Sequence and 2) Part of Sequence (peptide can be found by the fragment of amino acid chain).

Sequence Length

Finds peptides having the sequence of a specific range of length.

N Terminus, C Terminus

Peptides can be found according to their termini modification. Modification types are presented in the dropdown menu. The Description of each type is given below in Abbreviations.

Complexity

The peptides are divided into three types by complexity - Monomers, Multimers and Multi-Peptides. Complexity type should be selected from the dropdown menu.

Unusual Amino Acid

This field allows finding peptides containing unusual, posttranslationally modified, or artificial amino acids. Such amino acids are presented in the dropdown menu (see abbreviations).

Bond

Finds monomers with intrachain covalent bonds. Bond type should be selected from the dropdown menu (see abbreviations).

Synthesis Type

Finds peptides according to their synthesis type (ribosomal, nonribosomal and synthetic). Synthesis type should be selected from the dropdown menu.

Kingdom

This field allows finding peptides according to the kingdom level of their source organism. Kingdoms can be selected from the dropdown menu.

Source

This field allows selecting peptides according to the name of the source organism. Latin name (or part of the name) of the peptide source organism should be written.

Target Group

Selects peptides according to the groups of target species. To select more than one group keep the Ctrl key pressed and select the groups by mouse clicking.

Target Object of Cell

Target Object of Cell is the subcellular structure or molecule that peptide interacts with. Selection can be done as described in Target Group.

Target Species

Selects peptides according to the target species (bacteria, fungus, virus, cancer). Name or part of the name of the target species can be used for searching.

Nonstandard Experimental Conditions

Selects peptides which activities are measured in nonstandard salt or pH conditions.

Hemolytic and Cytotoxic activities

Selects peptides with hemolytic and/or cytotoxic activity.

UniProt ID

Selects peptides according to UniProt ID. We distinguish three types of UniProt IDs Peptide ID, Precursor ID, Probable Precursor ID.

1) UniProt ID is defined as "Peptide ID" if the amino acid sequence coincides with the sequence in UniProt entry.

2) UniProt ID is defined as “Precursor ID" if the amino acid sequence corresponds to the part of UniProt entry which is defined as a precursor.

3) UniProt ID is defined as “Probable Precursor ID" if amino acid sequence corresponds to the part of UniProt entry which can be considered as a precursor.

3D Structure

Selects peptides that contain link to PDB database and/or information about MD model.

Synergy

This field allows finding peptides being synergistic with other peptides or with antimicrobials(including antibiotics). Antimicrobials are presented in the dropdown menu. To get information on synergistic data for the peptide one should use the combination of the ID of peptide and option "All with data on synergy" from drop the dropdown of "Synergy". The table "Synergy between current peptide and Antimicrobials" in the peptide-card gives information on synergistic relations between query and other peptides.To get information on synergistic data for antimicrobials, one should choose the name of antimicrobial only.

Search result

Each row of the table corresponds to the particular peptide. Each row contains information on peptide ID, name, N terminus modification, sequence, C terminus modification.The Lowercase letter indicates D amino acid. Posttranslationally modified, unusual or artificial amino acids are depicted by X or x. (Exception is disulfide bond of cysteine where cysteines are depicted by C or c). In order to get full information on peptides one should click on View at the right edge of the row.

Ranking Search

Ranking Search provides information about peptide activities against specific target species/cells and activity/lysis measures ranked by activity value that can be selected from the dropdown menu. They can be used in combination with other search options: Sequence Length, N Terminus, C Terminus, Complexity, Unusual Amino Acid, Bond, Synthesis Type, Kingdom, Source, Medium, CFU. For ranking search, some activity measures are systematized.

Hemolytic/cytotoxic measures of activity

In literature, several measures are in use for cytotoxicity and hemolysis. ECn (n=25, 50), HCn (n=10, 50, 100), HDn (n=50), HLn (n=50), LCn (n=50), LDn (n=50) are used as measures of hemolysis. They define a peptide concentration at which n% erythrocyte lysis occurs. In addition, MHC is the minimal concentration of peptide that causes no observable hemolytic activity. Some authors define MHC as a peptide concentration that causes n% hemolysis (n=0–10). However, it is reasonable to standardize measures for Ranking search. Thus, the common term – n% hemolysis – is used instead of various denotations.

Antiviral activity measurements

The antiviral activity of AMP should be distinguished from its antibacterial activity. Bacteria are fully self- sufficient organisms and able to grow without host cells. Therefore, bacteria growth inhibition or killing is measured directly. Viruses are not self-sufficient and cannot reproduce without a host cell. Consequently, inhibition of different processes connected to viral multiplication in the host cell is related to the activity of integrase, reverse transcriptase, protease, replication, cell fusion/entry, etc. In literature, characteristic measures of these processes are IC50 and EC50.

In ranking search, activity measurements – integrase activity, reverse transcriptase activity, protease activity, and virus replication – are systematized as follows:

1) Integrase 3'-end processing and strand transfer measurements are designated IC50 IN 3’EP and IC50 IN ST, respectively.

2) Retroviral reverse transcriptase has three activities: RNA-dependent DNA polymerase, DNA- dependent DNA polymerase, and ribonuclease H. For these activities, we use IC50 RT RDDP, IC50 RT DDDP, and IC50 RT RH, correspondingly.

3) Activity measures for protease activity, replication, cell-cell fusion, viral entry, and infection are denoted by IC50 PR, IC50 REP, IC50 F, IC50 E, and IC50 I, respectively. Other activity measures do not take part in the ranking search.

Medium

As antimicrobial activities are found to be different under various conditions, the Peptide Card includes the information on Medium and CFU.

Ranking search provides available information on the antimicrobial activities of peptides against the specific target species in the particular medium and with definite CFU.

The prediction section predicts the antimicrobial activity of peptides. The website contains three types of prediction tools: Prediction of general antibacterial activity, Prediction of activity against specific microbial species, and Strain-specific AMP prediction based on microbial genome analysis.

Prediction of general antibacterial activity is a tool for predicting the antimicrobial potential of only linear peptides active against some bacterial strain. It is based on the machine learning algorithm. Initially, the following physicochemical characteristics of peptides and hydrophobic scales were used: physicochemical characteristics: Normalized Hydrophobic moment, Normalized Hydrophobicity, Charge Density, Isoelectric Point, Penetration Depth, Orientation of Peptides relative to the surface of the membrane (Tilt angle), Propensity to Disordering, Linear Moment, in vitro aggregation (Tango) and in vivo aggregation (Aggrascan); hydrophobic scales: MF – Moon and Fleming scale [1], KD – Kyte and Doolittle scale [2], WW – Wimley, and White scale [3], EW – Eisenberg and Weiss scale [4], UH – Unified Hydrophobicity scale [5], HW – Hessa and White scale [6].

Finally, MF hydrophobic scale and the following characteristics were selected: Hydrophobic moment, Charge density, and depth-dependent potential (for the details, see [7]). The peptide should consist of 20 canonical amino acids and sequence be in FASTA format.
Prediction of activity against specific microbial species is a tool for predicting the antimicrobial potential of Linear AMPs against particular species. Active peptide exhibits MIC<25 µg/ml, and Non-Active peptide exhibits MIC>100 µg/ml. The strain can be selected from the drop-down menu. The number of stains will permanently rise. The length of the peptide should not exceed 30 amino acids. The server produces one of the two predictive values: positive predictive value (PPV) for peptides predicted as active and negative predictive value (NPV) for peptides predicted as not active. Taking into account that regarding the activity against Human erythrocytes, non-active peptides are considered positive, the server generates PPV for peptides not active against Human erythrocytes, and NPV – in another case.

A semi-supervised machine-learning approach relied on the density-based clustering algorithm DBSCAN was developed to optimize the predictive model. Moon and Fleming hydrophobic scale and the following characteristics are used in the QSAR study: Normalized Hydrophobic moment, Normalized Hydrophobicity, Charge, Isoelectric Point, Penetration Depth, Orientation of Peptides relative to the surface of the membrane (Tilt angle), Propensity to Disordering, Linear Moment and In vitro aggregation [8].
Strain-specific AMP prediction based on microbial genome analysis is a tool for predicting the antimicrobial potential of Linear AMPs active against particular strains. The prediction algorithm uses information on the genomic sequences of specific microbial strains. Active peptide exhibits MIC<25 µg/ml, and Non-Active peptide exhibits MIC>100 µg/ml. The microbial strains can be selected in two ways: from the drop-down menu, or the user can provide the genomic sequence of the desired strain that can be uploaded by the user or accessed through GenBank ID. It is also possible to predict activity against several microbial strains: some can be selected from the drop-down menu, and the user can add one more strain using the corresponding genome sequence. The length of the peptide should not exceed 30 amino acids.

Predictive models for the tool have been developed based on the comparative analysis of various machine learning algorithms. The most accurate and stable model for each strain is used in the AMP predictor. For the considered strains Random Forest or RealAdaBoost machine learning algorithms (depending on the target strain) give the best prediction performance and are used in the AMP predictor. RealAdaBoost algorithm is used for the prediction of AMPs against new strains, which are provided by the user via uploading the genomic sequence of the desired strains.
This section allows calculating the following physicochemical properties of peptides: Normalized Hydrophobic Moment, Normalized Hydrophobicity, Net Charge, Isoelectric Point, Penetration Depth, Tilt Angle, Disordered Conformation Propensity, Linear Moment, Propensity to in vitro Aggregation, Angle Subtended by the Hydrophobic Residues, Amphiphilicity Index, and Propensity to ppII coil.

Normalized Hydrophobic Moment – Peptide hydrophobic moment in the α-helical approximation. Calculated by Eisenberg formula (Eisenberg et al, Nature. 1982, 299, 371-374).

Normalized Hydrophobicity – Calculated as the ratio (a/b) of the sum of free energy of peptide amino acid residues transferred from the polar to the nonpolar environment (a) to the peptide sequence length (b).

Net Charge – The sum of charges of charged amino acid residues (K, R, D, E) of the peptide at neutral pH; calculated according to the method described in https://pepcalc.com/notes.php?all.

Isoelectric Point – Calculated according to the method described in https://pepcalc.com/notes.php?all.

Penetration Depth – Calculated as the distance-dependent potential according to the method described in Senes et al., J. Mol. Biol. 2007, 366, 436-448, and Vishnepolsky and Pirtskhalava, J. Chem. Inf. Model. 2014, 54, 1512-1523.

Tilt Angle – Calculated as the distance-dependent potential according to the method described in Senes et al., J. Mol. Biol. 2007, 366, 436-448, and Vishnepolsky and Pirtskhalava, J. Chem. Inf. Model. 2014, 54, 1512-1523.

Disordered Conformation Propensity – Calculated based on the balance between hydrophobic and positively charged residues according to Uversky's formula (Uversky et al., Proteins: Struct. Funct. Gen. 2000, 41, 415-427).

Linear Moment – Describes the separation of hydrophobic and hydrophilic residues along the peptide chain; calculated according to the method described in Vishnepolsky and Pirtskhalava J. Chem. Inf. Model. 2014, 54, 1512-1523.

Propensity to in vitro Aggregation – Calculated based on the TANGO software (Fernandez-Escamilla et al. Nat. Biotechnol. 2004, 22, 1302-1306).

Angle Subtended by the Hydrophobic Residues – Calculated based on helical wheel representation of the polypeptide chain in the ideal α-helical approximation.

Amphiphilicity Index – Calculated as the ratio of the sum of amphiphilicity of peptide amino acid residues to the peptide sequence length (Kawashima and Kanehisa, Nucleic Acids Res. 2000, 28, 374).

Propensity to PPII coil – Calculated as the ratio (a/b) of the sum of propensity to PPII coil of peptide amino acid residues (a) to the peptide sequence length (b) (Adzhubei et al., Biochem. Biophys. Res Commun. 1987, 146, 934-938).

The following properties: Normalized Hydrophobic Moment, Normalized Hydrophobicity, Linear Moment, and Angle Subtended by the Hydrophobic Residues can be calculated for different hydrophobic scales: MF – Moon and Fleming scale (Moon, C. P.; Fleming, Proc. Natl. Acad. Sci. U. S. A. 2011, 108 (25), 10174-10177), KD – Kyte and Doolittle scale (Kyte J., Doolittle R. F. J. Mol. Biol. 1982, 157 (1), 105-132), WW – Wimley and White scale (Wimley W. C., White S. H. Nat. Struct. Biol. 1996, 3, 842- 848), EW – Eisenberg and Weiss (Eisenberg D. at all Proc. Natl. Acad. Sci. U. S. A. 1984, 81 (1), 140-144), UH – Unified Hydrophobic scale (Koehler J. at all Proteins 2009, 76, 13-29), HW – Hessa and White scale (Hessa, T at all Nature 2005, 433, 377-381).

Values of the properties given in the Peptide Cards have been calculated for the KD scale.
Name Description
Name Description
The features of AMPs, players of the innate immune system, should be considered while developing new peptide-based antimicrobials. Synergism is a widespread phenomenon among AMPs expressed in multicellular organisms. Designed AMP has to interact with the defense cocktail of the host organism. Therefore, using a combination of synergistically interacting peptides or AMP and traditional antibiotics can increase efficiency. Consequently, studying the mechanism of synergy is essential to estimate the efficiency of designed AMP with host defense peptides.
Thus, to develop a new peptide-based antimicrobial and use a combination of AMP with traditional antibiotics, it is valuable to have information on the synergy between AMPs and between AMPs and antibiotics. Currently, the attention to synergy is very close. The data on the synergy between different AMPs and antibiotics is rapidly increasing. The current database version offers information on synergy organized in the specific table in the peptide card of particular peptides for which corresponding data are available. The table holds information on the synergy between peptide-card peptide and another peptide or antibiotic/antimicrobial agent. Data includes the combined activity of the specific peptide from the database with another peptide or antimicrobial against the particular target strain. The data is represented as the FICI (Fractional Inhibitory Concentration Index) value.
NCBI Taxonomy database synonyms