PepX

The Peptide eXpression annotator (pepX) takes a peptide as input, identifies from which proteins the peptide can be derived, and returns an estimate of the expression level of those source proteins from selected public databases (“Peptide/Gene Summary” tab on the results page).

PepX also accumulates those expression levels and provides an estimate for the abundance level of the peptide (“Peptide Summary” tab on the results page).

The PepX database currently contains all peptides from the Ensembl GRCh38, release 106. This fasta file was used to derive all possible peptides, excluding those that are shorter than 8 amino acids or contain ‘X’.

Parameter selection

PepX Parameters

  • Quantitation Level

    • Indicates to retrieve either gene-level or transcript-level TPM values from the expression data.

  • Data Source / Dataset

    • HPA

      • Pre-calculated gene-level and transcript-level TPM values for 256 healthy tissues were downloaded from the Human Protein Atlas (HPA) (Uhlen et al., 2010).

    • GTEx

      • Pre-calculated gene-level and transcript-level TPM values for 54 healthy tissue subtypes were downloaded from The Genotype-Tissue Expression (GTEx) project data portal (Carithers and Moore, 2015). Median TPM values were calculated for each of the 31 main tissue types.

    • TCGA

      • Pre-calculated gene-level and transcript-level TPM values for the TCGA Pan-cancer cohort for 33 cancer types were downloaded from the UCSC Xena data pages (Goldman et al., 2020).

    • CCLE

      • Pre-calculated gene-level and transcript-level TPM values for 1,019 cell lines were downloaded from the Cancer Cell Line Encyclopedia (CCLE) (Ghandi et al., 2019).

    • RNA-Seq data of a B721.221 cell line (Abelin et al., 2017)

      • Only gene-level expression data is available.

    • HeLa cell line (Cantarella et al., 2019).

      • Only gene-level expression data is available.

    Note

    All datasets were downloaded in July 2022.

Results

The PepX tool returns two tables: Peptide Table and Peptide Gene Summary Table.

1. Peptide Table (Gene)

This table has 1 row per input peptide. Data for all genes in which the peptide is found are collapsed here.

PepX Result 1

It is defaulted to show only Total Peptide TPM and Gene Symbols. However, through Display Columns, more information can be displayed on the table.

Many of the fields are lists of values derived from the peptide/gene summary table, where you will find associated descriptions.

Field Description Example
Peptide Peptide sequence MQKEITAL
Gene Symbol List of gene symbols where peptide is found. ACTB;ACTA2;ACTA1;ACTC1;ACTG2;ACTG1
Total Peptide TPM Sum of Peptide TPMs for all genes. 9093.988
Median Peptide TPM Median Peptide TPM for all genes. 37.008
Total Scaled Peptide TPM Sum of Scaled Peptide TPMs for all genes. 6498.506
Median Scaled Peptide TPM Median Scaled Peptide TPM for all genes. 12.408
Gene ENSG IDs List of corresponding Ensembl gene identifiers. ENSG00000075624;ENSG00000107796;ENSG00000143632;
ENSG00000159251;ENSG00000163017;ENSG00000184009
Gene TPMs List of Gene TPMs for corresponding genes. 5209;73.763;0.252;0.0045;0.048;3810.92
Peptide TPMs List of Peptide TPMs for corresponding genes. 5209.000;73.763;0.252;0.005;0.048;3810.920
Scaled Peptide TPMs List of Scaled Peptide TPMs for corresponding genes. 3062.892;24.563;0.252;0.005;0.021;3410.773
Proteins Encoded by Gene List of 'Proteins Encoded by Gene' for corresponding genes. 17;3;3;1;7;19
Proteins Containing Peptide (per Gene) List of 'Proteins Containing Peptide' for corresponding genes. 10;1;3;1;3;17
Fraction of Proteins Containing Peptide (per Gene) List of 'Fraction of Matching Proteins' for corresonding genes. 0.588;0.333;1.000;1.000;0.429;0.895
Gene Mean Occurrences per Protein List of 'Mean Occurrences per Protein' for corresponding genes. 1.000;1.000;1.000;1.000;1.000;1.000



2. Peptide Gene Summary Table

This table has 1 row per input peptide and matched gene.

PepX Result 2

Field Description Example
Peptide Peptide sequence HETTFNSI
Gene ENSG ID Ensembl gene identifier ENSG00000075624
Gene Symbol HGVS gene symbol ACTB
Proteins Encoded by Gene Number of proteins/transcripts associated with the gene 17
Proteins Containing Peptide Number of proteins/transcripts associated with the gene that also contain the peptide 9
Fraction of Matching Proteins Fraction of proteins/transcripts associated with the gene that also contain the peptide 0.529
Mean Occurrences per Protein The total number of occurrences of this peptide divided by 'Proteins Containing Peptide'.
This will usually be 1 except in unusual circumstances.
(e.g., low-complexity peptides, repetative genes, etc.)
1
Gene TPM TPM of the gene 5209
Peptide TPM Gene TPM x Mean Occurrenced per Protein 5209
Scaled Peptide TPM Gene TPM x Fraction of Matching Proteins 2755.561



3. Peptide Table (Transcript)

This table has 1 row per input peptide and matched transcript.

PepX Result 3

Field Description Example
Peptide See peptide summary for genes MQKEITAL
Gene Symbols See peptide summary for genes ACTA1;ACTA2;ACTB;ACTC1;ACTG1;ACTG2
Total Peptide TPM Sum of the Peptide TPMs for all transcripts in all genes where the peptide occurs. 5815.28
Median Peptide TPM Median Peptide TPM over all transcripts in all genes in which the peptide occurs. 0.83
Number of Genes Number of genes with transcripts encoding the peptide. 6
Number of Transcripts Number of transcripts encoding the peptide. 35
Gene ENSG IDs See peptide summary for genes. ENSG00000075624;ENSG00000107796;ENSG00000143632;
ENSG00000159251;ENSG00000163017;ENSG00000184009
Protein ENSP IDs List of Ensembl protein identifiers containing the peptide. ENSP00000224784;ENSP00000290378;ENSP00000295137;ENSP00000355644;
ENSP00000355645;ENSP00000386857;ENSP00000386929;ENSP00000407473;
ENSP00000458162;ENSP00000458435;ENSP00000459119;ENSP00000459124;
ENSP00000460464;ENSP00000460660;ENSP00000461407;ENSP00000461672;
ENSP00000466346;ENSP00000477968;ENSP00000493648;ENSP00000494269;
ENSP00000494750;ENSP00000495059;ENSP00000495995;ENSP00000496101;
ENSP00000501773;ENSP00000501862;ENSP00000502286;ENSP00000502821;
ENSP00000505060;ENSP00000505193;ENSP00000505235;ENSP00000506126;
ENSP00000506201;ENSP00000506253;ENSP00000508084
Number of Transcript Occurences List of 'Number of Occurrences' for corresponding transcripts. 1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1
Transcript TPMs List of individual Transcript TPM values. 0.37;0;0;0.04;0;0;0;14.62;1.26;2951.5;0.48;0.31;0;1.48;78.18;156.79;8.04;0.16;
34.47;5.07;2468.8;0.69;1.54;39.26;1.37;0;34.65;6.61;2.69;5.95;0.12;0;0.83;0;0
Transcript Peptide TPMs List of individual Peptide TPM values. 0.370;0.000;0.000;0.040;0.000;0.000;0.000;14.620;1.260;2951.500;0.480;0.310;0.000;
1.480;78.180;156.790;8.040;0.160;34.470;5.070;2468.800;0.690;1.540;39.260;1.370;
0.000;34.650;6.610;2.690;5.950;0.120;0.000;0.830;0.000;0.000



4. Peptide Transcript Summary Table

This table has 1 row per input peptide and matched transcript.

PepX Result 4

Field Description Example
Peptide See peptide/gene summary HETTFNSI
Gene ENSG ID See peptide/gene summary ENSG00000184009
Protein ENSP ID Ensembl protein identifer ENSP00000458435
Gene Symbol See peptide/gene summary ACTG1
Number of Occurences The number of times the peptide appears in the transcript/protein.
In most cases, this will be 1.
1
Transcript TPM TPM of the transcript. 2951.5
Peptide TPM Transcript TPM x Number of Occurrences. 2951.5