Critical Assessment of Fully Automated Structure Prediction


EVALUATION OF CAFASP-DP METHODS


Number of targets evaluated:27
Number of single-domain targets:16
Number of two-domain targets:11


The performance of domain prediction methods has been carried out based on following measures:

Separate values are computed for the single-domain targets, the two-domain targets and all targets.

Ten CAFASP-DP methods have been evaluated:

  1. Adda
  2. Armadillo
  3. Biozon
  4. Dompred-domssea
  5. Dompred-dps
  6. Dopro
  7. Mateo
  8. SSep
  9. Robetta-ginzu
  10. Robetta-rosettadom


  1. Absolute Number of Correctly Predicted Targets
  2. In addition to the 10 domain prediction servers, three controls have been used: Control1, Control2 and Random. Control1 is computed by predicting all the targets as single-domain proteins and control2 is computed by predicting all the targets as two-domain proteins with a domain boundary at the centre of the sequence. The random control chooses between control1 and control2 at random. Finally, a consensus has been calculated based on a majority vote and some weighting scheme in case of a tie.

    A prediction is considered as correct if the number of predicted domains is correct. For two-domain targets, the predicted domains should be continuous, i.e, with no split domains.

    MethodsNumber of targets predicted as single-domainNumber of targets predicted correctly as single-domainNumber of targets predicted as two-domainNumber of targets predicted correctly as two-domainNumber of targets predicted as multi-domain (number of domains >2) or split-domains
    Control12716000
    Control20027110
    Random1481350
    ADDA2112511
    Armadillo329312
    Biozon2212413
    Dompred-Domssea2315430
    Dompred-DPS1712951
    Dopro1813851
    Mateo1184212
    SSep2014641
    Robetta-Ginzu1091364
    Robetta-Rosettadom13111163
    CONSENSUS1813752

  3. Sensitivity and Specificity for single, two-domain and all targets
  4. For each method, sensitivity (Sen) and specificity (Spec) have been calculated and plotted separately for single-domain targets, two-domain targets and finally for all targets as

    Sen = TP / TP+FN

    Spec = TP / TP+FP

    where TP: Number of true positives; FN: Number of false negatives and FP: Number of false positives

  5. Overlap Score (Single-Domain Targets)
  6. Targets Control1 Control2 Random Adda Armadillo Biozon Dompred-domssea Dompred-dps Dopro Mateo SSep Robetta-ginzu Robetta-rosettadom Consensus
    T0196 100.00 50.00 100.00 92.24 50.86 56.90 100.00 100.00 74.13 93.97 74.14 100.00 100.00 100.00
    T0198 100.00 50.21 50.21 100.00 X 48.09 100.00 54.46 87.23 97.02 100.00 53.62 100.00 100.00
    T0203 100.00 50.00 100.00 99.48 23.04 45.55 100.00 100.00 100.00 98.17 100.00 65.71 65.71 100.00
    T0204 100.00 50.14 50.14 98.86 27.20 27.64 60.40 61.25 52.14 41.88 56.13 55.84 55.84 56.98
    T0205 100.00 50.00 100.00 83.85 100.00 74.62 100.00 100.00 67.69 93.85 100.00 100.00 100.00 100.00
    T0207 100.00 50.00 50.00 91.03 100.00 100.00 100.00 100.00 96.15 91.03 97.44 100.00 100.00 100.00
    T0208 100.00 50.14 100.00 48.18 57.42 58.82 100.00 62.19 96.08 77.87 97.48 75.63 75.63 62.66
    T0212 100.00 50.00 50.00 96.03 52.38 59.52 100.00 100.00 93.65 94.44 96.03 100.00 100.00 100.00
    T0231 100.00 50.00 100.00 99.30 59.16 54.93 100.00 100.00 93.66 100.00 93.66 100.00 100.00 100.00
    T0234 100.00 50.30 50.30 100.00 21.97 67.27 100.00 100.00 100.00 42.42 100.00 100.00 100.00 100.00
    T0236 100.00 50.00 100.00 96.36 52.73 52.73 100.00 100.00 100.00 51.82 100.00 61.82 100.00 100.00
    T0246 100.00 50.00 50.00 98.59 27.01 40.96 100.00 100.00 100.00 35.31 100.00 100.00 100.00 100.00
    T0247 100.00 50.00 100.00 97.80 40.11 40.93 100.00 59.07 74.73 100.00 98.35 79.95 79.95 79.95
    T0252 100.00 50.00 50.00 98.71 27.10 56.13 100.00 100.00 99.36 26.12 96.77 57.10 57.10 100.00
    T0254 100.00 50.00 100.00 100.00 42.06 100.00 100.00 100.00 100.00 70.09 100.00 100.00 100.00 100.00
    T0270 100.00 50.20 50.20 91.97 47.39 46.19 100.00 100.00 90.36 46.19 48.19 100.00 100.00 100.00
    Average 100.00 50.09 75.05 88.85 42.26 50.15 97.53 87.73 85.97 59.89 91.14 79.22 86.44 92.22
    *X: No prediction

  7. Plot of overlap score vs the percentage of correctly predicted single-domain targets
  8. Overlap Score (Two-Domain Targets)
  9. Targets Control1 Control2 Random Adda Armadillo Biozon Dompred-domssea Dompred-dps Dopro Mateo SSep Robetta-ginzu Robetta-rosettadom Consensus
    T0199 74.67 75.15 74.67 74.56 X 98.82 91.42 92.01 70.41 69.53 88.17 92.01 63.61 89.65
    T0202 61.85 87.95 87.95 61.85 X 61.04 61.85 61.85 54.62 61.85 61.85 43.37 67.07 61.85
    T0209 50.63 99.58 50.63 87.87 60.25 75.73 50.63 87.87 67.78 79.50 88.70 89.12 50.63 89.12
    T0216 51.49 98.39 98.39 51.49 66.21 54.02 51.49 79.77 55.63 97.47 73.33 63.91 86.44 88.28
    T0222 84.72 65.42 84.72 84.72 59.25 34.32 98.39 87.94 84.72 84.72 96.78 93.30 93.30 93.30
    T0223 58.25 91.26 91.26 58.25 84.47 89.81 58.25 58.25 58.25 58.25 58.25 58.25 58.25 58.25
    T0225 52.14 97.86 52.14 51.79 81.07 55.00 52.14 52.14 29.29 52.14 52.14 70.36 70.36 52.14
    T0228 65.04 84.85 84.85 64.80 50.58 54.55 95.80 46.85 63.40 82.75 61.77 85.32 85.32 85.32
    T0229 68.12 81.88 68.12 68.12 92.03 68.12 68.12 68.12 84.06 68.12 68.12 94.93 94.93 92.03
    T0235 75.95 73.95 73.95 64.13 59.92 65.33 75.95 75.75 48.70 91.18 96.59 47.90 47.90 47.90
    T0255 56.60 93.08 56.60 56.60 74.84 90.57 56.60 56.60 74.84 45.28 49.06 95.60 95.60 56.60
    Average 63.59 86.31 74.84 65.83 57.15 67.94 69.15 69.74 62.88 71.89 72.25 75.82 73.94 74.04
    *X: No prediction

  10. Plot of overlap score vs the percentage of correctly predicted two-domain targets
  11. Plot of average overlap score vs the percentage of correctly predicted total targets

  12. Prediction Performance Separately on HM and FR targets
  13. Number of HM targets: 16
    Number of HM targets which are single-domain: 10
    Number of HM targets which are two-domain: 6

    Number of FR targets: 11
    Number of FR targets which are single-domain: 6
    Number of FR targets which are two-domain: 5

MethodsHM TargetsFR Targets
Number of targets predicted correctly as single-domainNumber of targets predicted correctly as two-domainNumber of targets predicted correctly as single-domainNumber of targets predicted correctly as two-domain
Control110060
Control20605
Random5233
ADDA8041
Armadillo2300
Biozon1311
Dompred-Domssea9162
Dompred-DPS7154
Dopro7263
Mateo5230
SSep9252
Robetta-Ginzu6432
Robetta-Rosettadom6452
CONSENSUS7263

Plot of Sensitivity vs Specificity for HM Targets

Plot of Sensitivity vs Specificity for FR Targets