Assessing and Improving Protein-Protein Interaction Prediction in E. coli

Public Deposited
Resource Type
  • This thesis evaluates and extends the state-of-the-art in sequence-based binary protein-protein interaction (PPI) prediction for bacterial species. Accurately predicting PPIs for bacteria enables researchers to quickly identify targets for developing antimicrobial drugs and expand interactome knowledge for bacteria. E. coli is used here as a model organism for bacteria. A systematic and unbiased evaluation of four classifiers, SPRINT, DPPI, DEEPFE, and PIPR is conducted on new E. coli datasets. Classifier enhancement is accomplished using a stacked reciprocal perspective (RP) classifier, a technique recently developed by the cuBIC lab. Cross-validation results improve by 16.6% for the area under precision-recall (auPR) curve compared to the best base classifier, which increases to 262.5% when considering a 1:100 positive-to-negative sample imbalance. The results of this thesis also indicate the need for new benchmark datasets, more bacterial PPI data, and consistent evaluation protocols to be followed for new PPI predictions.

Thesis Degree Level
Thesis Degree Name
Thesis Degree Discipline
Rights Notes
  • Copyright © 2022 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.
Date Created
  • 2022


In Collection: