COVID-19: predicted interactome

The 2019-2020 COVID-19 pandemic resulting from infection with the SARS-CoV-2 virus has claimed hundreds of thousands of lives. There is currently no effective treatment for individuals afflicted, and as such, there is a pressing need for therapies that can reduce the burden caused by infections with the virus.

While multiple groups are aiming to tackle this challenge through vaccine development, we believe that peptide therapeutics targeting proteins of the SARS-CoV-2 proteome hold great promise. Indeed, peptides have the potential to specifically disrupt protein-protein interactions (PPIs) involving SARS-CoV-2 proteins and human proteins. One of the most easily targetable interactions is that of hACE2 and Spike. Other interactions may constitute targets of high therapeutic value, and as such, we sought to predict the entire SARS-CoV-2-human interactome to identify novel targetable interactions. We ran PIPE4 and SPRINT on state-of-the-art computer clusters to predict the full interactome. This high-quality dataset is now freely available for all to use.

In addition to predicting the interactome, we leveraged PIPE-Sites to hone in on regions that are likely to mediate the interactions.

Our hybrid workflow leverages high-performance computing with wet lab experiments to identify and validate novel interactions, in order to design therapeutic peptides that disrupt interactions between the SARS-CoV-2 virus and human proteins.

Description of the dataset


You may consult our pre-print on BioRxiv for additional details on the dataset and for insight.

Project contributors

Kevin Dick

Kevin Dick is currently pursuing a PhD in biomedical engineering specializing in data science and bioinformatics as part of the Carleton University Biomedical Informatics Colaboratory (cuBIC) in Ottawa, Canada. His research interests include data science, machine learning, high performance computing, secodary use of autonomous vehicle data, and scientometrics.

Francois Charih

Fran├žois Charih is currently a PhD student in Electrical and Computer Engineering in the Carleton University Biomedical Informatics Colaboratory (cuBIC). His research interests include bioinformatics, applied machine learning and software development for applications in personalized medicine. Other areas of expertise include cloud computing, web development and science outreach.

Kyle K. Biggar

Dr. Biggar is an Assistant professor in the Department Biology and Institute of Biochemistry at Carleton University. His research focuses on the discovery and characterization of novel peptide inhibitors of protein interaction and funciton. Current research projects include the prediction and functional characterization of peptide inhibitors as novel cancer therapeutics, the use of systematic peptide arrays for the study of how proteins interact with each other, including how enzymes recognize substrates, and how reversible post-translational lysine methylation of histone and non-histone proteins regulates protein-protein interactions and function. Dr. Biggar is also the director of the Carleton Functional Proteomics Facility.

James R. Green

Dr. Green is a full professor in the Department of Systems and Computer Engineering at Carleton University. His research focuses on the application of machine learning to challenges in biomedical informatics, particularly in the presence of class imbalance. Current research projects include the prediction of protein structure, function, and interaction; the use of supervised and semi-supervised machine learning for the identification of microRNA in unique species; unobtrusive and non-contact neonatal patient monitoring; developing ML for audiology; applying computer vision to autonomous vehicle imagery; and the acceleration of scientific computing using parallel computing.


If you use our dataset in your work, please cite the following references:

	author = {Dick, Kevin and Biggar, Kyle K. and Green, James R.},
	publisher = {Scholars Portal Dataverse},
	title = "{Comprehensive Prediction of the SARS-CoV-2 vs. Human Interactome using PIPE4, SPRINT, and PIPE-Sites}",
	UNF = {UNF:6:LaU8vpF6y1UavvDQrlXzpg==},
	year = {2020},
	version = {V1},
	doi = {10.5683/SP2/JZ77XA},
	url = {}

Download the PPI dataset

Our dataset consists of the following:

  • The SARS-CoV-2-human interactome predicted with PIPE4 (one .tab file per SARS-CoV-2 protein)
  • The SARS-CoV-2-human interactome predicted with SPRINT (one .tab file per SARS-CoV-2 protein)
  • Matrices (.mat files) describing the contribution score of amino acids to the interactions
  • Heatmaps (.pdf files) illustrating the interacting regions of interacting proteins

Our data is made available to researchers around the world through the Scholars Portal Dataverse.