Abstract:
Drug resistance to almost all known antimalarials is widespread and is rapidly increasing. This resistance
is due to the over and misuse of these antimalarials, thus new antimalarial drugs are necessary
to help in the prevention and cure of this widespread disease. Continuous in-depth studies are being
done on a handful of putative targets for future exploitation and use, but not many resources are available
that focus on performing data mining and target identification on the complete malaria genome,
together with relations to chemical compounds.
The DISCOVERY Database is a web-based system, developed for the in silico selection of drug
target proteins and lead compounds. It is a database filled with malaria information and aspects that
might influence the druggability of a malaria parasite protein and guide a scientist in choosing the
right ligand for a protein. DISCOVERY can aid in attempting to predict the interaction of ligands
with proteins of interest, associating chemical compound with malaria proteins and selective chemical
similarity searches. It can be used to mine information on malaria proteins, predict ligands and compare
human and mosquito host characteristics.
DISCOVERY2 was developed in Java with NetBeans. The protein sequences for the Plasmodium
spp. included in DISCOVERY were downloaded from PlasmoDB; the Homo sapiens proteins were
downloaded from Ensembl and the Anopheles gambiae proteins was downloaded from VectorBase.
Even though DISCOVERY is primarily focused on Plasmodium falciparum it also contains information
for all proteins from Plasmodium vivax, Plasmodium yoelii, Plasmodium knowlesi, Plasmodium
chabaudi and Plasmodium berghei as well for the human vector and mosquito host. Protein information
includes sequences and annotations, functional predictions, gene ontology terms, orthology information,
structural information, metabolic pathways, predicted putative protein-ligand interactions,
druggability predictions and literature links. Chemical compounds are also included.
Recently approaches have illustrated the value of predicting the association of chemical compounds
with putative drug targets, especially when the targets of compounds, like the Glaxo Smith Kline
dataset with known activity against the parasite may be extrapolated, using protein-ligand interaction
databases, like ChemProt. DISCOVERY attempts to use a similar approach in associating chemical
compounds with malaria proteins, using sequence homology, and also selective chemical similarity
searches.
Chapter 1 of this dissertation is a literature review focusing on the in silico identification of potential
drug targets. It also mentions a few techniques/approaches with which to accomplish this as well as
target databases that can be used to help in the identification process. Chapter 2 describes the steps
taken to run and score the Plasmodium falciparum proteins in a high throughput manner through
DISCOVERY. Chapter 3 gives four case studies from DISCOVERY, a protein that had a low weighted
score, a protein with a very high weighted score and two proteins with weighted scores in between the
other two. And Chapter 4 concludes by looking at how researchers can use this study as a starting point.
In this dissertation, DISCOVERY2 was used, in conjunction with Taverna pipelines, to study all
Plasmodium falciparum proteins in a high throughput manner to be able to identify possible drug
targets that might be of importance for future drug identification.