register interest

Dr Etienne de Villiers

Research Area: Bioinformatics & Stats (inc. Modelling and Computational Biology)
Technology Exchange: Bioinformatics
Scientific Themes: Genetics & Genomics
Keywords: Bioinformatics and Genomics
Web Links:

I am the Bioinformatics group leader at the Wellcome – KEMRI – Oxford collaborative research programme in Kilifi, Kenya where I established a Bioinformatics and Genomics platform to support the application of bioinformatics and genomics in research projects at the programme.

Name Department Institution Country
Professor Kevin Marsh Tropical Medicine Oxford University, NDM Research Building United Kingdom
Professor James A Berkley Tropical Medicine Oxford University, Kilifi Kenya
Jean Langhorne National Institute of Medical Research Mill Hill United Kingdom
Professor Eduard Sanders Tropical Medicine Oxford University, Kilifi Kenya
Dr Francis Ndungu Tropical Medicine University of Oxford United Kingdom
Bronkhorst AJ, Wentzel JF, Ungerer V, Peters DL, Aucamp J, de Villiers EP, Holdenrieder S, Pretorius PJ. 2018. Sequence analysis of cell-free DNA derived from cultured human bone osteosarcoma (143B) cells. Tumour Biol, 40 (9), pp. 1010428318801190. | Show Abstract | Read more

The true importance of cell-free DNA in human biology, together with the potential scale of its clinical utility, is tarnished by a lack of understanding of its composition and origin. In investigating the cell-free DNA present in the growth medium of cultured 143B cells, we previously demonstrated that the majority of cell-free DNA is neither a product of apoptosis nor necrosis. In the present study, we investigated the composition and origin of this cell-free DNA population using next-generation sequencing. We found that the cell-free DNA comprises mainly of repetitive DNA, including α-satellite DNA, mini satellites, and transposons that are currently active or exhibit the capacity to become reactivated. A significant portion of these cell-free DNA fragments originates from specific chromosomes, especially chromosomes 1 and 9. In healthy adult somatic cells, the centromeric and pericentromeric regions of these chromosomes are normally densely methylated. However, in many cancer types, these regions are preferentially hypomethylated. This can lead to double-stranded DNA breaks or it can directly impair the formation of proper kinetochore structures. This type of chromosomal instability is a precursor to the formation of nuclear anomalies, including lagging chromosomes and anaphase bridges. DNA fragments derived from these structures can recruit their own nuclear envelope and form secondary nuclear structures known as micronuclei, which can localize to the nuclear periphery and bud out from the membrane. We postulate that the majority of cell-free DNA present in the growth medium of cultured 143B cells originates from these micronuclei.

Hernández-de-Diego R, de Villiers EP, Klingström T, Gourlé H, Conesa A, Bongcam-Rudloff E. 2017. The eBioKit, a stand-alone educational platform for bioinformatics. PLoS Comput Biol, 13 (9), pp. e1005616. | Show Abstract | Read more

Bioinformatics skills have become essential for many research areas; however, the availability of qualified researchers is usually lower than the demand and training to increase the number of able bioinformaticians is an important task for the bioinformatics community. When conducting training or hands-on tutorials, the lack of control over the analysis tools and repositories often results in undesirable situations during training, as unavailable online tools or version conflicts may delay, complicate, or even prevent the successful completion of a training event. The eBioKit is a stand-alone educational platform that hosts numerous tools and databases for bioinformatics research and allows training to take place in a controlled environment. A key advantage of the eBioKit over other existing teaching solutions is that all the required software and databases are locally installed on the system, significantly reducing the dependence on the internet. Furthermore, the architecture of the eBioKit has demonstrated itself to be an excellent balance between portability and performance, not only making the eBioKit an exceptional educational tool but also providing small research groups with a platform to incorporate bioinformatics analysis in their research. As a result, the eBioKit has formed an integral part of training and research performed by a wide variety of universities and organizations such as the Pan African Bioinformatics Network (H3ABioNet) as part of the initiative Human Heredity and Health in Africa (H3Africa), the Southern Africa Network for Biosciences (SAnBio) initiative, the Biosciences eastern and central Africa (BecA) hub, and the International Glossina Genome Initiative.

Omedo I, Mogeni P, Rockett K, Kamau A, Hubbart C, Jeffreys A, Ochola-Oyier LI, de Villiers EP, Gitonga CW, Noor AM et al. 2017. Geographic-genetic analysis of Plasmodium falciparum parasite populations from surveys of primary school children in Western Kenya. Wellcome Open Res, 2 pp. 29. | Show Abstract | Read more

Background. Malaria control, and finally malaria elimination, requires the identification and targeting of residual foci or hotspots of transmission. However, the level of parasite mixing within and between geographical locations is likely to impact the effectiveness and durability of control interventions and thus should be taken into consideration when developing control programs. Methods. In order to determine the geographic-genetic patterns of Plasmodium falciparum parasite populations at a sub-national level in Kenya, we used the Sequenom platform to genotype 111 genome-wide distributed single nucleotide polymorphic (SNP) positions in 2486 isolates collected from children in 95 primary schools in western Kenya. We analysed these parasite genotypes for genetic structure using principal component analysis and assessed local and global clustering using statistical measures of spatial autocorrelation. We further examined the region for spatial barriers to parasite movement as well as directionality in the patterns of parasite movement. Results. We found no evidence of population structure and little evidence of spatial autocorrelation of parasite genotypes (correlation coefficients <0.03 among parasite pairs in distance classes of 1km, 2km and 5km; p value<0.01). An analysis of the geographical distribution of allele frequencies showed weak evidence of variation in distribution of alleles, with clusters representing a higher than expected number of samples with the major allele being identified for 5 SNPs. Furthermore, we found no evidence of the existence of spatial barriers to parasite movement within the region, but observed directional movement of parasites among schools in two separate sections of the region studied. Conclusions. Our findings illustrate a pattern of high parasite mixing within the study region. If this mixing is due to rapid gene flow, then "one-off" targeted interventions may not be currently effective at the sub-national scale in Western Kenya, due to the high parasite movement that is likely to lead to re-introduction of infection from surrounding regions. However repeated targeted interventions may reduce transmission in the surrounding regions.

Omedo I, Mogeni P, Rockett K, Kamau A, Hubbart C, Jeffreys A, Ochola-Oyier LI, de Villiers EP, Gitonga CW, Noor AM et al. 2017. Geographic-genetic analysis of Plasmodium falciparum parasite populations from surveys of primary school children in Western Kenya. Wellcome open research, 2 pp. 29. | Show Abstract | Read more

Background. Malaria control, and finally malaria elimination, requires the identification and targeting of residual foci or hotspots of transmission. However, the level of parasite mixing within and between geographical locations is likely to impact the effectiveness and durability of control interventions and thus should be taken into consideration when developing control programs. Methods. In order to determine the geographic-genetic patterns of Plasmodium falciparum parasite populations at a sub-national level in Kenya, we used the Sequenom platform to genotype 111 genome-wide distributed single nucleotide polymorphic (SNP) positions in 2486 isolates collected from children in 95 primary schools in western Kenya. We analysed these parasite genotypes for genetic structure using principal component analysis and assessed local and global clustering using statistical measures of spatial autocorrelation. We further examined the region for spatial barriers to parasite movement as well as directionality in the patterns of parasite movement. Results. We found no evidence of population structure and little evidence of spatial autocorrelation of parasite genotypes (correlation coefficients <0.03 among parasite pairs in distance classes of 1km, 2km and 5km; p value<0.01). An analysis of the geographical distribution of allele frequencies showed weak evidence of variation in distribution of alleles, with clusters representing a higher than expected number of samples with the major allele being identified for 5 SNPs. Furthermore, we found no evidence of the existence of spatial barriers to parasite movement within the region, but observed directional movement of parasites among schools in two separate sections of the region studied. Conclusions. Our findings illustrate a pattern of high parasite mixing within the study region. If this mixing is due to rapid gene flow, then "one-off" targeted interventions may not be currently effective at the sub-national scale in Western Kenya, due to the high parasite movement that is likely to lead to re-introduction of infection from surrounding regions. However repeated targeted interventions may reduce transmission in the surrounding regions.

Omedo I, Mogeni P, Rockett K, Kamau A, Hubbart C, Jeffreys A, Ochola-Oyier LI, de Villiers EP, Gitonga CW, Noor AM et al. Geographic-genetic analysis of Plasmodium falciparum parasite populations from surveys of primary school children in Western Kenya Wellcome Open Research, 2 pp. 29-29. | Read more

Omedo I, Mogeni P, Bousema T, Rockett K, Amambua-Ngwa A, Oyier I, C Stevenson J, Y Baidjoe A, de Villiers EP, Fegan G et al. 2017. Micro-epidemiological structuring of Plasmodium falciparum parasite populations in regions with varying transmission intensities in Africa. Wellcome Open Res, 2 pp. 10. | Show Abstract | Read more

Background: The first models of malaria transmission assumed a completely mixed and homogeneous population of parasites. Recent models include spatial heterogeneity and variably mixed populations. However, there are few empiric estimates of parasite mixing with which to parametize such models. Methods: Here we genotype 276 single nucleotide polymorphisms (SNPs) in 5199 P. falciparum isolates from two Kenyan sites (Kilifi county and Rachuonyo South district) and one Gambian site (Kombo coastal districts) to determine the spatio-temporal extent of parasite mixing, and use Principal Component Analysis (PCA) and linear regression to examine the relationship between genetic relatedness and distance in space and time for parasite pairs. Results: Using 107, 177 and 82 SNPs that were successfully genotyped in 133, 1602, and 1034 parasite isolates from The Gambia, Kilifi and Rachuonyo South district, respectively, we show that there are no discrete geographically restricted parasite sub-populations, but instead we see a diffuse spatio-temporal structure to parasite genotypes. Genetic relatedness of sample pairs is predicted by relatedness in space and time. Conclusions: Our findings suggest that targeted malaria control will benefit the surrounding community, but unfortunately also that emerging drug resistance will spread rapidly through the population.

Gimode D, Odeny DA, de Villiers EP, Wanyonyi S, Dida MM, Mneney EE, Muchugi A, Machuka J, de Villiers SM. 2016. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies. PLoS One, 11 (7), pp. e0159437. | Show Abstract | Read more

Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional breeding programs in order to efficiently optimize productivity.

Bishop RP, Fleischauer C, de Villiers EP, Okoth EA, Arias M, Gallardo C, Upton C. 2015. Comparative analysis of the complete genome sequences of Kenyan African swine fever virus isolates within p72 genotypes IX and X. Virus Genes, 50 (2), pp. 303-309. | Show Abstract | Read more

Twelve complete African swine fever virus (ASFV) genome sequences are currently publicly available and these include only one sequence from East Africa. We describe genome sequencing and annotation of a recent pig-derived p72 genotype IX, and a tick-derived genotype X isolate from Kenya using the Illumina platform and comparison with the Kenya 1950 isolate. The three genomes constitute a cluster that was phylogenetically distinct from other ASFV genomes, but 98-99 % conserved within the group. Vector-based compositional analysis of the complete genomes produced a similar topology. Of the 125 previously identified 'core' ASFV genes, two ORFs of unassigned function were absent from the genotype IX sequence which was 184 kb in size as compared to 191 kb for the genotype X. There were multiple differences among East African genomes in the 360 and 110 multicopy gene families. The gene corresponding to 360-19R has transposed to the 5' variable region in both genotype X isolates. Additionally, there is a 110 ORF in the tick-derived genotype X isolate formed by fusion of 13L and 14L that is unique among ASFV genomes. In future, functional analysis based on the variations in the multicopy families may reveal whether they contribute to the observed differences in virulence between genotpye IX and X viruses.

De Villiers EP, Bongcam-Rudloff E. eBioKit bioinformatics workshops in Dar es Salaam, Tanzania EMBnet.journal, 20 pp. 755-755. | Read more

Fischer A, Liljander A, Kaspar H, Muriuki C, Fuxelius H-H, Bongcam-Rudloff E, de Villiers EP, Huber CA, Frey J, Daubenberger C et al. 2013. Camel Streptococcus agalactiae populations are associated with specific disease complexes and acquired the tetracycline resistance gene tetM via a Tn916-like element. Vet Res, 44 (1), pp. 86. | Show Abstract | Read more

Camels are the most valuable livestock species in the Horn of Africa and play a pivotal role in the nutritional sustainability for millions of people. Their health status is therefore of utmost importance for the people living in this region. Streptococcus agalactiae, a Group B Streptococcus (GBS), is an important camel pathogen. Here we present the first epidemiological study based on genetic and phenotypic data from African camel derived GBS. Ninety-two GBS were characterized using multilocus sequence typing (MLST), capsular polysaccharide typing and in vitro antimicrobial susceptibility testing. We analysed the GBS using Bayesian linkage, phylogenetic and minimum spanning tree analyses and compared them with human GBS from East Africa in order to investigate the level of genetic exchange between GBS populations in the region. Camel GBS sequence types (STs) were distinct from other STs reported so far. We mapped specific STs and capsular types to major disease complexes caused by GBS. Widespread resistance (34%) to tetracycline was associated with acquisition of the tetM gene that is carried on a Tn916-like element, and observed primarily among GBS isolated from mastitis. The presence of tetM within different MLST clades suggests acquisition on multiple occasions. Wound infections and mastitis in camels associated with GBS are widespread and should ideally be treated with antimicrobials other than tetracycline in East Africa.

Zubair S, de Villiers EP, Younan M, Andersson G, Tettelin H, Riley DR, Jores J, Bongcam-Rudloff E, Bishop RP. 2013. Genome Sequences of Two Pathogenic Streptococcus agalactiae Isolates from the One-Humped Camel Camelus dromedarius. Genome Announc, 1 (4), pp. 31. | Show Abstract | Read more

Streptococcus agalactiae causes a range of clinical syndromes in camels (Camelus dromedarius). We report the genome sequences of two S. agalactiae isolates that induce abscesses in Kenyan camels. These genomes provide novel data on the composition of the S. agalactiae "pan genome" and reveal the presence of multiple genomic islands.

Zubair S, de Villiers EP, Fuxelius HH, Andersson G, Johansson K-E, Bishop RP, Bongcam-Rudloff E. 2013. Genome Sequence of Streptococcus agalactiae Strain 09mas018883, Isolated from a Swedish Cow. Genome Announc, 1 (4), pp. 142. | Show Abstract | Read more

We announce the complete genome sequence of Streptococcus agalactiae strain 09mas018883, isolated from the milk of a cow with clinical mastitis. The availability of this genome may allow identification of candidate genes, leading to discovery of antigens that might form the basis for development of a vaccine as an alternative means of mastitis control.

Ferguson ME, Hearne SJ, Close TJ, Wanamaker S, Moskal WA, Town CD, de Young J, Marri PR, Rabbi IY, de Villiers EP. 2012. Identification, validation and high-throughput genotyping of transcribed gene SNPs in cassava. Theor Appl Genet, 124 (4), pp. 685-695. | Show Abstract | Read more

The availability of genomic resources can facilitate progress in plant breeding through the application of advanced molecular technologies for crop improvement. This is particularly important in the case of less researched crops such as cassava, a staple and food security crop for more than 800 million people. Here, expressed sequence tags (ESTs) were generated from five drought stressed and well-watered cassava varieties. Two cDNA libraries were developed: one from root tissue (CASR), the other from leaf, stem and stem meristem tissue (CASL). Sequencing generated 706 contigs and 3,430 singletons. These sequences were combined with those from two other EST sequencing initiatives and filtered based on the sequence quality. Quality sequences were aligned using CAP3 and embedded in a Windows browser called HarvEST:Cassava which is made available. HarvEST:Cassava consists of a Unigene set of 22,903 quality sequences. A total of 2,954 putative SNPs were identified. Of these 1,536 SNPs from 1,170 contigs and 53 cassava genotypes were selected for SNP validation using Illumina's GoldenGate assay. As a result 1,190 SNPs were validated technically and biologically. The location of validated SNPs on scaffolds of the cassava genome sequence (v.4.1) is provided. A diversity assessment of 53 cassava varieties reveals some sub-structure based on the geographical origin, greater diversity in the Americas as opposed to Africa, and similar levels of diversity in West Africa and southern, eastern and central Africa. The resources presented allow for improved genetic dissection of economically important traits and the application of modern genomics-based approaches to cassava breeding and conservation.

De Villiers E, Kumuthini J, Bongcam-Rudloff E. ISCB Africa ASBCB Conference on Bioinformatics and eBioKit Workshop EMBnet.journal, 17 (2), pp. 7-7. | Read more

Visendi P, Ng'ang'a W, Bulimo W, Bishop R, Ochanda J, de Villiers EP. 2011. TparvaDB: a database to support Theileria parva vaccine development. Database (Oxford), 2011 pp. bar015. | Show Abstract | Read more

We describe the development of TparvaDB, a comprehensive resource to facilitate research towards development of an East Coast fever vaccine, by providing an integrated user-friendly database of all genome and related data currently available for Theileria parva. TparvaDB is based on the Generic Model Organism Database (GMOD) platform. It contains a complete reference genome sequence, Expressed Sequence Tags (ESTs), Massively Parallel Signature Sequencing (MPSS) expression tag data and related information from both public and private repositories. The Artemis annotation workbench provides online annotation functionality. TparvaDB represents a resource that will underpin and promote ongoing East Coast fever vaccine development and biological research. Database URL: http://tparvadb.ilri.cgiar.org.

Ommeh S, Budd A, Ngara MV, Njaci I, de Villiers EP. 2011. Basic Molecular Evolution Workshop--A trans-African virtual training course: "Virtual Workshops": Is Africa ready to embrace the concept? Bioessays, 33 (4), pp. 243-247. | Read more

Fuxelius H, Bongcam E, Jaufeerally Y. The contribution of the eBioKit to Bioinformatics Education in Southern Africa EMBnet.journal, 16 (1), pp. 29-29. | Read more

de Villiers EP, Gallardo C, Arias M, da Silva M, Upton C, Martin R, Bishop RP. 2010. Phylogenomic analysis of 11 complete African swine fever virus genome sequences. Virology, 400 (1), pp. 128-136. | Show Abstract | Read more

Viral molecular epidemiology has traditionally analyzed variation in single genes. Whole genome phylogenetic analysis of 123 concatenated genes from 11 ASFV genomes, including E75, a newly sequenced virulent isolate from Spain, identified two clusters. One contained South African isolates from ticks and warthog, suggesting derivation from a sylvatic transmission cycle. The second contained isolates from West Africa and the Iberian Peninsula. Two isolates, from Kenya and Malawi, were outliers. Of the nine genomes within the clusters, seven were within p72 genotype 1. The 11 genomes sequenced comprised only 5 of the 22 p72 genotypes. Comparison of synonymous and non-synonymous mutations at the genome level identified 20 genes subject to selection pressure for diversification. A novel gene of the E75 virus evolved by the fusion of two genes within the 360 multicopy family. Comparative genomics reveals high diversity within a limited sample of the ASFV viral gene pool.

Gichora NN, Fatumo SA, Ngara MV, Chelbat N, Ramdayal K, Opap KB, Siwo GH, Adebiyi MO, El Gonnouni A, Zofou D et al. 2010. Ten simple rules for organizing a virtual conference--anywhere. PLoS Comput Biol, 6 (2), pp. e1000650. | Read more

Adam F, Villiers E, Watson S, Coyne K, Blackwood L. 2009. Clinical pathological and epidemiological assessment of morphologically and immunologically confirmed canine leukaemia. Vet Comp Oncol, 7 (3), pp. 181-195. | Show Abstract | Read more

Traditionally, classification of leukaemia in dogs has relied on morphological examination and cytochemical staining patterns, but aberrant cellular morphology and stain uptake often curtails accurate categorization, and historical data based on this classification may be unreliable. Immunophenotyping is now the gold standard for classification of leukaemias. The purpose of this prospective study was to assess the clinical pathological and epidemiological features of a population of dogs with morphologically and immunologically confirmed leukaemia and to compare them within categories: acute and chronic lymphoid leukaemia (ALL and CLL), and acute and chronic myeloid leukaemia (AML and CML). There were 64 cases of morphologically and immunologically confirmed leukaemia: 25 cases of ALL, 17 cases of CLL and 22 cases of AML. Prevalence of B and T immunophenotypes in ALL and CLL was not statistically different. Dogs with AML were significantly younger than those with ALL at presentation (P = 0.04). Golden Retriever dogs in the study population were overrepresented in comparison with a control population of dogs (6/25 ALL cases, 8/64 leukaemia cases). No sex was overrepresented. Dogs with ALL had significantly more severe neutropenia (P = 0.001) and thrombocytopenia (P = 0.002) than those with CLL and had significantly more cytopenias. The severity and numbers of cytopenias seen in ALL and AML were not significantly different. Twenty-one of the leukaemia cases showed one cytopenia, fourteen had two cytopenias and twenty-one cases had pancytopenia. Anaemia was the most common cytopenia seen in isolation (17/21). No dogs had neutropenia without anaemia and/or thrombocytopenia. Total white blood cell counts were not different between the groups. The atypical cell counts within the peripheral blood were significantly higher in ALL than AML; both in isolation and as a percentage of the total white blood cell count (P = 0.03). This study strengthens the hypothesis that acute leukaemias give rise to more profound cytopenias, affecting more cell lines, than chronic leukaemias.

Weir W, Sunter J, Chaussepied M, Skilton R, Tait A, de Villiers EP, Bishop R, Shiels B, Langsley G. 2009. Highly syntenic and yet divergent: a tale of two Theilerias. Infect Genet Evol, 9 (4), pp. 453-461. | Show Abstract | Read more

The published genomic sequences of the two major host-transforming Theileria species of cattle represent a rich resource of information that has allowed novel bioinformatic and experimental studies into these important apicomplexan parasites. Since their publication in 2005, the genomes of T. annulata and T. parva have been utilised for a diverse range of applications, ranging from candidate antigen discovery to the identification of genetic markers for population analysis. This has led to advancements in the quest for a sub-unit vaccine, while providing a greater understanding of variation among parasite populations in the field. The unique ability of these Theileria species to induce host cell transformation is the subject of considerable scientific interest and the availability of full genomic sequences has provided new insights into this area of research. This article reviews the data underlying published comparative analyses, focussing on the general features of gene expression, the major Tpr/Tar multi-copy gene family and a re-examination of the predicted macroschizont secretome. Codon usage between the Theileria species is reviewed in detail, as this underpins ongoing comparative studies investigating selection at the intra- and inter-species level. The TashAT/TpshAT family of genes, conserved between T. annulata and T. parva, encodes products targeted to the host nucleus and has been implicated in contributing to the transformed bovine phenotype. Species-specific expansion and diversification at this critical locus is discussed with reference to the availability, in the near future, of genomic datasets which are based on non-transforming Theileria species.

Githui EK, De Villiers EP, McArthur AG. 2009. Plasmodium possesses dynein light chain classes that are unique and conserved across species. Infect Genet Evol, 9 (3), pp. 337-343. | Show Abstract | Read more

Plasmodium belongs to the phylum Apicomplexa. Within the Apicomplexa, Plasmodium, Toxoplasma and Cryptosporidium are parasites of considerable medical importance while Theileria and Eimeria are animal pathogens. P. falciparum is particularly important as it causes malaria, resulting in more than 1 million deaths each year. The malaria parasite actively invades the host cell in which it propagates and several proteins associated with the apical organelles have been implicated to be crucial in the invasion process. The biogenesis of the apical organelles is not well understood, but several studies indicate that microtubule-based vesicular transport is involved. Vesicular transport proteins are also present in Plasmodium and are presumed to be involved in transcellular transport in infected erythrocytes. Dynein is a multi-subunit motor protein involved in microtubule-based vesicular transport. In this study, we analyzed the cytoplasmic dynein light chains (Dlcs) of P. falciparum since they provide adaptor surface to the cargoes and are likely to be involved in differential transport. Dlcs consist of three different families: TcTex1/2, LC8 and LC7/roadblock. The data presented demonstrate that P. falciparum Dlcs sequences and functional domains show high sequence similarity within the species, but that only the Dlc group 1 (LC8) has a high similarity to human orthologues. TcTex1 and LC7/roadblock have low similarity to human orthologues. This sequence variation could be targeted for vaccine or drug development.

Boutet P, Heath F, Archer J, Villiers E. 2009. Comparison of quantitative immunoturbidimetric and semiquantitative latex-agglutination assays for D-dimer measurement in canine plasma. Vet Clin Pathol, 38 (1), pp. 78-82. | Show Abstract | Read more

BACKGROUND: D-dimer measurement in dogs is considered the most reliable test for detecting disseminated intravascular coagulation or thromboembolism. OBJECTIVES: The purposes of this study were to compare 2 D-dimer assays, a quantitative immunoturbidimetric and a semiquantitative latex agglutination assay, and to assess the effect of hemolysis and storage conditions on D-dimer concentration using the quantitative assay. METHODS: The immunoturbidimetric assay was validated using canine citrated plasma samples containing different concentrations of D-dimer. The effect of storage at various temperatures and times was assessed. Hemolysis was produced by adding lysed RBCs to the samples for a final hemoglobin concentration of 0.35 g/dL. RESULTS: For clinically relevant values (>250 microg/L), intra-assay and interassay coefficients of variation were 6.8% and 7.2%. The assay was linear (r(2)=1.00), and the tests had good agreement (kappa=0.685, P<.001). Storage at 4 degrees C and -20 degrees C and hemolysis had no significant effect on D-dimer concentrations. In hemolyzed samples stored at room temperature for > or =48 hours, fine clots were noted and often resulted in falsely increased D-dimer concentrations. CONCLUSIONS: Our findings suggest that the immunoturbidimetric assay validated in this study is reliable and accurate for the measurement of D-dimer in canine plasma. Samples can be stored for up to 1 month at -20 degrees C and moderate hemolysis does not significantly affect the D-dimer concentration in frozen or refrigerated samples.

Sunter JD, Patel SP, Skilton RA, Githaka N, Knowles DP, Scoles GA, Nene V, de Villiers E, Bishop RP. 2008. A novel SINE family occurs frequently in both genomic DNA and transcribed sequences in ixodid ticks of the arthropod sub-phylum Chelicerata. Gene, 415 (1-2), pp. 13-22. | Show Abstract | Read more

Reassociation kinetics and flow cytometry data indicate that ixodid tick genomes are large, relative to most arthropods, containing>or=10(9) base pairs. The molecular basis for this is unknown. We have identified a novel small interspersed element with features of a tRNA-derived SINE, designated Ruka, in genomic sequences of Rhipicephalus appendiculatus and Boophilus (Rhipicephalus) microplus ticks. The SINE was also identified in expressed sequence tag (EST) databases derived from several tissues in four species of ixodid ticks, namely R. appendiculatus, B. (R.) microplus, Amblyomma variegatum and also the more distantly related Ixodes scapularis. Secondary structure predictions indicated that Ruka could adopt a tRNA structure that was, atypically, most similar to a serine tRNA. By extrapolation the frequency of occurrence in the randomly selected BAC clone sequences is consistent with approximately 65,000 copies of Ruka in the R. appendiculatus genome. Real time PCR analyses on genomic DNA indicate copy numbers for specific Ruka subsets between 5800 and 38,000. Several putative conserved Ruka insertion sites were identified in EST sequences of three ixodid tick species based on the flanking sequences associated with the SINEs, indicating that some Ruka transpositions probably occurred prior to speciation within the metastriate division of the Ixodidae. The data strongly suggest that Class I transposable elements form a significant component of tick genomes and may partially account for the large genome sizes observed.

Langsley G, van Noort V, Carret C, Meissner M, de Villiers EP, Bishop R, Pain A. 2008. Comparative genomics of the Rab protein family in Apicomplexan parasites. Microbes Infect, 10 (5), pp. 462-470. | Show Abstract | Read more

Rab genes encode a subgroup of small GTP-binding proteins within the ras super-family that regulate targeting and fusion of transport vesicles within the secretory and endocytic pathways. These genes are of particular interest in the protozoan phylum Apicomplexa, since a family of Rab GTPases has been described for Plasmodium and most putative secretory pathway proteins in Apicomplexa have conventional predicted signal peptides. Moreover, peptide motifs have now been identified within a large number of secreted Plasmodium proteins that direct their targeting to the red blood cell cytosol, the apicoplast, the food vacuole and Maurer's clefs; in contrast, motifs that direct proteins to secretory organelles (rhoptries, micronemes and microspheres) have yet to be defined. The nature of the vesicle in which these proteins are transported to their destinations remains unknown and morphological structures equivalent to the endoplasmic reticulum and trans-Golgi stacks typical of other eukaryotes cannot be visualised in Apicomplexa. Since Rab GTPases regulate vesicular traffic in all eukaryotes, and this traffic in intracellular parasites could regulate import of nutrient and drugs and export of antigens, host cell modulatory proteins and lactate we compare and contrast here the Rab families of Apicomplexa.

Graham SP, Pellé R, Yamage M, Mwangi DM, Honda Y, Mwakubambanya RS, de Villiers EP, Abuya E, Awino E, Gachanja J et al. 2008. Characterization of the fine specificity of bovine CD8 T-cell responses to defined antigens from the protozoan parasite Theileria parva. Infect Immun, 76 (2), pp. 685-694. | Show Abstract | Read more

Immunity against the bovine intracellular protozoan parasite Theileria parva has been shown to be mediated by CD8 T cells. Six antigens targeted by CD8 T cells from T. parva-immune cattle of different major histocompatibility complex (MHC) genotypes have been identified, raising the prospect of developing a subunit vaccine. To facilitate further dissection of the specificity of protective CD8 T-cell responses and to assist in the assessment of responses to vaccination, we set out to identify the epitopes recognized in these T. parva antigens and their MHC restriction elements. Nine epitopes in six T. parva antigens, together with their respective MHC restriction elements, were successfully identified. Five of the cytotoxic-T-lymphocyte epitopes were found to be restricted by products of previously described alleles, and four were restricted by four novel restriction elements. Analyses of CD8 T-cell responses to five of the epitopes in groups of cattle carrying the defined restriction elements and immunized with live parasites demonstrated that, with one exception, the epitopes were consistently recognized by animals of the respective genotypes. The analysis of responses was extended to animals immunized with multiple antigens delivered in separate vaccine constructs. Specific CD8 T-cell responses were detected in 19 of 24 immunized cattle. All responder cattle mounted responses specific for antigens for which they carried an identified restriction element. By contrast, only 8 of 19 responder cattle displayed a response to antigens for which they did not carry an identified restriction element. These data demonstrate that the identified antigens are inherently dominant in animals with the corresponding MHC genotypes.

Graham SP, Honda Y, Pellé R, Mwangi DM, Glew EJ, de Villiers EP, Shah T, Bishop R, van der Bruggen P, Nene V, Taracha ELN. 2007. A novel strategy for the identification of antigens that are recognised by bovine MHC class I restricted cytotoxic T cells in a protozoan infection using reverse vaccinology. Immunome Res, 3 (1), pp. 2. | Show Abstract | Read more

BACKGROUND: Immunity against the bovine protozoan parasite Theileria parva has previously been shown to be mediated through lysis of parasite-infected cells by MHC class I restricted CD8+ cytotoxic T lymphocytes. It is hypothesized that identification of CTL target schizont antigens will aid the development of a sub-unit vaccine. We exploited the availability of the complete genome sequence data and bioinformatics tools to identify genes encoding secreted or membrane anchored proteins that may be processed and presented by the MHC class I molecules of infected cells to CTL. RESULTS: Of the 986 predicted open reading frames (ORFs) encoded by chromosome 1 of the T. parva genome, 55 were selected based on the presence of a signal peptide and/or a transmembrane helix domain. Thirty six selected ORFs were successfully cloned into a eukaryotic expression vector, transiently transfected into immortalized bovine skin fibroblasts and screened in vitro using T. parva-specific CTL. Recognition of gene products by CTL was assessed using an IFN-gamma ELISpot assay. A 525 base pair ORF encoding a 174 amino acid protein, designated Tp2, was identified by T. parva-specific CTL from 4 animals. These CTL recognized and lysed Tp2 transfected skin fibroblasts and recognized 4 distinct epitopes. Significantly, Tp2 specific CD8+ T cell responses were observed during the protective immune response against sporozoite challenge. CONCLUSION: The identification of an antigen containing multiple CTL epitopes and its apparent immunodominance during a protective anti-parasite response makes Tp2 an attractive candidate for evaluation of its vaccine potential.

Graham SP, Pellé R, Honda Y, Mwangi DM, Tonukari NJ, Yamage M, Glew EJ, de Villiers EP, Shah T, Bishop R et al. 2006. Theileria parva candidate vaccine antigens recognized by immune bovine cytotoxic T lymphocytes. Proc Natl Acad Sci U S A, 103 (9), pp. 3286-3291. | Show Abstract | Read more

East Coast fever, caused by the tick-borne intracellular apicomplexan parasite Theileria parva, is a highly fatal lymphoproliferative disease of cattle. The pathogenic schizont-induced lymphocyte transformation is a unique cancer-like condition that is reversible with parasite removal. Schizont-infected cell-directed CD8(+) cytotoxic T lymphocytes (CTL) constitute the dominant protective bovine immune response after a single exposure to infection. However, the schizont antigens targeted by T. parva-specific CTL are undefined. Here we show the identification of five candidate vaccine antigens that are the targets of MHC class I-restricted CD8(+) CTL from immune cattle. CD8(+) T cell responses to these antigens were boosted in T. parva-immune cattle resolving a challenge infection and, when used to immunize naïve cattle, induced CTL responses that significantly correlated with survival from a lethal parasite challenge. These data provide a basis for developing a CTL-targeted anti-East Coast fever subunit vaccine. In addition, orthologs of these antigens may be vaccine targets for other apicomplexan parasites.

Glinka EM, Edelweiss EF, Sapozhnikov AM, Deyev SM. 2006. A new vector for controllable expression of an anti-HER2/neu mini-antibody-barnase fusion protein in HEK 293T cells. Gene, 366 (1), pp. 97-103. | Show Abstract | Read more

Tumor-targeted vectors with controllable expression of therapeutic genes and specific antitumor antibodies are promising tools for the reduction of malignant tumors. Here we describe a new plasmid for the eukaryotic expression of an anti-HER2/neu mini-antibody-barnase fusion protein (4D5 scFv-barnase-His(5)) with an NH(2)-terminal leader peptide. The 4D5 scFv-barnase-His(5) gene was placed downstream of the tetracycline responsive-element minimal promoter in the vector using the Tet-Off gene-expression system. The Bacillus amyloliquefaciens ribonuclease barnase is toxic for the host cells. To overcome this problem, barstar gene under its own minimal cytomegalovirus promoter was used in designed vector. Barstar inhibits the background level of barnase in the cells in the presence of tetracycline in culture medium. The HEK 293T cells were transfected with the designed vector, and the 4D5 scFv-barnase-His(5) fusion protein was identified by anti-barnase antibodies in cell culture medium and after purification from cell lysates using metal-affinity chromatography. The overexpression of the anti-HER2/neu mini-antibody-barnase fusion protein decreased the intensity of fluorescence of HEK 293T cells co-transfected with the generated plasmid and a plasmid containing the gene of enhanced green fluorescent protein (pEGFP-N1), in comparison with the intensity of fluorescence of HEK 293T cells transfected with pEGFP-N1, in the absence of tetracycline in the medium. The effect of the 4D5 scFv-barnase-His(5) on EGFP fluorescence indicates that the introduced barnase functions as a ribonuclease inside the cells. The anti-HER2/neu mini-antibody could be used to deliver barnase to HER2/neu-positive cells and provide its penetration into the target cells, as HER2/neu is a ligand-internalizing receptor. This expression vector has potential applications to both gene and antibody therapies of cancer.

Shah T, de Villiers E, Nene V, Hass B, Taracha E, Gardner MJ, Sansom C, Pelle R, Bishop R. 2006. Using the transcriptome to annotate the genome revisited: application of massively parallel signature sequencing (MPSS). Gene, 366 (1), pp. 104-108. | Show Abstract | Read more

Transcriptome analysis can provide useful data for refining genome sequence annotation. Application of massively parallel signature sequencing (MPSS) revealed reproducible transcription, in multiple MPSS cycles, from 73% of computationally predicted genes in the Theileria parva schizont lifecycle stage. Signatures spanning consecutive exons confirmed 142 predicted introns. MPSS identified 83 putative genes, >100 codons overlooked by annotation software, and 139 potentially incorrect gene models (with either truncated ORFs or overlooked exons) by interfacing signature locations with stop codon maps. Twenty representative models were confirmed as likely to be incorrect using reverse transcription PCR amplification from independent schizont cDNA preparations. More than 50% of the 60 putative single copy genes in T. parva that were absent from the genome of the closely related T. annulata had MPSS signatures. This study illustrates the utility of MPSS for improving annotation of small, gene-rich microbial eukaryotic genomes.

Lambson B, Nene V, Obura M, Shah T, Pandit P, Ole-Moiyoi O, Delroux K, Welburn S, Skilton R, de Villiers E, Bishop R. 2005. Identification of candidate sialome components expressed in ixodid tick salivary glands using secretion signal complementation in mammalian cells. Insect Mol Biol, 14 (4), pp. 403-414. | Show Abstract | Read more

Ixodid ticks manipulate mammalian host pathways by secreting molecules from salivary glands. Novel cDNAs containing functional secretion signals were isolated from ixodid tick salivary glands using a signal sequence trap. Only 15/61 Rhipicephalus appendiculatus and 1/7 Amblyomma variegatum cDNAs had significant identity (< 1e-15) to previously identified sequences. Polypeptides that may interact with host pathways included a kinase inhibitor. Two proteins encoded homologues of the yolk protein vitellogenin and seventeen contained glycine-rich motifs. Four proteins without sequence matches had conserved structural folds, identified using a Threading algorithm. Predicted secretion signals were between fifteen and fifty-seven amino acids long. Four homologous polymorphic proteins contained conserved (26/27 residues) signal peptides. Ten functional tick secretion signals could not be unambiguously identified using predictive algorithms.

Pain A, Renauld H, Berriman M, Murphy L, Yeats CA, Weir W, Kerhornou A, Aslett M, Bishop R, Bouchier C et al. 2005. Genome of the host-cell transforming parasite Theileria annulata compared with T. parva. Science, 309 (5731), pp. 131-133. | Show Abstract | Read more

Theileria annulata and T. parva are closely related protozoan parasites that cause lymphoproliferative diseases of cattle. We sequenced the genome of T. annulata and compared it with that of T. parva to understand the mechanisms underlying transformation and tropism. Despite high conservation of gene sequences and synteny, the analysis reveals unequally expanded gene families and species-specific genes. We also identify divergent families of putative secreted polypeptides that may reduce immune recognition, candidate regulators of host-cell transformation, and a Theileria-specific protein domain [frequently associated in Theileria (FAINT)] present in a large number of secreted proteins.

Gardner MJ, Bishop R, Shah T, de Villiers EP, Carlton JM, Hall N, Ren Q, Paulsen IT, Pain A, Berriman M et al. 2005. Genome sequence of Theileria parva, a bovine pathogen that transforms lymphocytes. Science, 309 (5731), pp. 134-137. | Show Abstract | Read more

We report the genome sequence of Theileria parva, an apicomplexan pathogen causing economic losses to smallholder farmers in Africa. The parasite chromosomes exhibit limited conservation of gene synteny with Plasmodium falciparum, and its plastid-like genome represents the first example where all apicoplast genes are encoded on one DNA strand. We tentatively identify proteins that facilitate parasite segregation during host cell cytokinesis and contribute to persistent infection of transformed host cells. Several biosynthetic pathways are incomplete or absent, suggesting substantial metabolic dependence on the host cell. One protein family that may generate parasite antigenic diversity is not telomere-associated.

Collins NE, Liebenberg J, de Villiers EP, Brayton KA, Louw E, Pretorius A, Faber FE, van Heerden H, Josemans A, van Kleef M et al. 2005. The genome of the heartwater agent Ehrlichia ruminantium contains multiple tandem repeats of actively variable copy number. Proc Natl Acad Sci U S A, 102 (3), pp. 838-843. | Show Abstract | Read more

Heartwater, a tick-borne disease of domestic and wild ruminants, is caused by the intracellular rickettsia Ehrlichia ruminantium (previously known as Cowdria ruminantium). It is a major constraint to livestock production throughout subSaharan Africa, and it threatens to invade the Americas, yet there is no immediate prospect of an effective vaccine. A shotgun genome sequencing project was undertaken in the expectation that access to the complete protein coding repertoire of the organism will facilitate the search for vaccine candidate genes. We report here the complete 1,516,355-bp sequence of the type strain, the stock derived from the South African Welgevonden isolate. Only 62% of the genome is predicted to be coding sequence, encoding 888 proteins and 41 stable RNA species. The most striking feature is the large number of tandemly repeated and duplicated sequences, some of continuously variable copy number, which contributes to the low proportion of coding sequence. These repeats have mediated numerous translocation and inversion events that have resulted in the duplication and truncation of some genes and have also given rise to new genes. There are 32 predicted pseudogenes, most of which are truncated fragments of genes associated with repeats. Rather then being the result of the reductive evolution seen in other intracellular bacteria, these pseudogenes appear to be the product of ongoing sequence duplication events.

Bishop R, Shah T, Pelle R, Hoyle D, Pearson T, Haines L, Brass A, Hulme H, Graham SP, Taracha ELN et al. 2005. Analysis of the transcriptome of the protozoan Theileria parva using MPSS reveals that the majority of genes are transcriptionally active in the schizont stage. Nucleic Acids Res, 33 (17), pp. 5503-5511. | Show Abstract | Read more

Massively parallel signature sequencing (MPSS) was used to analyze the transcriptome of the intracellular protozoan Theileria parva. In total 1,095,000, 20 bp sequences representing 4371 different signatures were generated from T.parva schizonts. Reproducible signatures were identified within 73% of potentially detectable predicted genes and 83% had signatures in at least one MPSS cycle. A predicted leader peptide was detected on 405 expressed genes. The quantitative range of signatures was 4-52,256 transcripts per million (t.p.m.). Rare transcripts (<50 t.p.m.) were detected from 36% of genes. Sequence signatures approximated a lognormal distribution, as in microarray. Transcripts were widely distributed throughout the genome, although only 47% of 138 telomere-associated open reading frames exhibited signatures. Antisense signatures comprised 13.8% of the total, comparable with Plasmodium. Eighty five predicted genes with antisense signatures lacked a sense signature. Antisense transcripts were independently amplified from schizont cDNA and verified by sequencing. The MPSS transcripts per million for seven genes encoding schizont antigens recognized by bovine CD8 T cells varied 1000-fold. There was concordance between transcription and protein expression for heat shock proteins that were very highly expressed according to MPSS and proteomics. The data suggests a low level of baseline transcription from the majority of protein-coding genes.

Nene V, Lee D, Kang'a S, Skilton R, Shah T, de Villiers E, Mwaura S, Taylor D, Quackenbush J, Bishop R. 2004. Genes transcribed in the salivary glands of female Rhipicephalus appendiculatus ticks infected with Theileria parva. Insect Biochem Mol Biol, 34 (10), pp. 1117-1128. | Show Abstract | Read more

We describe the generation of an auto-annotated index of genes that are expressed in the salivary glands of four-day fed female adult Rhipicephalus appendiculatus ticks. A total of 9162 EST sequences were derived from an uninfected tick cDNA library and 9844 ESTs were from a cDNA library from ticks infected with Theileria parva, which develop in type III salivary gland acini. There were no major differences between abundantly expressed ESTs from the two cDNA libraries, although there was evidence for an up-regulation in the expression of some glycine-rich proteins in infected salivary glands. Gene ontology terms were also assigned to sequences in the index and those with potential enzyme function were linked to the Kyoto encyclopedia of genes and genomes database, allowing reconstruction of metabolic pathways. Several genes code for previously characterized tick proteins such as receptors for myokinin or ecdysteroid and an immunosuppressive protein. cDNAs coding for homologs of heme-lipoproteins which are major components of tick hemolymph were identified by searching the database with published N-terminal peptide sequence data derived from biochemically purified Boophilus microplus proteins. The EST data will be a useful resource for construction of microarrays to probe vector biology, vector-host and vector-pathogen interactions and to underpin gene identification via proteomics approaches.

Hide W, Mizrahi V, Venkatesh B, Brenner S, Simpson A, Blatch G, Soodyall H, Denby K, Wingfield M, Wingfield B et al. 2001. A platform for genomics in South Africa. S Afr Med J, 91 (12), pp. 1006-1007.

de Villiers EP, Brayton KA, Zweygarth E, Allsopp BA. 2000. Genome size and genetic map of Cowdria ruminantium. Microbiology, 146 ( Pt 10) (10), pp. 2627-2634. | Show Abstract | Read more

Cowdria ruminantium is the cause of a serious tick-borne disease of domestic ruminants, known as heartwater or cowdriosis. The organism belongs to the tribe Ehrlichieae:, which contains obligate intracellular pathogens, causing several important animal and human diseases. Although a few C. ruminantium genes have been cloned and sequenced, very little is known about the size, gross structure and organization of the genome. This paper presents a complete physical map and a preliminary genetic map for C. ruminantium. Chromosomal C. ruminantium DNA was examined by PFGE and Southern hybridization. PFGE analysis revealed that C. ruminantium has a circular chromosome approximately 1576 kb in size. A physical map was derived by combining the results of PFGE analysis of DNA fragments resulting from digestion of the whole genome with KSP:I, RSR:II and SMA:I and Southern hybridization analysis with a series of gene probes and isolated macrorestriction fragments. A genetic map for C. ruminantium with a mean resolution of 290 kb was established, the first for a member of the Ehrlichieae: A total of nine genes or cloned C. ruminantium DNA fragments were mapped to specific KSP:I, RSR:II and SMA:I fragments, including the major antigenic protein gene, map-1.

de Villiers EP, Brayton KA, Zweygarth E, Allsopp BA. 2000. Macrorestriction fragment profiles reveal genetic variation of Cowdria ruminantium isolates. J Clin Microbiol, 38 (5), pp. 1967-1970. | Show Abstract

Macrorestriction profile analysis by pulsed-field gel electrophoresis (PFGE) was used to distinguish between seven isolates of Cowdria ruminantium from geographically different areas. Characteristic profiles were generated for each isolate by using the restriction endonucleases KspI, SalI, and SmaI with chromosomal sizes ranging between 1,546 and 1,692 kb. Statistical analysis of the macrorestriction profiles indicated that all the isolates were distinct from each other; these data contribute to a better understanding of the epidemiology of this pathogen and may be exploited for the identification of genotype-specific DNA probes.

Brayton KA, De Villiers EP, Fehrsen J, Nxomani C, Collins NE, Allsopp BA. 1999. Cowdria ruminantium DNA is unstable in a SuperCos1 library. Onderstepoort J Vet Res, 66 (2), pp. 111-117. | Show Abstract

A Cowdria ruminantium genomic library was constructed in a cosmid vector to serve as a source of easily accessible and pure C. ruminantium DNA for molecular genetic studies. The cosmid library contained 846 clones which were arrayed into microtitre plates. Restriction enzyme digestion patterns indicated that these clones had an average insert size of 35 kb. Probing of the arrays did not detect any bovine clones and only one of the known C. ruminantium genes, pCS20, was detected. Due to the high AT content and the fact that C. ruminantium genes are active in the Escherichia coli host, the C. ruminantium clones were unstable in the SuperCos1 vector and most clones did not grow reproducibly. The library was contaminated with E. coli clones and these clones were maintained with greater fidelity than the C. ruminantium clones, resulting in a skewed representation over time. We have isolated seven C. ruminantium clones which we were able to serially culture reproducibly; two of these clones overlap. These clones constitute the first large regions of C. ruminantium DNA to be cloned and represent almost 10% of the C. ruminantium genome.

Collins NE, De Villiers EP, Brayton KA, Allsopp BA. 1998. DNA sequence of a cosmid clone of Cowdria ruminantium. Ann N Y Acad Sci, 849 (1), pp. 365-368. | Read more

de Villiers EP, Brayton KA, Zweygarth E, Allsopp BA. 1998. Purification of Cowdria ruminantium organisms for use in genome analysis by pulsed-field gel electrophoresis. Ann N Y Acad Sci, 849 (1), pp. 313-320. | Show Abstract | Read more

Cowdria ruminantium is an obligate intracellular rickettsial pathogen which is responsible for a tick-borne disease of domestic and wild ruminants called heartwater or cowdriosis. Although several genes have been cloned and partially sequenced, the genome size, gross structure, and organization of the C. ruminantium genome is unknown. Genome analysis of the organism has been hindered because it is difficult to obtain C. ruminantium DNA free from contaminating host cell DNA, and this probably accounts for the lack of genome size data for this organism. In this study we investigated several methods for purifying C. ruminantium from bovine cellular contaminants and organisms of a relatively high purity were obtained. These were used to prepare Cowdria DNA which was analyzed by pulsed-field gel electrophoresis (PFGE) and which revealed a genome approximately 1900 kbp in length plus an additional extra-chromosomal fragment migrating with an apparent size of 815 kbp. This is the first time that the genome size of C. ruminantium has been determined and the first demonstration of an extrachromosomal element.

Brayton KA, Fehrsen J, de Villiers EP, van Kleef M, Allsopp BA. 1997. Construction and initial analysis of a representative lambda ZAPII expression library of the intracellular rickettsia Cowdria ruminantium: cloning of map1 and three other Cowdria genes. Vet Parasitol, 72 (2), pp. 185-199. | Show Abstract | Read more

The causative agent of heartwater, the rickettsia Cowdria ruminantium, is very poorly understood at the molecular level owing to a profound lack of suitable tools. We have developed an immunoaffinity chromatographic method to purify C. ruminantium from host cell components and the purified rickettsial cells have been used to prepare substantially pure Cowdria DNA. This DNA has been used to construct what we believe to be the first fully representative C. ruminantium expression library. A clone containing the complete Cowdria map1 gene has been isolated and sequenced. This gene has been expressed in E. coli cells from the native Cowdria promoter, suggesting that the mechanisms for gene transcription and translation are similar between these two organisms. Parts of three other Cowdria genes have also been isolated and sequenced.

Swart P, De Villiers EP, Swart AC, van der Merwe KJ, Todres PC. 1993. The interaction of biogenic amines with adrenal cytochrome P450-dependent enzymes. Biochem Soc Trans, 21 (4), pp. 413S. | Read more

Hernández-de-Diego R, de Villiers EP, Klingström T, Gourlé H, Conesa A, Bongcam-Rudloff E. 2017. The eBioKit, a stand-alone educational platform for bioinformatics. PLoS Comput Biol, 13 (9), pp. e1005616. | Show Abstract | Read more

Bioinformatics skills have become essential for many research areas; however, the availability of qualified researchers is usually lower than the demand and training to increase the number of able bioinformaticians is an important task for the bioinformatics community. When conducting training or hands-on tutorials, the lack of control over the analysis tools and repositories often results in undesirable situations during training, as unavailable online tools or version conflicts may delay, complicate, or even prevent the successful completion of a training event. The eBioKit is a stand-alone educational platform that hosts numerous tools and databases for bioinformatics research and allows training to take place in a controlled environment. A key advantage of the eBioKit over other existing teaching solutions is that all the required software and databases are locally installed on the system, significantly reducing the dependence on the internet. Furthermore, the architecture of the eBioKit has demonstrated itself to be an excellent balance between portability and performance, not only making the eBioKit an exceptional educational tool but also providing small research groups with a platform to incorporate bioinformatics analysis in their research. As a result, the eBioKit has formed an integral part of training and research performed by a wide variety of universities and organizations such as the Pan African Bioinformatics Network (H3ABioNet) as part of the initiative Human Heredity and Health in Africa (H3Africa), the Southern Africa Network for Biosciences (SAnBio) initiative, the Biosciences eastern and central Africa (BecA) hub, and the International Glossina Genome Initiative.

Bishop RP, Fleischauer C, de Villiers EP, Okoth EA, Arias M, Gallardo C, Upton C. 2015. Comparative analysis of the complete genome sequences of Kenyan African swine fever virus isolates within p72 genotypes IX and X. Virus Genes, 50 (2), pp. 303-309. | Show Abstract | Read more

Twelve complete African swine fever virus (ASFV) genome sequences are currently publicly available and these include only one sequence from East Africa. We describe genome sequencing and annotation of a recent pig-derived p72 genotype IX, and a tick-derived genotype X isolate from Kenya using the Illumina platform and comparison with the Kenya 1950 isolate. The three genomes constitute a cluster that was phylogenetically distinct from other ASFV genomes, but 98-99 % conserved within the group. Vector-based compositional analysis of the complete genomes produced a similar topology. Of the 125 previously identified 'core' ASFV genes, two ORFs of unassigned function were absent from the genotype IX sequence which was 184 kb in size as compared to 191 kb for the genotype X. There were multiple differences among East African genomes in the 360 and 110 multicopy gene families. The gene corresponding to 360-19R has transposed to the 5' variable region in both genotype X isolates. Additionally, there is a 110 ORF in the tick-derived genotype X isolate formed by fusion of 13L and 14L that is unique among ASFV genomes. In future, functional analysis based on the variations in the multicopy families may reveal whether they contribute to the observed differences in virulence between genotpye IX and X viruses.

Visendi P, Ng'ang'a W, Bulimo W, Bishop R, Ochanda J, de Villiers EP. 2011. TparvaDB: a database to support Theileria parva vaccine development. Database (Oxford), 2011 pp. bar015. | Show Abstract | Read more

We describe the development of TparvaDB, a comprehensive resource to facilitate research towards development of an East Coast fever vaccine, by providing an integrated user-friendly database of all genome and related data currently available for Theileria parva. TparvaDB is based on the Generic Model Organism Database (GMOD) platform. It contains a complete reference genome sequence, Expressed Sequence Tags (ESTs), Massively Parallel Signature Sequencing (MPSS) expression tag data and related information from both public and private repositories. The Artemis annotation workbench provides online annotation functionality. TparvaDB represents a resource that will underpin and promote ongoing East Coast fever vaccine development and biological research. Database URL: http://tparvadb.ilri.cgiar.org.

de Villiers EP, Gallardo C, Arias M, da Silva M, Upton C, Martin R, Bishop RP. 2010. Phylogenomic analysis of 11 complete African swine fever virus genome sequences. Virology, 400 (1), pp. 128-136. | Show Abstract | Read more

Viral molecular epidemiology has traditionally analyzed variation in single genes. Whole genome phylogenetic analysis of 123 concatenated genes from 11 ASFV genomes, including E75, a newly sequenced virulent isolate from Spain, identified two clusters. One contained South African isolates from ticks and warthog, suggesting derivation from a sylvatic transmission cycle. The second contained isolates from West Africa and the Iberian Peninsula. Two isolates, from Kenya and Malawi, were outliers. Of the nine genomes within the clusters, seven were within p72 genotype 1. The 11 genomes sequenced comprised only 5 of the 22 p72 genotypes. Comparison of synonymous and non-synonymous mutations at the genome level identified 20 genes subject to selection pressure for diversification. A novel gene of the E75 virus evolved by the fusion of two genes within the 360 multicopy family. Comparative genomics reveals high diversity within a limited sample of the ASFV viral gene pool.

Langsley G, van Noort V, Carret C, Meissner M, de Villiers EP, Bishop R, Pain A. 2008. Comparative genomics of the Rab protein family in Apicomplexan parasites. Microbes Infect, 10 (5), pp. 462-470. | Show Abstract | Read more

Rab genes encode a subgroup of small GTP-binding proteins within the ras super-family that regulate targeting and fusion of transport vesicles within the secretory and endocytic pathways. These genes are of particular interest in the protozoan phylum Apicomplexa, since a family of Rab GTPases has been described for Plasmodium and most putative secretory pathway proteins in Apicomplexa have conventional predicted signal peptides. Moreover, peptide motifs have now been identified within a large number of secreted Plasmodium proteins that direct their targeting to the red blood cell cytosol, the apicoplast, the food vacuole and Maurer's clefs; in contrast, motifs that direct proteins to secretory organelles (rhoptries, micronemes and microspheres) have yet to be defined. The nature of the vesicle in which these proteins are transported to their destinations remains unknown and morphological structures equivalent to the endoplasmic reticulum and trans-Golgi stacks typical of other eukaryotes cannot be visualised in Apicomplexa. Since Rab GTPases regulate vesicular traffic in all eukaryotes, and this traffic in intracellular parasites could regulate import of nutrient and drugs and export of antigens, host cell modulatory proteins and lactate we compare and contrast here the Rab families of Apicomplexa.

Graham SP, Pellé R, Honda Y, Mwangi DM, Tonukari NJ, Yamage M, Glew EJ, de Villiers EP, Shah T, Bishop R et al. 2006. Theileria parva candidate vaccine antigens recognized by immune bovine cytotoxic T lymphocytes. Proc Natl Acad Sci U S A, 103 (9), pp. 3286-3291. | Show Abstract | Read more

East Coast fever, caused by the tick-borne intracellular apicomplexan parasite Theileria parva, is a highly fatal lymphoproliferative disease of cattle. The pathogenic schizont-induced lymphocyte transformation is a unique cancer-like condition that is reversible with parasite removal. Schizont-infected cell-directed CD8(+) cytotoxic T lymphocytes (CTL) constitute the dominant protective bovine immune response after a single exposure to infection. However, the schizont antigens targeted by T. parva-specific CTL are undefined. Here we show the identification of five candidate vaccine antigens that are the targets of MHC class I-restricted CD8(+) CTL from immune cattle. CD8(+) T cell responses to these antigens were boosted in T. parva-immune cattle resolving a challenge infection and, when used to immunize naïve cattle, induced CTL responses that significantly correlated with survival from a lethal parasite challenge. These data provide a basis for developing a CTL-targeted anti-East Coast fever subunit vaccine. In addition, orthologs of these antigens may be vaccine targets for other apicomplexan parasites.

Pain A, Renauld H, Berriman M, Murphy L, Yeats CA, Weir W, Kerhornou A, Aslett M, Bishop R, Bouchier C et al. 2005. Genome of the host-cell transforming parasite Theileria annulata compared with T. parva. Science, 309 (5731), pp. 131-133. | Show Abstract | Read more

Theileria annulata and T. parva are closely related protozoan parasites that cause lymphoproliferative diseases of cattle. We sequenced the genome of T. annulata and compared it with that of T. parva to understand the mechanisms underlying transformation and tropism. Despite high conservation of gene sequences and synteny, the analysis reveals unequally expanded gene families and species-specific genes. We also identify divergent families of putative secreted polypeptides that may reduce immune recognition, candidate regulators of host-cell transformation, and a Theileria-specific protein domain [frequently associated in Theileria (FAINT)] present in a large number of secreted proteins.

Gardner MJ, Bishop R, Shah T, de Villiers EP, Carlton JM, Hall N, Ren Q, Paulsen IT, Pain A, Berriman M et al. 2005. Genome sequence of Theileria parva, a bovine pathogen that transforms lymphocytes. Science, 309 (5731), pp. 134-137. | Show Abstract | Read more

We report the genome sequence of Theileria parva, an apicomplexan pathogen causing economic losses to smallholder farmers in Africa. The parasite chromosomes exhibit limited conservation of gene synteny with Plasmodium falciparum, and its plastid-like genome represents the first example where all apicoplast genes are encoded on one DNA strand. We tentatively identify proteins that facilitate parasite segregation during host cell cytokinesis and contribute to persistent infection of transformed host cells. Several biosynthetic pathways are incomplete or absent, suggesting substantial metabolic dependence on the host cell. One protein family that may generate parasite antigenic diversity is not telomere-associated.

Collins NE, Liebenberg J, de Villiers EP, Brayton KA, Louw E, Pretorius A, Faber FE, van Heerden H, Josemans A, van Kleef M et al. 2005. The genome of the heartwater agent Ehrlichia ruminantium contains multiple tandem repeats of actively variable copy number. Proc Natl Acad Sci U S A, 102 (3), pp. 838-843. | Show Abstract | Read more

Heartwater, a tick-borne disease of domestic and wild ruminants, is caused by the intracellular rickettsia Ehrlichia ruminantium (previously known as Cowdria ruminantium). It is a major constraint to livestock production throughout subSaharan Africa, and it threatens to invade the Americas, yet there is no immediate prospect of an effective vaccine. A shotgun genome sequencing project was undertaken in the expectation that access to the complete protein coding repertoire of the organism will facilitate the search for vaccine candidate genes. We report here the complete 1,516,355-bp sequence of the type strain, the stock derived from the South African Welgevonden isolate. Only 62% of the genome is predicted to be coding sequence, encoding 888 proteins and 41 stable RNA species. The most striking feature is the large number of tandemly repeated and duplicated sequences, some of continuously variable copy number, which contributes to the low proportion of coding sequence. These repeats have mediated numerous translocation and inversion events that have resulted in the duplication and truncation of some genes and have also given rise to new genes. There are 32 predicted pseudogenes, most of which are truncated fragments of genes associated with repeats. Rather then being the result of the reductive evolution seen in other intracellular bacteria, these pseudogenes appear to be the product of ongoing sequence duplication events.

Bishop R, Shah T, Pelle R, Hoyle D, Pearson T, Haines L, Brass A, Hulme H, Graham SP, Taracha ELN et al. 2005. Analysis of the transcriptome of the protozoan Theileria parva using MPSS reveals that the majority of genes are transcriptionally active in the schizont stage. Nucleic Acids Res, 33 (17), pp. 5503-5511. | Show Abstract | Read more

Massively parallel signature sequencing (MPSS) was used to analyze the transcriptome of the intracellular protozoan Theileria parva. In total 1,095,000, 20 bp sequences representing 4371 different signatures were generated from T.parva schizonts. Reproducible signatures were identified within 73% of potentially detectable predicted genes and 83% had signatures in at least one MPSS cycle. A predicted leader peptide was detected on 405 expressed genes. The quantitative range of signatures was 4-52,256 transcripts per million (t.p.m.). Rare transcripts (<50 t.p.m.) were detected from 36% of genes. Sequence signatures approximated a lognormal distribution, as in microarray. Transcripts were widely distributed throughout the genome, although only 47% of 138 telomere-associated open reading frames exhibited signatures. Antisense signatures comprised 13.8% of the total, comparable with Plasmodium. Eighty five predicted genes with antisense signatures lacked a sense signature. Antisense transcripts were independently amplified from schizont cDNA and verified by sequencing. The MPSS transcripts per million for seven genes encoding schizont antigens recognized by bovine CD8 T cells varied 1000-fold. There was concordance between transcription and protein expression for heat shock proteins that were very highly expressed according to MPSS and proteomics. The data suggests a low level of baseline transcription from the majority of protein-coding genes.

2554

Thank you for registering your interest

We were unable to record your request to register for interest in future opportunities. Please try again and if problems persist contact us at webteam@ndm.ox.ac.uk