Dung beetle database: comparison with other invertebrate transcriptomes.

The dung beetle E. intermedius, a member of the highly diverse order, Coleoptera has immense economic benefits. It was estimated that insect ecological services in the United States amounted to some $60 billion in 2006 with dung beetles being major contributors. E. intermedius may be endowed with a robust immune system given its microbe-rich habitat. Dung beetles live on juice and microbes from the dung and are therefore, potential models for the study of infectious agents and ecological damage. The E. intermedius database is a web-based system for the genome and transcriptome of the dung beetle. The database will be expanded to include differentially expressed genes in response o various stresses especially infectious agents such as fungi, bacteria and viruses. Availability The dung beetle transcriptome database is freely available at http://flylab.wits.ac.za/


Background:
The order, Coleoptera, is the most diverse on earth. The scarab beetle, Euoniticellus intermedius, is highly effective in new habitats and often more successful than native beetles. E. intermedius makes tunnels beneath dung pats where they feed and reproduce. Adults obtain nutrients from the microbe -rich liquid portion of manure feed and do not actually consume the dung. The dung beetle larvae consume most of the dung in brood balls where they are laid.
Dung beetles have important agricultural benefits and are often introduced in the environment to alleviate ecological damage. A recent meeting of some 1500 conservation biologists at Columbia University found that loss of dung beetles could hurt the ecosystem [1]. Some of the benefits include nutrient recycling, improvements to soil tilth and pest control. E. intermedius beetles are the most beneficial to pasture health and in agriculture as they enhance soil conditions by increasing percolation. Dung beetles also reduce the number of parasites acquired by cattle and greatly reduce the population of pestiferous flies (that are dangerous to livestock) such as African buffalo fly Mostly behavioural, ecological and taxonomic studies of E. intermedius have been conducted so far. This is the first significant molecular biology study of E. intermedius. We are interested in these beetles because of their microbe-rich habitat as they may have a potent immune system. Their immune system could provide useful strategies for treatment of infections diseases affecting humans and animals. The study of beetles could also help in managing ecosystems and in enhancing agricultural practices.

Methodology:
Data collection and sequencing cDNA was synthesized from mRNA isolated from adult beetles sequenced by the GS(FLX) technology by a commercial facility (Inqaba Biotec. Pty (Ltd)).

Sequence processing and analysis
Sequence processing and analysis were performed by the use of the EST2Uni (EST analysis software to create an annotated UNIgene database) pipeline [4]. The EST2uni pipeline automates the pre-processing, clustering, annotation, databases creation, data mining and retrieval of the EST collection. To reach a fully functional EST2uni pipeline on a Unix-base server, the following

Database content
The database contains 6064 ESTs further processed into 744 CONTIGS and 1918 singletones. Sequence comparisons between the E. intermedius transcriptome and other species were performed by TBLASTX and BLASTX at a cutoff E-value <1x10 -10 . The compared species fall in the two main insect branches; holometabolous (Apis melifera, Bombyx mori, Tribolium castaneum, Drosophila melanogaster) and hemimetabolous (Acyrthosphon pisum). The comparative results show that E. intermedius has closer sequence homology to Tribolium than to the other species. Functional comparison was based on Drosophila melanogaster gene ontology. Motifs and domains were identified by using the Pfam and HMMER at a cut-off E-value<1x10 -10 . Information collected from these sources shows that there are 472 (17%) unigenes in the E. intermedius transcriptome database that have identifiable motifs and domains. The database can be searched by using BLAST and by unigene ID (Figure 1).

Future development:
The database will be expanded to include genome sequences of E. intermedus. In addition differentially expressed genes in response to a variety of stresses on D. melanogaster and E. intermedius will be included. To this end the website is called the Flylab Genomebase to reflect is broader purpose.

Figure 1:
A snapshot of the database. In addition to the BLAST page for comparing user sequences with the database. This page allows the user to view the full sequences in the dataset and search entries using the sequence ID.