Analysis of sequence diversity among IS6110 sequence of Mycobacterium tuberculosis: possible implications for PCR based detection.

The IS6110 belongs to the family of insertion sequences (IS) of the IS3 category. This insertion sequence was reported to be specific for Mycobacterium tuberculosis complex and hence is extensively exploited for laboratory detection of the agent of tuberculosis and for epidemiological investigations based on polymerase chain reaction. IS6110 is 1361-bp long and within this sequence different regions have been utilized as targets in the identification of M. tuberculosis by PCR. However, the results are not always consistent, specific and sensitive. In recent years, a few clinical investigations raised concerns over IS6110 specificity and sensitivity in the diagnosis of tuberculosis due to false-positive (homology with other target DNA besides M. tuberculosis) or false negative (due to absence of copies of IS6110) results with IS6110 specific primers. To unravel the variations in IS6110 sequences, an insilico analysis of IS6110 sequence of different strains of M. tuberculosis was carried out. Our results of comparative analysis of IS6110 insertion sequences of M. tuberculosis complex suggests that, IS6110 insertion sequences harbored variations in its sequence, which is evident from the phylogenetic analysis. Importantly, IS6110 sequence has divergence within the copies of same strain and formed different clusters. A list of IS6110 specific primers used in various clinical investigation of tuberculosis was obtained from the literature and their performance scrutinized. Our study emphasizes the need to develop PCR assays (multiplex format) targeting more than one region of the genome of M. tuberculosis.


Background and Description:
Tuberculosis (TB), an infectious disease affects millions of humans worldwide and is caused by the Mycobacterium tuberculosis complex. In 2009, WHO estimated that there were about 9.4 million new cases, with 1.3 million deaths globally due to tuberculosis [1]. This high incidence of TB worldwide necessitates research on developing precise diagnostic methods for specific treatment and management. Currently, the disease is diagnosed by sputum smear examination, a rapid and cheap method but lacks specificity. Traditional microbial culture utilizes solid (Lowenstein-Jensen) or liquid media which provides a definitive diagnosis of an active infection but is time consuming (6-8 weeks). With the advent of automated or semi-automated liquid culture system the time to detect the growth of mycobacterial species has been significantly shortened (roughly 14 days) [2]. However, the systems are expensive and not available even in many tertiary care centres in developing countries. In recent years, a number of molecular diagnostic methods for tuberculosis have been developed based on polymerase chain reaction (PCR) amplification targeting certain sequences of IS6110, hsp65, 16srRNA, 85B antigen, 38kDa antigen. The IS6110 belongs to a family of insertion sequences (IS) of the IS3 category and its amplification is most commonly used in the detection of M. tuberculosis because it is highly conserved. In addition to diagnosis, IS6110 insertion sequence has also been utilized for molecular epidemiological analysis of clinical isolates. However, the sensitivity and specificity of IS6110 sequence in the diagnosis of tuberculosis remains uncertain and needs to be appraised by in-silico analysis. In recent years, IS6110 based diagnosis has been shown to be hampered by the presence of low copy number or absence of this IS6110 repetitive sequence.
A few clinical investigations reported the presence of low copy number of IS6110 in M. tuberculosis strains from regions such as Tunisia, where, 75% of the strains showed 6-10 copies [3]. In Denmark 50% of the strains analyzed showed 11-15 copies [4] and in different geographical regions of India 11 to 20% of the strains showed nil or 1-2 copies of IS6110 [5]. A small number (<1%) of IS6110 deficient strains have been reported in San Francisco [6], and 2% in Vietnam [7]. In the present study, our primary aim was to detect the IS6110 sequence differences within the M. tuberculosis complex by means of construction of phylogenetic trees based on available IS6110 sequence at GenBank database and the secondary aim was to analyze the discriminatory potential of a list of primer pairs reported in literature which were used in the PCR techniques to diagnose tuberculosis.

Retrieval of IS6110 sequences:
The IS6110 sequences were retrieved from the GenBank repository for the M.

Sequence alignment and phylogeny construction:
The ClustalW program of the MEGA 5.03 [8] was used to align the IS6110 nucleotide sequences. Each alignment was visually examined to detect short misalignments and poorly aligned regions. From these aligned IS6110 nucleotide sequences, phylogenetic tree was constructed using Maximum Likelihood method using Tamura-Nei model. The discriminatory potential of a few primer sets which have been used successfully to amplify different regions of IS6110 sequences in PCR assays for the detection of M. tuberculosis in various clinical investigations was also analyzed.

Phylogenetic analysis:
The phylogenetic analysis of the IS6110 sequence of M. tuberculosis H37Ra strain revealed the existence of sequence divergence among the individual copies of IS6110 in a given strain (Figure 1) IS6110 (Figure 2). Different regions of IS6110 sequence are being used as target for PCR amplification. Much of these do not describe the sensitivity and specificity of the assay. The reported data shows varied sensitivity and specificity for different regions of IS110 sequence. This further baffles the laboratorian in selecting an optimum primer set. A majority of the assays target nt 762-883 region for PCR amplification. A primer set used for PCR identification from buffy coat and sputum as well has been reported and found to be having 100% sensitivity and 95.1% specificity in nested format compared to blood culture (BacT/ALERT 3D, Biomeriux) [22]. These primers were found to be more efficient than other primers used to identify IS6110 lacking M. tuberculosis strains [23]. Thus, information on each primer sets along with its clinical-and assay-sensitivity and specificity, and its detection limit would be useful to identify an optimal primer set to facilitate rapid diagnosis.  . [24] i.e., IS6110 likely to have originated from (or been passed on to) other organisms and certain regions of DNA that may have remained conserved among these organisms during evolution. This raised the possibility that some laboratories may have amplified stretches of DNA related to IS6110 from organisms that were not M. tuberculosis. This would explain some of the false positive tests reported in the literature. In a search of GenBank with 181-bp sequence, 63.3% homology in a 177-bp overlap was found with the insertion sequence IS629 of Shigella sonnei and 63.8% homology in a 177-bp overlap was found with the insertion sequence IS3411 for Escherichia coli. In addition, with the IS6110 specific primers this central region was amplified and Southern hybridized with IS6110 derived probe from the majority of non-M.tuberculosis species, thereby we confirmed that this central region is homologous to mycobacteria other than tuberculosis. Together, these data suggests to us that care must be taken while designing specific primers for IS6110 sequence of M. tuberculosis. The presence of multiple copies of IS6110 in the majority of M. tuberculosis strains enhances the sensitivity of PCR tests by providing a source of "preamplified" sequence and in addition, deletion or mutation of one of the sequences should still allow for others to be intact for amplification. It is well documented that, M. tuberculosis isolates containing five or fewer IS6110 copies cannot be reliably differentiated by the RFLP method [25]. In conclusion, we suggest that great care must be given in designing primer pairs for IS6110 to prevent false positive or false negative results. In such cases where IS6110 RFLP typing is ISSN 0973-2063 (online) 0973-8894 (print) Bioinformation 6(7): 283-285 (2011) 285 © 2011 Biomedical Informatics not possible due to low or zero copy of the element, one should consider an alternate genotyping method for accurate epidemiologic or diagnostic purposes. For diagnostic PCRs, multiplexing by targeting two regions like IS6110 and hsp65 could be a good strategy.