TB revealed

THE origins of tuberculosis (TB) can be traced back to the early days of agriculture, about 100,000 years ago, which brought Homo sapiens into close contact with cattle infected with the disease. The cattle form of the bacterium is thought to have derived from the soil. The bacterium gradually evolved to adapt itself to the niche provided by the human lung. In most individuals, the infection is dormant and does not lead to any obvious symptoms; when the pattern of peaceful coexistence is lost, as happens with a relatively small number of individuals (who amount to a million or more every year nevertheless), it can even lead to death.

Going by current figures, tuberculosis has emerged as one of the biggest killers of our time. Poor hygienic and sanitation conditions are major causes of this, as is the premature termination of treatment. The BCG vaccine is not as effective as one would like it to be. Also, the widespread use of antibiotics has lead to the consolidation of several resistant strains. Finally, tuberculosis appears to be an almost compulsory accompaniment to AIDS and returns the favour by accelerating the progression of this dreaded disease.

This is what makes the report that the entire sequence of DNA in a virulent strain of Mycobacterium tuberculosis (M tuberculosis) has been deciphered, such a welcome one. The work, co-authored by as many as 41 scientists from four institutions in as many countries, was conducted by a group led by S T Cole of the Sanger Centre in the UK and has several interesting and valuable implications.

First of all, as a matter of interest, the bacterium has 4,411,529 base pairs in its DNA, and can be expected to code for around 4,000 different genes. In this respect, it is not very different from the more common and usually benign intestinal bacterium Escherichta coli. It has an unusually high representation of two bases out of the four, namely guanine and cytosine, and consequently the composition of its proteins also shows a bias.

A very large number of genes seem to be involved in making up the cell wall of this bacterium, an observation that might have been expected on the grounds that its cell wall has an unusual structure. Given this, these genes offer possible routes for vaccine development. Similarly, the genome sequence showed the existence of two large families of proteins rich in the amino acid glycine, and this again could be useful from the point of view of a potential vaccine. The molecular basis of virulence - the essential difference between a virulent strain and a non-virulent one - or the precise reason why the infection is virulent in some individuals but not in others, is still not understood. For that matter, one does not know why the infection lies dormant in many individuals. Some scientists think varying levels of stress between one person and another may be a part of the story.

Once again, the knowledge of the full genome sequence should speed up the hunt for virulence factors; just three have been discovered so far, all before the genome sequence was known. Curiously, the degree of genetic differentiation between strains appears to be extremely small. However, among the variable elements, there is a family of sequences that appear to specify proteins with common repeated regions, and by varying the pattern in of these proteins, the bacterium could be fooling the human immune system.

The sequence of a related mycobacterium, the one responsible for causing leprosy, is also being worked out and should be available soon. The availability of tools for artificially manipulating the genome of M tuberculosis suggests the sequence data can be made use of rapidly for carrying out interventionist experiments. The World Health Organisation has categorised tuberculosis as a global emergency, and any hope of containing its menace must be followed up.