Mapping disease risk

Print edition : June 20, 2008

An Indian database has been created by profiling the population on the basis of changes in genes with disease linkages.

THE first results from the project of the Indian Genome Variation Consortium (IGVC) have clearly demonstrated (Frontline, June 6) that even as the Indian population exhibits a genetic diversity unmatched anywhere in the world, there are within it pockets of homogeneous ethnic groups that have remained relatively genetically unadmixed. (The IGVCs multi-institutional project was set up to evolve a disease-linked genetic map of India.)

The project has also shown that the Indian population includes groups that show genetic affinities to the populations of Central Europe (CE), China (CH) and Japan (JP) and are genetically well separated from the rest of the Indian groups. One of the findings of the International HapMap Project launched in 2002, which included the above population groups but significantly did not sample any Indian population groups, was that certain (functional) alleles, or variants of disease-associated genes, had distinct distribution frequencies across the groups studied, notwithstanding the fact that the sample sizes in HapMap were small.

Following HapMap, but going much further, an Indian Genome Variation database (IGVdb) has been created by profiling the population on the basis of changes, or polymorphisms, in genes with established or suggested disease linkages. The database has revealed a significant differentiation of these polymorphisms across Indian population groups. At the end of the projects first phase, the IGVdb includes data on 75 genes and 405 associated polymorphisms. The data were obtained by genotyping the 1,871 samples drawn from the 54 distinct ethnic groups that represent the four major linguistic groups (Austro-Asiatic, Tibeto-Burman, Indo-European and Dravidian) and the six geographical areas of the country (north, south, east, west, central and north-east) and the one outgroup known to be of Negroid origin.

The significance of this basal data lies in the fact that it has, for the first time, provided a proper basis for the identification of appropriate cohort groups for pooling samples in gene-disease association studies in the country. A countrywide random sampling would miss this stratification and could lead to false conclusions. Following the IGVC project, a population stratification on the basis of variants of established or candidate disease genes now exists in the Indian context, and the database could serve as the background template for a genetic map for disease risk.

This article describes some studies on validated functional polymorphisms of certain disease-linked genes and the implications for the different population groups of the country of the data generated by the IGVdb. In fact, the IGVdb, particularly after its second phase, which is based on 1,000 genes and nearly 5,000 genetic markers, becomes a potential tool for population-specific, pre-emptive medical intervention in the countrys public health programmes of the future. These initial association studies also indicate directions for future such investigations in India and elsewhere.

We first looked at things that we can do; disorders where the number of genes expressed is low, like the eye, muscles, diabetes and also genes which are well studied at the protein level, like the drug metabolism genes or the drug response genes where polymorphisms can lead to altered metabolism rates, points out Samir K. Brahmachari, Director General of the Council of Scientific and Industrial Research (CSIR) and former Director of the Institute of Genomics and Integrative Biology (IGIB), the projects nodal institution. For instance, dosage-dependent, long-acting neuropsychiatric drugs can lead to complications if the drugs are metabolised more. So the first step was to obtain data on such disorders and the associated set of gene polymorphisms on limited clinical samples, says Brahmachari. In fact, many of the 405 gene polymorphisms used for the project were identified on the basis of several earlier clinical studies in India. Now with the IGVdb, we can look at the frequencies of these polymorphisms across the country. One can then ask what do the data imply for countrywide population groups, Brahmachari adds.

In a significant 2001 study, Partha Majumder and Badal Dey of the Indian Statistical Institute (ISI), Kolkata, found that the bulk of the Indian population was susceptible to the human immunodeficiency virus (HIV). They studied 1,436 individuals, unrelated at least to the second cousin level and belonging to 40 ethnic groups, and found that a protective allele, a certain polymorphism of a gene called beta-chemokine receptor gene CCR5, was absent in most ethnic populations inhabiting the eastern, north-eastern, southern and central regions of the country. The highest frequency of this allele was found among Muslims of the north. This obviously has important implications for the epidemiology of HIV and for the strategy to control acquired immune deficiency syndrome (AIDS) in the country, especially when projections indicate that HIV infection will spread rapidly in India (notwithstanding the recent downward revision of infection rates).

HIV gains entry into a susceptible cell through a certain family of receptors of chemokines (proteins secreted by cells), the most important of which is the CCR5 gene. But the polymorphism of the CCR5 gene (Fig.1) that confers protection against HIV-1, the dominant transmitting virus strain in India, is largely absent in Indian populations. This protective allele has a 32 base-pair deletion in the gene CCR5. Majumder and Dey also found that the individuals who were homozygous in this particular allele were the ones highly resistant to HIV-1 infection and not those who were heterozygous.

Homozygotes are those who carry two copies of the allele in the paired chromosomes, at identical loci, and heterozygotes carry one copy of the allele and one copy of the unchanged gene. In some studies, heterozygotes too were found to have significantly lower viral loads. Interestingly, the ISI study found no homozygotes at all, even in populations where a few samples carried the allele. It was originally believed that non-European populations did not carry the protective allele.

A more recent global survey, however, has found that the allele is present at frequencies of 2 to 5 per cent throughout Europe, West Asia and the Indian subcontinent. But the survey was limited, in the sense that it covered only the northern and western regions of the country, where the allele was found to be present at levels of 1.5 to 4.7 per cent. Majumder and Dey point out that the surveyed regions are known to have had a high admixture with Caucasians about 8,000 to 10,000 years ago, and the relatively higher frequencies in these regions may be due to the Caucasian gene flow. The ISI study, while consistent with the global survey, points to a situation of concern for the country as a whole, one that calls for appropriate disease-control strategies. The IGVdb data, whose sample size was somewhat larger than the ISI study, only seems to confirm the earlier findings.

Variants of as many as 30 to 40 genes have been associated with susceptibility or resistance to Plasmodium falciparum malaria. It is also believed that malaria has been the selective pressure for several disorders of the red blood cells such as sickle cell anaemia and thalassaemia. Different population groups across the world have exhibited differential association of the disease with these genes. However, most polymorphism-disease association studies have been carried out in African and South-East Asian populations, and only limited studies exist in Indian populations.

The incidence of and mortality rates from P. falciparum malaria in India are high. Several regions of the country are at high risk for P. falciparum, with the infection accounting for more than 80 per cent of all malaria cases in some areas (Fig. 2D). Pathogenicity of the disease is quite different in several Indian population groups compared with African populations, points out Saman Habib, a malaria expert at the Central Drug Research Institute (CDRI), Lucknow, and a lead scientist in the IGVC project.

In a case-control study published in January, Saman Habib and associates looked at the correlation of the severity of the disease and susceptibility to severe malaria with polymorphisms in two genes known to play an important role in the pathogenesis of P. falciparum malaria. The study included 86 subjects from regions of endemicity Antagarh, Chhattisgarh, and Sundargarh, Orissa and 197 subjects from regions of non-endemicity Lucknow and surrounding regions in Uttar Pradesh. The endemic population comprised mainly tribal groups of Austro-Asiatic and Dravidian linguistic lineage, while the non-endemic population mainly comprised large caste and religious groups of Indo-European lineage.

The genes studied were the tumour necrosis factor (TNF) and FCGR2A genes. The former is involved in inflammatory and immune responses and the latter is a gene for an immunoglobulin-G receptor, which serves as an important link between T-cell-mediated immune response and antibody-mediated response.

The study found differences in the frequencies of Single Nucleotide Polymorphisms (or SNPs, pronounced snips) of the TNF-enhancer and FCGR2A genes between the populations of endemic and non-endemic regions. (SNPs constitute the simplest mutation a gene undergoes, involving a change in the base molecule (A, T, C or G) at its free end.) Two SNPs of the TNF-enhancer gene (involving switches from base molecule T to C and C to A respectively) were found to have significant association with the disease in the endemic populations, and these were found to correlate well with enhanced TNF levels.

A single SNP of the FCGR2A gene, involving a switch from the base molecule G to A, altered the affinity of the receptor with immunoglobulin-G, with the G allele having a low affinity and the A allele having a high affinity. Interestingly, the major, or the G, allele was associated with susceptibility to severe malaria and the minor, or the A, allele was found to offer protection against severe disease manifestation. While the latter observation is in conformity with studies on a Sudanese population, it is the opposite of studies conducted elsewhere.

These findings were validated against the 1,871 samples from 55 ethnic groups of the IGVC study to arrive at a pan-India malaria susceptibility map (Fig. 2). The IGVdb provides background frequencies for the associated alleles and enables pooling of appropriate control population and helps in deciding the kind of case-control study that should be undertaken, says Saman Habib. In two months time, she plans to launch, with more collaborators, a more elaborate gene-disease association study involving 13 to 14 genes from the 40-odd known ones. She is also engaged in studying the role of the complementary receptor 1 (CR1) gene whose various polymorphisms are known to play a role in the levels of receptor molecules.

By identifying genetic variants associated with drug efficacy or drug response, pharmacogenomics (genetics- information-based medicine) offers a premise for personalised treatment or population-specific medical intervention based on the differentiation of the polymorphisms across population groups. Several studies have focussed on such an association with salbutamol, or albuterol, the most commonly prescribed medication for asthma. However, results from such studies have differed in the associations found and also show considerable variations between populations. Now, with the availability of the IGVdb, it has become possible to use the data for mapping the population groups in the country for the variants in the gene for the receptor, the target for the asthma drug. Salbutamol, which is a long-acting beta2-adrenergic receptor (ADRB2) agonist, is the commonly used bronchodilator for asthma and other respiratory disorders. It acts on the receptor and causes smooth muscle relaxation, dilation of bronchial pathways, vasodilation, and so on.

Variants in the ADRB2 gene can result in varying responses of the receptor to the drug. A collaborative study involving the IGIB and the pharmaceutical company Nicholas Piramal found as many as 10 SNPs in the gene in Indian populations of which one was significantly associated with the receptors response to salbutamol. In this SNP, the base molecule A is replaced by G, and by genotyping the patients for the SNP, the study found that homozygotes in the A allele (individuals carrying two copies of the A allele) were poor responders (with a probability of 0.81) and homozygotes in the G allele were good responders (with a probability of 0.73) (Fig. 3). Heterozygotes (individuals carrying one G allele and one A allele) showed an intermediate response.

Interestingly, this finding in Indian populations is in contrast with studies in Caucasian and Japanese populations but similar to studies in African populations. A countrywide map of the SNP provided by the IGVdb can be used to extend the study to larger control sizes to confirm the findings. These could then form the basis in future epidemiological studies to identify populations with differential drug response and to evolve a genotype-specific drug-dosage regimen.

Elevated levels of the homocysteine (Hcy) in the human system has been implicated in a range of disorders, including paediatric acute lymphoblastic leukaemia, ischemic stroke and cardiovascular diseases. It is an important molecule, which is formed in body cells, and is a key intermediate in the metabolism cycle of the amino acid methionine. Methionine metabolism is critical to transmethylation reactions, which play an important role in the protein expression of genes. Thus, an increased level of Hcy implies its reduced re-methylation to methionine. This dysfunction in the methionine cycle can result in reduced levels of critical proteins required for body functions.

The levels of Hcy can get enhanced because of both genetic and dietary factors. As for the latter, folate (present in green leafy vegetables) and vitamin B12 are critical to maintaining the proper conversion of Hcy to methionine by an enzyme called MTHFR (methylenetatrahydrofolate reductase). Sources of B12 are mainly animal (chiefly bacterial) proteins, and thus, vegetarians are usually deficient in it. As regards genetic factors, defects or polymorphisms in the gene MTHFR that codes for the enzyme can result in altered levels of Hcy.

Among the known polymorphisms that play a dominant role in elevating Hcy levels is the allele MTHFR C677T of the MTHFR gene, a SNP where T replaces the base molecule C. Thus, the presence of this MTHFR polymorphism, along with folate and B12 deficiency, would have a compounding effect in increasing Hcy levels and thus increasing the risk for cardiovascular diseases.

According to Shantanu Sengupta of the IGIB, who has looked at the role of MTHFR polymorphisms in cardiovascular diseases, compared with the normal enzyme activity in homozygote individuals in the C allele, enzyme activity in heterozygotes in C and T alleles is reduced by 30 per cent and in homozygotes in the T allele by 70 per cent (bar chart in Fig. 4). In a separate study, Sengupta also demonstrated that vegetarians, in the presence of MTHFR polymorphism, do have higher levels of Hcy. The IGVdb gives the genotype profile of the Indian population (map in Fig. 4) and therefore provides a basis for the possible selective targeting of populations for folate and B12 supplementation.

However, in the Indian population, the overall frequency for this allele is much lower than in the populations covered by HapMap except for the group of Negroid origin, which is closer to the Indian frequency. According to the IGVdb, only 3 per cent of the subjects genotyped were homozygous in T, and this variant was not observed in 29 of the 55 population groups covered by the IGVC project. The interesting thing to note in the context of linkage to vegetarianism, however, is the predominance of the risk genotype in the north, which has a relatively larger share of non-vegetarians compared with the rest of the country. Further investigation, with larger samples than this initial study, is required to establish the reasons for this, Sengupta says. And the larger database that would become available after the second phase of the IGVC project would be important for this.

Glaucoma is one of the eye disorders that can now be investigated using the countrywide database. A group led by Kunal Ray, an eye genetics specialist of the Indian Institute of Chemical Biology (IICB), Kolkata, has recently reported that a SNP) in the gene CYP1B1 is an allele for susceptibility to a form of glaucoma known as primary open angle glaucoma (POAG), a complex disease that affects the optic nerve. The IGVC has mapped this SNP across the various population groups. Glaucoma was chosen as a model disease because, while cataracts can be treated very easily, 50 per cent of the 70 million across the world affected with glaucoma become blind, says Arijit Mukhopadhyay, a student of Rays currently at the IGIB. Because glaucoma is a nerve disease, we also wanted to ask a broader question about its commonality with other neurodegenerative disorders, he adds.

CYP1B1 was investigated because it is a known causative factor for recessive primary congenital glaucoma (PCG) and has also recently been implicated in expediting disease onset in inherited, or familial, cases of POAG. It has also been seen that 72 per cent of all POAG cases represent the inherited form of the disease but without a clear Mendelian pattern of inheritance. These observations were seen to be indicative of possible interactions of classical mutations in the known causative genes for glaucoma with other potential risk factors, such as SNPs in CYP1B1, which can play a major role in complex diseases by having subtle effects on the protein expressed and lead to POAG.

In the study, a group of 264 POAG patients (regardless of their familial history of the disease) and 95 ethnically matched controls from West Bengal were investigated for statistically significant association of the disease with five potential SNPs in the gene CYP1B1. It was found that a particular SNP (where the base molecule C is replaced by G at a particular locus in the gene) showed much greater statistical association with the disease than the remaining four, thus suggesting it to be a risk allele for predisposition to POAG. Genotyping for this SNP showed that patients were also homozygotes for this allele, that is, the patients carried the same G allele at identical loci in the paired chromosomes.

According to Ray, earlier functional studies of this polymorphism elsewhere suggested that this particular SNP caused the higher generation of reactive oxygen species (oxygen radicals and other reactive oxygen molecules such as peroxide), which might lead to the degeneration of ocular cells that results in glaucoma. To get the distribution of this particular polymorphism in different population groups of the country, data on 542 individuals from 24 ethnic subgroups were studied.

Remarkable variation of the SNP has been observed among the ethnic groups of India. This could provide insight for future epidemiological studies on POAG in these population groups, points out Ray. This study is now being extended to include 1,000 patients across population groups.

Apart from glaucoma, Rays group is also engaged in studying the genes involved in the phototransduction pathway, the conversion of light signals to electric signals in the eye. According to Mukhopadhyay, as many as 30 to 40 genes are involved in this. The first phase of the study will look at SNPs associated with 15 genes and try to find out, using statistical analysis of the variance of these SNPs across population groups, whether one can pick out the SNPs that have positive or negative selection pressure.

Schizophrenia and bipolar disorder (manic depression) are two neuropsychiatric disorders (NPDs) whose genetic predisposition can now be investigated across population groups in the country on the basis of IGVdb data, according to Sanjeev Jain of the National Institute of Mental Health and Neurosciences (NIMHANS), Bangalore, who has studied the genetic basis of these diseases extensively.

A number of genes on chromosome 22q, according to Jain, are known to be associated with these two major NPDs. The IGVdb data set clearly shows that polymorphisms of these vary markedly across ethnic groups in the country. Our studies at NIMHANS on two or three large pedigree [groups] have shown that polymorphisms in one particular gene, known as MLC1, which has a role in learning and adaptability in the adult brain, are associated with both NPDs. We also know that this is a genetic defect as five or more people in the same family were found to be affected, says Jain.

The involvement of MLC1 in both disorders suggests a common pathway in their pathogenesis, but predisposition to one or the other seems to depend on the different SNPs of the gene that an individual harbours. According to Jain, a case control study on 216 bipolar disorder and 193 schizophrenia patients from South India showed that individual patients had minor changes in the same gene, indicating that the gene was prone to functional polymorphisms. Interestingly enough, while the findings seem to be similar to those found in Chinese populations, within the country the associations were found to be significant only in the large population groups of the country and not in isolated tribal groups, such as those in the north-east.

This suggests that there may be something peculiar happening in the particular gene by way of interaction with other genes and selection, which needs further investigation with larger samples from across various population groups of the country, says Jain. The IGVdb allows this to be carried out, by comparing population groups across the country as well as populations elsewhere. A larger collaborative study between NIMHANS, the IGIB and Manipal University aims to study 1,000 schizophrenic patients. With the involvement of Manipal, we will be able to study populations of the Western Ghats, which are much more admixed with tribal populations, says Jain.

Several other such gene-disease association studies have been done in the country with limited clinical samples, and these can now acquire a pan-India character with the availability of the IGVdb. The data have also generated an increased interest among medical specialists and their involvement should strengthen such studies to evolve, in the not-too-distant future, a much improved disease-linked genetic profile that gains importance and usefulness from a public health perspective.

This article is closed for comments.
Please Email the Editor