David Reich: Harappan ancestry is the single largest source population for almost all people in India today

Interview with David Reich, Professor at the School of Medicine, Harvard University.

Published : Sep 13, 2020 06:40 IST

Prof. David Reich: “It is quite clear that before two million years ago there were no humans in Eurasia.”

Prof. David Reich: “It is quite clear that before two million years ago there were no humans in Eurasia.”

Professor David Reich is a scholar of genetics and is Professor at the School of Medicine at Harvard University. His phenomenal work Who We Are and How We Got Here (2018), resulting out of long years of research came into discussion in India through a serialised review in The Hindu , which was detailed and laudatory in tone. The review aroused my interest in the work and I quickly got hold of a copy of the book. Ever since I read it, I have not stopped feeling amazed at the range of facts and ideas Prof. Reich has presented in the book. The work should be of interest to Indian readers and scholars of history and culture because it has serious implications for the history of India as it has been read so far.

Neither the leftist historiography used for representing Indian history nor the revisionist-nationalist historiography has ever privileged the south over the north. Prof. Reich’s observations and the results of his analysis suggest the need for a paradigm shift in the approach. One must add that for the last three decades in India intellectuals have witnessed a raging conflict of views on history, the origin and the place of the Sanskrit language and the people who used it being the main ploy in that noisy debate. One does not know if Prof. Reich is aware of the historiography debate, but he surely is keenly aware of the wilful distortion of history during the fascist regime in Germany. At the very start of his book, he places on record his stand on Hitler’s notion of a ‘superior race’ and the contempt resulting out of that for other races. For a scientist dealing with genetics, I should think, this is really a significant starting point, because the science they practice, if used tendentiously, may lead to genocide. I had requested Prof Reich time for a serious conversation. The online conversation took place in June 2020. It is produced here in a version of the transcription approved by Prof. Reich.


In your book, you have covered a phenomenally long span of history and pre-history of many continents, including India and East Asia. The first inhabitants on the subcontinent, in your opinion, came out of Africa, and are in Andaman. From there, they spread out to what is now south India, and from there they moved north. In that movement, their language also migrated with them. And probably, as you mention several times, the link between the Indus Valley Civilisation and those who were in the south is a subject worth exploring. It has not been finally determined if such a link actually exists. I was reading about the undeciphered Indus script; and it is still an unsolved mystery despite many claims to solving it. I would like you to comment on the probability, rather high probability, of pre-Dravidians moving to the Indus Valley, and then being pushed back. How does your science reflect on some of these areas which are still a mystery for Indian historians?

Thank you for the opportunity to have a conversation and for your kind words about my book. I’m really happy it was able to reach you and some people in India. Well, you touched on many topics and your questions range over a wide timescale. I’m going to briefly touch on the earlier part of your question and then focus on the latter part.

The earlier part of your question concerns the first human in India. It is quite clear that before two million years ago there were no humans in Eurasia. ‘ Homo’ , which is our genus, arose in the paleo-anthropological record in Africa and spread out of Africa into Eurasia two million years ago with the first known skeletal remains and archaeological sites in the Caucasus, in Indonesia, and then spread to other places throughout Eurasia, including India.

But the ancestors of modern humans in India only spread there after 50,000 years ago, where there is evidence for a later large-scale spread of anatomically modern people, peoples whose elements look like ours and look fully modern. This migration, once again, is out of Africa, and it displaces the previous set of humans throughout Eurasia, throughout Europe, East Asia, in South Asia, and elsewhere in Eurasia, albeit with a little bit of mixture with the archaic, previously established humans. The modern ones had skeletons like ours, and the modern occupation of Eurasia, including India, begins in earnest after about 50,000 years ago. We don’t know where modern humans arrived first out of Africa and what was the path of their spread. In the Andaman Islands today, we have people who seem to be relatively unmixed descendants of some of the first people in different parts of Eurasia, just like the ones in Australia and New Guinea. It is not clear whether there has been continuous occupation of the Andaman Islands since that time, or even if they are very early habitation of these islands. Most likely, Andaman ancestors lived in South-East Asia or the South Asian mainland. The mainland is much more likely to be the source of this population than an early migration to the Andaman Islands themselves because the mainland is so much bigger and a richer environment. Where in the mainland is unclear though; it might not be to the Indian subcontinent itself but rather to parts of Indonesia, for example, Sumatra or to Myanmar.

But your question revolves around the formation of the present-day population structure of South Asia. South Asia is one of the most diverse places on earth, with many hundreds of languages, reflecting great human diversity, and there is great genetic diversity as well. One of the things that we see when we look at genetic data from diverse people in India is that the great majority of groups in India, but by no means all groups, speak Indo-European and Dravidian languages. These are the two largest language families in India, and which are genetically well described as being arrayed on a gradient of different proportions of ancestry, different proportions of inheritance, from two very different ancestral populations, as different from each other as Europeans and East Asians. We have known for 11 years in India with this gradient of different proportions of these two highly divergent ancestries.

This came out of the work that we did in collaboration with K. Thangaraj in Hyderabad’s Centre for Cellular and Molecular Biology [CCMB]. We spent a lot of time over the past decade, trying to understand the origin of this major gradient of ancestry in South Asia that accounts for almost all of the ancestry in people speaking Indo-European and Dravidian languages, but does not account for unique and special ancestry that is common in people speaking Austro-Asiatic and Burmese languages. So most of the work we have done is focussed on that gradient. I’m happy to tell you what we have learned in the last decade about the origin of that group.

Professor Reich, after those 10,000 samples that you studied in the Hyderabad-CCMB, is there an ongoing collection of samples, and are you aware if they are being analysed again?

This project is being led in Hyderabad. There was a time in the late 2000s when the collection was intense. There was a large-scale programme by the Indian government through the CCMB that involved many students, usually masters students who would do collections in diverse villages throughout India, trying to catalogue ethnic diversity inside the Indian subcontinent. Today, if I remember right, there are 18,000 or so samples from something like 500 groups. There is a minimum of 4,500 documented ethnic groups in India. There are some 10 times more unsampled groups than sampled ones, depending on how you count.

And so this is in the end only a fraction, maybe 10 per cent, of the diversity in terms of groups in this incredibly diverse country. But that collection is ongoing. It is not as well funded as it used to be before. So the new collection is not as active, but I know this is an absolutely central project. And it is important for health care for India. Nearly everyone belongs to these groups and each group has its own genetic susceptibilities to different illnesses, just like other groups in the world do, and each one needs to be studied ‘on your own and for its own sake’ in order to characterise those susceptibilities.


Yes, of course, the study of the unspoiled cells makes sense in stem-cell research. It will be important for health, of course. However, some of the indigenous communities are terribly worried about this. That is, they have ethical issues related to such research; but I’m not bringing those questions in, because you are far more aware of the ethical questions in science research. I will not get into that. I want to ask you one question about the possibility of migrations. When you draw conclusions about migrations, do other sciences help in consolidating your conclusions or are they entirely based on the study of the genomic properties and dating? You mentioned earlier archaeology, which is from your perspective more recent history. But because one does not have great clarity about several views of at least two or three very distinct migrations, one does not have great clarity in the field of history and historiography related to India. One is the northern migration to the south bringing with it language, epics, culture, myths, gods and food habits. And it is clear that, that happened. What we have is only mixed myths; and they are no evidence for building any theory about that migration. But in the ancient migrations that you have discussed in your book, the use of ships is to be taken into account, at least of some kind of seafaring, particularly going from south India, right up to the area of the Sumerian Civilisation. Was there in use any such transportation device about 6,000 to 7,000 years before our time? Do you have or do you normally use all that supporting evidence drawn from study of material culture or oceans study or agronomy, for instance? India has a unique dual kind of agricultural practice. One is wheat based and the other is rice based. We have been following two cultivation seasons. Did you make use of all those inputs in drawing your conclusions?

Absolutely, we do. I think there are many areas of inquiry. We have limited information about the deep past and writing really captures only the last, maybe 3,500 years in the oral tradition or probably not more than 3,500 or 4,000 years old anywhere in the world or 5,000 years old from the places where writing was earliest. And so we don’t have any information deeper in time from writing, which is a particulary rich source of information. Maybe we have some information from mythology that gets passed down, which is another type of information we could potentially use. And then there is the languages people speak since languages are related to each other and we can reconstruct their ancient vocabularies and decode that there are linkages between the languages that are spoken in different parts of the world or in different regions. And when there is a paired language, we know from ethnographic and anthropological studies of the present-day people that language usually, not always, spread through large-scale movements of people, especially women. And so, the fact that people share languages is often a clue that there is some movement of people.

Then finally, there is the archaeological evidence, the material culture remains that people leave behind, the types of buildings they make, the types of tools, in the shape of the stones, the styles of making bricks, the types of ornamentations that people made, and the types of foods they ate. You can actually see from the remains of their food what types of food people ate and see where people ate some kinds of foods, other people did not eat those kinds of foods—again evidence of connections between people, which we can directly measure. One thing that is interesting about the past is that it is not the only thing that is interesting. The past may not even be the most important thing, but it is a thing . And what that is, whether we can get a human skeleton and connect and obtain DNA from it, which has become possible in the past 10 years. We can ask the question, if people between these two ages and sites are related closely to each other or are they not so related. And if we obtain human skeletal remains over time, from 10,000 years ago, some from 7,000 years ago, from 4,000 years ago, and 2,000 years ago, and compare them to people from the present, we can see whether it is the case that people who live in that region today have descended from the same ancestors who lived there 10,000 years ago, or whether there has been additional movement and migration and contribution from people who live outside that geographical area.

What has become possible in the last decade, not just in India but in very many places in the world, is to get DNA out of ancient archaeological sites—and archaeologists have thorougly studied and characterised and understood the culture associated with them—and see how they are related to later archaeological sites in the same region or related to people today. What you can see in many cases—and what we are learning from the genetic data—is that it is almost never the case that in any place today, people are directly descended without mixture, without external input, from the same people who live there 10,000 years ago. We know that very well in Europe where we have a huge amount of data.

The people in Britain today, for example, inherit almost no ancestry from the people who lived in Britain 10,000 years ago. We have DNA now from people who live in Britain, right after the end of the last ice age, 10,000 years ago, and we can tell from the genes of these people, we can predict the colour of their eyes and the darkness of their skin. They probably have blue eyes, very dark skin almost as dark as people from sub-Saharan Africa. But about 6,000 years ago in Britain, there was a large-scale movement of people bringing farming from the continent. And there was a 99 per cent replacement. The British population was of farmers; and then again a little bit after 4,500 years ago there was another 90 per cent replacement. So the people of Britain cannot claim that most of their ancestry comes from people who live in Britain. To the contrary, there has been a tremendous amount of churn.

It is the same with India. the people who live in any one place today inherit some unbroken ancestry, sometimes probably more than in the case of Britain, but most of the ancestry of any person does not come from people who live within the same 500 kilometre or even 1,000-km radius, where they now live. Today, we are all mixed with the chain of human heritage and we have received input from many different places. And what we gained during the last decade of research in India is that now we know a lot about the origins of the gradient of ancestry. I told you about two groups. Each of them is different from each other as Europeans and East Asians. We know how that population collision and mixture occurred after the end or during or after the end of the Harappan Civilisation, a little bit after 4,000 years ago. The mixture began at that point.

Jati and varna

This brings me to the question of jati and varna, which you have briefly discussed in your book when you give this extraordinary example of the Vyasa community in Telangana. Your thesis is that, probably, this community has managed to keep itself untouched by external genetic contact. India has now something like 4,000 or more communities, and each one thinks of itself as a jati. Some describe themselves as a ‘tribe’. Historcally, the varna system has caused a huge amount of injustice to women and those were seen as ‘lower’ in the jati and varna hierarchy. Now, of course, science need not worry about this history, because it is what people did. But the conclusion about non-mixing of the genetic traits within a community, is that overconfidence of science? I may be asking this question out of sheer ignorance, but for us in India, this is a very important issue. In fact, the ‘dream of India’, the idea of modern India is to get out of the cage of caste and varna and think of all of us as equals. Caste and varna are, even today, unfortunately a part of the mindset of most Indians. So, while your entire book convinces me that all of us are basically mixed-humans, suddenly getting an evidence that some Indian communities have preserved themselves in the purest form, though such communities—population islands—may be small in numbers, makes me uneasy. I am not questioning your method or conclusion. I should add that in your book you so vehemently, clearly reject the idea of racism and stereotyping. You refer to how Hitler actually misused science to create a fraudulent political discourse. My question is: ‘What more needs to be done after your book in order to convey to Indians that these jati islands are something other than jati islands? The jati is a concept and what genetics are describing is a genetic features of a group, and not the thinking surrounding that group’s identity. In other words, I am asking you how to use your research without its conclusions being misunderstood as a justification of jati, though entirely unintended by you? My question implies that science may be an articulation of some previously unknown truth. And this objective need not be explicitly stated. But, in strange times such as the present time, ideas bundled together as ‘ideology’ can easily prove to be the enemy of truth. Therefore, I would like you to respond.

What a wonderful question. I see science as an attempt to get at truth, but how the truth is used or misused is terribly important. What genetic analysis of Indian population history has shown is an alternation—between periods of genetic isolation and periods of mixture. What we see in South Asia is an alternation of periods of mixing of groups and periods of isolation, as for example, in the group that you mentioned. And this has happened over a long time with many people in the historical memory and many groups. Often what people remember is the periods of isolation, and they think it has been always the case. However, every few hundred years, or maybe every thousand or two thousand years, groups form and mingle with other groups in a very very dramatic way. And then when that gets forgotten, people regain a false sense that migration and mixture are not important. So the group that you mentioned is like every other group in India: the product is a mixture between two or three very different populations that mix across many lines of ancestry and tradition. Then that mixed population developed its own kind of endogamy with limited influx until the next big mixing event occurred. For that group, what we found is that there could not have been more than about 1 per cent input of new people every generation for the last couple of thousand years, which is a very extreme isolation. However, that does not mean that there might have been one episode or two episodes of major mixture 1,000-2,000 years ago.

In other groups in India, it is well known that there is a lot of mixing across jati lines. Often, it is the case that people have alliances with people often from ‘lower’ caste, lower social status jati groups that get incorporated into their group. In such an instance, you would not see the same degree of impact of the ancestors of the original group.

For me, as I describe in my book, I have an analogy very much for my personal background in mind. I am from a Jewish background, which is a little bit like a jati or a caste group, but in Europe where you might not think of castes. However, historically, my people had a ‘caste’ function with an economic role in society, accompanied by social segregation; in my group, there has been a tremendous degree of isolation for thousands of years. And so I have a visceral understanding of how this can occur. Of course, my group of people, in particular, has had massive inputs from other groups. For example, the Askhenazi Jews—my community—had about half its ancestry come from mixing with Europeans, which has occurred in the last 1,500 years. That is a profound part of the ancestry of my group. What all this goes to show is that the idea that any one group is pure is genetically wrong. We are now learning a lot more about patterns in human ancestry; and the pattern that we are learning is that the myths and the stories that we tell ourselves about our history and about where our ancestors lived, do not work. Almost every time we are able test those myths, they are falsified with data. So even if we are not able to get DNA from a particular group, the best bet is that the idea of unmixed and pure lineage is wrong in some profound way. So many stories about the past have been falsified by genetic data. The science does not really have a moral aspect; it is just what it is. It is an uncontrollable force. In this case, its effect has been to explode mythology, to explode prejudicial understanding and explode the narratives of isolation on a very large timescale.

Visual culture

I now turn to another area. And I need some advice from you in that. I have been working on languages in India, and with a large group of almost 3,000 volunteers. I documented, about more than 700 languages. I have also worked with the indigenous people, the Adivasis (‘tribal’ is a bad word). When I was looking at the languages and what is called the folk culture of communities in Rajasthan and Kutch adjacent to Gujarat—the area where the Indus Valley Civilisation sprang up—we noticed in the visual culture and the languages of those people many signs that are not inherited from the Indo-Aryan but from something else, known as Prakrit, and not Sanskrit. So far, the proto-Prakrit has not been reconstructed, and, therefore, its past remains still hazy. But many times persons in that area, many starkly illiterate persons, tell me, when I show them some Harappan figures and designs, that they find them familiar. I would like to assume as a working hypothesis that there may be some links between people who now exist in that region and people who existed in that region 4,500 before our time. Do you think a composite research, studying visual culture and very extensive genetic sampling and examination might yield some better results. You have already given us this wonderful book based on extensive research. But is there some further scope to study the Harappan Civilisation, its disappearance, and its unnoticed continuities? It has disappeared from the perspective of history. But surely biological continuity is bound to be there because people did not evaporate into the skies. They must have left some descendants and some bones and the people of generations that succeeded. Would you think this kind of pursuit related to that specific area, together with archaeologists, experts of scripts and of ancient mythology might actually yield more than what we know? The trouble with the script is that there are not enough samples available to draw a final conclusion. So many partial explanations are there, but nothing explains all the mysteries related to the Harappan script. And we have no known sample of their speech. So, genetics, in collaboration with other sciences, could that be a composite project to unravel the mystery that is necessary to be unravelled for the sake of Indian history and Indian society. For, if it is not unravelled, then the Sanskritist, fascist, hegemonist powers will continue to infest the Indian thinking about history.

Absolutely. I think that in India, as elsewhere in the world, there is precious little information about the time before the emergence of writing and even from the time after writing as writing is only done by some people, and so many people did not have their stories handed down. And so in order to get at that past, to really understand the nature of how culture forms over time, we need to use every type of information we have, be that mythology, be that musicology, studies of languages and similarities of words and shared words that are unique to those regions, be that phonetics, be those studies of the ancient building tools, crops and foods that people ate and left behind. By putting the various lines of evidence, we can hope to begin obtaining some meaningful information. Since the book was published, we have made a lot of further progress in understanding the history of South Asian populations. We now have a clearer sense of the ancestry of at least some people who were part of the Indus Valley Civilisation. We have DNA from just one individual from the Harappan site called Rakhigarhi in Harayana, but we also have DNA from over a dozen other individuals living in South Asia itself, who we think are almost certainly immigrants from the Harappan Civilisation and living in the area with Turkmenistan to the north and Iran to the west. The places where they lived were trading hubs. They were trading; the sites were full of Harappan archaeological material that they were trading with the other civilisations to the west and to the north. And these people were outliers in their communities. Genetically, they were different from the other primary groups, and were genetically more Iranian; and they have a mixture related to present-day south Indians. So, I think there is a good chance that these people who are genetically similar to what we see in the Rakhigari individuals represent the gradient and the range of ancestry that is present in at least some part of the Harappan world. It was a gradient, but it was a different gradient than it is today.

What we now know is that the formation of present-day South Asian population genetically arose after the decline of the Harappan Civilisation. If we can use genetic data to date when the mixture occurred, we can show that the mixture included people of this Harappan ancestry type. They mixed with people, more to the south, more similar to the Andamanese and population in South-East Asia, and to some of the ancestry in Austro-Asiatic speakers, like Khasi. And, on the other hand, they also mixed with people to the north, who have ancestry related to the south. The mixture of these people associated with this gradient of ancestry that we have now documented through ancient DNA, both people to the north, that forms one of the ancestral populations of India, and people from the south, that forms the later ancestral populations. But most of the ancestry of both of the groups comes from the Harappan gradient. So the Harappan-related ancestry is actually the single largest source population for almost all people in India today. It mixed with other groups, probably in peninsular India or the south-east as well as groups from the north. These contributed important components, but the single largest component both of northern ancestral groups and other ancestral groups is the Harappan ancestral type. Now, we do not currently know where that ancestral group was distributed. It may not have just been within the range of the Harappan Civilisation. There may have been people genetically like it spread further out; and maybe we are seeing in some of those people not just people who participated in that culture. But a very reasonable prospect, because it was so widely distributed, is that disruptions associated with dramatic cultural changes that are documented in archaeological record, including the disruption of Harappan Civilisation, were associated with movements of people; and a mixture both of the south and the north, and then further movements associated with the admixture of these groups forming the gradient of today fully by about 2,000 years ago.

Well I shall eagerly wait for the publication of this work. Thank you.

G.N. Devy is Chair, People’s Linguistic Survey of India

Other related articles

More stories from this issue

Sign in to Unlock member-only benefits!
  • Bookmark stories to read later.
  • Comment on stories to start conversations.
  • Subscribe to our newsletters.
  • Get notified about discounts and offers to our products.
Sign in


Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide to our community guidelines for posting your comment