A review article based on recent Genetic studies

Background: Indian Subcontinent has been connected with the Middle East for ages. The cultural exchange between two regions has been there for thousands of years. Since the time of the Indus Valley Civilization, & Mesopotamian Civilization (2500 BCE), there have been cultural & economic exchanges between these two regions. Down the centuries, South Asia hosted one of the earliest Judeo-Christian missionaries. With the arrival of Islam in Arabia during the 07th Century, it was the South of the Subcontinent that was in touch with Arabs. Even before the arrival of Islam, the Arabs were engaged with the Southern part of South Asia, as maritime traders. It was from here, that the first contact of Islam with the Indian subcontinent took place in the 07th century via maritime Arab traders who were doing trade in the Indian Ocean for centuries. To secure coastal trading routes close to the Sindh, the Ummayad armies made two futile attempts against Raja Dahir’s armies. In 712 CE, it was a young Arab general, Muhammad Bin Qasim who successfully defeated Raja Dahir & Umayyad’s dominion expand up to Sindh & Multan. This was the first & last massive Arab expedition in South Asia as by the later centuries, it was Turks, Afghans, Mughals & other Central Asians who invaded & expanded their dominion in South Asia. During all these centuries, limited migration of Arabs continued in the Indian subcontinent as scholars, Sufis, Mercenaries & traders. With the time, after the Arab invasion, a massive conversion took place in Sindh where the majority of converts followed Buddhism and were socially segregated during the rule of Raja Dahir. The Arab settlers married the locals & down the centuries, they got Indianized. It was such a small number that didn’t seem to make any major change in the genetic profile of the region. In the South Asian context, there is an innumerable number of communities that believe in their Arab or Semitic origin. These claims of Arab origin can be seen by revisiting hundred years old archival documents of Colonial India. This seems to be a phenomenon of disconnecting with indigenous roots that have evolved down the centuries. With changing regimes of Muslim domination in South Asia for centuries, the disconnection from indigenous backgrounds became a trend for the local converts.

Interaction between Harappa people and Aryans way back in time left indelible marks and scars that defined the subcontinent for all times to come. People here in their march have changed faith several times. But has that changed their history and obliterated the past? Sadly this is what the elite in Pakistan has been trying to do. It has set for itself a task fraught with problems which are insoluble. Can people be securely anchored by sacrificing their natural affinity at the altar of the imagined one? If they do, they will be adrift as they are in Pakistan today. The drift is best expressed by certain tribes or castes whose claim to ‘nobility’ rests on their real or imagined non-South Asian origins.

Mushtaq Soofi, 2020, Dawn News,

Pashtuns: When Pashtuns (Afghans) came to power, an attempt was made to disconnect with the local roots & an assertion of Semitic origin was made. Now with the advent of modern genetic research in the 21st century, we had evidence-based samples from diverse Pashtun populations. Like other regions & clines, they were an admixture of different ancient populations. What makes them different from others is the low contribution of the AASI (Ancestral ancient South Asian) component i.e., Paleolithic Indian. This peaks among tribal, Dalit’s & most of South Indian communities.  For example, AASI from G 25 samples of Pashtuns represents an average of 8-10 percent among North Pashtuns & 11.5-16 percent among South Pashtuns while in Punjabi Chuhras, it goes up to 42 percent. Another major difference is higher proportions of steppes ancestry are present among Pashtuns.

The human Y-chromosome accumulates roughly two mutations per generation. Y-DNA haplogroups represent major branches of the Y-chromosome phylogenetic tree that share hundreds or even thousands of mutations unique to each haplogroup.The haplotype is used to identify the haplogroup of an individual. Thus, the haplogroup represents a group of people who have inherited common genetic characteristics from the same most recent common ancestor (MRCA) going back several thousand years. All humans belong to haplogroups which are designated according to their Y-DNA and MT-DNA.

Mahal, D. G., & Matsoukas, I. G. (2018). The geographic origins of ethnic groups in the Indian subcontinent: exploring ancient footprints with Y-DNA haplogroups. Frontiers in genetics9, 4.
Admixture analysis of different samples from G25

A research study titled “ A Genealogical study of the Origin of Pashtuns” (2013) studied the Y haplogroup among  718 Pashtun subjects and found that R1a: 61.6 %,  L: 31.6 %  & J2: 6.7%. It is interesting to note that none of them were tested positive for J1c3d (a subclade of J1) that presumed to be an Abrahamic Y haplogroup. The descendants of Prophet Ishaaq & Prophet Ismail (Peace be Upon them) tested positive for this Y haplogroup.

For video stories subscribe to our YouTube
A 2011 research from Cornell University where the concepts of Y haplogroup & its diversities have been explained

Awans: Now we will discuss another larger Punjabi tribe, i.e., Awans. The tribe has been spread in the North, West & Central parts of Punjab with significant distribution across KPK. Down the centuries, it has been a thriving community & its representation can be seen in all walks of life. The legendary origin goes back to Hazrat Ali Bin Abu Talib & his second wife, Hanafiya. But the genetic research gave a completely different picture. In autosomal admixture, they represent the same as other North Western communities of South Asia, as a blend of AASI (Ancestral Ancient South Indian), Eastern Iranian, & components of Steppes DNA. Their admixture is quite close to other Punjabi communities.

GED Match Admixture proportions of some Punjabi & North West Indian tribes, Excel source: Mr. Mansour Ali Chaudhry. The labels S Indian is Paleolithic Indian or Indian Hunter-Gatherers, Baloch is Eastern Iranian (A primitive component that mixed with Paleolithic Indian to form our first civilization), Caucasian, and NE Europeans are the labels that were brought by Steppes (Aryan) migration.
The genetic distance of a few Arain samples with other communities. At number four, the Awan sample can be seen

A 2018 research study conducted on the Mitochondrial DNA of seven Punjabi communities (Rajput, Awan, Arain, Jaat, Mughals, & Gujjar) found that 38 % of Mt DNA haplogroup were M (Paleolithic Indians) & 24 % represents U (An Eurasian Haplogroup that came from Steppes migration).  The phylogenetic analysis in the same study showed close proximity of Awans with Gujjars, and Rajputs with Jaats, while Arians represent basal branches with higher diversity among the rest of the groups.

Arains: The third larger group of Punjab that claims of Arab origin are Arains. They belonged to an agrarian tribe that has been settled in Punjab for centuries, & they had roots in Sindh. Mukhtar Ahmad, in his book, The Arains: A Historical Perspective, held the modern genetic view that like all other tribes & castes, this community is also formed as a blend of diverse genetic populations. According to Ibbetson, it was during the 11th century, the tribe migrated on the large scale from Sindh to Punjab in the search of Agrarian fortunes. As the historic epicenter of the tribe was Sindh so it was easy to connect their distant past with the Arab armies of Muhammad Bin Qasim. However, genetic studies showed their autosomal admixture as a blend of three major components, like other North Western communities of South Asia, AASI (Ancestral Ancient South Indian), Eastern Iranian, & components of Steppes DNA. Their AASI (Ancestral Ancient South Indian), the Paleolithic Indian is on the lower side among the Punjabi communities, & Eastern Iranian is a little higher. Regarding Y Haplogroup, the 52 samples of Arain community from23andMe showed RIa1 as 44.6%, L 19%, J2 17.9%, Q 8.9%, G 5.4% & H 3.6%.

Sayyad’s: Now we will review the one largest subgroup of South Asia that has a claim of Arab origin i.e., Sayyads. Sayyads were believed to be direct descendants of Prophet Muhammad (PBUH).  Over the centuries, the Sayyads migrated to South Asia from the Middle East, & Central Asia. Many old settlements of Sayyad existed in different parts of erstwhile Awadh, Bihar & other parts of South Asia. Various subdivisions like Hashemi, Hasani, Husaini, Abedi, Zaidi, Jafri, Naqvi & Kazmi existed in South Asia.  By the time of the early 20th, the number of Sayyads in the official census seems to be highly inflated. The 1901 census of British India, showed their number as 1,339,734.  With the lens of modern genetic tools & research studies, when the Y haplogroup of South Asian Sayyad samples done at 23andme showed a diverse range of Haplogroup (H1, R1a, R1b, R2, J1 & J2). So the chances of paternal lineage go only to those who have tested positive for J1. The haplogroup J1 represents Arab roots but not all J1 are descendants of Prophet Muhammad (PBUH). It’s only a subclade of J1 (J1c3d) that connects with the Prophetic chain. A 2010 study on 56 South Asian Sayyads conducted in the UK found only 14 J among 56 subjects. And it also includes J2 and other subclades of J1. This study received a critical review by an American geneticist, Razib Khan on one of his blogs.

And yet this paper was published in 2010. We now know through various tests of confirmed descendants of Muhammad, and who descend in the male line from his cousin Ali, that they carry a branch of haplogroup J1.
Even among the Syeds, most do not descend from Muhammad assuredly. There are nearly as many scions of Lord Indra, R1a1, as those who bear haplogroup J. Of the J’s within the Syed community, I think the most likely scenario if they are not South Asia is that they are Iranian. J is found at frequencies of 35% in Iran, and Iranians, along with Turks, were the most common migrants into South Asia.

Razib Khan, https://www.brownpundits.com/2019/03/05/the-syeds-of-south-asia-are-the-sons-of-hindus-and-magians/

Many individual samples from different commercial companies and genetics blogs, like GEDMatch, Anthrogenica & others also showed such trends.
A 2022 research study from Khyber Pakhtun Wala on 678 individuals from five ethnic groups (Gujars, Jadoons, Syeds, Tanolis, and Yousafzais) finds that majority of the Syeds in this sample had R1a Y haplogroup. The research also concluded that Gujars, Syeds & Yousafzais had an affinity with neighboring Central Asian populations & Pakistani populations.  One can’t deny the presence of Syed ancestry in South Asia but their numbers are highly exaggerated. This is what the genetic research showed. Down the centuries, the local coverts, many Iranian & Central Asian immigrants blend in this cline. Some of the open-minded subjects whose families claimed Najibul Tarafain based on ancestral family trees have busted these claims by sharing their results.

As we stated earlier, the immigration of Sayyads happened in Subcontinent but in the present scenario, this affiliation is hyperinflated. With the lens of a genetic test, there have been families that have been tested with subclades of the Prophetic line.
In the Family Tree, DNA, The Quraish & Bani Hashim genealogical project, there have been samples that were tested & found with subclades of Prophetic line. It includes a sample of Rizvi Sayyad with roots from Karari (Allahabad), a Zaidi Sayyad sample from Pakistan having roots connected with historic 18th C Barha Sayyads & few other samples from roots in Kashmir. In the case of haplogroup detections, even a single sample has a consideration as the entire male line of the subject paternal kins represents the same Y haplogroup up to the level of farthest cousins. According to genetic scientists, even the single subject DNA sample is presumed to be strong as it represents hundreds of his ancestors. Another relevant point that needs elaboration here, even if the Y haplogroup of the small proportion of Sayyads matched with the Abrahamic origin but their autosomal admixture is more close to South Asian communities except with few differences.

For example, Urdu speaking Sayyads of Karachi (Urdu speakers having roots from erstwhile Awadh, United Provinces & Bihar) showed Mediterranean component rather than North East European components in their admixture. Because Mediterranean component was brought by Middle Eastern immigrant in a post Islamic ages while North East European component was brought by steppes migration i.e., almost 5 k years old.

A simple genetic map representing admixture proportions of our South Asian ancestral populations. Reference: https://araingang.medium.com/pakistan-genetic-map-81db19cd0465

In quite rare instances, some Sayyad Y haplogroup was found to be Haplogroup E. This confirmed their ancestral connection with the Middle East as this is one of the common Haplogroup in Egypt, & other North African tribes along with substantial numbers in Yemen, & other parts of the Arab world. It seems that by the time, their ancestors merged in the cline of Sayyads by renaming or by matrimonial ties.

Sheikhs: The category consists major chunk of Ashraf Muslims. As a broader title, this has also been used by many converts from higher caste Hindus. Then came categories, like Siddiqui, Qureshi, Ansari, Faruqi, and Bani Israeli.

“Some of the Sheikhs belonged to Old families of repute, but majority of them are descendants of Hindu Converts. While the fact, they claim connection with the recognized tribes of Sheikhs probably indicates that they assumed the name & the race of the Qazi/Mufti at whose hands, they were admitted into Islam”

District Gazette, Bareilly, United Provinces (1911), HR Nevill
Autosomal Admixture of Sheikh Siddiqui Sample from Lucknow
Source: GED Match

In light of recent Genetic studies, only a number of handful historic families among all these clines would have ancestry from the Arab world. A few of the Siddiqui samples, from Pakistan, showed a diverse Y haplogroup & none of them were found to be in any subclades of J1. Even revisiting the research of traditional genealogist who relied on family trees & archival documents showed that few historic families among Ansaris, Siddiquis & Faruqi Sheikhs, seems to have Arab roots.

PS: The purpose of this piece is simply to offer a new perspective on genetic studies and ancestries. We do not deny that several people of Arab descent settled in the Indian Subcontinent over centuries. Our sole purpose, however, is to highlight new research and findings concerning several communities that assert to be descended from Arabs.
Among the most recent examples of Arab admixing in South Asia, is the small Chaush community of Hyderabad having roots in Hadhrami Arabs of Yemen. Similarly, Daudi Bohras have a trace of Sub-Saharan ancestry in their admixture. As we reviewed that the major chunk of South Asian Ashrafs beliefs of their Arab origin has been grounded in legends. In the 21st century, when the complete human genome has been mapped, it’s easy to trace the genetic past of the population. The new technology of studying the Human DNA from ancient remains of Humans has revolutionized the studies of genetic history. This is how the human population has been formed all across the globe. First human or biological Adam set foot outside Africa & then the Human population starts evolving as an outcome of admixtures. For eg., the Eastern Iranian component in South Asians is also present in up to thirty percent among modern Iranians & up to 10 percent among Middle Eastern Populations. Another component, Anatolian farmers that form the base of European & also present in Middle Eastern populations can be seen in different proportions among South Asians as this component was brought by Steppes (Aryans) immigrants.
In the coming days, when more individuals will perform these tests, more holistic perspectives will get unfolded. Muslims in South Asia have been divided into Ashrafs & Ajlafs for centuries. As modern genetic research showed, it seems to be another version of the Varna system that might be imbibed when top strata of South Asian social fabric came into the fold of Islam in their distant past.

Mixing is in Human Nature, and no one population is or could be pure.

Professor David Reich, Department pf Genetics, Harvard Medical School
For video stories subscribe to our YouTube
David Reich: The truth about us, and where we come from


  1. Mushtaq Soofi, 2020, retrieved from:
  2. Mahal, D. G., & Matsoukas, I. G. (2018). The geographic origins of ethnic groups in the Indian subcontinent: exploring ancient footprints with Y-DNA haplogroups. Frontiers in genetics9, 4. https://www.frontiersin.org/articles/10.3389/fgene.2018.00004/full
  3. Khan, H. U., & Ahmed, N. (2013, July). A genealogical study of the origin of Pashtuns. In International conference on intelligent computing (pp. 402-410). Springer, Berlin, Heidelberg. https://link.springer.com/chapter/10.1007/978-3-642-39479-9_48
  5. Belle, E., Shah, S., Parfitt, T., & Thomas, M. G. (2010). Y chromosomes of self-identified Syeds from the Indian subcontinent show evidence of elevated Arab ancestry but not of a recent common patrilineal origin. Archaeological and Anthropological Sciences2(3), 217-224. 
  6. Razib Khan, 2019, The Syeds of South Asia are sons of Hindus & Magians. Retrieved from: https://www.brownpundits.com/2019/03/05/the-syeds-of-south-asia-are-the-sons-of-hindus-and-magians/
  7.  The Quraysh & Bani Hashim Genealogical Project, Family Tree DNA, https://fgc8712.com/2021/12/08/%d8%aa%d8%ad%d8%af%d9%8a%d8%ab-%d9%86%d8%aa%d9%8a%d8%ac%d8%a9-%d8%a7%d9%84%d8%b3%d8%a7%d8%af%d8%a9-%d8%a2%d9%84-%d9%81%d8%ae%d8%b1%d9%8a-%d8%a7%d9%84%d9%83%d8%b4%d9%85%d9%8a%d8%b1%d9%8a/
  8. Tariq, M., Ahmad, H., Hemphill, B. E., Farooq, U., & Schurr, T. G. (2022). Contrasting maternal and paternal genetic histories among five ethnic groups from Khyber Pakhtunkhwa, Pakistan. Scientific reports12(1), 1-18. https://www.nature.com/articles/s41598-022-05076-3
  9. Pakistan Genetic Map, 2019, Retrieved from:  https://araingang.medium.com/pakistan-genetic-map-81db19cd0465
  10. Bareilly, A Gazetter, Nevill, HR, 1911, https://archive.org/details/in.ernet.dli.2015.4798
  11. David Reich: The truth about us, and where we come from, 2020, https://youtu.be/3-vHByC14bc
Website | + posts

Rehan Asad is a medical doctor & currently working as an anatomy faculty. He has a penchant for writing people, food & culture stories.


Enjoy this blog? Please spread the word :)