Depending on the age of the language family under consideration, its homeland may be known with near-certainty (in the case of historical or near-historical migrations) or it may be very uncertain (in the case of deep prehistory). Next to internal linguistic evidence, the reconstruction of a prehistoric homeland makes use of a variety of disciplines, including archaeology and archaeogenetics.
There are several methods to determine the homeland of a given language family. One method is based on the vocabulary that can be reconstructed for the proto-language. This vocabulary - especially terms for flora and fauna - can provide clues for the geographical and ecological environment in which the proto-language was spoken. An estimate for the time-depth of the proto-language is necessary in order to account for prehistorical changes in climate and the distribution of flora and fauna.
Another method is based on the linguistic migration theory (first proposed by Edward Sapir), which states that the most likely candidate for the last homeland of a language family can be located in the area of its highest linguistic diversity. This presupposes an established view about the internal subgrouping of the language family. Different assumptions about high-order subgrouping can thus lead to very divergent proposals for a linguistic homeland (e.g. Isidore Dyen's proposal for New Guinea as the center of dispersal of the Austronesian languages). The linguistic migration theory has its limits because it only works when linguistic diversity evolves continuously without major disruptions. Its results can be distorted e.g. when this diversity is wiped out by more recent migrations.
Limitations of the concept
The concept of a (single, identifiable) "homeland" of a given language family implies a purely genealogical view of the development of languages. This assumption is often reasonable and useful, but it is by no means a logical necessity, as languages are well known to be susceptible to areal change such as substrate or superstrate influence.
Over a sufficient period of time, in the absence of evidence of intermediary steps in the process, it may be impossible to observe linkages between languages that have a shared Urheimat: given enough time, natural language change will obliterate any meaningful linguistic evidence of a common genetic source.
This general concern is a manifestation of the larger issue of "time depth" in historical linguistics.
For example, the languages of the New World are believed to be descended from a relatively "rapid" peopling of the Americas (relative to the duration of the Upper Paleolithic) within a few millennia (roughly between 20,000 and 15,000 years ago), but their genetic relationship has become completely obscured over the more than ten millennia which have passed between their separation and their first written record in the early modern period. Similarly, the Australian Aboriginal languages are divided into some 28 families and isolates for which no genetic relationship can be shown.
The Urheimaten reconstructed using the methods of comparative linguistics typically estimate separation times dating to the Neolithic or later. It is undisputed that fully developed languages were present throughout the Upper Paleolithic, and possibly into the deep Middle Paleolithic (see origin of language, behavioral modernity). These languages would have spread with the early human migrations of the first "peopling of the world", but they are no longer amenable to linguistic reconstruction. The Last Glacial Maximum (LGM) has imposed linguistic separation lasting several millennia on many Upper Paleolithic populations in Eurasia, as they were forced to retreat into "refugia" before the advancing ice sheets. After the end of the LGM, Mesolithic populations of the Holocene again became more mobile, and most of the prehistoric spread of the world's major linguistic families seem to reflect the expansion of population cores during the Mesolithic followed by the Neolithic Revolution.
The Nostratic theory is the best-known attempt to expand the deep prehistory of the main language families of Eurasia (excepting Sino-Tibetan and the languages of Southeast Asia) to the beginning of the Holocene. First proposed in the early 20th century, the Nostratic theory still receives serious consideration, but it is by no means generally accepted. The more recent and more speculative ""Borean" hypothesis attempts to unite Nostratic with Dené-Caucasian and Austric, in a "mega-phylum" that would unite most languages of Eurasia, with a time depth going back to the Last Glacial Maximum.
The argument surrounding the "Proto-Human language", finally, is almost completely detached from linguistic reconstruction, instead surrounding questions of phonology and the origin of speech. Time depths involved in the deep prehistory of all the world's extant languages are of the order of at least 100,000 years.
Language contact and creolization
The concept of an Urheimat only applies to populations speaking a proto-language defined by the tree model. This is not always the case.
For example, in places where language families meet, the relationship between a group that speaks a language and the Urheimat for that language is complicated by "processes of migration, language shift and group absorption are documented by linguists and ethnographers" in groups that are themselves "transient and plastic." Thus, in the contact area in western Ethiopia between languages belonging to the Nilo-Saharan and Afroasiatic families, the Nilo-Saharan-speaking Nyangatom and the Afroasiatic-speaking Daasanach have been observed to be closely related to each other but genetically distinct from neighboring Afroasiatic-speaking populations. This is a reflection of the fact that the Daasanach, like the Nyangatom, originally spoke a Nilo-Saharan language, with the ancestral Daasanach later adopting an Afroasiatic language around the 19th century.
Creole languages are hybrids of languages that are sometimes unrelated. Similarities arise from the creole formation process, rather than from genetic descent. For example, a creole language may lack significant inflectional morphology, lack tone on monosyllabic words, or lack semantically opaque word formation, even if these features are found in all of the parent languages of the languages from which the creole was formed.
Some languages are language isolates. That is to say, they have no well accepted language family connection, no nodes in a family tree, and therefore no known Urheimat. An example is the Basque language of Northern Spain and southwest France. Nevertheless, it is a scientific fact that all languages evolve. An unknown Urheimat may still be hypothesized, such as that for a Proto-Basque, and may be supported by archaeological and historical evidence.
Sometimes relatives are found for a language originally believed to be an isolate. An example is the Etruscan language, which, even though only partially understood, is believed to be related to the Rhaetic language and to the Lemnian language. A single family may be an isolate. In the case of the non-Austronesian indigenous languages of Papua New Guinea and the indigenous languages of Australia, there is no published linguistic hypothesis supported by any evidence that these languages have links to any other families. Nevertheless, an unknown Urheimat is implied. The entire Indo-European family itself is a language isolate: no further connections are known. This lack of information does not prevent some professional linguists from formulating additional hypothetical nodes (Nostratic) and additional homelands for the speakers.
Homelands of major language families
Map showing the present-day distribution of Indo-European languages in Eurasia (light green) and the likely Proto-Indo-European homeland (dark green).
Although Dravidian languages are now concentrated in southern India, with isolated pockets further north, placenames and substrate influences on Indo-Aryan languages indicate that they were once spoken more widely across the subcontinent. Reconstructed Proto-Dravidian terms for flora and fauna support the idea that Dravidian is indigenous to India. Proponents of a migration from the northwest cite the location of Brahui, a hypothesized connection to the undeciphered Indus Valley Script, and claims of a link to Elamite.
All modern Koreanic varieties are descended from the language of Unified Silla, which ruled the southern two-thirds of the Korean peninsula between the 7th and 10th centuries. Evidence for the earlier linguistic history of the peninsula is extremely sparse. The orthodox view among Korean social historians is that the Korean people migrated to the peninsula from the north, but no archaeological evidence of such a migration has been found.
The reconstruction of Sino-Tibetan is much less developed than for other major families, so its higher-level structure and time depth remain unclear. Proposed homelands and periods include: the upper and middle reaches of the Yellow River about 4-8 kya, associated with the hypothesis of a top-level branching between Chinese and the rest; southwestern Sichuan around 9 kya, associated with the hypothesis that Chinese and Tibetan form a subbranch; Northeast India (the area of maximal diversity) 9-10 kya.
Austroasiatic is widely held to be the oldest family in mainland Southeast Asia, with its current discontinuous distribution resulting from the later arrival of other families. The various branches share a great deal of vocabulary concerning rice cultivation, but few related to metals. Identification of the homeland of the family has been hampered by the lack of progress on its branching. The main proposals are northern India (favoured by those who assume an early branching of Munda), Southeast Asia (the area of maximal diversity) and southern China (based on claimed loanwords in Chinese).
The homeland of the Austronesian languages is widely accepted by linguists to be Taiwan, since nine of its ten branches are found there, with all Austronesian languages found outside Taiwan belonging to the remaining Malayo-Polynesian branch.
Some authorities on the history of the Uto-Aztecan language group place the Proto-Uto-Aztecan homeland in the border region between the USA and Mexico, namely the upland regions of Arizona and New Mexico and the adjacent areas of the Mexican states of Sonora and Chihuahua, roughly corresponding to the Sonoran Desert. The proto-language would have been spoken by foragers, about 5,000 years ago. Hill (2001) proposes instead a homeland further south, making the assumed speakers of Proto-Uto-Aztecan maize cultivators in Mesoamerica, who were gradually pushed north, bringing maize cultivation with them, during the period of roughly 4,500 to 3,000 years ago, the geographic diffusion of speakers corresponding to the breakup of linguistic unity.
The countries and autonomous regions where a Turkic language has official status.
There is considerable dispute over the time and place of origin of the Turkic languages, with candidates for their ancient homeland ranging from the Transcaspian steppe to Manchuria in Northeast Asia and South-Central-Siberia. The lack of written records prior to the earliest Chinese accounts, and the fact that the early Turkic peoples were nomadic pastoralists, and hence mobile, makes localizing and dating the earliest homeland of the Turkic language difficult. Attempts to localize the proto-Turkic Urheimat are usually connected with the early archaeological horizon of west and central Siberia and in the region south of it.
The Turkic peoples lived in the Eurasian Steppe including North China, especially Xinjiang Province, Inner Mongolia, Mongolia and West Siberian Plain possibly as far west as Lake Baikal and the Altai Mountains, by the 6th century CE. After Turkic migration, by the 10th century CE, most of Central Asia, formerly dominated by Iranian peoples, was settled by Turkic tribes. Then, the Seljuk Turks from the 11th century invaded Anatolia, ultimately resulting in permanent Turkic settlement there and the establishment of the Turkish nation. The Turkic languages are now spoken in Turkey, Iran, Central Asia and Siberia.
The Afro-Asiatic languages include Arabic, Hebrew, Berber, and a variety of other languages now found mostly in Northeastern Africa, although the exact boundaries of this language family are disputed in the case of a small number of languages spoken by small numbers of individuals in a few localized areas of Sudan and East Africa.
The limited area of the Afro-Asiatic Sprachraum (prior to its expansion to new areas in the historic era) has limited the potential areas where that family's Urheimat could be. Generally speaking, two proposals have been developed: that Afro-Asiatic arose in a Semitic Urheimat in the Middle East aka Southwest Asia, or that Afro-Asiatic languages arose in northeast Africa (generally, either between Darfur and Tibesti or in Ethiopia and the other countries of the Horn of Africa). The African hypothesis is considered to be rather more likely at the present time, because of the greater diversity of languages with more distant relationships to each other there.
There have been serious linguistic proponents of almost every conceivable possible set of relationships of the Afro-Asiatic language subfamilies to each other, although there is reasonably great consensus concerning the subfamily classification of all but a few of the Afro-Asiatic languages. Some of this difficulty in resolving the Afro-Asiatic family tree flows from the time depth of these languages. The Afro-Asiatic Egyptian language of ancient Egypt (whose latest stage is known as Coptic) is one of the two oldest written languages on Earth (the other being the Sumerian language, a language isolate) dating in written form to approximately 3000 BCE, and the Semitic Akkadian language was also attested in writing from a very early date (ca. 2000 BCE). A common Afro-Asiatic proto-language is necessarily older than these very old written languages which belonged to language families that had already diverged from each other considerably by that point. There is also no one genetic profile that is uniform among Afro-Asiatic language speakers that clearly unites them. There are also competing theories on whether the Afro-Asiatic language family owes its expansion to the Neolithic revolution that originated in an area that includes the range of the Afro-Asiatic language, or was already widespread in the Upper Paleolithic era.
There has been speculation regarding the specific Semitic subfamily of Afro-Asiatic languages, again with the Horn of Africa and Southwest Asia--specifically the Levant--being the most common proposals. The large number of Semitic languages present in the Horn of Africa seems at first glance to support the hypothesis that the Semitic homeland lies there. However, the Semitic languages in the Horn of Africa all belong to the South Semitic subfamily and appear to all have relatively recent common origins in a single Ethio-Semitic proto-language, while the East and Central Semitic languages are native solely to Asia. These features, and the presence of certain common Semitic lexical items in all Ethio-Semitic languages referring to items that arrived in Africa from the Levant at a time after Semitic languages were known to have been spoken in the Levant, have lent weight to the Levantine proposal.
Hebrew is relatively closely related to the Arabic language even within the Semitic language family, being part of the same Central Semitic group.
The Maltese language, the only Semitic language of Europe, is a derivative of the Arabic language as it was spoken in Sicily starting sometime after the rise of the Islamic empire in North Africa.
The homeland of the Niger-Congo languages, which has as its subfamily the Benue-Congo languages, which in turn includes the Bantu languages, is not known in time or place, beyond the fact that it probably originated in or near the area where these languages were spoken prior to Bantu expansion (i.e. West Africa or Central Africa) and probably predated the Bantu expansion of ca. 3000 BCE through 500 CE by many thousands of years. Its expansion may have been associated with the expansion of Sahel agriculture in the African Neolithic period.
According to linguist Roger Blench, as of 2004, all specialists in Niger-Congo languages believe the languages to have a common origin, rather than merely constituting a typological classification, for reasons including their shared noun-class system, their shared verbal extensions and their shared basic lexicon. Similar classifications have been made ever since Diedrich Westermann in 1922.Joseph Greenberg continued that tradition making it the starting point for modern linguistic classification in Africa, with some of his most notable publications going to press starting in the 1960s. But, there has been active debate for many decades over the appropriate subclassifications of the languages in that language family, which is a key tool used in localizing a language's place of origin. No definitive "Proto-Niger-Congo" lexicon or grammar has been developed for the language family as a whole.
An important unresolved issue in determining the time and place where the Niger-Congo languages originated and their range prior to recorded history is this language family's relationship to the Kordofanian languages now spoken in the Nuba mountains of Sudan, which is not contiguous with the remainder of the Niger-Congo language speaking region and is at the northeasternmost extent of the current Niger-Congo linguistic region. The current prevailing linguistic view is that Kordofanian languages are part of the Niger-Congo language family, and that among the many languages still surviving in that region these may be the oldest. The evidence is insufficient to determine if this outlier group of Niger-Congo language speakers represent a prehistoric range of a Niger-Congo linguistic region that has since contracted as other languages have intruded, or if instead, this represents a group of Niger-Congo language speakers who migrated to the area at some point in prehistory where they were an isolated linguistic community from the beginning.
The prehistoric range for the Niger-Congo languages has implications, not just for the history of the Niger-Congo languages, but for the origins of the Afro-Asiatic languages and Nilo-Saharan languages whose homelands have been hypothesized by some to overlap with the Niger-Congo linguistic range prior to recorded history. If the consensus view regarding the origins of the Nilo-Saharan languages which came to East Africa is adopted, and a North African or Southwest Asian origin for Afro-Asiatic languages is assumed, the linguistic affiliation of East Africa prior to the arrival of Nilo-Saharan and Afro-Asiatic languages is left open. The overlap between the potential areas of origin for these languages in East Africa is particularly notable because includes the regions from which the Proto-Eurasians who brought anatomically modern humans Out of Africa, and presumably their original proto-language or languages originated.
However, there is more agreement regarding the place of origin of the Benue-Congo subfamily of languages, which is the largest subfamily of the group, and the place of origin of the Bantu languages and the time at which it started to expand is known with great specificity.
The classification of the relatively divergent family of Ubangian languages which are centered in the Central African Republic, as part of the Niger-Congo language family where Greenberg classified them in 1963 and subsequently scholars concurred, was called into question, by linguist Gerrit Dimmendaal in a 2008 article.
The Benue-Congo homeland
Roger Blench, relying particularly on prior work by Professor Kay Williamson of the University of Port Harcourt, and the linguist P. De Wolf, who each took the same position, has argued that a Benue-Congo linguistic subfamily of the Niger-Congo language family, which includes the Bantu languages and other related languages and would be the largest branch of Niger-Congo, is an empirically supported grouping which probably originated at the confluence of the Benue and Niger Rivers in Central Nigeria. These estimates of the place of origin of the Benue-Congo language family do not fix a date for the start of that expansion other than that it must have been sufficiently prior to the Bantu expansion to allow for the diversification of the languages within this language family that includes Bantu.
^Bowern, Claire; Atkinson, Quentin (2012). "Computational Phylogenetics and the Internal Structure of Pama-Nyungan". Language. 84 (4): 817-845. Kayser, Manfred (2010), "The Human Genetic History of Oceania: Near and Remote Views of Dispersal", Current Biology, 20 (4): R194-201, doi:10.1016/j.cub.2009.12.004, PMID20178767, S2CID7282462
^Bengtson and Ruhlen (1994) offered a list of 27 "global etymologies". Bengtson, John D. and Merritt Ruhlen. 1994. "Global etymologies"Archived 2007-09-28 at the Wayback Machine. In Ruhlen 1994a, pp. 277-336. This approach has been criticized as flawed by Campbell and Poser (2008) who used the same criteria employed by Bengtson and Ruhlen to identify "cognates" in Spanish known to be false. Campbell, Lyle, and William J. Poser. 2008. Language Classification: History and Method. Cambridge: Cambridge University Press, 370-372.
^McWhorter, J. H. (1998), "Identifying the Creole Prototype: Vindicating a Typological Class", Language, 74 (4): 788-818, doi:10.2307/417003, JSTOR417003
^McWhorter, John H. (1999), "The Afrogenesis Hypothesis of Plantation Creole Origin", in Huber, Magnus; Parkvall, Mikael (eds.), Spreading the Word: The Issue of Diffusion among the Atlantic Creoles, London: Westminster University Press, pp. 111-152
^Serafim, Leon A. (2008). "The uses of Ryukyuan in understanding Japanese language history". In Frellesvig, Bjarne; Whitman, John (eds.). Proto-Japanese: Issues and Prospects. John Benjamins. pp. 79-99. ISBN978-90-272-4809-1. p. 98.
^Roger Blench, "Stratification in the peopling of China: how far does the linguistic evidence match genetics and archaeology?," Paper for the Symposium "Human migrations in continental East Asia and Taiwan: genetic, linguistic and archaeological evidence". Geneva June 10-13, 2004. Université de Genève.
^Ostapirat, Weera. (2005). "Kra-Dai and Austronesian: Notes on phonological correspondences and vocabulary distribution", pp. 107-131 in Sagart, Laurent, Blench, Roger & Sanchez-Mazas, Alicia (eds.), The Peopling of East Asia: Putting Together Archaeology, Linguistics and Genetics. London/New York: Routledge-Curzon.
^Sidwell, Paul (2015). "Austroasiatic Classification". In Jenny, Mathias; Sidwell, Paul (eds.). The Handbook of the Austroasiatic Languages. Leiden: BRILL. pp. 144-220. ISBN978-90-04-28295-7. p. 146.
^Rau, Felix; Sidwell, Paul (2019). "The Munda maritime hypothesis". Journal of the Southeast Asian Linguistics Society. 12 (2): 35-57. hdl:10524/52454. pp. 42-44.
^Potter, Ben A. (2010). "Archaeological Patterning in Northeast Asia and Northwest North America: An Examination of the Dene-Yeniseian Hypothesis". Anthropological Papers of the University of Alaska. 5 (1-2): 138-167.
^Bakker, Peter (2013). "Diachrony and typology in the history of Cree". In Folke Josephson; Ingmar Söhrman (eds.). Diachronic and typological perspectives on verbs. Amsterdam: John Benjamins. pp. 223-260.
^Golla, Victor (2011). California Indian Languages. Berkeley: University of California Press. p. 256.
^Jane H. Hill, "Proto-Uto-Aztecan", American Anthropologist, 2001. JSTOR684121.
Mallory, J.P. (1989), In Search of the Indo-Europeans: Language, Archaeology, and Myth, London: Thames & Hudson.
Mallory, James P. (1997), "The homelands of the Indo-Europeans", in Blench, Roger; Spriggs, Matthew (eds.), Archaeology and Language, I: Theoretical and Methodological Orientations, London: Routledge, ISBN978-0-415-11760-9.
Mallory, J.P.; Adams, D.Q. (2006), The Oxford introduction to Proto-Indo-European and the Proto-Indo-European world (Repr. ed.), Oxford [u.a.]: Oxford Univ. Press, ISBN978-0-19-928791-8