This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. (April 2017) (Learn how and when to remove this template message)
A language isolate, in the absolute sense, is a natural language with no demonstrable genealogical (or "genetic") relationship with other languages, one that has not been demonstrated to descend from an ancestor common with any other language. Language isolates are in effect language families consisting of a single language. Commonly cited examples include Ainu, Basque, Sumerian, Elamite, and Vedda, though in each case a minority of linguists claim to have demonstrated a relationship with other languages.
Some sources use the term "language isolate" to indicate a branch of a larger family with only one surviving member. For instance, Albanian, Armenian and Greek are commonly called Indo-European isolates. While part of the Indo-European family, they do not belong to any established branch (such as the Romance, Indo-Iranian, Celtic, Slavic or Germanic branches), but instead form independent branches. Similarly, within the Romance languages, Sardinian is a relative isolate. However, without a qualifier, isolate is understood to mean having no demonstrable genetic relationship to any other known language.
Some languages once seen as isolates may be reclassified as small families. This happened with Japanese (now included in the Japonic family along with Ryukyuan languages such as Okinawan) and Korean in Koreanic languages with Jeju language. The Etruscan language of Italy has long been considered an isolate, but some have proposed that it is related to the so-called Tyrsenian languages, an extinct family of closely related ancient languages proposed by Helmut Rix in 1998, including the Rhaetian language, formerly spoken in the central Alps, and the Lemnian language, formerly spoken on the Greek island of Lemnos.
Language isolates may be seen as a special case of unclassified languages that remain unclassified even after extensive efforts. If such efforts eventually do prove fruitful, a language previously considered an isolate may no longer be considered one, as happened with the Yanyuwa language of northern Australia, which has been placed in the Pama-Nyungan family. Since linguists do not always agree on whether a genetic relationship has been demonstrated, it is often disputed whether a language is an isolate or not.
The term "genetic relationship" is meant in the genealogical sense of historical linguistics, which groups most languages spoken in the world today into a relatively small number of families, according to reconstructed descent from common ancestral languages. A "genetic relationship" is a connection between languages, like similarities in vocabulary or grammar, that can be attributed to a common ancestral proto-language that diverged into multiple languages or branches. For example, English is related to other Indo-European languages and Mandarin Chinese is related to other Sino-Tibetan languages. By this criterion, each language isolate constitutes a family of its own, which explains the exceptional interest that these languages have received from linguists.
In some situations, a language with no ancestor can arise. This frequently happens with sign languages--most famously in the case of Nicaraguan Sign Language, where deaf children with no language were placed together and developed a new language. Similarly, if deaf parents were to raise a group of hearing children who have no contact with others until adulthood, those children might develop an oral language among themselves and keep using it later, teaching it to their children, and so on. Eventually, it could develop into the full-fledged language of a population. With unsigned languages, this is not very likely to occur at any one time but, over the tens of thousands of years of human prehistory, the likelihood of this occurring at least a few times increases. There are also creole languages and constructed languages such as Esperanto, which do not descend directly from a single ancestor but have become the language of a population; however, they do take elements from existing languages.
Caution is required when speaking of extinct languages as isolates. Despite their great age, Sumerian and Elamite can be safely classified as isolates, as the languages are well enough known that, if modern relatives existed, they would be recognizably related.
However, many extinct languages are very poorly attested, and the fact that they cannot be linked to other languages may be a reflection of our poor knowledge of them. Hattic, Gutian, and Kassite are also believed by mainstream majority to be isolates, but their status is disputed by a minority of linguists. Many extinct languages of the Americas such as Cayuse and Majena may likewise have been isolates. A language thought to be an isolate may turn out to be relatable to other languages once enough material is recovered, but material is unlikely to be recovered if a language was not documented in writing.
A number of sign languages have arisen independently, without any ancestral language, and thus are true language isolates. The most famous of these is the Nicaraguan Sign Language, a well documented case of what has happened in schools for the deaf in many countries. In Tanzania, for example, there are seven schools for the deaf, each with its own sign language with no known connection to any other language. Sign languages have also developed outside schools, in communities with high incidences of deafness, such as Kata Kolok in Bali, the Adamorobe Sign Language in Ghana, the Urubu Sign Language in Brazil, several Mayan sign languages, and half a dozen sign languages of the hill tribes in Thailand including the Ban Khor Sign Language.
These and more are all presumed isolates or small local families, because many deaf communities are made up of people whose hearing parents do not use sign language, and have manifestly, as shown by the language itself, not borrowed their sign language from other deaf communities during the recorded history of these languages.
Below is a list of known language isolates, arranged by continent, along with notes on possible relations to other languages or language families.
The Status column indicates the long-term viability of the language, according to the definitions of the UNESCO Atlas of the World's Languages in Danger. "Vibrant" languages are those in full use by speakers of every generation, with consistent native acquisition by children. "Vulnerable" languages have a similarly wide base of native speakers, but a restricted use and the long-term risk of language shift. "Endangered" languages are either acquired irregularly or spoken only by older generations. "Moribund" languages have only a few remaining native speakers, with no new acquisition, highly restricted use, and near-universal bilingualism. "Extinct" languages have no native speakers, but are sufficiently documented to be classified as isolates.
With few exceptions, all of Africa's languages have been gathered into four major phyla: Afroasiatic, Niger-Congo, Nilo-Saharan and Khoisan. However, the genetic unity of some language families, like Nilo-Saharan, is questionable,[according to whom?] and so there may be many more language families and isolates than currently accepted.[by whom?] Data for several African languages, like Kwadi and Kwisi, are not sufficient for classification. In addition, Jalaa, Shabo, Laal, Kujargé, and a few other languages within Nilo-Saharan and Afroasiatic-speaking areas may turn out to be isolates upon further investigation. Defaka and Ega are highly divergent languages located within Niger-Congo-speaking areas, and may also possibly be language isolates.
|Bangime||2,000||Vibrant||Mali||Spoken in the Bandiagara Escarpment. Used as an anti-language.|
|Hadza||1,000||Vulnerable||Tanzania||Spoken on the southern shore of Lake Eyasi in the southwest of Arusha Region. Once listed as an outlier among the Khoisan languages. Language use is vigorous, though there are fewer than 1,000 speakers.|
|Jalaa||Extinct||Nigeria||Strongly influenced by Dikaka, but most vocabulary is very unusual.|
|Laal||750||Moribund||Chad||Spoken in three villages along the Chari River in Moyen-Chari Region. Poorly known. Also known as Gori. Possibly a distinct branch of Niger-Congo, Chadic of the Afroasiatic languages, or mixed.|
|Sandawe||60,000||Vibrant||Tanzania||Spoken in the northwest of Dodoma Region. Tentatively linked to the Khoe languages.|
|Shabo||400||Endangered||Ethiopia||Spoken in Anderaccha, Gecha, and Kaabo of the Southern Nations, Nationalities, and Peoples' Region. Linked to the Gumuz and Koman families in the proposed Komuz branch of the Nilo-Saharan languages|
|Ainu ?||2||Moribund||Japan, Russia||Formerly spoken on southern Sakhalin, and all of the Kuril Islands and Hokkaido, now reduced to a handful of speakers in Hokkaido. May actually constitute a small language family, if the extinct varieties are classed as languages rather than dialects. Possibly related to the unattested language of the Emishi.|
|Burushaski||96,800||Vulnerable||Pakistan||Spoken in the Hunza Valley of Gilgit-Baltistan. Linked to Caucasian languages, Indo-European, and Na-Dene languages in various proposals.|
|Elamite||Extinct||Iran||Formerly spoken in Elam, along the northeast coast of the Persian Gulf. Attested from around 2800 BC to 300 BC. Some propose a relationship to the Dravidian languages (see Elamo-Dravidian), but this is not well-supported.|
|Korean ?||77,230,000||Vibrant||North Korea, South Korea and Northeast China||More speakers than all other language isolates combined. Connections to the Altaic languages had been proposed, but widely discredited. It has also been proposed that Korean may be related to Japanese in the Japanese-Korean classification hypothesis, both with and without a common Altaic ancestor. Sometimes classified as a language family, forming the Koreanic family if the Jeju dialect is classified as a separate language rather than a Korean dialect.|
|Kusunda||87 (2014)||Moribund||Nepal||Spoken in the Gandaki Zone. The recent discovery of a few speakers shows that it is not demonstrably related to anything else.|
|Nihali||2,000||Endangered||India||Also known as Nahali. Spoken in northeastern Maharashtra and southwestern Madhya Pradesh, along the Tapti River. Strong lexical Munda influence from Korku. Used as anti-language by speakers.|
|Nivkh ?||200||Moribund||Russia||Also known as Gilyak. Spoken in the lower Amur River basin and in the northern part of Sakhalin. Dialects sometimes considered two languages. Has been linked to Chukotko-Kamchatkan languages.|
|Sumerian||Extinct||Iraq||Spoken in Mesopotamia until around 1800 BC, but used as a classical language until 100 AD. Long-extinct but well-attested language of ancient Sumer. Included in various proposals involving everything from Basque to the Sino-Tibetan languages.|
Current research considers that the "Papuasphere" centered in New Guinea includes as many as 37 isolates. (The more is known about these languages in the future, the more likely it is for these languages to be later assigned to a known language family.) To these, one must add several isolates found among non-Pama-Nyungan languages of Australia:
|Abinomn||300||Vibrant||Indonesia||Spoken in the far north of New Guinea. Also known as Bas or Foia. Language use is vigorous, despite low number of speakers.|
|Anêm ?||800||Vibrant||Papua New Guinea||Spoken on the northwest coast of New Britain. Perhaps related to Yélî Dnye and Ata.|
|Ata ?||2,000||Vibrant||Papua New Guinea||Spoken in the central highlands of New Britain. Also known as Wasi. Perhaps related to Yélî Dnye and Anem.|
|Giimbiyu ?||Extinct||Australia||Spoken in the northern part of Arnhem Land until the early 1980s. Sometimes considered a small language family consisting of Mengerrdji, Urningangk and Erre. Part of a proposal for the undemonstrated Arnhem Land language family.|
|Kol||4,000||Vibrant||Papua New Guinea||Spoken in the northeastern part of New Britain. Possibly related to the poorly-known Sulka, or the Baining languages.|
|Kuot||2,400||Vulnerable||Papua New Guinea||Spoken on New Ireland. Also known as Panaras.|
|Malak-Malak||10||Moribund||Australia||Spoken in northern Australia. Often considered part of one Northern Daly family together with Tyeraity. Used to be considered genetically related to the Wagaydyic languages, but nowadays they are considered genetically distinct.|
|Murrinh-patha ?||1,973||Vibrant||Australia||Spoken on the eastern coast of Joseph Bonaparte Gulf in the Top End. The proposed linkage to Ngan'gityemerri in one Southern Daly family is generally accepted to be valid.|
|Ngan'gityemerri ?||26||Moribund||Australia||Spoken in the Top End along the Daly River. The proposed linkage to Murrinh-patha in one Southern Daly family is generally accepted to be valid.|
|Sulka||2,500-3,000||Vibrant||New Britain, Papua New Guinea||Possible language isolate spoken across the eastern end of New Britain. Poorly attested.|
|Tayap||>50||Moribund||Papua New Guinea||Formerly spoken in the village of Gapun. Link to Lower Sepik languages and Torricelli languages have been explored, but the general consensus among Linguists is that it is an isolate unrelated to surrounding languages.|
|Tiwi||2,040||Vulnerable||Australia||Spoken in the Tiwi Islands in the Timor Sea. Traditionally Tiwi is polysynthetic, but the Tiwi spoken by younger generations is not.|
|Wagiman||11||Moribund||Australia||Spoken in the southern part of the Top End. May be distantly related to the Yangmanic languages, which might in turn be a member of the Macro-Gunwinyguan family, but neither link has been demonstrated.|
|Wardaman||50||Moribund||Australia||Spoken in the southern part of the Top End. The extinct and poorly-attested Dagoman and Yangman dialects are sometimes treated as separate languages, forming a Yangmanic family, to which Wagiman may be distantly related. Possibly a member of the Macro-Gunwinyguan family, but this has yet to be demonstrated.|
|Basque||751,500 (2016), 1,185,500 passive speakers||Vulnerable||Spain, France||Natively known as Euskara, the Basque language, found in the historical region of the Basque Country between France and Spain, is the second most-widely spoken language isolate after Korean. It has no known living relatives, although Aquitanian is commonly regarded as related to or a direct ancestor of Basque. Some linguists have claimed similarities with various languages of the Caucasus that are indicative of a relationship, while others have proposed a relation to Iberian and to the hypothetical Dené-Caucasian languages.|
|Alsea||Extinct||United States||Poorly attested. Spoken along the central coast of Oregon until 1942. Sometimes regarded as two separate languages. Often included in the Penutian hypothesis in a Coast Oregon Penutian branch.|
|Atakapa||Extinct||United States||Spoken on the Gulf coast of eastern Texas and southwestern Louisiana until the early 1900s. Often linked to Muskogean in a Gulf hypothesis.|
|Chimariko||Extinct||United States||Spoken in northern California until the 1950s. Part of the Hokan hypothesis.|
|Chitimacha||Extinct||United States||Well-attested. Spoken along the Gulf coast of southeastern Louisiana until 1940. Possibly in the Totozoquean family of Mesoamerica.|
|Coahuilteco||Extinct||United States, Mexico||Spoken in southern Texas and northeastern Mexico until the 1700s. Part of the Hokan hypothesis.|
|Cuitlatec||Extinct||Mexico||Spoken in northern Guerrero until the 1960s. Formerly considered Macro-Chibchan.|
|Esselen||Extinct||United States||Poorly known. Spoken in the Big Sur region of California until the early 1800s. Part of the Hokan hypothesis.|
|Haida||24||Moribund||Canada, United States||Spoken in the Haida Gwaii archipelago off the northwest coast of British Columbia, and the southern islands of the Alexander Archipelago in southeastern Alaska. Some proposals connect it to the Na-Dené languages, but these have fallen into disfavor.|
|Huave||18,000||Endangered||Mexico||Spoken in the Isthmus of Tehuantepec, in the southeast of Oaxaca state. Part of the Penutian hypothesis when extended to Mexico, but this idea has generally been abandoned.|
|Karuk||12||Moribund||United States||Spoken along the Klamath River in northwestern California. Part of the Hokan hypothesis.|
|Keres||10,670||Endangered||United States||Spoken in several pueblos throughout New Mexico, including Cochiti and Acoma Pueblos. Has two main dialects: Eastern and Western. Sometimes those two dialects are separated into languages in a Keresan family.|
|Kutenai||245||Moribund||Canada, United States||Spoken in the Rockies of northeastern Idaho, northwestern Montana and southeastern British Columbia. Attempts have been made to place it in a Macro-Algic or Macro-Salishan family, but these have not gained significant support.|
|Natchez||Extinct||United States||Spoken in southern Mississippi and eastern Louisiana until 1957. Often linked to Muskogean in a Gulf hypothesis. Attempts at revival have produced 6 people with some fluency.|
|Purépecha||124,494||Endangered||Mexico||Spoken in the north of Michoacán state. Language of the ancient Tarascan kingdom. Sometimes regarded as two languages.|
|Salinan||Extinct||United States||Spoken along the south-central coast of California until 1958. Part of the Hokan hypothesis.|
|Seri||764||Vulnerable||Mexico||Spoken along the coast of the Gulf of California, in the southwest of Sonora state. Formerly spoken on Tiburón Island in the Gulf of California. Part of the Hokan hypothesis.|
|Siuslaw||Extinct||United States||Spoken on the southwest coast of Oregon until the 1970s. Likely related to Alsea, Coosan languages, or possibly the Wintuan languages. Part of the Penutian hypothesis.|
|Takelma||Extinct||United States||Spoken in western Oregon until 1934. Part of the Penutian hypothesis. A specific relationship with Kalapuyan is now rejected.|
|Timucua||Extinct||United States||Well attested. Spoken in northern Florida and southern Georgia until the late 1700s. A connection with the poorly known Tawasa language has been suggested, but this may be a dialect.|
|Tonkawa||Extinct||United States||Spoken in central and northern Texas until the early 1940s.|
|Tunica||Extinct||United States||Spoken in western Mississippi, northeastern Louisiana, and southeastern Arkansas until 1948. Attempts at revitalization have produced 32 second-language speakers.|
|Washo||20||Moribund||United States||Spoken along the Truckee River in the Sierra Nevada of eastern California and northwestern Nevada. Part of the Hokan hypothesis.|
|Yana||Extinct||United States||Well-attested. Spoken in northern California until 1916. Part of the Hokan hypothesis.|
|Yuchi||4||Moribund||United States||Spoken in Oklahoma, but formerly spoken in eastern Tennessee. A connection to the Siouan languages has been proposed.|
|Zuni||9,620||Vulnerable||United States||Spoken in Zuni Pueblo in northwestern New Mexico. Links to Penutian and Keres have been proposed.|
|Aikanã||200||Endangered||Brazil||Spoken in the Amazon of eastern Rondônia. Arawakan has been suggested.|
|Andoque||370||Endangered||Colombia, Peru||Spoken on the upper reaches of the Japurá River. Extinct in Peru. Possibly Witotoan.|
|Betoi||Extinct||Venezuela||Spoken in the Apure River basin near the Colombian border until the 18th century. Paezan has been suggested.|
|Camsá||4,000||Endangered||Colombia||Spoken in Sibundoy in the Putumayo Department. Also known as Kamsa, Coche, Sibundoy, Kamentxa, Kamse, or Camëntsëá.|
|Candoshi-Shapra||1,100||Endangered||Peru||Spoken along the Chapuli, Huitoyacu, Pastaza, and Morona river valleys in southwestern Loreto. Could be related to the extinct and poorly-attested Chirino language.|
|Canichana||Extinct||Bolivia||Spoken in the Llanos de Moxos region of Beni Department until around 2000. A connection with the extinct Tequiraca (Auishiri) has been proposed.|
|Cayuvava||4||Moribund||Bolivia||Spoken in the Amazon west of Mamore River, north of Santa Ana del Yacuma in the Beni Department.|
|Chimane||5,300||Vulnerable||Bolivia||Spoken along the Beni river in Beni Department. Also spelled Tsimané. Sometimes split into multiple languages in a Moséten family. Linked to the Chonan languages in a Moseten-Chonan hypothesis.|
|Chiquitano||5,900||Endangered||Bolivia, Brazil||Spoken in the eastern part of Santa Cruz department and the southwestern part of Mato Grosso state. Formerly regarded as a member of the Macro-Jê family, but this claim was unsubstantiated.|
|Cofán||2,400||Endangered||Colombia, Ecuador||Spoken in northern Sucumbíos Province and southern Putumayo Department. Also called A'ingae. Sometimes classified as Chibchan, but the similarities appear to be due to borrowings. Seriously endangered in Colombia.|
|Fulniô||1,000||Moribund||Brazil||Spoken in the states of Paraíba, Pernambuco, Alagoas, Sergipe, and the northern part of Bahia. Divided into two dialects, Fulniô and Yatê. Sometimes classified as a Macro-Jê language, but not much evidence to support this.|
|Guató||6||Moribund||Brazil||Spoken in the far south of Mato Grosso near the Bolivian border. Previously classified as Macro-Jê, but no evidence was found to support this.|
|Itonama||5||Moribund||Bolivia||Spoken in the far-eastern part of Beni Department. Paezan has been suggested.|
|Kanoê||5||Moribund||Brazil||Spoken in southeastern Rondônia. Also known as Kapishana. Part of a Macro-Paesan proposal.|
|Kunza||Extinct||Chile||Spoken in areas near Salar de Atacama until the 1950s. Also known as Atacameño. Part of a Macro-Paesan proposal.|
|Kwaza||54||Moribund||Brazil||Spoken in eastern Rondônia. Connections have been proposed with Aikanã and Kanoê.|
|Leco||20||Moribund||Bolivia||Spoken in the Andes east of Lake Titicaca.|
|Mapuche||260,000||Vulnerable||Chile, Argentina||Spoken in areas of the far-southern Andes and in the Chiloé Archipelago. Also known as Mapudungun, Araucano or Araucanian. Considered a family of 2 languages by Ethnologue. Variously part of Andean, Macro-Panoan, or Mataco-Guaicuru proposals. Sometimes Huilliche is treated as a separate language, reclassifying Mapuche into an Araucanian family.|
|Munichi||Extinct||Peru||Spoken in the southern part of Loreto Region until the late 1990s. Possibly related to Arawakan languages|
|Movima||1,400||Vulnerable||Bolivia||Spoken in the Llanos de Moxos, in the north of Beni Department.|
|Oti||Extinct||Brazil||Spoken in São Paulo until the early 1900s. Macro-Jê has been suggested.|
|Páez||60,000||Vulnerable||Colombia||Spoken in the northern part of Cauca Department. Several proposed relationships in the Paezan hypothesis but nothing conclusive.|
|Puelche||Extinct||Argentina, Chile||Spoken in the Pampas region until the 1930s. Sometimes linked to Het. Included in a proposed Macro-Jibaro family.|
|Tequiraca||Extinct||Peru||Spoken in the central part of Loreto until the 1950s. Also known as Auishiri. A connection with Canichana has been proposed.|
|Trumai||51||Moribund||Brazil||Settled on the upper Xingu River. Currently reside in the Xingu National Park in the northern part of Mato Grosso.|
|Urarina||3,000||Vulnerable||Peru||Spoken in the central part of the Loreto Region. Part of the Macro-Jibaro proposal.|
|Waorani||2,000||Vulnerable||Ecuador, Peru||Also known as Sabela. Spoken between the Napo and Curaray rivers. Could be spoken by several uncontacted groups.|
|Warao||28,000||Endangered||Guyana, Suriname, Venezuela, Trinidad and Tobago||Spoken in the Orinoco Delta. Sometimes linked to Paezan.|
|Yaghan||1||Moribund||Chile||Spoken in far-southern Tierra del Fuego. Also called Yámana. Last native speaker is Cristina Calderón, who is 90 years old.|
|Yaruro||7,900||Vibrant||Venezuela||Spoken along the Orinoco, Cinaruco, Meta, and Apure rivers. Linked to the extinct Esmeralda language.|
|Yuracaré||2,700||Endangered||Bolivia||Spoken in the foothills of the Andes, in Cochabamba and Beni Departments. Connections to Mosetenan, Pano-Tacanan, Arawakan, and Chonan have been suggested.|