The linguistic and genetic mosaic of the Northwest Caucasus by Asya Pereltsvaig

Asya Pereltsvaig | Languages of the World

The Northwest Caucasus – including Russia’s internal republics of Adygea, Karachai-Cherkessia, and Kabardino-Balkaria, as well as parts of Krasnodar Krai in Russia proper – presents a veritably kaleidoscopic ethno-linguistic picture. As can be seen from this ethno-linguistic map of Karachai-Cherkessia, based on 2002 census data, Indo-European-speaking groups such as the Russians (shown in blue) and the Ossetians (in brown) coexist with Turkic-speaking peoples like the Karachais and Nogais (in two shades of green) and Turkic-speaking Greeks (in blue-green), as well as with ethnic groups who speak Northwest Caucasian languages (this map depicts “Cherkess” in orange and “Abaza” in yellow; more on these terms below). Similarly, the 2010 Adygean census lists “Adyghe” and “Cherkess” as constituting about a quarter of the region’s population, alongside 62% Russians, with the rest divided between three other Indo-European-speaking groups: Ukrainians, Armenians and Kurds. Kabardino-Balkaria is home to Northwest-Caucasian-speaking Kabardins; Turkic-speaking Balkars and Turks; Indo-European-speaking Russians, Ukrainians and Ossetians; as well as to some 2,500 Germans, 1,300 Jews and 4,700 Koreans. Finally, the 2002 census data for Krasnodar Krai lists the following groups (in the order of decreasing numbers): Russians, Armenians, Ukrainians, Greeks, Belorussians, Tatars, Adyghe, Georgians, Germans, Turks, Azeris, and Gypsies.

Two problems quickly become apparent when such ethno-demographic data are considered. First, while most ethnic groups used for census data gathering are defined in terms of the languages they speak, several groups are not, most notably the Turkic-speaking Greeks and the mostly Russian-speaking Jews, Germans and Gypsies. A more intractable problem concerns the groups speaking Northwest Caucasian languages: various sources use the terms “Adyghe”, “Cherkess”, “Circassian”, “Kabardin”, “Shapsug” in different yet overlapping ways. For example, the 2002 census listing for Krasnodar Krai specifies in a footnote that the term “Adyghe” is meant to include “Adygeis, Kabardians, Cherkess, and Shapsug” (the original Russian terminology is adygi for ‘Adyghe’ and adygejcy for ‘Adygeis’). The UNESCO list of endangered languages lists Adyghe separately, while grouping Kabardian and Cherkess into one “Kabard-Cherkes” language (incidentally, both of these are listed as “vulnerable” meaning that children speak the language but for the most part only at home). Some sources use “Cherkess” and “Adyghe” interchangeably; yet, other sources use “Cherkess” or “Circassian” as an umbrella term for all Northwest-Caucasian-speaking groups.

The origins of these terms help explain their confused usage today. Adyghe is the autonym (self-designation) of the group. Moreover, Kabardins refer to themselves as Kebertei or Kebertei-Adyghe, and Shapsugs use Shapsyg-Adyghe. In contrast, the term Circassian is the exonym by which speakers of Northwest Caucasian languages are most commonly known to the outside world. It derives from the Turkic designation “Cherkess” that has been adopted by Russian and other languages and became fixed in the European and Asian literatures. Northwest Caucasian Languages Map

Curiously, the linguistic classification of Northwest Caucasian languages correlates closely with the self-designations of these groups. As the alternative name of this family, Abkhazo-Adygean, indicates, it consists of two main subfamilies: Abkhazian and Adygean. The latter includes two literary languages, Adyghe and Kabardian, and their local dialects (e.g. Shapsug, Temirgoy, and Abadzakh are properly considered dialects of Adyghe). These two languages exhibit close affinity to each other, and some scholars consider them to form a dialect continuum, meaning that some local dialects occupy an intermediate position between the literary standards of Adyghe and Kabardian (for example, the Beslenei dialect is closer to Kabardian except its pronunciation of consonants is more Adyghe-like). This high degree of similarity between Adyghe and Kabardian means that the two languages share a relatively recent common ancestor; indeed, we learn from historical sources that the split between the two subgroups of the Adyghe people happened about 1,500 years ago. Today, Adyghe is spoken by approximately 500,000 people in Turkey, Russia, Jordan, Syria, Iraq, Israel, Macedonia, and in the United States (there is even Circassian Association of California). Its closest relative, Kabardian, has about 1,600,000 speakers in Turkey, Russia, Jordan, Syria, and Germany. To avoid further confusion, in what follows I will use the term Circassian as an umbrella designation for Adyghe and Kabardians, but not for the rest of the Northwest Caucasian groups.

The second branch of the Northwest Caucasian language family, the Abkhaz–Abaza branch consists of two languages: Abaza (with 48,000 speakers in Russia and Turkey) and Abkhaz (with 117,000 speakers mostly in the Great Caucasus Mountain range in Abkhazia, as well as in smaller communities in Turkey). The fifth Northwest Caucasian language, Ubykh, now extinct, was once spoken in the area around Sochi in Russia. After the expulsions in 1864, Ubykh was mostly spoken in the Istanbul area, near the Sea of Marmara, but its use gradually declined and its last fully competent speaker Tevfik Esenç died on October 7, 1992. The ethnic Ubykh community now speaks a distinct dialect of Adyghe, according to the Ethnologue website.

Among the linguistic peculiarities of the Northwest Caucasian languages, including Adyghe, is the relative paucity of vowels coupled with the abundance and complexity of consonants. Depending on the analysis, Northwest Caucasian languages have just 2 or 3 vowels, but to compensate for the shortage of vowels, they have very rich systems of consonants. Adyghe and Kabardian actually have the simplest consonantal inventories of the five Northwest Caucasian languages. For example, Kabardian features a “mere” 48 consonants, including some rather unusual ejective fricatives, pharyngeals (i.e. sounds articulated with the root of the tongue against the pharynx, at the back of the throat) and interdentals (i.e. “th”-sounds). Ubykh, on the other hand, had one of the largest consonant inventories in the world, and probably the largest outside the Khoisan languages – a whopping 81 consonants (according to some analyses).

How does a language end up with such a skewed ratio of consonants to vowels? Historical linguists believe that Northwest Caucasian languages developed so many consonants at the expense of vowels because of a historical change in which vowel features were reassigned to preceding consonants. For example, ancestral */ki/ became /kʲə/, with palatalization (i.e. moving the tongue closer to the roof of the mouth) being reassigned from the vowel /i/ to the consonant /kʲ/; similarly, ancestral */ku/ became /kʷə/, with labialization (i.e. the rounding of lips) similarly reassigned. Note that in both cases, the vowels have been neutralized to a schwa /ə/.

From the grammatical point of view, the most interesting property of the Northwest Caucasian languages, including Adyghe, is their polysynthetiс nature: each verb is marked for agreement with all arguments, not only with subjects (as in more familiar Indo-European languages like English: The children play but The child plays), but also with objects and indirect objects. (Other polysythetic languages around the world include: Wichita, a Caddoan languages spoken in west-central Oklahoma; Nahuatl, an Uto-Aztecan language of central Mexico; Mapudungun, an Araucanian language spoken in central Chile; Nunggubuyu, an Australian aboriginal language from the Gunwingguan family; Chukchi and Koryak, two Chukotko-Kamchatkan languages in northeastern Siberia; Ainu of northern Japan; and Sora, a Munda language spoken in India.)

Since agreement prefixes on the verb encode “who did what to whom”, the systems of noun cases in Northwest Caucasian languages are rather underdeveloped (for instance, Abkhaz distinguishes just two cases, the nominative and the adverbial). More generally, the verb is where the morphosyntactic “action” is centered in these languages, and the verb may include not only agreement prefixes but also locative, directional, reflexive, causative, and other morphemes. In effect, virtually the entire syntactic structure of the sentence can be packed within the verb, as in the following examples of Adyghe verbs:

(1)       i-          zo-       gˆa-     tq

him-    I-          make- write

‘I will force him to write.’

(2)       nə-      pq-                  ue-      z-         gˆa-     tqa

there-            you-    I-          make- write

‘I forced him to write you something.’

(3)       u-        q̣a-      ze-                   gˆa-     pṭl

you-    here-             make- look

‘He forces you to look here at me.’

[examples from Shakryl 1971, pp. 24-25, using standard transliteration]

The polysynthetic nature of verbs coupled with the paucity of cases is rather peculiar to Northwest Caucasian languages; Northeast Caucasian languages, in contrast, place the morphosyntactic “action” on nouns rather than on verbs (i.e. these languages have rich case systems and little to no agreement on verbs), whereas South Caucasian languages combine a relatively rich verbal agreement with relatively rich case systems.

Another notable aspect of Adyghe is its vocabulary. In addition to the core word stock shared with other Northwest Caucasian languages, Adyghe also has a significant number of loanwords (some of which are also shared with other Northwest Caucasian languages, especially with Kabardian). The main sources of such loanwords are Russian, Arabic, Farsi (Persian) and Turkic languages. According to A. K. Shagirov, loanwords from Russian constitute the bulk of the foreign vocabulary in Adyghe, indicating the extent and diversity of contacts between the Russian and the Adyghes. However, unsurprisingly, the Adyghe language of diasporic communities in the former Ottoman lands features many more loanwords from Turkic languages.

Each language source contributed words in certain semantic domains. For example, Russian borrowings include words for everyday objects (Adyghe kastrul ‘pot’, cf. Russian kastrjulja), clothing (Adyghe trusik ‘underpants’, cf. Russian trusiki), foods (Adyghe pičen ‘cookie’, cf. Russian pečenje), construction and engineering terms (Adyghe tormaz ‘breaks’, cf. Russian tormoz), medicine (Adyghe prastud ‘common cold’, cf. Russian prostuda), education, science, culture and sports (Adyghe mestaimenija ‘pronoun’, cf. Russian mestoimenije), government, administration, military and the law (Adyghe zakon ‘law’, cf. Russian zakon). In contrast, many of the Arabic loanwords in Adyghe have to do with Islam and Muslim ethics, traditions and holidays: älah ‘God’, hädžə ‘one who made pilgrimage to Mecca’, din ‘religion, faith’, among others. Some of these Arabic loanwords, and all loanwords from Farsi, are said to have been borrowed into Adyghe via such Turkic languages as Turkish and Crimean Tatar. Turkic-derived vocabulary in Adyghe – borrowed from a variety of languages including Turkish, Tatar, Nogai, Karachai-Balkar, and others – includes words for everyday objects, foods, items of clothing, names of animals and plants, trade and military terms, such as bajäu ‘paint’, ästlän ‘lion’, äjvä ‘quince’, äqšə‘money’, and many others.

One particularly interesting loanword in Adyghe, mentioned by Shagirov, is the word qazar meaning ‘one who asks too much for his goods and does not reduce the price’. Shagirov relates the etymology of this word to the enthnonym Khazars, a Turkic-speaking group whose territory comprised large portions of the northern Caucasus, including the lands of the Adyghe (see map on the left). It is well-known that Khazars were instrumental in maintaining trade networks between Europe and the East; if the etymology of the Adyghe word qazar proposed by Shagirov is correct, it speaks volumes to the trade relations between the Adyghe and the Khazars.

More generally, both the extent of the loanword vocabulary in Adyghe and the semantic areas in which loanwords are especially common reveal that the contact between the Adyghes and the other groups – the Russians, the Turkic-speaking peoples, the Arabs, and the Iranians – were limited mostly to military conquest, administrative rule, and trade. The grammatical peculiarities of Adyghe and other Northwest Caucasian languages, such as their intricate verbal complexes and the paucity of nominal cases, indicate that the contacts could not have involved much intermarriage, which would have led to more extensive penetration of grammatical features from languages of the other groups.

One would hope that genetic findings would shed more light on the origin of Northwest Caucasian groups and their interactions with their neighbors, especially intermarriage between groups. However, there has been relatively little research into the DNA of Northwest-Caucasian-speaking groups. The paucity of work on this topic is compounded by the confusing terminology, as described above, which makes it hard to figure out exactly which groups were tested for which study. But the overall picture that emerges from studies of Y-DNA by Balanovsky et al. (2011) and by Yunusbayev et al. (2011) is as follows. The most frequent haplogroup among Northwest Caucasian peoples is haplogroup G2a; however, unlike the Ossetians, who are highest in subgroup G2a1-P16 (blue in the map to the left, from Balanovsky et al.), Northwest Caucasian peoples are highest in subgroup G1a3b1-P303 (yellow in the map on the left). This haplogroup seems to be peculiar to them, though it is also found in some European populations. In the Caucasus this haplogroup is highest in the west and its frequency diminishes to the east. Thus, Shapsugs, the westernmost Northwest Caucasian group, have the highest frequency of this haplogroup, over 80%.

The two other haplogroups that are high in the Northwest Caucasian populations are J2a and R1a1. The former, especially its subgroup J2a4b*-M67, is found in about 6%-12% of Northwest Caucasian people (depending on the study and the group), and has the highest frequency among the Ingush, up to 88% (shown in red in the map above). Haplogroup R1a1-M17 is found with the frequency similar to that of J2a – about 6-14% of Northwest Caucasian men have this genetic signature (again, depending on the study and the group). This haplogroup became known as the “Eastern European DNA”; it is also the only haplogroup found in what is thought to be Scythian skeletons from the south Siberian steppes to the northeast of the Caucasus. In fact, Keyser et al. (2009) called this haplogroup the “mark [of] the eastward migration of the early Indo-Europeans”.


Balanovsky, Oleg; Khadizhat Dibirova; Anna Dybo; Oleg Mudrak; Svetlana Frolova; Elvira Pocheshkhova; Marc Haber; Daniel Platt; Theodore Schurr; Wolfgang Haak; Marina Kuznetsova; Magomed Radzhabov; Olga Balaganskaya; Alexey Romanov; Tatiana Zakharova; David F. Soria Hernanz; Pierre Zalloua; Sergey Koshel; Merritt Ruhlen; Colin Renfrew; R. Spencer Wells; Chris Tyler-Smith; Elena Balanovska; and The Genographic Consortium (2011) Parallel Evolution of Genes and Languages in the Caucasus Region. Molecular Biology and Evolution 28(10): 2905–2920.

Keyser C., Bouakaze C., Crubézy E., Nikolaev V.G., Montagnon D., Reis T., Ludes B. (2009) Ancient DNA provides new insights into the history of south Siberian Kurgan people. Human Genetics 126(3): 395-410.

Shagirov, A. K. (1962) Essays on comparative lexicology of Adyghean languages. Nalchik: Kabardino-Balkarian Book Publishing. [in Russian]

Shakryl, K. S. (1971) Essays on Abkhaz-Adyghe languages. Sukhumi: Alashara. [in Russian]

Yunusbayev, Bayazit; Mait Metspalu, Mari Järve, Ildus Kutuev, Siiri Rootsi, Ene Metspalu, Doron M. Behar, Kärt Varendi, Hovhannes Sahakyan, Rita Khusainova, Levon Yepiskoposyan, Elza K. Khusnutdinova, Peter A. Underhill, Toomas Kivisild, and Richard Villems. The Caucasus as an asymmetric semipermeable barrier to ancient human migrationsMolecular Biology and Evolution online.

Source: Languages of the World