All living languages evolve over time, adding & losing vocabulary, morphological behavior, and syntactic structures, and changing in the ways they are pronounced by their speakers. Even without knowing how or why these evolutionary mechanisms operate, one can still get a feel for their effects; for example, they account for the differences between American and British English, and for the fact that neither Americans nor Brits can understand Beowulf at all without first being taught how to read the Old English language in which it was composed. Even the writings of Shakespeare -- much more recent than Beowulf -- can be difficult for modern English speakers to interpret. The field of study that concerns itself with language evolution is called historical linguistics.
A large number of related languages form what is called the Indo-European macrofamily. These languages all evolved from a common ancestral tongue called Proto-Indo-European (PIE), spoken ca. 6,000 years ago by a people living (by "traditional" hypothesis) somewhere in the general vicinity of the Pontic Steppe north of the Black Sea and east to the Caspian -- an area that, perhaps not accidentally, seems to coincide with the land of the ancient Scythians, from the Ukraine across far southwestern Russia to western Kazakhstan. (N.B. Many claims on this page are debated, in their details, but on the whole they seem best to fit the evidence and are accepted by most scholars; herein, we shall not bother to acknowledge the myriad debates but instead present a broad-brush picture for a general audience.)
Proto-Indo-European speakers grew in number and influence -- they are credited with the domestication of horses and the invention of the chariot, among many other innovations -- and spread east & west, north & south. But before the invention of any writing system known to its speakers, PIE had died out: as Indo-Europeans expanded from the ancestral homeland and brought forth new generations, PIE evolved, first into disparate dialects, and then into mutually incomprehensible daughter languages. Ten "proto-language" families are identified today: using what historical linguists call the comparative method, their probable forms (and that of Proto-Indo-European itself) can be reconstructed based on similarities and differences among descendants that were attested in inscriptions and literary & religious texts. (Such written records began to appear about a thousand years after PIE was last spoken.) For a sketch of the evolution of PIE into its major proto-languages, see Evolution of IE Families.
The Indo-European proto-languages themselves evolved, each giving rise to its own family of languages. Each family is identified with the proto-language from which it sprung; these families are conventionally listed in order, roughly from west to east with respect to the homelands their speakers came to occupy. The ten families, linked to modern maps of their homeland areas (which open in a separate window), are:
Each table that follows presents a highly schematic sketch of the evolutionary paths leading from the family ancestor to later, attested languages -- up to the present time, in the case of families that did not entirely die out. (Anatolian and Tocharian are the only known families that are now extinct.) By highly schematic we mean, for example, that dates are very approximate: we adopt, for sheer presentation convenience, quite arbitrary ranges of 500 or 1000 years that have little to do with accurate dates even when these might be known, which is seldom. What is important is that the general picture is instructive; for details the reader is referred to the vast literature of historical linguistics, now well over 200 years in the making and brimming with hypotheses, supporting arguments, and disagreements major & minor.
In the tables that follow, columns show 500/1000-year ranges, reading left to right; successive rows display groupings of sub-families (in bold face), languages within them (italicized if dead), and, reading left to right, not just a chronological but an evolutionary sequence (except for the Balkan languages). After each family section heading, important points related to the table that follows are briefly surveyed; for the reader's convenience, most geographic names are in modern English. Note: even where surviving languages in a family may number in the hundreds, and may be spoken by over a billion people (as in the case of the Indo-Iranian family), only a very few languages are selected for illustration here. For every family except Balkan, there are one or more languages for which online texts & lessons are or will be available in our Early Indo-European Online (EIEOL) series; links are provided from those languages to their series introductions.
Proto-Celtic speakers moved generally west from the PIE homeland, probably alongside groups from the Italic branch, spreading across southern Europe into central Turkey, northern Italy, France, Spain, and eventually the British Isles. As centuries passed, their language evolved into one group of languages labelled Continental (spoken by "Gauls" across southern Europe and mentioned by Julius Caesar among others), and another labelled Insular (spoken in the British Isles). Continental Celts later adopted Latin, or Greek in the case of those in Turkey, and the Continental Celtic languages, attested from the 6th century B.C., were lost. Insular Celtic split into a Goidelic subgroup that developed in Ireland, and a Brythonic subgroup that developed in England & Wales. Later in history, Goidelic Celts migrated to Scotland; also later in history, Brythonic Celts under pressure from the Anglo-Saxons returned to the Continent and settled in Brittany, on the western point of France.
2000-1000 | 1000-500 | 500-1 BC | 1-500 AD | 500-1000 | 1000-1500 | 1500-2000 | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
Proto-Celtic | Continental | Celtiberian | |||||||||
Gaulish | |||||||||||
Lepontic | |||||||||||
Noric | |||||||||||
Galatian | |||||||||||
Insular | Goidelic | Ogham Irish | Old Irish | Middle Irish | Irish Gaelic | ||||||
Scots Gaelic | |||||||||||
Manx | |||||||||||
Brythonic | Old Welsh | Middle Welsh | Welsh | ||||||||
Old Cornish | Middle Cornish | Cornish | |||||||||
Old Breton | Middle Breton | Breton |
The Germanic tribes generally followed behind the Celts, but moved somewhat further north. Their language developed into three groups of tongues labelled East, North, and West for their geographic distribution, with Runic now being considered the likely ancestor of the latter two. Gothic is the only attested language from the east, with a 4th century translation of the Bible, although Vandalic is known to have been spoken by Vandals who migrated across the fading Roman Empire through Spain to north Africa (see also map of the Germanic Kingdoms in 526). Most of the Goths blended into the Empire and their language was replaced by local Latin dialects, but some migrated east into Crimea, where their language survived to the 16th century.
Limited amounts of "Northwest Germanic" text survive from the 1st/2nd centuries A.D., carved in Runic script; later, the North Germanic languages developed in far north Europe (primarily the Scandinavian countries Denmark, Sweden, Norway, and their islands). Old Norse was the language of the Vikings, who settled Iceland as well as Scandinavia.
West Germanic languages developed in two main groups, one ("High German") at higher elevations, in southern Germany, Switzerland, and Austria, and the other ("Low German") further north and along the coast, including the Netherlands and Belgium. Modern German evolved from the former; modern English, via Old English a.k.a. Anglo-Saxon (see the map of Angles & Saxons about 600 A.D.), from the latter. (The term "Pennsylvania Dutch" is a modern misnomer: the original speakers came from central & southern Germany, even Switzerland -- not from the Netherlands.)
2000-500 | 500-1 BC | 1-500 AD | 500-1000 | 1000-1500 | 1500-2000 | ||
---|---|---|---|---|---|---|---|
Proto-Germanic | East | Gothic | Crimean Gothic | ||||
Vandalic | |||||||
Runic | North | Old Norse | Old Icelandic | Icelandic | |||
Old Norwegian | Norwegian | ||||||
Old Swedish | Swedish | ||||||
Old Danish | Danish | ||||||
West | Old High German | Middle High German | German | ||||
Swiss German | |||||||
Pennsylvania Dutch | |||||||
Yiddish | |||||||
Old Saxon | Middle Low German | Low German | |||||
Old English | Middle English | English | |||||
Old Dutch | Middle Dutch | Dutch | |||||
Afrikaans |
The Italic peoples began their descent into the Italian peninsula around the 2nd millenium B.C. Two subgroups developed from Proto-Italic -- Sabellic and Latino-Faliscan, both attested by 7th century B.C. inscriptions (the former in Umbrian, the latter in Faliscan). But the growing strength of the Latin speakers, culminating in the Roman Empire, resulted in most competing tongues in Italy (and many elsewhere, for example Continental Celtic) being extinguished. With the collapse of the Empire, the provincial Vulgar Latin dialects rather than Classical Latin survived, and in time developed into the Romance languages (see map of the European Provinces of Rome).
2000-1000 | 1000-500 | 500-1 BC | 1-500 AD | 500-1000 | 1000-1500 | 1500-2000 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Proto-Italic | Sabellic | Oscan | ||||||||||
Umbrian | ||||||||||||
Latino-Faliscan | Faliscan | |||||||||||
Latin | Classical Latin | Vulgar | Romanian | |||||||||
Old Italian | Italian | |||||||||||
Old French | French | |||||||||||
Old Provençal | Provençal | |||||||||||
Old Spanish | Spanish | |||||||||||
Old Portuguese | Portuguese |
While the Balto-Slavic (and especially the Baltic) languages of eastern Europe are attested only late, even by Indo-European standards, there are characteristics that strongly suggest they are highly conservative (most especially Baltic) and retain features akin to Proto-Indo-European. No Slavic language is attested until the mid-9th century A.D. (Old Church Slavonic), and no Baltic language until the 14th century (some Old Prussian words & phrases). Old Church Slavonic and Old Prussian became extinct, but Slavic and Baltic sibling languages survived.
2000-1000 | 1000-1 BC | 1-500 AD | 500-1000 | 1000-1500 | 1500-2000 | |||||
---|---|---|---|---|---|---|---|---|---|---|
Proto-Balto-Slavic | Proto-Baltic | Western | Old Prussian | |||||||
Eastern | Old Lithuanian | Lithuanian | ||||||||
Old Latvian | Latvian | |||||||||
Proto-Slavic | South | Old Church Slavonic | ||||||||
Eastern South | Bulgarian | |||||||||
Western South | Serbian | |||||||||
East | Old Russian | Russian | ||||||||
West | Old Polish | Polish |
The "family" of Balkan languages (see also the old map of Macedonia, Thrace, Illyria, Moesia and Dacia) is exceptional in that there are far too few early texts to support strong hypotheses about genetic relationships among the erstwhile members. This doesn't mean there are no hypotheses -- they are, in fact, numerous! -- but it does mean that no firm conclusions can be drawn because evidence is paltry or absent. As one example, the "traditional" hypothesis is that Illyrian is the ancestor of Albanian; but as there are no native texts in Illyrian, it is difficult to say much of anything certain about it. It seems nevertheless that these two differ in a fundamental manner that, in Indo-European linguistics, has always marked a crucial distinction (denoted by the terms "centum" vs. "satem"). The languages in the table below are grouped into a "family" for reasons as much geographic as linguistic, and the chronological sequence of languages, left to right, cannot be taken to suggest their evolutionary sequence.
2000-1000 | 1000-500 | 500-1 BC | 1-500 AD | 500-1000 | 1000-1500 | 1500-2000 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Proto-Balkan | Phrygian | Thracian | Dacian | Albanian | ||||||||
Illyrian |
For all practical purposes, the Hellenic family is represented by a single language spoken in Greece and the Aegean Islands: Greek, which is attested in a number of dialects spanning more than three millenia. The oldest, Mycenaean Greek texts pre-date the 14th century B.C. (see map of Mycenaean Greece), and were written in the script known as Linear B. But an invasion of (illiterate?) Dorian tribes ca. 1100 B.C. was followed by the collapse of Mycenaean civilization and the loss of the art of Greek writing. A few hundred years later the Greeks adapted a Phoenician script -- adding, for the first time, letters representing vowels. This script developed into what we know as the Greek alphabet, which formed the early basis of the Etruscan & Roman alphabets among others (a more modern example being Cyrillic).
2000-1500 | 1500-1000 | 1000-500 | 500-1 BC | 1-500 AD | 500-1500 | 1500-2000 | ||||
---|---|---|---|---|---|---|---|---|---|---|
Proto-Greek | Mycenaean | Ancient Greek | Attic Greek | Koine Greek | Middle Greek | Greek | ||||
Homeric Greek | ||||||||||
Doric Greek |
The Anatolian family includes the oldest attested Indo-European languages: some Hittite documents are dated as early as the 18th century B.C. It is thought to have been the first branch of Indo-European to separate from PIE, and it was also the first branch [known to us] to become extinct, being replaced by Greek ca. 2nd/1st century B.C. Buried and lost until modern times, Hittite cuneiform tablets were first unearthed in the early 20th century in north-central Turkey, and helped revolutionize Indo-European linguistics. A sister language, Luwian, was probably spoken in Homer's Troy, located southwest of the Dardanelles.
2500-2000 | 2000-1500 | 1500-1000 | 1000-500 | 500-1 BC | 1-1000 AD | 1000-2000 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Proto-Anatolian | Old Hittite | Middle/New Hittite | Lydian | |||||||||
Luwian | Lycian |
The earliest documentary evidence re: the Armenians is a 6th century B.C. inscription at Behistun by the Persian king Darius I. Herodotus, writing a century later, stated that the Armenians had lived in Thrace and moved into Phrygia, from which they crossed into the [later] territory of Armenia. But though Armenians are known to history as a people, their language was first attested by a translation of the Bible a full thousand years later, following the invention by Mesrop, a Christian monk, of a suitable alphabet; by that time, Classical Armenian evidenced strong influence by Iranian tongues, especially Parthian. Other loan words from Anatolian languages attest to early Armenian presence in western and central Turkey. Due to manifold linguistic influences, evidenced for example by many isoglosses with Greek, it is difficult to support arguments for a close connection with any other Indo-European language family in particular.
2000-1000 | 1000-500 | 500-1 BC | 1-500 AD | 500-1000 | 1000-1500 | 1500-2000 | |||
---|---|---|---|---|---|---|---|---|---|
Proto-Armenian | Classical Armenian | Middle Armenian | Armenian |
Proto-Indo-Iranian speakers moved east & south from the PIE ancestral homeland. Then, still in prehistoric times, the Indo-Iranian family split into Indic and Iranian branches, labelled for their early literary centers (roughly speaking) in India and Iran.
Although written Indic documents do not exist of an age comparable to that of Hittite, the language of the Rigveda is thought to be well-preserved from a form dating to perhaps the early 2nd millenium B.C. In particular, when the grammar for Sanskrit was being composed by Panini ca. 400 B.C., Rigvedic was already archaic and, in many respects, no longer understood -- a situation analogous to modern English speakers' problems understanding the language of Beowulf. Even some of the poetic structures of the Rigveda were no longer recognized -- again, a situation analogous to our modern ignorance of Old English poetic structures. Nevertheless, oral transmission of liturgy and poetry can be, and for the Rigveda is believed to have been, amazingly accurate. Accordingly, early Indic compositions can be studied with almost as much confidence as is invested in later, written texts in Pali, Prakrit, etc.
Somewhat like Rigvedic (a close descendant of Proto-Indic), Avestan (a descendant of Proto-Iranian) was represented by memorized religious compositions for centuries before they were written down. The Avestan language itself, then, is of unknown but great age. Although it is still important in Zoroastrian liturgy, it does not have living descendants. Two languages closely related to it, Bactrian and Old Persian, have many modern descendants including Pashto and Farsi.
2000-1500 | 1500-1000 | 1000-500 | 500-1 BC | 1-500 AD | 500-1000 | 1000-1500 | 1500-2000 | ||
---|---|---|---|---|---|---|---|---|---|
Proto-Indo-Iranian | Proto-Indic | Rigvedic | Sanskrit | ||||||
Pali | Prakrit | Apabhramsha | Old Hindi | Hindi/Urdu | |||||
Proto-Iranian | Avestan | ||||||||
Eastern | Bactrian | Sogdian | Pashto | ||||||
Western | Old Persian | Pahlavi | Farsi |
Like the Anatolian language family, the Tocharian family is extinct; also like Anatolian, Tocharian texts were deciphered in the early 20th century and their study has suggested major changes to theories about early Indo-European (IE) languages. Prominent among these is the fact that Tocharian exhibits some fundamental affinities to the more western language families, such as Celtic, Italic, Hellenic and especially Germanic, that distinguish it from the geographically much closer eastern language families, such as Indo-Iranian or even Balto-Slavic. This does not mean that Tocharian is particularly close to any western European language family, though many individual parallels have been drawn, but only that it seems closer to them as a group than to the eastern IE languages. How western European (?) Tocharian speakers came to live in the Tarim Basin in Xinjiang, China, is a mystery yet unresolved. However, it is noteworthy that the Silk Road was established through that area around the same time Tocharian speakers seem to have arrived: the appearance of a highly mobile European people at the inception of a major Eurasian trade link might not be a coincidence.
It is by no means certain that western European affinities demonstrate a prior western European presence: sometimes similarities exist by chance; but if chance is ruled out, there may have been sufficient linguistic contact between Proto-Tocharian speakers and others destined to live in western Europe, before the IE break-up. It seems rather likely that Tocharian peoples migrated directly east from the PIE homeland and discovered exotic trade goods awaiting further exploitation. Tocharian, unattested, later evolved into two separate languages, conventionally denoted as Tocharian A (eastern, a.k.a. Turfanian) and Tocharian B (western, a.k.a. Kuchean), both located along the north rim of the Tarim Basin; in the 6th-8th century A.D. texts so far discovered, A seems to have been in liturgical use only, while B was yet a living vernacular. Evidence for yet a third offshoot, Tocharian C, somewhat older than the other two, has been unearthed along the southern rim of the Tarim Basin.
2000-1000 | 1000-500 | 500-1 BC | 1-500 AD | 500-1000 | 1000-1500 | 1500-2000 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Proto-Tocharian | Tocharian? | Tocharian A | ||||||||||
Tocharian B | ||||||||||||
Tocharian C |