Chapter 13: Lexicostatistical glottochronology and other delights

I must now try to reconcile my New Linguistics with recent work in historical linguistics. This immediately involves major conflicts of terminology, assumption, and even standards of proof. It is unclear how I can define my new approach in terms intelligible to an academic linguist, though there is no reason why an academic linguist should not be able to understand what I have done. It is much more difficult for the average reader to understand the ideas current in academic linguistics. Anyone who tries immediately faces a barrier of extremely technical language and a host of concepts which are often peculiar to one author and disputed by everyone else.

This lack of consensus is both striking and a sign of serious disjunction at the heart of this discipline. A notable feature of the historical-linguistic literature is the number of discrepant and often idiosyncratic proposals in circulation at any one time. There is little sign of the ruthless but ultimately constructive peer pressure which recently moved human genetics within a decade from pretty pictures to a (more or less) coherent account of global settlement. Any lack of coherence in human genetics reflects their misguided but honest attempts to reconcile genetic results with obsolete linguistic theory promoted by historical linguistics. The lack of coherence in historical linguistics reflects their addiction to a form of civil war in which each individual is preoccupied with defending his own, usually novel, ideas. Stephen Oppenheimer complains, with perfect justice, that historical linguistics has always offered ‘promises of great insight into the past’ but that academic publications reveal ‘more confusion than clarity’.1

It is nevertheless possible to understand something about historical linguistics without full understanding, as one can drive a car without reading the manual. For example, I understand that head marking and dependent marking are mutually exclusive features of all languages and that it is possible to calculate the H/D ratio of a region and compare it with that of other regions.2 But I make no claim to understand grammatical or phonetic debate. On the other hand I do understand something about its treatment of words.

Its lexical initiatives over the past century have been few and mostly ill-founded

The most intelligent proposal was made by the Danish linguist, Holger Pedersen (1867-1953), who in 1903 postulated a superfamily known as Nostratic. It was related to Indo-European and included the Uralic, Altaic, Yukaghir, Eskimo, and Afro-Asiatic language families. He no doubt glimpsed the pattern I have begun to demonstrate using the TEC and the wordlists but in more than a century no-one has managed to bring his Nostratic ideas to proof. In 1951 Pedersen published his glottalic theory, which postulates the equivalence of B/D/G and P/T/C. Since this is currently found in Welsh and Gaelic and was published as a fact by William Watson in 1926 it is difficult to see why it had to be published again as a theory.

There is some truth in Holger Pedersen's insights. One cannot say as much for glottochronology, 'an approach in historical linguistics for estimating the time at which languages diverged, based on the assumption that the basic (core) vocabulary of a language changes at a constant average rate.'3 The idea of constant change for no reason is a constant in Indo-European studies and in this case was invoked by the American Morris Swadesh (1909-1967).4 Glottochronology, like lexicostatistics, is a pseudo-science embedded in pseudo-mathematical logic. The aim of both is to identify the original language, whether Indo-European or some larger and older Ursprache. More recently Aaron Dolgopolsky has worked towards the same end with fifteen words.5 But they do not work either.6 He envisages Nostratic as 'a hypothetical macro-family of languages, embracing Indo-European, Afro-Asiatic, Kartvelian, Uralic, Altaic and Dravidian.'

It cannot be said, after a century, that much progress is visible. Languages certainly diverge, and it is in fact quite easy to deduce the date of their divergence from lexical features, but they never change by so much as a word without reason. Swadesh's 200-word list is given below. It seems premature to make a list of 100 or 200 or even 15 words without first understanding how language evolved. To which point in the evolution of language does this list apply? There is no way of knowing.

Swadesh List of 200 words

All, and, animal, ashes, at, back, bad, bark, because, belly, big, bird, bite, black, blood, blow, bone, breathe, burn, child, cloud, cold, come, count, cut, day, die, dig, dirty, dog, drink, dry, dull, dust, ear, earth, eat, egg, eye, fall, far, fat [grease], father, fear, feather, few, fight, fire, fish, five, float, flow, flower, fly, fog, foot, four, freeze, fruit, give, good, grass, green, guts, hair, hand, he, head, hear, heart, heavy, here, hit, hold [take], how, hunt, husband, I, ice, if, in, kill, know, lake, laugh, leaf, leftside, leg, lie, live, liver, long, louse, man [male], many, meat [flesh], mother, mountain, mouth, name, narrow, near, neck, new, night, nose, not, old, one, other, person, play, pull, push, rain, red, right [correct], rightside, river, road, root, rope, rotten, rub, salt, sand, say, scratch, sea, see, seed, sew, sharp, short, sing, sit, skin, sky, sleep, small, smell, smoke, smooth, snake, snow, some, spit, split, squeeze, stab [pierce], stand, star, stick, stone, straight, suck, sun, swell, swim, tail, that, there, they, thick, thin, think, this, thou, three, throw, tie, tongue, tooth, tree, turn, two, vomit, walk, warm, wash, water, we, wet, what, when, where, white, who, wide, wife, wind, wing, wipe, with, woman, woods, worm, ye, year, yellow.

Apart from its random and imaginary content, the Swadesh list has two flaws which between them condemn 11 per cent of these words before we start. He includes the names of numbers from one to five (5) and for several parts of the body (17). Names for numbers are not necessarily old, or a natural part of a language. The original counting system was a simple binary system, multiplying and dividing by two, and is found from Babylon to the Outer Hebrides. With urban development and literacy this was supplanted by more evolved systems using 5s, 6s, 10s, and eventually dozens and scores. The Yan Tan jingle, used to count sheep by 20s, is a good example. Names of numbers are not ‘basic vocabulary’, though the words for ‘1’ and ‘2’ are probably early.

Names for parts of the body are not basic vocabulary either but a varied collection of post-settlement metaphor. As proof of this, Roget’s Thesaurus (Longman’s edition) has no category for parts of the body. An arm is also a weapon, a hand is also a workman, a leg is also a distance or stage in a journey. A mouth is a gaping hole or where a river meets the sea. Scots lug is both an ear and a handle. Head is a generic word for a chief or leader or a high sea-promontory. Back, bottom, foot, nail and neck are not specific to the human body.

A similar set of metaphors is found in Gaelic, though, significantly, the words are entirely different from those used in English and more often refer to landscape features. The head, ceann, is the rounded top of a Highland loch. The knee, glun, and the elbow, uileann, are V-shaped pens or traps with a wide opening at one end. The armpit, lag-na-h-achlais, is seen as a deep four-sided enclosure. The tooth, fiacail, is a pun on a similar word for a deer trap. The forehead or face, aodann, is a steep outstanding cliff or rock face with a flat top which was used as a beacon site – the rock on which Edinburgh Castle sits is a typical aodann. The eye, sùil, was originally used of a fire (E. eye may be related to G. aodh ‘fire’). The nose, sron, is a promontory projecting into a level bog or a loch. Like Scots lug, G. cluas ‘ear’ is ‘a handle on any object equipped with two such things’. G. ruige 'the arm' means ‘outstretched’ and is applied to the spreading base of a mountain which is not otherwise at all like an arm.

The outstanding example of this confusion, the possible reason for all the rest, is the ‘Indo-European’ view of European languages which I referred to in Chapter 1. It is very strange to find that this theory is simultaneously accepted by many historical linguists as an indisputable fact and rejected as aberrant by others. Some believe that modern European languages are genetically so different from those of the rest of the world that they cannot be included in the same analysis. Neither archaeology nor genetics support this view and no-one seems to find it worth investigating. The absence of critical attention to this point must raise doubts about the validity of other linguistic arguments and assumptions.

One might hope to find a few islets of rationality in this centennial disorder. One such is Joanna Nichols, an American professor of Slavic languages. In Linguistic Diversity in Space and Time (1992) she lays out a mass of evidence, based on grammatical analysis and backed up by statistical proof, for ‘great time depth’ in the evolution of human languages. She believes that divergence is significant, that it can be measured and mapped, and that the pattern she has uncovered is relevant to the peopling of the world. If Nichols is right – and she cannot be entirely wrong – she has single-handedly done for world language what a much greater number of geneticists have done in the same period for human DNA. Her dating is necessarily tentative, and it remains to be seen how well her proposals meld with the patterns revealed by archaeology and human genetics. But it is inevitable that eventually the historical-linguistic, the archaeological-cultural and the genetic schemes of prehistory will match up: there was, after all, only one past. Oppenheimer has used some of Nichols’ data to produce a tentative outline of settlement in the Americas7 Her achievement is remarkable but inaccessible. Only a handful of people in the world are capable of properly understanding her published work, of making constructive criticism, or of taking it further, and the current disorder in historical linguistics makes this a remote hope. Nichols used ten markers but suggests that more are needed, and says this approach is limited by the irreparable loss of certain languages such as Tasmanian. Given the huge number of languages that do survive I find it difficult to believe that the loss of one, and a peripheral one which might have vanished anyway without record, can be fatal to any well-founded theory.

But then I do not believe that grammar is as neutral and stable a feature as lexicon. Even in preliterate societies there are fashionable and unfashionable ways of using words. Priests and poets, in their efforts to communicate with invisible powers, are liable to invent complex grammars, of which Sanskrit is an extreme but otherwise typical example. If grammar proves difficult to work with, linguists might consider adopting my new approach to lexicon. Nichols does not dismiss comparative work on lexicon out of hand but believes that ‘basic typological comparison can give us more usable information about great time depths than lexical comparison can’.8 She is no doubt influenced by the baneful example of Swadesh and the list of 100 or 200 words which make up the ‘basic vocabulary’ of glottochronology. She also fears that problems such as bound inalienable possession - the incorporation into a word of a possessive prefix - might obscure the original root consonants.9 I discussed the problem of prefixes in Chapter 2 and concluded that when the sample is large enough they are either easy to recognise or unimportant. But the new approach does not depend on selected individual words.

Happily, New Linguistics has not yet developed to the point where technical definitions are needed. It is powerful, rational, simple and accessible, and appears to be capable of unlimited expansion. Words, just as they come, represent an inexhaustible supply of raw material, processed and ready to use. Lexicographers have collected words in their millions, from modern slang to ‘hard’ or learned words. They have, very kindly, taken immense pains to make exact translations and exact definitions. This new approach to language origins suggests that words are resistant in their structure, that they have evolved logically, and that their range of meaning matches what we know of prehistoric culture. I have barely touched on what different geographical distributions might show. The preliterate layers of European languages remain to be demonstrated. The languages of Africa, Asia, the Pacific and the New World are virtually unexplored. Genetic links between related languages remain to be explored and dated. Above all, lexicon is not subject to the vagaries of the kind of uncontrolled postulates which currently dog historical linguistics. Those who created the grammatical complexities of Sanskrit also simplified and codified its words but they did not invent a new lexicon. A monkey was always a monkey.

