[syndicated profile] languagelog_feed

Posted by Victor Mair

These are remarks by Ron Vara from here:

ᱮᱞᱚᱱ ᱨᱤᱣ ᱢᱩᱥᱠ ( /ˈiːlɒn/ EE-lon; ᱡᱟᱱᱟᱢ ᱡᱩᱱ ᱒᱘, ᱑᱙᱗᱑) ᱩᱱᱤ ᱫᱚ ᱢᱤᱫ ᱵᱮᱯᱟᱨᱤᱭᱟᱹ ᱠᱟᱱᱟᱭ ᱚᱠᱚᱭ ᱫᱚ ᱩᱱᱤᱭᱟᱜ ᱢᱩᱲᱩᱫ ᱵᱷᱩᱢᱤᱠᱟ Tesla, Inc., SpaceX, ᱟᱨ ᱴᱩᱭᱴᱚᱨ (ᱡᱟᱦᱟᱸ ᱩᱱᱤ ᱮᱠᱥ ᱞᱮᱠᱟᱛᱮ ᱧᱩᱛᱩᱢ ᱵᱚᱫᱚᱞ ᱮᱱᱟ) ᱨᱮ ᱵᱟᱰᱟᱭᱚᱜ ᱠᱟᱱᱟ᱾

This is the first sentence in the article Elon Musk in Santali alphabet (Ol Chiki). Yes, it's an alphabetic writing system, not an abugida. What makes the Santali alphabet really elusive is that it resembles the shapes of the undeciphered Indus Valley script. Soviet archaeologists once tried to decipher IVC seals using Santali alphabet. Sounds ridiculous, but it's a sad truth that Santali is a unique language with little to no academic attention having been paid to it.

What makes Santali language itself stand out is the lack of "indianization" as compared to even Southeast Asian Austroasiatic languages like Mon and Khmer. Its phonology is still archetypically East Asian with a superimposed South Asian areal layer. Its morphology is somewhat mixed between SEA Austroasiatic prefixing and (self-innovative) highly polysynthetic verb akin to the Kiranti languages (spoken ~20 kilometres away in Nepal), so Santali is not remotely influenced by Sanskrit, except for a few Austroasiatic loanwords in early Vedic Sanskrit.

 

Selected readings

[syndicated profile] languagelog_feed

Posted by Victor Mair

Some folks seem to think so, but not Benjamin James who wrote this letter to the London Review of Books, 47.6 (April 3, 2025), p. 4:

Simple Script

In his fascinating article on the recent decipherment of Linear Elamite, Tom Stevenson finds it difficult to accept that 'the Latin or Greek writing systems are simpler or "more precise" than mostly logographic writing systems like written Chinese' (LRB, 6 March). Does he really believe Chinese script is just as suited as Latin to the rendering of foreign words? 'Tom Stevenson' is far simpler and more phonetically precise than 汤姆•史帝⽂森,'Tangmu Shidiwénsen', which adds two syllables, six tones and six individual character meanings. The Committee for Language Reform in China acknowledged the relative simplicity of the Latin script as one of the factors behind its abandonment in 1956 of the attempt to develop a phonetic script based on Chinese characters.

If Stevenson says this because he thinks the idea of one script being simpler than another is somehow discriminatory (as well as untrue), then he might prefer to consider the example of the long-vanished Tangut people, who possessed, according to the Tangutologist Gerard Clauson, 

one of the most inconvenient of all scripts, a collection of nearly 5800 characters of the
same kind as Chinese characters but rather more complicated . . . It is extremely difficult to remember them, since there are few recognisable indications of sound and meaning in the constituent parts of a character, and in some cases characters which differ from one another only in minor details of shape or by one or two strokes have completely different sounds and meanings, Imagination boggles at the thought of teaching typesetters to set it up. 

The Tangut script, supposedly created by a single Chinese bureaucrat in 1038, died out at the start of the 16th century – to the probable relief of future generations, who were free to write in Chinese, Tibetan, Mongolian or some other comparatively simple script.

Here I would like to pay tribute to two Tangutologists:

Sir Gerard Leslie Makins Clauson (1894-1974) — larger than life, he was "an English civil servant, businessman, and Orientalist best known for his studies of the Turkic languages. He was born in Malta."  Here I am celebrating his achievements in Tangut studies, but his accomplishments in Turkology were. to me, supernatural.  When I behold his An Etymological Dictionary of Pre-Thirteenth-Century Turkish (Oxford: Clarendon Press [1972]), which was almost like a bible for me when I was studying Old Turkish and Tocharian (!!), I think that alone would be enough for any scholar to produce in one lifetime, but there is much more:

Clauson attended Eton College, where he was Captain of School, and where, at age 15 or 16, he published a critical edition of a short Pali text, "A New Kammavācā" in the Journal of the Pali Text Society. In 1906, when his father was named Chief Secretary for Cyprus, he taught himself Turkish to complement his school Greek. He studied at Corpus Christi College, Oxford, in classics, receiving his degree in Greats, then became Boden Scholar in Sanskrit, 1911; Hall-Houghtman Syriac Prizeman, 1913; and James Mew Arabic Scholar, 1920. During World War I, he fought in the battle of Gallipoli but spent the majority of his effort in signals intelligence, concerned with German and Ottoman army codes.

These were the years in which the great Central Asian expeditions of Sven Hedin, Sir Aurel Stein and others were unearthing new texts in a variety of languages including Tocharian and Saka (both Khotanese, and Tumshuqese). Clauson actively engaged in unraveling their philologies, as well as Chinese Buddhist texts in the Tibetan script.

Clauson also worked on the Tangut language, and in 1938–1939 wrote a Skeleton dictionary of the Hsi-hsia language. The manuscript copy is held at the School of Oriental and African Studies in London, and was published as a facsimile edition in 2016.

In 1919 he began work in the British Civil Service, which was to culminate in serving as the Assistant Under-Secretary of State in the Colonial Office, 1940–1951, in which capacity he chaired the International Wheat Conference, 1947, and International Rubber Conference, 1951. After his mandatory retirement at age 60, he switched to a business career and in time served as chairman of Pirelli, 1960–1969.

A partially filled notebook containing Sir Gerard Clauson's Notes on Kashgari's Divan lugat at-Turk [VHM:   the first comprehensive dictionary of Turkic languages] and other cognate subjects (1072-74) is held at the Cadbury Research Library, University of Birmingham.

(Wikipedia)

If we're talking about someone who knew difficult languages and mind-numbing scripts, it was Clauson.  He authoritatively knew whereof he spoke.

 

Nikita Kuzmin

He completed his PhD at the University of Pennsylvania in 2023 on the following subject:

"Pilgrimage in Tangut Xia:  Study of Tangut Epigraphy from Dunhuang and Tangut Woodblock Prints from Bezeklik".  (free PDF)

Abstract

This dissertation aims to examine the pilgrimage activities of the Tanguts in the 11th–13th centuries in the Hexi Corridor, based on the research of the two corpora of Tangut received textual materials – Buddhist inscriptions that pilgrims left on the walls of the Buddhist cave complexes of Mogao and Yulin and the fragments of Tangut Buddhist texts excavated from Bezeklik. Chapter 1 introduces various manifestations of pilgrimage and articulates features of Buddhist pilgrimage in multiple regions in Asia. Chapter 2 displays the historical and religious characteristics of Mount Wutai and the greater Dunhuang area, which played a crucial role in the establishment and development of Tangut Buddhism. It also discusses various external factors (Uyghur monks) that influenced the propagation of Buddhism among the Tanguts. In Chapter 3, I analyze the remained Tangut inscriptions from Mogao and Yulin caves and interpret them within corresponding historical and religious contexts. Based on the comparative research of the inscriptions, I argue the existence of a unified “inscriptional discourse” in the greater Dunhuang area in the 10th to 13th centuries. Chapter 4 discusses codicological and contextual features of a corpus of Tangut Buddhist woodblock prints from Bezeklik caves. In the end, the dissertation provides an English translation of 22 inscriptions and 12 pieces of Tangut woodblock prints.

To accomplish this arduous task, one of the first things I had Nikita do was go off to Kathmandu to study Classical Tibetan for a summer.  He already knew Mandarin (virtually native fluency) and Classical Chinese, Japanese, German, and his native Russian.  Oh, yes, and was fluent in English.

No, to all those doubting Thomases out there who think that mostly logographic scripts like Tangut and Chinese are as simple and precise as the Latin or Greek alphabet, they are not.

If you only have one year to learn a new script, don't try Tangut or Chinese.

 

Selected readings

[Thanks to Leslie Katz]

[syndicated profile] languagelog_feed

Posted by Victor Mair

"Constructed Languages Are Processed by the Same Brain Mechanisms as Natural Languages." Malik-Moraleda, Saima, et al. Proceedings of the National Academy of Sciences 122, no. 12 (March 17, 2025): e2313473122.

Significance

What constitutes a language has been of interest to diverse disciplines—from philosophy and linguistics to psychology, anthropology, and sociology. An empirical approach is to test whether the system in question recruits the brain system that processes natural languages. Despite their similarity to natural languages, math and programming languages recruit a distinct brain system. Using fMRI, we test brain responses to constructed languages (conlangs)—which share features with both natural languages and programming languages—and find that they are processed by the same brain network as natural languages. Thus, an ability for a symbolic system to express diverse meanings about the world—but not the recency, manner, and purpose of its creation, or a large user base—is a defining characteristic of a language.

Abstract

What constitutes a language? Natural languages share features with other domains: from math, to music, to gesture. However, the brain mechanisms that process linguistic input are highly specialized, showing little response to diverse nonlinguistic tasks. Here, we examine constructed languages (conlangs) to ask whether they draw on the same neural mechanisms as natural languages or whether they instead pattern with domains like math and programming languages. Using individual-subject fMRI analyses, we show that understanding conlangs recruits the same brain areas as natural language comprehension. This result holds for Esperanto (n = 19 speakers) and four fictional conlangs [Klingon (n = 10), Na’vi (n = 9), High Valyrian (n = 3), and Dothraki (n = 3)]. These findings suggest that conlangs and natural languages share critical features that allow them to draw on the same representations and computations, implemented in the left-lateralized network of brain areas. The features of conlangs that differentiate them from natural languages—including recent creation by a single individual, often for an esoteric purpose, the small number of speakers, and the fact that these languages are typically learned in adulthood—appear to not be consequential for the reliance on the same cognitive and neural mechanisms. We argue that the critical shared feature of conlangs and natural languages is that they are symbolic systems capable of expressing an open-ended range of meanings about our outer and inner worlds.

Pursuing the questions raised in this paper will help us distinguish between linguistic and non-linguistic domains of intellectual inquiry — math, music, art….

 

Selected readings

[Thanks to Ted McClure]

Crosswalk protest art

Apr. 16th, 2025 01:44 pm
[syndicated profile] languagelog_feed

Posted by Mark Liberman

Last weekend, a number of crosswalk buttons in Silicon Valley were hacked so as to play (faked) messages from Mark Zuckerberg and Elon Musk. This got lots of (social and mass) media coverage — for one useful summary, see Zoe Morgan, "Silicon Valley crosswalk buttons apparently hacked to imitate Musk, Zuckerberg voices", Palo Alto Online 4/12/2025, or check out various other sources

Some audio samples:





Today's AI synthesis and voice morphing technology makes it easy to create such clips — and crosswalk buttons are not the only possible medium to be hacked.

And of course there will be targets from other regions of the political and cultural space.

Update —
Apparently the same sort of thing happened yesterday in Seattle: "'Please don’t tax the rich': Seattle crosswalk buttons hacked to sound like Jeff Bezos", 4/16/2025. Reddit has a sample video.

[syndicated profile] languagelog_feed

Posted by Victor Mair

For the record:

"Do Inuit languages really have many words for snow? The most interesting finds from our study of 616 languages", The Conversation (4/10/25); rpt. in phys.org/news (4/13/25)

Authors:

Charles Kemp
Professor, School of Psychological Sciences, The University of Melbourne (PhD MIT)
Ekaterina Vylomova
Lecturer, Computing and Information Systems, The University of Melbourne (The University of Melbourne, PhD/Computational Linguistics)
Temuulen Khishigsuren
PhD Candidate, The University of Melbourne (National University of Mongolia, M.A. in linguistics)
Terry Regier
Professor, Language and Cognition Lab, University of California, Berkeley (Ph.D., Computer science, UC Berkeley, 1992; frequent co-author with Paul Kay; among his most-cited work is:

"Whorf hypothesis is supported in the right visual field but not the left",
Aubrey L. Gilbert; Terry Regier; Paul Kay; Richard B. Ivry.
Proceedings of the National Academy of Sciences of the United States of America (2006)\

These two articles (The Conversation and phys.org/news) are journalistic accounts of the scientific study by Kemp, Vylomova, Khishigsuren, and Regier.

The full scientific paper is here:

Temuulen Khishigsuren et al, "A computational analysis of lexical elaboration across languages", Proceedings of the National Academy of Sciences (2025). DOI: 10.1073/pnas.2417304122

Journal information: Proceedings of the National Academy of Sciences

Significance

The vocabulary of any language emphasizes some areas more than others, and the number of terms for fish, cattle, smells, and other concepts varies across languages. Most work on lexical elaboration relies on manually compiled data, but we show how lexical elaboration can be measured using data from bilingual dictionaries, and use this measure to develop analyses of lexical elaboration that span hundreds of languages and thousands of concepts. Our work suggests several hypotheses about well-studied concepts (e.g. that smell terms are well developed in Oceanic languages), and opens up the investigation of concepts that are new to the literature on lexical elaboration (e.g. dance).
 

Abstract

Claims about lexical elaboration (e.g. Mongolian has many horse-related terms) are widespread in the scholarly and popular literature. Here, we show that computational analyses of bilingual dictionaries can be used to test claims about lexical elaboration at scale. We validate our approach by introducing BILA, a dataset including 1,574 bilingual dictionaries, and showing that it confirms 147 out of 163 previous claims from the literature. We then identify previously unreported examples of lexical elaboration, and analyze how lexical elaboration is influenced by ecological and cultural variables. Claims about lexical elaboration are sometimes dismissed as either obvious or fanciful, but our work suggests that large-scale computational approaches to the topic can produce nonobvious and well-grounded insights into language and culture.

Some highlights from the two journalistic articles cited above:

Languages are windows into the worlds of the people who speak them – reflecting what they value and experience daily.

So perhaps it’s no surprise different languages highlight different areas of vocabulary. Scholars have noted that Mongolian has many horse-related words, that Maori has many words for ferns, and Japanese has many words related to taste.

Some links are unsurprising, such as German having many words related to beer, or Fijian having many words for fish. The linguist Paul Zinsli wrote an entire book on Swiss-German words related to mountains.

One example of a concept we looked at was “horse”, for which the top-scoring languages included French, German, Kazakh and Mongolian. This means dictionaries in these languages had a relatively high number of

    1. words for horses. For instance, Mongolian аргамаг means “a good racing or riding horse”
    2. words related to horses. For instance, Mongolian чөдөрлөх means “to hobble a horse”.

Our findings support most links previously highlighted by researchers, including that Hindi has many words related to love and Japanese has many words related to obligation and duty.

We were especially interested in testing the idea that Inuit languages have many words for snow. This notorious claim has long been distorted and exaggerated. It has even been dismissed as the “great Eskimo vocabulary hoax”, with some experts saying it simply isn’t true.

But our results suggest the Inuit snow vocabulary is indeed exceptional. Out of 616 languages, the language with the top score for “snow” was Eastern Canadian Inuktitut. The other two Inuit languages in our data set (Western Canadian Inuktitut and North Alaskan Inupiatun) also achieved high scores for “snow”.

The Eastern Canadian Inuktitut dictionary in our dataset includes terms such as kikalukpok, which means “noisy walking on hard snow”, and apingaut, which means “first snow fall”.

The top 20 languages for “snow” included several other languages of Alaska, such as Ahtena, Dena'ina and Central Alaskan Yupik, as well as Japanese and Scots.

Scots includes terms such as doon-lay, meaning “a heavy fall of snow”, feughter meaning “a sudden, slight fall of snow”, and fuddum, meaning “snow drifting at intervals”.

You can explore our findings using the tool we developed, which allows you to identify the top languages for any given concept, and the top concepts for a particular language.

The top-scoring languages for “smell” include a cluster of Oceanic languages such as Marshallese, which has terms such as jatbo meaning “smell of damp clothing”, meļļā meaning “smell of blood”, and aelel meaning “smell of fish, lingering on hands, body, or utensils”.

Prior to our research, the smell terms of the Pacific Islands had received little attention.

Much to their credit, the authors are careful to issue a set of thoughtful caveats:

Although our analysis reveals many interesting links between languages and concepts, the results aren’t always reliable – and should be checked against original dictionaries where possible.

For example, the top concepts for Plautdietsch (Mennonite Low German) include von (“of”), den (“the”) and und (“and”) – all of which are unrevealing. We excluded similar words from other languages using Wiktionary, but our method did not filter out these common words for Plautdietsch.

Also, the word counts reflect both dictionary definitions and other elements, such as example sentences. While our analysis excluded words that are especially likely to appear in example sentences (such as “woman” and “father”), such words could have still influenced our results to some extent.

Most importantly, our results run the risk of perpetuating potentially harmful stereotypes if taken at face value. So we urge caution and respect while using the tool. The concepts it lists for any given language provide, at best, a crude reflection of the cultures associated with that language.

To conclude, one of my favorite Mongolian words is Morin Khuur (Mongolian: Морин хуур), which may be translated as "horse fiddle".

Soulful Mongolian Horsehead Fiddle | Coplans in China
 
The full Classical Mongolian name is Morin Tologhay'ta Quğur (Морин толгойтой хуур), meaning "fiddle with a horse's head".

See "Some Mongolian words for 'horse'" (11/7/19), a variorum post with observations and comments by more than two dozen specialists.

 

Selected readings

The claim that Eskimo words for snow are unusually numerous, particularly in contrast to English, is a cliché commonly used to support the controversial linguistic relativity hypothesis. In linguistic terminology, the relevant languages are the Eskimo–Aleut languages, specifically the Yupik and Inuit varieties.

The strongest interpretation of the linguistic relativity hypothesis, also known as the SapirWhorf hypothesis or "Whorfianism", posits that a language's vocabulary (among other features) shapes or limits its speakers' view of the world. This interpretation is widely criticized by linguists, though a 2010 study supports the core notion that the Yupik and Inuit languages have many more root words for frozen variants of water than the English language. The original claim is loosely based in the work of anthropologist Franz Boas and was particularly promoted by his contemporary, Benjamin Lee Whorf, whose name is connected with the hypothesis.[4][5] The idea is commonly tied to larger discussions on the connections between language and thought.

Franz Boas did not make quantitative claims but rather pointed out that the Eskaleut languages have about the same number of distinct word roots referring to snow as English does, with the structure of these languages tending to allow more variety as to how those roots can be modified in forming a single word. A good deal of the ongoing debate thus depends on how one defines "word", and perhaps even "word root".

The first re-evaluation of the claim was by linguist Laura Martin in 1986, who traced the history of the claim and argued that its prevalence had diverted attention from serious research into linguistic relativity. A subsequent influential and humorous, and polemical, essay by Geoffrey K. Pullum repeated Martin's critique, calling the process by which the so-called "myth" was created the "Great Eskimo Vocabulary Hoax". Pullum argued that the fact that the number of word roots for snow is about equally large in Eskimoan languages and English indicates that there exists no difference in the size of their respective vocabularies to define snow. Other specialists in the matter of Eskimoan languages and Eskimoan knowledge of snow and especially sea ice argue against this notion and defend Boas's original fieldwork amongst the Inuit, at the time known as Eskimo, of Baffin Island.

[Thanks to Hiroshi Kumamoto and Ross Presser]

Mawkishly maudlin

Apr. 14th, 2025 05:00 pm
[syndicated profile] languagelog_feed

Posted by Victor Mair

Thirty-five or so years ago, Allyn Rickett (1921-2020), my old colleague at Penn, referred to a certain person as "pópomāmā 婆婆媽媽" ("mawkishly maudlin" [my translation of Rickett's Mandarin]; "old-lady-like").  This is such an unusual expression, and it so perfectly characterized the individual in question, that it's worth writing a post on it.

In the years around the founding of the People's Republic of China in 1949, Rickett ("Rick") was in China doing research for his doctoral dissertation on the Guǎn Zǐ 管子 (Master Guan), a large and important politicophilosophical text reflecting the thought and practice of the Spring and Autumn period (c. 770-c. 481 BC), though the received version was not edited until circa 26 BC.  Rickett was accused of spying for the US Office of Naval Intelligence and imprisoned by the PRC government.  There he underwent four years of "struggle sessions".  Call them what you will, he had ample opportunity to become familiar with such colloquial terms as "pópomāmā 婆婆媽媽".

I should also note that Rickett, who was a student of the distinguished Sinologist, Derk Bodde (1909-2003), was an outstanding scholar in his own right, and his densely annotated translation of the Guan Zi is a monumental achievement, one that he worked on for most of his professional life.

Now, back to pópomāmā 婆婆媽媽.  First, let's break the four-syllable term down into its constituent monosyllables:

pó 婆

From Proto-Sino-Tibetan *pʷa-n ~ *bʷa-n (grandmother). Cognate with Burmese ဘွား (bhwa:, grandmother).

  1. old woman
  2. grandmother
  3. mother-in-law (of a woman); husband's mother
  4. woman in a particular profession
      ―  méi  ―  female matchmaker
  5. pejorative suffix for a woman
      ―  féi  ―  fat woman
  6. (ACG, neologism) Short for 老婆 (lǎopo, “wife”)

In transcriptions of Buddhist terms, (MC ba) is often used to transcribe Sanskrit (ba), (bha) and (va), e.g. 濕婆 / 湿婆 (Shīpó, Shiva).

(Wiktionary)

 

mā 媽

Colloquial form of (OC *mɯʔ, “mother”), from Proto-Sino-Tibetan *mow (woman, female).

  1. mom; mum: an affectionate term for a mother
      ―  Wǒ de ya!  ―  Oh my God! [Literally: Oh my Mom!]
  2. (usually with qualifier) other older female relatives or houseservants
  3. (religion) "The Mother", an epithet of the Fujianese sea goddess Mazu (媽祖妈祖).
  4. (obsolete, hapax legomenon) mare; female horse

(Wiktionary)

 

pópo 婆婆

  1. (chiefly Mandarin, Jin) mother-in-law (husband's mother)
  2. (Cantonese, dialectal Mandarin, dialectal Hakka, dialectal Jin, dialectal Wu, Taining Gan, Jianyang Northern Min, Shaowu Min) maternal grandmother
  3. (Gan, dialectal Mandarin, dialectal Xiang, dialectal Wu) paternal grandmother
  4. (dialectal) old woman
  5. (figurative, Mainland China) higher authorities; superior; leader

(Wiktionary)

 

māma / mǎmá 媽媽

  1. (informal) mum (mom); mama
    遵守媽媽叮嚀遵守妈妈叮咛  ―  zūnshǒu māma de dīngníng  ―  to follow mum's advice
    媽媽味道妈妈味道  ―  māma de wèidào  ―  the taste of mum's cooking
    媽媽總是向著妹妹
    妈妈总是向着妹妹
    Māma zǒngshì xiàngzhe mèimei. [Pinyin]
    Mum always favours my younger sister.
    這麼淘氣媽媽擔心才怪这么淘气妈妈担心才怪
    Nǐ zhème táoqì, māma bù dānxīn cáiguài! [Pinyin]
    You are such a naughty kid. It'd be surprising if your mum were not worried.
    一輩子媽媽監視
    一辈子妈妈监视
    Wǒ yībèizi dōu bèi māma jiānshì. [Pinyin]
    Mum has kept tabs on me all my life.
  2. (dialectal, colloquial) breast
  3. (ACG, figurative) character designer (female)

(Wiktionary)

 

Putting all four syllables back together, we get "pópomāmā 婆婆媽媽", which means "overly careful (like an old woman); womanishly garrulous; irresolute", and so forth.

Here's an interesting quotation illustrating its usage:

(Wiktionary)

 

The author of these sentences was Zhū Zìqīng 朱自清 (1898-1948), a famous poet and essayist of the Republican period.

Key terms, as rendered by online interpreters:

  • White people—God’s favorite (GT), 
  • White people — God's pride (Baidu)
  • Caucasians — the proud sons of God (MS Bing)
  • The white man — the pride of God (DeepL), plus three alternatives that vary only slightly

Zhu Ziqing had his finger on the pulse of the lingua franca of China during the first half of the 20th century (particularly the second quarter of that century).  When I began studying Mandarin in 1967, it was to that period that I looked for a model on which to base my own emerging idiolect.  The reason for this is that I thought it was the most vibrant vernacular of the century, certainly more lively and creative than the period in which I grew up and learned Mandarin.

Moreover, I had a strong antipathy to the characters, whether traditional or simplified, the former for being divorced from spoken language and the latter for being neither fish nor fowl.  So I turned to romanized missionary writings where I could learn delightful terms like shabulengdengde ("foolish; daffy") and pangdudu ("chubby").  When I was forced by my Mandarin teachers to learn characters, I preferred to do it through the literature of writers like Zhu Ziqing and Lao She (1899-1966), who stretched sinographic writing as close to alphabetic writing as it could go.  That's why I loved words like pópomāmā 婆婆媽媽 ("old-lady-like; anile") and niánniándādā 黏黏搭搭 ("sticky; irresolute; kleisty").

 

Selected readings

 

AI Sauce

Apr. 13th, 2025 10:12 pm
[syndicated profile] languagelog_feed

Posted by Victor Mair

This should add some zest to the debate over AI.

She's not the only one.  I've seen a lot of people mix up AI and A1.

 

Selected readings

The last two posts together are definitive.

[h.t. François Lang]

[syndicated profile] languagelog_feed

Posted by Victor Mair

Jeffrey Weng, "What Is Mandarin? The Social Project of Language Standardization in Early Republican China", Journal of Asian Studies, 77.3 (August 2018), 611-633.

Abstract

Scholars who study language often see standard or official languages as oppressive, helping the socially advantaged to entrench themselves as elites. This article questions this view by examining the Chinese case, in which early twentieth-century language reformers attempted to remake their society's language situation to further national integration. Classical Chinese, accessible only to a privileged few, was sidelined in favor of Mandarin, a national standard newly created for the many. This article argues that Mandarin's creation reflected an entirely new vision of society. It draws on archival sources on the design and promulgation of Mandarin from the 1910s to the 1930s to discuss how the way the language was standardized reflected the nature of the imagined future society it was meant to serve. Language reform thus represented a radical rethinking of how society should be organized: linguistic modernity was to be a national modernity, in which all the nation's people would have access to the new official language, and thus increased opportunities for advancement.

The first two paragraphs of the article:

The artificiality of China's standard language is no secret. Nonetheless, much of social and sociolinguistic theory until now has been devoted to unmasking the artificiality and arbitrariness of standard languages. But the arbitrariness of the Chinese standard was never hidden from public view. This language, which this essay will refer to simply as “Mandarin,” was deliberately designed in the early twentieth century to be distinct from any existing spoken vernacular. This new language, though based on the speech of Beijing, was different from it and every other regional or local speech in China, and it was designed to be the standard for the entire polyglot Chinese nation. Whereas Beijing speech was a language of a particular place spoken by a particular group of people, Mandarin was intended to be, within China, the language of all places and no particular place. Thus universalized, Mandarin could facilitate nationwide communication that previously had been stymied by the nation's extensive multilinguality.

The creation of a Chinese standard language, therefore, was a state-led nation-building project, meant to mold a motley collection of peoples into a unified national society. But what was to be the nature of this new society? One of the main goals of language reform in China was to create a standard language that was easier to learn and thus more widely accessible. This desire for a more accessible standard language represented a substantial departure from the previous language situation, in which the official language—Classical Chinese—was so difficult to learn that access was restricted to a small segment of society. The promulgation of a national standard language in the early twentieth century therefore represented an attempt to extend educational meritocracy from small segment of elites to all of society. I argue that the creation of a new language was intimately connected to the goal of a new social order.

The following paragraph has copious, up-to-date citations, the full details of which are given in the author's ample List of References.

In so arguing, I diverge from the approaches taken in a small but growing body of scholarship in addressing the question of Chinese language reform. Historians in the past few years have been particularly active in this area, reflecting a resurgence of interest in language in China that began in the United States with the landmark works of the linguist and sinologist John DeFrancis (Reference DeFrancis1950, Reference DeFrancis1984, Reference DeFrancis1989). David Moser (Reference Moser2016) has offered a fresh overview of how Mandarin came to be China's national language. Recent studies have also addressed the social history of the origins and growth of Mandarin, documenting the experiences of everyday people in their encounter with the new national language in sound, on the screen, and in print (J. Chen Reference Chen2013b, Reference Chen2015; Culp Reference Culp2008). Other historians have discussed the intellectual history behind the vernacular language movement and the promulgation of Mandarin in China before and after 1949 (Kaske Reference Kaske2008; J. Liu Reference Liu2016; Tam Reference Tam2016a, Reference Tam2016b). Among linguists, the study of Mandarin phonology has driven theory-building in generative linguistics (Duanmu Reference Duanmu2007), while work by sociolinguists has illuminated popular attitudes about language practices in China (C. Li Reference Li2004, Reference Li, Árokay, Gvozdanović and Miyajima2014; Peng Reference Peng2016). And one cannot overlook the rapid expansion of Sinophone studies and other significant work in comparative literature in the past three decades (Gunn Reference Gunn1991; L. Liu Reference Liu1995, Reference Liu2004; Shih Reference Shih2011; Tsu Reference Tsu2010; G. Zhou Reference Zhou2011).

A breath of fresh air!  "Mandarin" as a single language called into question.

I am in communication with a number of scholars (mostly young) who will soon be taking on the daunting challenge of deconstructing the whole idea of a monolithic Hànyǔ 漢語 ("Hannic"), which, faute de mieux, I sometimes call "Sinitic".  The notion of a mammoth, integral language called "Chinese" is long gone.

 

Selected readings

Mandarin disyllabism for beginners

Apr. 11th, 2025 09:32 pm
[syndicated profile] languagelog_feed

Posted by Victor Mair

tā jiǎng dé hěn qīngchǔ
她講得很清楚

tā jiǎng dé fēicháng qīngchǔ
她講得非常清楚

tā jiǎng dé tèbié qīngchǔ
她講得特別清楚

tā jiǎng dé shífēn qīngchǔ
她講得十分清楚

tā jiǎng dé qīngqīngchǔchǔ
她講得清清楚楚

tā jiǎng dé qīngchǔ de bùdéliǎo
她講得清楚得不得了

tā jiǎng dé bùnéng zài qīngchǔle
她講得不能再清楚了
("She couldn't have explained it more clearly"

All seven sentences say the same thing, "She explained it clearly", with various nuances. This is a language learning game I like to play to show how much flexibility there can be in Mandarin expressions.

 

Selected readings

[Thanks to John Rohsenow]

Romanized Japanese Bible translation

Apr. 10th, 2025 03:21 pm
[syndicated profile] languagelog_feed

Posted by Victor Mair

The Portuguese were the first Europeans to arrive in Japan in 1543, establishing trade and cultural exchange, including the introduction of firearms and Christianity, which later led to persecution and the Sakoku (closed country) policy in the 17th century. (AIO)

For the impact of Portuguese missionaries on the study of East Asian languages and linguistics, see, for example:

W. South Coblin, Francisco Varo's Glossary of the Mandarin Language.  Vol. 1: An English and Chinese Annotation of the Vocabulario de la Lengua Mandarina Vol. 2: Pinyin and English Index of the Vocabulario de la Lengua Mandarina (London:  Routledge, 2006).

Abstract

Western missionaries contributed largely to Chinese lexicography. Their involvement was basically a practical rather than a theoretical one. In order to preach and convert, it was necessary to speak Chinese. A missionary on post needed to learn at least two languages, the national Guanhua, the "language of the officials" or "Mandarin," and the local vernacular. The first lexicographical work by missionaries was a Portuguese-Chinese dictionary compiled in the late 1500s by Francisco Varo (1627-1687), a Spanish Dominican based in the province of Fujian, was legendary for his superb mastery in Mandarin. His Vocabulario de la Lengua Mandarina, a Spanish-Chinese dictionary, is made available to modern readers in the present study, which is based on two manuscripts held in Berlin and London. Volume 1 contains the text of Varo's glossary, with English translations offered for all Spanish glosses and Chinese characters added for all Chinese forms. Volume 2 includes a pinyin index to all Chinese forms in the text and a selective index to the English translations of the Chinese glosses. The Vocabulario is mainly devoted to the spoken language, but includes literary forms as well. Varo was also sensitive to other matters of usage, e.g., questions of style, new expressions coined by the missionaries, specific expressions in Chinese and in European culture, Chinese customs and beliefs, and aspects of grammar. The Vocabulario is recommended for readers interested in Chinese linguistics, lexicography, Sino-Western cultural relations and the history of Christianity in China.

See also:   W. South Coblin and Joseph Abraham Levi, Francisco Varo's Grammar of the Mandarin Language (1703). An English Translation of 'Arte de la lengua mandarina' (Philadelphia: John Benjamins, 2000).

 

Selected readings

[h.t. Geoffrey Wade]

Rime / rhyme tables / charts

Apr. 10th, 2025 02:58 pm
[syndicated profile] languagelog_feed

Posted by Victor Mair

In Chinese they are called yùntú 韻圖 / 韵图.  These tools are vitally important in the development of Sinitic phonology, but barely known outside of sinological specialists, so — for the history of world phonology — it is worthwhile to introduce them to linguists in general.

A rime table or rhyme table (simplified Chinese: 韵图; traditional Chinese: 韻圖; pinyin: yùntú; Wade–Giles: yün-t'u) is a Chinese phonological model, tabulating the syllables of the series of rime dictionaries beginning with the Qieyun (601) by their onsets, rhyme groups, tones and other properties. The method gave a significantly more precise and systematic account of the sounds of those dictionaries than the previously used fǎnqiè analysis, but many of its details remain obscure. The phonological system that is implicit in the rime dictionaries and analysed in the rime tables is known as Middle Chinese, and is the traditional starting point for efforts to recover the sounds of early forms of Chinese. Some authors distinguish the two layers as Early and Late Middle Chinese respectively.

The earliest rime tables are associated with Chinese Buddhist monks, who are believed to have been inspired by the Sanskrit syllable charts in the Siddham script they used to study the language. The oldest extant rime tables are the 12th-century Yunjing ('mirror of rhymes') and Qiyin lüe ('summary of the seven sounds'), which are very similar, and believed to derive from a common prototype. Earlier fragmentary documents describing the analysis have been found at Dunhuang, suggesting that the tradition may date back to the late Tang dynasty.

Some scholars, such as the Swedish linguist Bernhard Karlgren, use the French spelling rime for the categories described in these works, to distinguish them from the concept of poetic rhyme.

(Wikipedia)

We are fortunate to have an expert treatment of the rime / rhyme tables in The Chinese Rime TablesLinguistic philosophy and historical-comparative phonology, edited by David Prager Branner (Amsterdam and Philadelphia:  John Benjamins, 2006), viii, 358 pp. [Current Issues in Linguistic Theory, 271]    https://doi.org/10.1075/cilt.271

This book, the first in its field in a Western language, examines China’s native phonological tool with regard to reconstruction, theory, and linguistic philosophy.

After an introductory essay on the nature of the tables and the history of their interpretation, the book concentrates on three areas: application of rime table theory to reconstruction, the history of rime table theory, and the application of the tables to descriptive linguistics. An appendix details a number of 20th century systems for transcribing their phonology into Roman letters.

Major topics include Altaic contact-influence on Chinese, early native understanding of the tables’ meaning, the phonological work of Yuen Ren Chao, and Stammbaumtheorie/diasystemic thinking about Chinese. New reconstructions of Han and “Common Dialectal” phonology appear here, as do complete texts and translations of the Shouwen fragments and Yunjing preface.

Shouwen was a shadowy 9th-century Buddhist Chinese monk who has been credited with the invention of the analysis of Middle Sinitic as having 36 initials, later ubiquitously used by the rime tables.  One could say that he had created an abortive proto-alphabet for Sinitic, one that never bore fruit as an actual writing system.  I believe that was due to the strong path dependency of the deeply entrenched trimillennial sinographs.

    Introduction: What Are Rime Tables and What Do They Mean?
    David Prager Branner | pp. 1–34

    Part I: Rime-Tables in Chinese Reconstruction

    On the Principle of the Four Grades
    Abraham Chan | pp. 37–46

    The Four Grades: An Interpretation from the perspective of Sino-altaic language contact
    Chris Wen-Chao Li | pp. 47–58

    On Old Turkic Consonanticism and Vocalic Divisions of Acute Consonants in Medieval Hàn Phonology
    An-King Lim | pp. 59–82

    The Qièyùn System ‘Divisions’ as the Result of Vowel Warping
    Axel Schuessler | pp. 83–96

    Part II: The History of Rime Table Texts and Reconstruction

    Reflections on the Shouwen Fragments
    W. South Coblin | pp. 99–122

    Zhāng Línzhī on the Yùnjìng
    W. South Coblin | pp. 123–150

    Simon Schaank and the Evolution of Western Beliefs About Traditional Chinese Phonology
    David Prager Branner | pp. 151–167

    Part III: Rime Tables as Descriptive Tools

    How Rime-Book Based Analyses Can Lead Us Astray
    Richard VanNess Simmons | pp. 171–182

    Modern Chinese and the Rime Tables
    Jerry Norman | pp. 183–188

    Common Dialect Phonology in Practice.: Y.R. Chao’s Field Methodology
    Richard VanNess Simmons | pp. 189–208

    Some Composite Phonological Systems in Chinese
    David Prager Branner | pp. 209–232

    Common Dialectal Chinese
    Jerry Norman | pp. 233–254

    Appendix I: Pronunciation Guide to Boodberg's Alternative Grammatonomic Notation
    Gari K. Ledyard | pp. 255–264

    Appendix II: Comparative Transcriptions of Rime Table Phonology
    David Prager Branner | pp. 265–302

    Index of Biographical Names | pp. 327–332

    General Index | pp. 333–358

More recently, the Chinese scholar, Pān Wénguó 潘文国, published a two volume work titled Yùn tú kǎo 韵图考.  It was translated into English by Lǐ Zhìqiáng 李志強 (Andy Li), who was a visiting scholar at Penn a decade ago.

The Chinese Rhyme Tables , vol. 1 (London: Routledge, 2023). 

Abstract

As the first volume of a two- volume set that studies Chinese rhyme tables, this book focuses on their emergence, development, structure, and patterns. Rhyme tables are a tabulated tool constituted by phonological properties, which help indicate the pronunciation of sinograms or Chinese characters, marking a precise and systematic account of the Chinese phonological system. This volume first discusses the emergence of the model and factors that determined its formation and evolution, including the Chinese tradition of the rhyme dictionary and the introduction of Buddhist scripts. The second part analyzes the structure and arrangement patterns of rhyme tables in detail, giving insights into the nature of “division” (deng): the classification and differentiation of speech sounds, of vital significance in the reconstruction of middle Chinese. The author argues that deng has nothing to do with vowel aperture or other phonetic features but is a natural result of rhyme table arrangement. He also reexamines the principles for irregular cases (menfa rules) and categorizes the 20 rules into three types.

The book will appeal to scholars and students who are studying linguistics, Chinese phonology, and Sinology.

Pan Wenguo, The Chinese Rhyme Tables, vol. 2 (London: Routledge, 2023).

Abstract

As the second volume of a two-volume set that studies the Chinese rhyme tables, this book seeks to reconstruct the ancient rhyme tables based on the extant materials and findings.

A rhyme table is a tabulated tool constituted by phonological properties, which helps indicate the pronunciation of sinograms or Chinese characters, marking an accurate and systematic account of the Chinese phonological system. The book first explores the relationship and identifies the prototype of the extant rhyme tables. Then the principles and methods for collating and rebuilding the ancient rhyme table are introduced. It then looks at the general layout, including tables, table order, shè, zhuǎn, rhyme heading, rhyme order, light and heavy articulations, rounded and unrounded articulations, and initials. The final chapter presents the reconstructed rhyme tables with detailed annotations and add-on indexes.

The book will appeal to scholars and students studying Sinology, Chinese linguistics, and especially Chinese

Because these two volumes are primarily descriptive and narrative, I do not list the contents of their chapters as I did for the Branner volume, which is geared more to the ideas behind the rime tables and their philosophical significance, plus an abundance of pathbreaking papers written by the leading historical linguists of the day that focus on common topolectal features, extra-Sinitic associations, and other previously undiscussed aspects of the rime tables.

I asked Chris Button whether he preferred one or the other, "rime" or "rhyme" for these charts / tables.  He replied sensibly:

I would use onset vs rime in a linguistic sense, but I would use rhyme when referring to poetry. So, I would probably go with "rime table" since it's not specifically for poetic use.

To give you an idea of what these "rime tables" looked like and how they were structured, here's the first chart (of 43) from the Yùnjìng 韻鏡 (Mirror of Rimes; 1161, 1203):

The five big characters on the right-hand side read Nèi zhuǎn dìyī kāi (內轉第一開). In the Yùnjìng, each chart is called a zhuǎn (lit. 'turn'). The characters indicate that the chart is the first (第一) one in the book, and that the syllables of this chart are "inner" (內) and "open" (開).

The columns of each table classify syllables according to their initial consonant (shēngmǔ 聲母 lit. 'sound mother'), with syllables beginning with a vowel considered to have a "zero initial". Initials are classified according to

The order of the places and manners roughly match that of Sanskrit, providing further evidence of inspiration from Indian phonology.

(Wikipedia)

There you have it, a capsule introduction to the Chinese rime tables, which were as important for premodern Sinitic phonology as slide rules were for mathematical operations before the invention of digital calculators and computers.  The parallels are not perfect, but the idea of having once been an essential tool in a technical field and later having become obsolete is common to both.

 

Selected readings

[Thanks to South Coblin and Axel Schuessler]

Profile

alterkrmn: Nozue from the manga Old Fashion Cupcake. His expression shows confusion. (Default)
Carm

April 2025

S M T W T F S
   12345
678910 1112
13 141516171819
20212223242526
27282930   

Style Credit

Expand Cut Tags

No cut tags