The Journey of the Entering Tone from Middle Chinese 中古音からの入声の道

Introduction 紹介
One of the reasons I am interested in historical linguistics is how consistent it is. For example, in Japanese, the Nara period /p/ sound became /h/ if it was the onset of a word, /p/ if it was before a sokuon (っ), or /w/ if it was in the middle of a word (this /w/ would furthermore drop out before /i, u, e, o/). For evidence of this, look at the following words spelled in their historical kana orthography (歴史的仮名遣い) and modern kana orthography (現代仮名遣い).

私が歴史言語学に興味がある理由一つは歴史言語学の整然さだ。例えば、日本語においては、奈良時代の/p/の音が、語頭で/h/になり、促音(っ)の前に/p/になり、語中で/w/になった(/i, u, e, o/の前で、この/w/が脱落した)。証拠として、歴史的仮名遣いと現代仮名遣いで綴られた以下の単語を見よ。

Kanji 漢字Historical Kana Orthography 歴史的仮名遣いModern Kana Orthography 現代仮名遣い

Note: The はひふへほ characters signified the /pa, pi, pu, pe, po/ sounds when the historical kana orthography was created in the early Heian period.

注意:「はひふへほ」は歴史的仮名遣いが作られた平安初期に/pa, pi, pu, pe, po/の音を表していた。

So, with that said, you can imagine my disappointment when I found out that in Beijing Mandarin Chinese there is a class of syllables whose tone cannot be determined based on their Middle Chinese form. This class is composed of syllables which had the entering tone and a voiceless initial (except the glottal stop) in Middle Chinese. For brevity, I will refer to this class as ETVI (Entering Tone Voiceless Initial). According to David Branner’s A Neutral Transcription System for Teaching Medieval Chinese (p. 36), the ETVI syllables could develop into any of the four Modern Beijing Mandarin tones, which is in steep contrast to the other Chinese syllables that followed diachronic rules very closely. Branner suggests this may be the result of dialect mixing. Branner also notes that according to a correspondence he had with Jerry Norman, ETVI syllables tended to become either tone 1 or 3 for colloquial readings, and 2 or 4 for literary readings, due to forced rhyming called Xieyun 協韻.

そこでは、中古音形式によって現代北京語反映形の声調を予測できない音節があるとわかった時に、残念に思った。このクラスは、無声声母(声門閉鎖音以外)と入声の中古音形式の音節から成り立っている。以下、このクラスをETVI(Entering Tone Voiceless Initial 入声無声声母)と呼ぶ。デービッド・ブラナーの『中古音の教えのための中立表記』(A Neutral Transcription System for Teaching Medieval Chinese)(36頁)によると、通時的ルールに従う他の音節とは対照的に、ETVI音節は現代北京語の4種類の声調のどちらにもなりうる。この現象は方言同士が交流した結果であるかもしれないとブラナーは提案する。また、ジェリー・ノーマンとの協議の結果、ETVI音節が、口語の場合は1声、3声になり、文語の場合、「協韻」と呼ばれる強制押韻のために、2声、4声になる傾向があると、ブラナーは指摘している。

The thing that shocked me most about this phenomenon is that according to Pulleyblank’s The Nature of the Middle Chinese Tones and Their Development to Early Mandarin, by Early Mandarin, the language of Northern China during the Yuan period, the ETVI syllables had become tone 3. This naturally brings up the question, why did the ETVI syllables go from being regular to being irregular? While I could not immediately tackle this question since my knowledge of the history of the Chinese language is lacking, I figured a good first step was to collect statistical data on just how random the tonal distribution of the ETVI syllables actually is in Modern Beijing Mandarin. But before I get to the methodology, let’s go over Chinese syllable structure, and how Chinese developed from Early Middle Chinese to Modern Beijing Mandarin.

プリーブランクの『中古音の声調の性質と早期官話への発展』(The Nature of the Middle Chinese Tones and Their Development to Early Mandarin)によると、早期官話(元朝の北部中国の言語)までには、ETVI節音が3声になったことは、先ほど述べた発展には合致しない。故に、なぜETVI節音が規律から不規律になったかという問題が生じる。私は中国語の歴史について詳しくないので、この疑問に答えることができない。しかし現代北京語におけるETVI音節の声調分布がどれほどランダムであるかに関する統計データを集めると、この疑問を解決するための良い手段となるだろう。しかし方法を説明する前に、中国語音節構造と、中古音から現代北京語までの中国語歴史を見てみよう。

Chinese Syllable Structure 中国語の音節構成
Chinese syllables are traditionally divided into three parts: an initial, a final, and a tone. The initial consists of the onset of the syllable, the initial consonant. The final consists of the onglide, the nucleus, and the coda – the rest of the syllable. Lastly, the tone consists of a pitch that is carried for the duration of the syllable. For example, the modern Mandarin word shuāng 雙 can be analyzed as follows.


Initial 声母Final 韻母Tone 声調
sh [ʂ]uang [wɑŋ]1 [˥]

The Development of Tone from Early Middle Chinese to Modern Beijing Mandarin 前期中古音から現代北京語までの声調の発展
This explanation will only cover what is relevant for us to understand the origin and development of ETVI syllables. This explanation will exclusively follow Pulleyblank’s Late Middle Chinese – Part I/II, The Nature of the Middle Chinese Tones and Their Development to Early Mandarin, Middle Chinese: A Study in Historical Phonology, and Lexicon of Reconstructed Pronunciation in Early Middle Chinese, Late Middle Chinese, and Early Mandarin.

この説明では、ETVI音節の由来と発展を理解するために必要な部分だけを扱う。この説明はもっぱら、プリーブランクの『後期中古音 第1・2部』(Late Middle Chinese – Part I/II)『中古音の声調の性質と早期官話への発展』(The Nature of the Middle Chinese Tones and Their Development to Early Mandarin)、『中古音:歴史音韻論でのケーススタディ』(Middle Chinese: A Study in Historical Phonology)、『前期中古音、後期中古音、早期官話での再建された発音の字典』(Lexicon of Reconstructed Pronunciation in Early Middle Chinese, Late Middle Chinese, and Early Mandarin)に従っている。

Early Middle Chinese 前期中古音
Initials 声母
Early Middle Chinese (the language of the Sui dynasty), hereinafter EMC, had 39 initials, listed as follows.

前期中古音(隋朝での言語)、以下EMC(Early Middle Chinese)、は以下のリストにある通り39声母を持っていた。

Initial name 声母の名称Pronunciation 発音Notes 注釈
wSome scholars view 匣 and 云 as allophones instead of separate initials

牀 is also known as 崇

ʂ山 is also known as 生

照 is also known as 章

穿tɕʰ穿 is also known as 昌

The rime tables reverse the positions of 禪 and 神, so some reconstructions have their pronunciations reversed

ɕ審 is also known as 書

ʑ神 is also known as 船


The rime tables reverse the positions of 禪 and 神, so some reconstructions have their pronunciations reversed

j羊 is also known as 以

ɣSome scholars view 匣 and 云 as allophones instead of separate initials

HH is regarded as a “zero initial” and is left out of transcription, but Pulleyblank intends for it to be a “voiced laryngeal glide”. The only EMC characters to have this initial are the enclitics 矣 and 焉. Pulleyblank created this EMC initial because he believes the evidence for reconstructing initial 云 as [w] is indisputable, but 矣 and 焉, both starting with initial 云, have no trace of a labial glide. I am unaware of any other reconstruction which reconstructs an extra initial for EMC like this.


Tones 声調
EMC had four tones: the level tone 平, the rising tone 上, the departing tone 去, and the entering tone 入. While these tones likely did have an associated pitch contour, their most saliant features were their final segments.
  • The level tone had no final voiceless segment (and was likely a bit longer than the other tones)
  • The rising tone had a final glottal stop (from Old Chinese)
  • The departing tone had final aspiration (derived from Old Chinese final -s)
  • The entering tone had a final stop consonant (-p, -t, -k)

  • 平声は無声分節音で終わらない(そして他の声調より長い可能性が高い)
  • 上声は声門閉鎖音で終わる(上古音の-ʔに由来する)
  • 去声は有気で終わる(上古音の-sに由来する)
  • 入声は閉鎖子音(-p, -t, -k)で終わる

Late Middle Chinese 後期中古音
Initials 声母
The initials of Late Middle Chinese (the language of the Tang and Song dynasties), hereinafter LMC, changed from those of EMC in a few ways.
  • The bilabials (幫, 滂, 並, 明) labiodentalized before certain finals to form a set of new initials, (非 [f], 敷 [f], 奉 [fʱ], 微 [ʋ]) (even though 非 and 敷 were listed as separate initials, their distinction was likely artificial).
  • 羊 and 云 merged into a new initial: 喻 [H].
  • Voiced stops and affricates become voiceless but breathy. However, when I refer to “voiceless initials”, I am not referring to these, but rather the initials that were voiceless in EMC.
  • The palatals (照, 穿, 禪, 書, 船) merged with the retroflex sibilants (莊, 初, 牀, 山, 俟) (even though 牀 and 俟 were listed as separate initials, their distinction was likely artificial).
  • The pronunciation of the 日 initial went from a palatal nasal to a retroflex approximant [ɻ].

後期中古音(唐朝と宋朝での言語)、以下LMC(Late Middle Chinese)、はEMCから声母が以下のように変化した。
  • 両唇音(幫、滂、並、明)は、ある種の韻母の前にある場合は、唇歯化し、新しい声母(非 [f]、敷 [f]、奉 [fʱ]、微 [ʋ])を形成した(「非」と「敷」は別個の声母として記載されていたが、同じ音で発音されていた可能性が高い)。
  • 「羊」と「云」が合流し、新しい声母、喻 [H]、を形成した。
  • 有声閉鎖音と破擦音は、息もれ無声になった。ただし、この記事において、「無声声母」はこれらの声母を指さず、EMCでの無声声母を指す。
  • 硬口蓋音(照、穿、禪、書、船)はそり舌歯察音(莊、初、牀、山、俟)と合流した(「牀」と「俟」は別個の声母として記載されていたが、同じ音で発音されていた可能性が高い)。
  • 「日」の発音は、硬口蓋鼻音からそり舌接近音 [ɻ]になった。

Tones 声調
The LMC tonal system differs from that of EMC in the following ways.
  • The four EMC tones each split into two registers, a dark 陰 and light 陽 register (also known as the “upper” and “lower” registers, respectively). The dark register had a high pitch, while the light register had a low pitch. Syllables with voiceless initials went to the dark register while those with sonorant and breathy voiced aspirate initials went to the light register.
  • The final aspiration of the departing tone became a breathy voiced aspiration [ʱ].
  • EMC syllables with a breathy voiced aspirate initial and a rising tone merged with the departing tone light register, likely because the breathy voice from the initial spread to the glottal stop.
  • The final stops of the entering tone alternated with continuants accompanied by a glottal stop (-ʋʔ, -ɻʔ, -ɣʔ).

  • EMCの四声は「陰調」と「陽調」というレジスターにそれぞれ分裂した(「陰調」が「upper」と呼ばれ、「陽調」が「lower」と呼ばれる場合もある)。「陰調」はピッチが高いが、「陽調」はピッチが低い。無声声母を持つ音節は「陰調」になったが、共鳴音と息もれ声有気音声母を持つ音節は「陽調」になった。
  • 去声の末尾有気は息もれ声有気 [ʱ]になった。
  • 息もれ声有気音声母を持つEMC上声の音節は陽去声と合流した。その理由はおそらく、声母の息もれ声は閉鎖音に広がったからだ。
  • 入声の末尾閉鎖音は、閉門破裂音を伴う継続音(-ʋʔ, -ɻʔ, -ɣʔ)と、交替するようになった。

Early Mandarin 早期官話
Initials 声母
There’s no need to go too deep into the changes that occurred to the initials by Early Mandarin (the language of the Yuan dynasty, also known as Old Mandarin), hereinafter EM. However, there is one change relevant to the development of EM tones.
  • 影 [ʔ] merged with喻 /H/.

早期官話(元朝での言語)、以下EM(Early Mandarin)、古官話、までに起こった声母の変化について詳しく説明する必要はないだろう。ただしEM声調の成長に関係する変化が一つ存在する。
  • 「影」[ʔ]が「喻」/H/と合流した。

Tones 声調
Just like the previous stages, tones in EM were primarily based on laryngeal features, rather than simply pitch. The tonal classes of EM and their features are listed below.
  • The LMC level tone dark register was defined as tone 1 and had the features [-breath +long].
  • The LMC level tone light register was defined as tone 2 and had the features [+breath +long].
  • The LMC rising tone (both dark and light registers) was defined as tone 3 and had the features [-breath -long].
  • The LMC departing tone (both dark and light registers) was defined as tone 4 and had the features [+breath -long].

  • LMCの陰平声は第1声として定義され、その素性は[-息もれ +長い]であった。
  • LMCの陽平声は第2声として定義され、その素性は[+息もれ +長い]であった。
  • LMCの上声(陰上声と陽上声)は第3声として定義され、その素性は[-息もれ -長い]であった。
  • LMCの去声(陰去声と陽去声)は第4声として定義され、その素性は[+息もれ -長い]であった。

As for the entering tone, the stops/continuants accompanied by a glottal stop disappeared. The entering tone syllables were redistributed to the other tones as follows.
  • Entering tone syllables with LMC voiceless initials (except the glottal stop) merged with tone 3.
  • Entering tone syllables with LMC breathy voiced aspirate initials merged with tone 2.
  • Entering tone syllables with LMC sonorant initials (and the glottal stop) merged with tone 4.

  • LMC無声声母(声門閉鎖音以外)を持つ入声音節は第3声と合流した。
  • LMC息もれ声有気音声母を持つ入声音節は第2声と合流した。
  • LMC共鳴音声母(声門閉鎖音も)を持つ入声音節は第4声と合流した。

The explanation for this distribution is that the entering tone was originally short; therefore, the non-breathy LMC voiceless initial entering tone syllables merged with tone 3, and the LMC sonorant initial entering tone syllables, having secondary breath, merged with tone 4. As for the LMC breathy voiced aspirate initial entering tone syllables, the breathy voice from the initial spread to the final stop, lengthening it, and thereby they merged with tone 2.


Modern Beijing Mandarin 現代北京語
Like Early Mandarin, Modern Beijing Mandarin has four tones (Modern Beijing Mandarin is sometimes said to have a “fifth” tone, but this is better analyzed as a neutralized tone characteristic of weak syllables). However, these tonal distinctions became entirely based on pitch, rather than laryngeal features. The only tonal merger/split was the aforementioned shift of ETVI syllables from tone 3 to tones 1, 2, 3, 4.


Methodology 方法論
Now with all of that context out of the way, we can finally focus on the question at hand: what is the distribution of tones in Modern Beijing Mandarin for syllables that had the entering tone and a voiceless initial (except the glottal stop) in Middle Chinese?


To find this distribution, I decided to compare a dataset of Middle Chinese character readings, with a dataset of Modern Beijing Mandarin character readings. For the dataset of Middle Chinese character readings, I went with the Guangyun rime dictionary, digitized by the incredible Kanji Database Project. Even though the Guangyun was created in the Song Dynasty, it faithfully reflects the phonological system of the Qieyun, a Sui dynasty rime dictionary. The Guangyun dataset includes the readings of each character according to Karlgren’s Ancient Chinese reconstruction (analogous to Middle Chinese), making it quite easy to work with. For the modern-day dataset, I went with CC-CEDICT, because while expansive, it does not have many overly archaic readings. In addition, I needed a way to check if a modern-day reading comes from a historic reading. To do this, I used the Expected Mandarin Reflex module found on Wiktionary. While there were some reading pairs that the module said were not related but I suspected were, for the most part the module was very helpful at identifying which modern-day readings were related to historic readings.

この分布を明らかにするために、中古音の字音データセットと現代北京語の字音データセットを比較することにした。中古音の字音データセットについては、すばらしい「漢字データベースプロジェクト」によってディジタル化された『広韻』という韻書を使用することにした。『広韻』は宋朝に作られたが、『切韻』という隋朝における韻書の音韻論に従っている。『広韻』のデータセットはカールグレンの「Ancient Chinese」(中古音)再建による字音も含んでいるので、使いやすい。現代北京語データセットは規模が大きいが古語が多すぎないCC-CEDICTを使用することにした。加えて、現代北京語のある字音が歴史的な字音に基づくか確認する必要があった。そうするために、ウィクショナリーでの予想北京語反映形モジュールを使用した。モジュールが関係しないとしているが、関連すると私が思う字音が存在するものの、多くの場合、モジュールは現代北京語の字音と歴史的な字音の関連を特定することに役立っている。

To select the ETVI readings from the Guangyun dataset, I used Karlgren’s reconstruction. Since ETVI readings need to begin with non-glottal voiceless initials (幫, 滂, 非, 敷, 端, 透, 知, 徹, 精, 清, 心, 莊, 初, 生, 章, 昌, 書, 見, 溪, 曉) and end with a voiceless stop (k, p, t) in LMC, their Karlgren Ancient Chinese reconstructions will begin with exactly one of the following “p, pʰ, t, tʰ, ţ, ţʰ, ts, tsʰ, s, ʧ, ʧʰ, ʃ, tɕ, tɕʰ, ɕ, k, kʰ, x” and end with exactly one of the following “k, p, t”. From this, I found a total of 1143 ETVI readings. Note that some characters had more than one ETVI reading, and thus were counted more than once. Now, all I had to do was categorize each of these readings into either tone 1, tone 2, tone 3, tone 4 based on their Modern Beijing Mandarin reflex...

ETVI字音を『広韻』のデータセットから抽出するために、カールグレンの再建を使用した。ETVI字音はLMCにおいて声門音以外の無声声母(幫、滂、非、敷、端、透、知、徹、精、清、心、莊、初、生、章、昌、書、見、溪、曉)で始まり、無声閉鎖音(k, p, t)で終わるものなので、カールグレン再建は「p, pʰ, t, tʰ, ţ, ţʰ, ts, tsʰ, s, ʧ, ʧʰ, ʃ, tɕ, tɕʰ, ɕ, k, kʰ, x」の中の一つで始まり、「k, p, t」の中の一つで終わる。こうしてETVI字音を1143見つけた。ただしETVI字音を複数個持つため、複数回数えられた漢字もある。さて次はこれらのETVI字音を現代北京語反映形に基づいて第1声、第2声、第3声、第4声に分類しよう...

Results 結果
Out of the 1143 Middle Chinese ETVI readings, 422 of them did not have a Modern Beijing Mandarin reflex. The distribution of the remaining 721 readings’ Modern Beijing Mandarin tonal reflex distribution is as follows. Note that some Middle Chinese ETVI readings have multiple Modern Beijing Mandarin reflexes; in those case, the reading was categorized into all the reflexes’ tone classes.


Modern Beijing Mandarin Tone Class 現代北京語声調の種類Number of Readings 字音数Percentage of Total ETVI Readings that have a Modern Beijing Mandarin Reflex in this Class 総ETVI字音において、この種類に現代北京語の反映形がある割合
Tone 1 第1声18825%
Tone 2 第2声20927.79%
Tone 3 第3声577.58%
Tone 4 第4声29839.63%

As for the raw data, here are the characters that have a Middle Chinese ETVI reading, sorted by their Modern Beijing Mandarin tonal reflex.


Tone 1 第1声: 七侂倐倬儵八刮削剔剟剥割劄劈劐匹厀叔吸呷哭哳唂唧嘓噏圣墼失夾尗屩崞帀帖忽怗息悉惚惜慼慽戌戚扎扑拙捉捌掇掐接插揳撆撥撲擊攉攴昔朳析柒桌椄欻歇汁沰涿淅溼滴漆潝濕焟煞熄猲瘜癤癶發的皙盋督瞎瞥矻禿窟窸答節紮結緆緝缺翕翖耷聒胳腊膝臿荅菽蜇蜥蟈蟋蠍裰裼褉褡託詘豁貼踏踢轂逼遢郭鉢錫鏑鏚闕闟隻霹飥魠鴿鵖偪出切咄唧喝堀屈扒折捌揭撮昒梜楔殺激濕皀瞥砉磕積織缺脫菥蝃蠚袼褡踔鍤馲鴰作焌磕說鎩皀著揭

Tone 2 第2声: 乀伋伯侄傕則劂劄劫卓博即卽厥叕吉哫哲喆嗝嚗嚞國執夾妲婕宊察巿帗幅幯弗彴彿得德怛急悊惪戛戠抉拂摭攫斫斮斲晫札柏柞格桔桷棘椓樴橐橘殛汲淂潔潗灼炟熁燭爵玃玦琢瘃瘚的睫矍磔福禚秷稙窋竹笰答節級紱紼結絜綍縶羯聝職脚腳膈膕艴芨茍茖荅葍葛蕨虢蚻蛤蜇蜐蝍蝠蟨袷裌襋襏觡觼訣詧諑謫譎讁讋趹跖跲蹐蹠輒輵郟郹酌鈌鎛鏃鏑钁閣閤隔革靮靼餺馘駁駮骼髆鬲鮿鴂黻䅵亟厥啄嫡幗懫折拮挌摺斮晢格決祓竺脅茀蕝袺襮覺角觖訐識足跕蹢蹶輻韍髴卒搏梲笪芾茁蝍訐蹶髴

Tone 3 第3声: 丮乞劈匹卜囑塔尺属屬帖戟撮斸椁榖榾槨法渴瀔灋獺甲癖眨矚礤穀竺筆篤索給胛脊葛蓇谷轂鐵雪靸骨鰨丿合扢癖蹶鉀銕靸驖竺蓋蹶

Tone 4 第4声: 㣟㥦㲋䘏䟆促俶倜偰僻克刷刹刻剋剒劐勅勣匧卌却卹各呫咠嚃嚳埆塙壁壑夙夾奭妾婥媟客室屑屧屮帖帹廓式彳必忒怯怵恤恪恰悐惕愙愜愨慝抶拓拭挈撻擉擘攃攝敕斥朅朔朴束柏桎梏榻榼槊樕檗歠殈毾泣洫浙渫湢湱漯漷潟潷澼灄煞燮爍牿猝玊玓珀珌璧甋畢疶瘯的皕矗矟碏碧確碻磧稷窒窣竊笏築篋篳簇籜粕粟紲綽緙緤績翣肅肸腹膈葺蒪蒴蓽蔌蔎蕮薩藒藿蘀虩蛞蛭蝮螫蟀血衋褻襞襫觱觸設訹謋謔豁豖賉赤赩赫趯跡踏踥蹕蹟蹴躄躞軾輹辵迹逖速逴適郅鄎酷釋釳銍錔鑕鑠闃闊闋闕闥陟霍鞹韘颯飭飾餗餮騭驌髮鬩鯽鱐鵲鷫黜齪䒗亍侐僻切副卒卻厝咇喝嗒嚇塞宿徹掣搉撤暍柷楅橚欶歃歙歜泄涑湱溘炙猲玊矗磕磭祝筑箾籊翛肸舄萐蔟蛭複适逴鄐鉻銍錯鍥霎魄鯽啜嗃婼數涑磕礐箑芍覆辟啜

Here are the characters that had an ETVI reading in Middle Chinese, but that ETVI reading does not have a Modern Beijing Mandarin reflex (according to CC-CEDICT and the Expected Mandarin Reflex module).


NA 該当なし: 䅵䒗仄仡佸偪側僣劼勗勺匊北卒卻吃咄咥咭哱唊啄啅喀喫嗀嗇嗒嘁嘎嘖堲妁嫡尐屈岊嶨幘庴惙惻愊慉慴憋懾戄戢扢扱拍括拮拶挶掬搉搨搭摑摘撅擖攥斠旭曲柮柷栻梜梲棤椈楅楔楫榷槭樎橜櫛櫡欂欱歃歙歰汔汨決沏洁活浹測滑澀澁濇濈濮烕焃焯熇狘獝瑟璞璱畟百眣砉砝硅秸穡窄筈策筴箑簀絀綌緁縮繘羍膊臛舄舴色茁茇莢菊萐蓄蓿蕺薛蚱蛣蛺蜨蝨蝶蠋蠚蠽袚袺襆襮訖詰諕謖貜責趵趿跕踖踘踧蹙蹜蹼躂躠躩輂轖迄迫迮适達遫郝郤鄐醭鈒鉀鉿鋏鋦鍤鍥钃閘閾隙雀霅霎霩鞠鞫頡頰顣餄髺鬄魄鰈鴰鶻麴黒鼈䗪倅借債刜別刷刺剟劃劊匃咠咭唬啑喋喌嗑嗽囁堲契嬙寨帗帥幅弔怕愬憋押拓掇搏摘擖擷攃攴晫柣柵栔栝梏梲楬檜檝欱歁歊汏浩涉淢準溥滷漷潎潚濇濼焱煠熇燋爆爚爝獺率畜畟畷癟眣矠砝稭窒筈筏筴篧粥索綴緎繣繳舴茁莔菐萴葉葺蓄蓫薔薜蜈蝍蟋袷觳詧譫豞質跮跲踖蹴躩較輟辟迮透適郃郜郝醊鈒鉆鉍鑿閉霅霫靼鞠韘韣頊頜颮餟駃騤骱鱖鵖鵯鷩齱不了副呿咋咥嗽嘖契愒扒敫柲柵樸泌潏熇爆猲瘛皀笈笮筴索翯胠蓫蛭觕詆謏適郝鄗銛馲啐啜杓樸濊芍覆詀蹻鞄揭濼芍

Conclusions 結論
Glancing at the data, reflexes of tone 4 seem to be the most common, those of tone 1 and 2 seem to be somewhat common, and those of tone 3 seem to be the least common. This is notable since according to Pulleyblank’s theory, they would all be tone 3 in Early Mandarin. There is also no apparent distribution analogous to what David Branner and Jerry Norman discussed.


Now this is where I intended to conclude my article, with empirical data on the reflexes of ETVI readings in Modern Beijing Mandarin. However, that feels a bit incomplete to me. So, if I may, I would like to provide what I see as the simplest modification of Pulleyblank’s tonal theory in order to explain the reflex tonal distribution we see here. Please note that I am not proposing a theory based on the most up to date information, rather I am just giving what I see as the necessary steps in order to make Pulleyblank’s, possibly outdated, tonal theory consistent with the observed data.


Pulleyblank weighs the evidence that the entering tone had merged with the non-entering tones by Early Mandarin, and ultimately concludes that they had completely merged. However, if the ETVI syllables had merged with tone 3 by EM, it would be impossible for the ETVI syllables to then become randomly distributed in Modern Beijing Mandarin while the non-ETVI syllables stayed as tone 3. Therefore, I conclude that the entering tone was at least somewhat distinct from the non-entering tones in EM. I propose that this distinction was a remnant glottal stop in place of the former stop consonant for the entering tone.


This remnant glottal stop is actually seen in some contemporary Mandarin dialects such as those in the Jianghuai region. With this, I propose that Early Mandarin tones had a third feature in addition to breath and length: glottalization. Since EM tone 3 had a final glottal stop in LMC, it is not a far stretch to assume that it also was glottalized in EM (though its glottal stop was likely more saliant than that of the entering tone). Thus, I think that the Early Mandarin tonal system can be described as follows:


Tone 1 第1声Tone 2 第2声Tone 3 第3声Tone 4 第4声Tone 2.1 第2.1声Tone 3.1 第3.1声Tone 4.1 第4.1声

Note that “Tone 2.1” refers to entering tone syllables with LMC breathy voiced aspirate initials, “Tone 3.1” refers to entering tone syllables with LMC voiceless initials (except the glottal stop), and “Tone 4.1” refers to entering tone syllables with LMC sonorant initials (and glottal stop). Herein, tones 2.1, 3.1, and 4.1 are distinct from tones 2, 3, and 4, but quite similar to them. This is why some works from the Yuan period separate the entering tones from the non-entering tones, but others use them to gloss each other. Something notable in this system is that each tone has at least one laryngeal feature, implying that the tones were identified by their laryngeal feature rather than lack thereof.


Next, the entering tones lost their final glottal stop in Beijing Mandarin yielding a system as follows.


Tone 1 第1声Tone 2 第2声Tone 3 第3声Tone 4 第4声Tone 2.1 第2.1声Tone 3.1 第3.1声Tone 4.1 第4.1声

Due to this, tone 2.1 merged with tone 2, and tone 4.1 merged with tone 4. However, tone 3.1 lacking all laryngeal features, was not a stable tonal category. Accordingly, the words of tone 3.1 were redistributed to the other tones inversely to the salience of the laryngeal features of the tonal system. Glottalization would be the most saliant, length would be somewhat saliant, and breath would be the least salient. That’s to say, the lack of glottalization of tone 3.1 would make a tone 3.1 syllable unlikely to become tone 3, the lack of length of tone 3.1 would make a tone 3.1 syllable somewhat unlikely to become tone 1 or 2, and the lack of breath of tone 3.1 would make a tone 3.1 syllable only slightly unlikely to become tone 4.


Once again, this theory is not based on the most cutting-edge research, but I can appreciate how little is needed to modify Pulleyblank’s tonal development theory in order to explain the tonal distribution of the ETVI syllables’ reflexes. Any Chinese tonal development theory worth its salt needs to be able to explain this phenomenon.


Future Research 将来の研究
While I am happy with how this analysis turned out, there is much work to be done. First, one should utilize other historical Chinese tone theories, and see how they can explain the tonal distribution of ETVI syllables in Modern Beijing Mandarin. Then, someone could repeat this methodology, but use a different Modern Beijing Mandarin dictionary, or expected Mandarin reflex algorithm and see if it yields the same results. Finally, one could look over my data with a fine-tooth comb and see if there are any patterns that can better explain this phenomenon. There is still a lot of work to be done, so for those of you who are trying to figure out what to write their paper on, please, help yourself~


Verbal Prefixes in the Hachijō Dialect 八丈方言の動詞接頭辞

For those who have been following this blog, you may have noticed that I haven’t posted in almost a year. The two reasons why I haven’t posted in a while are that I canceled one of my articles, and the research that’s been occupying my time hasn’t been targeted towards this blog. In April 2021 I started my tenure as a Research Student at Kyoto University. My area of research was the Hachijō Dialect. Because I wanted to focus all my attention on that research from then, I was planning to release an article on this blog in early-April. The article was going to be about gō-yōon (e.g. the historical /kwa, gwa/ readings of 火 and 瓦). I was planning to write about the orthography used to represent these sounds in the Ruiju Myōgishō 類聚名義抄 and why /kwi, gwi/ were excluded in Bjarke Frellesvig’s A History of the Japanese Language. However, as I was writing the article, I came across some evidence that disproved my hypothesis and would require me to completely reframe my findings. Since it was already April, I decided to shelve the article indefinitely and begin my research on the Hachijō Dialect. I may return to gō-yōon in a future blog post, but I’m not sure.

このブログで、私が約一年間、投稿しなかったと気づいた人がいるかもしれない。しばらく投稿しなかった二つの理由は、研究していたテーマがこのブログ向けではないということと、投稿するつもりだった記事を削除したことだ。2021年4月、京都大学で研究生になった。研究テーマは八丈方言であった。その時から、研究生としての研究のみに集中したかったから、このブログでは新たな記事を4月初旬に投稿するつもりだった。その記事は合拗音(例:「火」、「瓦」の/kwa, gwa/という歴史的な音読み)についてである。『類聚妙義抄』においてこの合拗音を表す綴りだということと、なぜビャーケ・フレレスビッグの『日本語の歴史』(A History of the Japanese Language)で/kwi, gwi/が除外されたか、について書くつもりだった。しかし記事を書いていた時、仮説を覆す証拠を見つけたから、記事を書き直す必要があった。走行しているうちに、研究生が始まる四月になったから、その記事の編集を中止し、八丈方言の研究を始めた。将来、記事で合拗音について書くかもしれない。

I chose the Hachijō Dialect as the topic for my Research Student research because recently there has been some excellent scholarship by Western scholars about it and its relationship to Eastern Old Japanese (e.g. Kupchik’s On the Etymology of the Eastern Japanese Word tego and A Grammar of the Eastern Old Japanese Dialects, and Iannucci’s The Hachijō Language of Japan: Phonology and Historical Development). In particular, The Hachijō Language of Japan: Phonology and Historical Development was invaluable to my research because its appendix is a compilation of several Hachijō Dialect dictionaries. In addition, unexpectedly, in March 2021, the Hachijō Grammar Wikipedia page was created by a user known as LhikJovan. The page is primarily a summary of Akihiro Kaneda’s Basic Research on Verbs in the Hachijō Dialect 八丈方言動詞の基礎研究, but it is very well-written and was quite helpful. This page has made the innerworkings of the Hachijō Dialect more accessible to a Western audience than ever before.

研究生の研究テーマは八丈方言とした理由は、最近、八丈方言及び八丈方言との上代東国方言の関係について、西洋の学者によるすばらしい論文が出たことである(例:ジョーン・カプチックによる『東国方言の語彙、「テゴ」、について』(On the Etymology of the Eastern Japanese Word tego)と『上代日本語方言の文法』(A Grammar of the Eastern Old Japanese Dialects)及び、イアヌッチによる『日本の八丈語:音韻論と歴史的発展』(The Hachijō Language of Japan: Phonology and Historical Development))。特に、『日本の八丈語:音韻論と歴史的発展』には付録がさまざまな八丈方言辞書を基にした集大成辞典だから、私の研究にとても役立つ。その上、2021年三月に、八丈方言文法のウィキペディアページがLhikJovanというユーザーによって制作された。主にこのページは章宏金田の『八丈方言動詞の基礎研究』の要約だけれど、よく書かれていて、役立った。このページは西洋人に、これまで以上に八丈方言の仕組みをわかりやすく説明した。

My research ended up focusing on verbal prefixes within the Hachijō Dialect, and it took about 9 months to conduct my research and write an Undergraduate Thesis on the findings. So, what is a verbal prefix? A verbal prefix is a morpheme which cannot stand on its own and attaches to the beginning of a verb. For a standard Japanese example, the くっ in くっ付く is a verbal prefix. くっ by itself is not a word, and it attaches to the beginning of verbs such as 付く, so it is classified as a verbal prefix. I was planning to write the findings of my research on this blog as an article, but having completed my paper, I am reluctant to. For an article to meet the standards of this blog I need to have confidence that my findings are interesting and conclusive. If I do not think an article has both qualities, I will edit it until it does, or scrap it (like the aforementioned article on gō-yōon). On the other hand, papers have a hard deadline, so even if the conclusions are a bit bland, it still needs to be submitted. In addition, an undergraduate paper is not expected to be perfect, and is more for practice than anything else. Unfortunately, as for my paper, because I focused on such a small subset of words, my conclusions felt suggestive rather than confirmatory. For example, as my paper focused almost exclusively on the Hachijō Dialect and did not focus on other Northeastern Japanese dialects, it was not possible to separate which phenomena were loans and which were innovations. While I would have loved to incorporate as many Japanese dialects as I could, the research would have taken too long and likely become too derailed for a single undergraduate paper. However, while not unarguable, I do think the suggested conclusions of my paper are fascinating, and I certainly think that they could serve as inspiration for others’ future research. Therefore, I am going to post my Undergraduate Thesis on this blog. You can download it from here. Enjoy reading it, but please keep in mind that it is a Research Student paper. And as for the future, I don’t want to make any promises, but I already have some ideas brewing on what my next article will be. Hope to see you there~


Kobun Tomodachi Privacy Policy

In accordance with the new Google Play Store requirements, I present the Privacy Policy for my app Kobun Tomodachi.

Kobun Tomodachi does not collect any user data.

This Privacy Policy was last updated on October 28th, 2021.

Kobun Tomodachi: Issues and Prospects 古文友達:課題と展望

Last week I released update for my app Kobun Tomodachi. You can download it from the Play Store for Android or the Microsoft Store for Windows. I made this update mainly to change the origin character of hiragana /wu/ in accordance with the new article on this blog, “Reply to The Origin of Hiragana /wu/”, but I also added some new information, changed the backend of the kyūjitai/shinjitai converter, and added HIME-sama functionality to the Origin of Kana block. Ironically, my previous update,, was also inspired by a desire to change the hiragana /wu/ origin character in accordance with the first article on this blog, “The Origin of Hiragana /wu/”. Therefore, it’s clear that my app Kobun Tomodachi, and my blog Kobun World, are in some ways closely related. However, I have always been hesitant to write about Kobun Tomodachi on this blog because of how different the audiences for which I was aiming are. Kobun Tomodachi was designed for an audience of self-studiers who were just beginning to learn Classical Japanese. It takes inspiration from websites like Tae Kim’s Blog and Imabi. In fact, Imabi is the site which inspired me to self-study Classical Japanese and the block system of Kobun Tomodachi was very closely modeled after it. Therefore, keeping in line with these websites, I very casually cited my sources, wrote in a mixture of English and Japanese, and omitted some of the more complex details. While this style works very well for an audience of people who are just starting to learn Classical Japanese, my blog is written to present my own original research and thus requires a more academic style. Therefore, on my blog, I aim to properly cite my sources, write only in English with rōmaji, and fully explain what I’m writing about. Thus, due to this incongruence in styles and audiences, I have been hesitant to write about my app on this blog.

先週、「古文友達」という、私が作ったアプリのアップデート1.7.10.22を公開した。アンドロイド版はPlayストアから、ウインドウズ版はMicrosoftストアから、ダウンロードできる。主に、このブログの新たな記事「平仮名のわ行うの字源に対する新たな発見」に従い、アップデートを作成したが、新しい情報を加えたり、旧字体・新字体の変換の手順を改めたり、「Origin of Kana」のブロックにHIME-samaの機能を加えたりした。皮肉にも以前のアップデート(も、このブログの最初の記事「平仮名のわ行うの字源」に従って、平仮名のわ行うの字源を改良するためアップデートを行った。だから私のアプリ「古文友達」と私のブログ「古文・ワールド」は密接につながっていることが分かる。しかしアプリとブログの読者対象が非常に異なるので、このブログで「古文友達」について書くのをいつもためらっていた。「古文友達」は自分で古文を勉強しはじめる人達向けのアプリだ。「古文友達」は「Tae Kim’s Blog」や「今日いまび」などウェブサイトから影響を受けている。実際、「今日」が私にとって、古文を勉強するきっかけになった。加えて「古文友達」のブロックシステムのモデル化は「今日」の影響を受けている。これらのウェブサイトを習って、参考文献を何気なく引用したり、英語と日本語の交じり文で書いたり、複雑な詳細の説明を省いたりした。このような「古文友達」での文章は、古文を勉強しはじめる読者にはふさわしい。しかし私のブログでは独自の研究を発表する為、よりアカデミックなスタイルが必要である。故にこのブログでは参考文献をきちんと引用し、英語とローマ字の交じり文で書き、詳細を詳しく説明している。だから、読解対象と文章のスタイルの差から、このブログで私のアプリについて書くのをためらっていた。

However, I think it is time to stop ignoring my app and write at least something about it on this blog. After I initially released the app in 2018, I posted an article about it on one of my friend’s blogs (where I had previously written an analysis of a Classical Japanese poem in around 2016). I also did a presentation on the app and made some handouts for it in 2018. I packaged this promotional material into a zip file to let those who are interested in the origin of this app check it out. In addition, I also put in the original logo which was made by a friend and served as the basis for the current logo. I would like to note that some of this promotional material may have some mistakes in it, since I wrote it so long ago. You can download this content from here.


Going forward, I am not particularly interested in taking Kobun Tomodachi much further. I think it has about as much information and functionality as I would want it to have. However, I also think it is quite likely that I will update it again in the future to maintain support. Even now, I already have ideas on what I would want to change for next time, so I guess we will have to wait and see. But warts and all Kobun Tomodachi was my first real publication on Japanese language, and for that, it will always hold a special place in my heart.


Microsoft Word Right Vertical Zhuyin Tone Mark Workaround マイクロソフト・ワード右縦寄せ注音声調記号ワークアラウンド

Introduction 紹介
To keep myself occupied during the pandemic, I have been reading and translating kanbun 漢文 and kanshi 漢詩. These refer to texts and poetry written in Classical Chinese. Sometimes these Classical Chinese texts were even written by Japanese authors. Following academic tradition, when I translate these texts to English, I render proper nouns into Hanyu Pinyin. Therefore, I would render 蓬莱 as Pénglái or Penglai rather than Hōrai. When I encountered a proper noun while reading a Classical Chinese text, to remind myself of the Chinese reading for my translation, I originally would gloss the character with Pinyin. For horizontal texts, Pinyin is quite easy to read as it follows the same text orientation of the Latin alphabet, e.g. (zhī). But as Classical Chinese is traditionally written vertically, Pinyin can be quite difficult to read. One can write the Pinyin gloss vertically without rotation (zhī) or rotated 90 degrees (zhī). I find both formats quite cumbersome to read. Therefore, I began to use Zhuyin Fuhao 注音符号.


Zhuyin Fuhao 注音符号
Zhuyin Fuhao, also known as Bopomofo or Zhuyin, is a transliteration system for Mandarin Chinese. In 1912 the government of the Republic of China established the Commission on the Unification of Pronunciation, an organization to standardize Mandarin pronunciation and create a Mandarin phonetic system. The system they created, Zhuyin, was used in mainland China until 1958, when Hanyu Pinyin became the official transliteration system of the People’s Republic of China. However, Zhuyin continues to be used in Taiwan for teaching Mandarin pronunciation.


As for the actual system itself, each Zhuyin character represents either a Mandarin initial, or a medial/final. For example, the Zhuyin character ㄇ is the initial /m/ while the Zhuyin character ㄚ is the final /a/. Therefore, the reading of 媽, /ma/, is written in Zhuyin as ㄇㄚ. Here is a chart of Zhuyin characters with their Pinyin and IPA equivalents.



Zhuyin is flexible because it can be written both vertically and horizontally. For example, one can gloss 媽 as ㄇㄚ in horizontal writing and ㄇㄚ in vertical writing. Zhuyin can be considered analogous to katakana, because not only are the characters used for phonetic transcription, but they also originate from Chinese characters. For example, the Zhuyin character ㄖ comes from the Chinese character 日 and the Zhuyin character ㄓ comes from the Chinese character 之.


(I would like to have discussed the promulgation and origin of the Zhuyin characters more in-depth, but I encountered much difficulty in finding primary sources on the creation of Zhuyin and its adoption in the early 20th century. Maybe I will return to this one day.)


I saved the discussion of Zhuyin tones for last because it is the most relevant to this article and thus requires a more in-depth discussion. To indicate tone in Zhuyin four tone markers are utilized. They are as follows.


Tone 声調
1Omitted 省かれた

According to The Manual of the Phonetic Symbols of Mandarin Chinese by the Ministry of Education of the Republic of China, “Marks of the four tones should be noted at the upper-right corner of Bopomofo in both portrait and landscape text.



However, “The mark of neutral tone should be
   a. Noted on the top of Bopomofo in portrait text. For example:
   a. 縦書きでボポモフォの上に記すものとする。例えば

   b. Noted at the very front of Bopomofo in landscape text. For example:
   b. 横書きでボポモフォの最初に記すものとする。例えば

Using Microsoft Word to Gloss Chinese Characters マイクロソフト・ワードで漢字に注音を振る方法
So now that we have a basic understanding of Zhuyin, let’s talk about how I ended up finding this problem in the first place. In addition to kanbun and kanshi, I also have been reading Classical Chinese poems aloud in Mandarin. However, because I find memorizing tones to be quite difficult, I prefer to gloss every single word. Originally, I would gloss them manually, but soon I found that Microsoft Word could do this for me.


Now I’m going to discuss how to install Zhuyin support for Microsoft Word. Because Microsoft Office and Windows are frequently being updated, the process to set up Zhuyin glossing for you may be different than what I will describe. If you are having difficulty, please leave a comment and I will try my best to help.


Microsoft Office does not have any Chinese character dictionary itself, so to provide one you need to install the Windows Chinese (Traditional, Taiwan) language pack. To do such, open Word, click on Review, then Language, then Language Preferences, and Install additional keyboards from Windows Settings. From there, press Add a language, and select Chinese (Traditional, Taiwan).








After that finishes installing, you need to install the Traditional Chinese Language Pack for Office. You can do this by opening Word, clicking on Review, then Language, then Language Preferences, and finally Install additional display languages from From that list choose Chinese (Traditional) and run the file from to install the language pack.



インストールを終えた後、「オフィス用中国語 (繁体字)パック」をインストールする必要がある。インストールするために、ワードの「校閲」をクリックし、「言語」をクリックし、「言語の設定」をクリックして、「Office.comから追加の表示言語をインストール」をクリックする。その表から「中国語 (繁体字)」を選んで、Office.comからファイルを実行する。



Now that that is installed, let’s try to gloss some characters. First, copy and paste some Chinese characters into Microsoft Word. If Word does not automatically set the proofreading language to Chinese (Taiwan), then do such manually by highlighting the text, click Review, then Language, and then Set Proofreading Language. From there select Chinese (Taiwan) and hit Ok.




Now finally, highlight the text and click the Phonetic Guide button.




Thereupon this menu will open.




Word is now using Windows’ Zhuyin dictionary to easily gloss the text for you. From my experience, Word is even pretty good at glossing Chinese characters with multiple Mandarin readings, known as duōyīnzì 多音字 or pòyīnzì 破音字, based on the context of the sentence. The only downside is that this method cannot gloss more than about 30 characters at a time, so it is necessary to click the Phonetic Guide button for each sentence.


So let’s look at how Word glosses Li Bai’s 李白 8th century poem Quiet Night Thought 靜夜思.
Please note that the text orientation is currently horizontal. The first line is 床前明月光.



Overall, this glossing is pretty good in my opinion. I would note that the tone marks are a little high, but that’s just a nitpick. As for the readings themselves, they are also pretty good. Word was able to correctly gloss the duōyīnzì 地 and 頭. The only point of contention is 地上. While I have seen this word glossed as ㄉㄧˋㄕㄤ ˋ(dìshàng) in some sources, the more contemporary pronunciation is ㄉㄧˋ˙ㄕㄤ (dìshang). You can modify the gloss by highlighting the text and pressing Phonetic Guide again.

大抵上手く、注音を振ったと思う。声調記号が少し高すぎるが、まあいいだろう。候補に出てきた北京語の読みもいいと思う。ワードは多音字の「地」と「頭」に正しい読みの注音を振った。問題の一つは「地上」である。この言葉の読みは「ㄉㄧˋㄕㄤ ˋ」(dìshàng)と振られたが、より現代的なの読みは「ㄉㄧˋ˙ㄕㄤ」(dìshang)である。読みを変えるためにテキストをハイライトして、もう一度「ルビ」をクリックする。

After changing the gloss of 地上, here is the result.


But while this looks good, I would argue that Pinyin should be the preferred phonetic guide system of horizontal text, on the basis of how widespread it is. The only reason I am even interested in using Zhuyin is because of the awkwardness of using Pinyin with a vertical text orientation. (That’s also why I will not be commenting on the strange glitches Word has in rendering horizontal Zhuyin). So, let’s change the text orientation to Vertical and see how it affects the Zhuyin. You can change the text direction by going to Layout, and then Text Direction.




This yields


For those unfamiliar with Chinese, you may not notice the difference at first, but compare the Zhuyin in the vertically oriented text to that of the horizontally oriented text. You may notice that the tone of 床 appears to be 4th tone in the vertical text and 2nd tone in the horizontal text. Looking at 舉 reveals the problem: the Zhuyin tone marks are rotated 90 degrees in vertical text orientation. Looking at old Microsoft Community Forums, this bug has existed in Word for a long time. I imagine this is because the user base who would use this feature is so small that there’s little demand to correct it, even if the change in code to do such would likely be quite small. I could get over the tone marks being a tad too high, but the fact that the rotated 4th tone mark looks like the standard 2nd tone mark, and vice versa, is simply unacceptable.


I tried to find some way to fix this bug, but in short of patching Word, I do not think this bug can be fixed by a user. However, I have created some workarounds to this problem which have decent results.


These workarounds come in the form of macros. A macro is a sequence of inputtable computer instructions. For example, there can be a macro that capitalizes the name of every file in a folder. While a human could manually go through each file and capitalize its name, a macro can save time by doing this task automatically. Microsoft Office macros are written in a programing language called Visual Basic for Applications (VBA). I should note that Office macros have been used as a vector to send computer viruses before, so one should always take caution when enabling or running Office macros of an unknown source. To create a macro in Word, go to View then Macros.


このワークアラウンドはマクロである。マクロとはインプットできる命令の順序である。例えば、パソコンでいうと、フォルダーの中にあるファイルの名前の頭文字を大文字にするマクロを作れる。人は手動で各ファイルの名前の頭文字を大文字にすることができるけれど、マクロは自動で頭文字を大文字にすることを可能にするため、使い手の手間が省ける。マイクロソフト・オフィスのマクロはビジュアルベーシック・フォー・アプリケーションズ(Visual Basic for ApplicationsあるいはVBA)というプログラミング言語で書かれた。オフィスのマクロを使う前に、注意するべきことがある。オフィスのマクロを実行すると、ウィルスをもらうことがある。だから起源不明なオフィスのマクロを実行する時には、注意する必要がある。マクロを作るために、ワードで「表示」をクリックし、そして「マクロ」をクリックする。


Then name your macro and then press Create.




This will cause a window titled Microsoft Visual Basic for Applications to open. In this window there will be a text box, and in that text box, you paste the code for your macro.

するとMicrosoft Visual Basic for Applicationsというウィンドウが開かれる。このウィンドウにあるテキストボックスにマクロのコードを貼り付ける。


To run your macro, go to the Macros menu again, select the macro you want, and hit Run.




Before we begin discussing the macros, I would like to note that my macros involve replacing all instances of certain text sequences. While possible to undo the macro’s changes using the standard Word undo button, you will need to press undo for each replaced sequence, which can be very cumbersome. So, I recommend backing up your original text before running any of these macros, especially when you are not exactly sure which macro would best suit your needs.


So, without further ado, let’s get macroing.

Workaround #1: Reverse Tones 2 and 4 ワークアラウンド#1:第二声記号と第四声記号を交換
As I said before, the most egregious problem of having the rotated tone marks is that the rotated 2nd tone mark looks like the standard 4th tone mark and vice versa. While the 3rd tone mark looks unappealing, at least there is no confusion in what it is. Therefore, the simplest solution is to change the 2nd tone marks into 4th tone marks and the 4th tone marks into 2nd tone marks. To do this
   1. I replace all 2nd tone marks (U+02CA) with the string “UPTONE”
   2. I replace all 4th tone marks (U+02CB) with 2nd tone marks
   3. I replace all instances of the string “UPTONE” with 4th tone marks

   1. すべての第二声記号(U+02CA)を「UPTONE」という文字列とする
   2. すべての第四声記号(U+02CB)を第二声記号とする
   3. すべての 「UPTONE」という文字列を第四声記号(U+02CB)とする

Keep in mind that if your document has the string “UPTONE” in it, that string will be changed into a 4th tone mark. Also, in case you do not like the results, running this macro again will revert the tones to their original form. This macro also runs very quickly because it only needs to search the document three times.


The output of this macro is


The code for this macro is

Sub ReverseTones_Bopomofo()
Dim rngStory As Range
For Each rngStory In ActiveDocument.StoryRanges
With rngStory.Find
.Text = ChrW(714)
.Replacement.Text = "UPTONE"
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
With rngStory.Find
.Text = ChrW(715)
.Replacement.Text = ChrW(714)
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
With rngStory.Find
.Text = "UPTONE"
.Replacement.Text = ChrW(715)
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
Next rngStory
End Sub

While not perfect, someone with a rudimentary understanding of Zhuyin tone marks should have no difficulty in identifying the tone of the glossed character. But even if understandable, the rotated 3rd tone mark still does not conform to the Zhuyin standard. The quest to properly rotate the 3rd tone mark led me to Workaround #2.


Workaround #2: Use Combining Characters ワークアラウンド#2:結合文字の使用
Without going into character encoding too much, a Combining Character is a diacritic that combines with another character. For example, the combining character ◌̅ (U+0305) can combine with ‘e’ to create e̅. These contrast with Spacing Modifier Letters, the type of independent characters which the Zhuyin tone marks are. And fortunately, when the combining characters ◌̀ (U+0300), ◌́ (U+0301), and ◌͐ (U+0350) are rotated 90 degrees, they look very similar to the Zhuyin 2nd, 4th, and 3rd tone marks respectively.


But to which characters will these Combining Characters combine? For the output most accurate to the Zhuyin standard, they should combine with the final Zhuyin character. The simplest way to do this is to look for every combination of Zhuyin final and Zhuyin tone mark. This means that the entire document needs to be searched sixty-nine times. Therefore, this macro can take some time to run.


The output for this macro is


The code for this macro is

Sub Vertical_Bopomofo()
Dim rngStory As Range
For zhuyinchar = 12563 To 12585
For Each rngStory In ActiveDocument.StoryRanges
With rngStory.Find
.Text = ChrW(zhuyinchar) & ChrW(714)
.Replacement.Text = ChrW(832) & ChrW(zhuyinchar)
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
With rngStory.Find
.Text = ChrW(zhuyinchar) & ChrW(715)
.Replacement.Text = ChrW(833) & ChrW(zhuyinchar)
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
With rngStory.Find
.Text = ChrW(zhuyinchar) & ChrW(711)
.Replacement.Text = ChrW(848) & ChrW(zhuyinchar)
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
Next rngStory
End Sub

I like how the lines are much closer together than when using Spacing Modifier Letters, but unfortunately this leads to the tone marks occasionally being quite difficult to see. For example the tone mark on 是 can be easy to miss, especially if one prints out the page. On top of this, there is the quite lengthy search process. To combat a few of these issues I present Workaround #3.


Workaround #3: Use Combining Characters after the Final ワークアラウンド#3:声母後における結合文字の使用
To help alleviate the problems of the second macro, difficulty in seeing the tone marks and how long it needs to run, I created my third macro. This macro uses Combining Characters, but instead of attaching them to the Zhuyin final, they are instead attached to a whitespace character (U+0020) which is appended after the final. This macro only searches the document three times, so it is quite fast.


This macro results in


The code for this macro is

Sub Vertical_ModifyLast_Bopomofo()
Dim rngStory As Range
For Each rngStory In ActiveDocument.StoryRanges
With rngStory.Find
.Text = ChrW(714)
.Replacement.Text = ChrW(832) + " "
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
With rngStory.Find
.Text = ChrW(715)
.Replacement.Text = ChrW(833) + " "
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
With rngStory.Find
.Text = ChrW(711)
.Replacement.Text = ChrW(848) + " "
.Wrap = wdFindContinue
.Execute Replace:=wdReplaceAll
End With
Next rngStory
End Sub

I don’t think this macro completely resolves the difficulty in seeing the tone marks, for example 床, but for other characters, like 是, it does help.


Conclusion 結論
At the end of the day, these macros are only workarounds to the problem. If you want true Zhuyin support, you are likely better off using another program rather than Microsoft Word. However, if you just need basic vertical Zhuyin support, I think my macros make Word an option. They may not have the cleanest execution, but at least their results do adhere to the Zhuyin standard. I also would like to note that I have very limited experience in VBA, so I am sure that there is some way to make this code more efficient or the output more presentable. If you have any suggestions, please leave a comment. Also, if you are interested in getting this bug fixed, feel free to leave feedback within Microsoft Word (Help, Feedback, and then I don’t like something) and also upvote this UserVoice suggestion.




I am pessimistic that this issue will be resolved because it affects so few people, but because I imagine the fix to be quite simple, who knows, maybe we’ll get lucky.