Early Japanese Diplomatic Codes: 1874-1933

Table of Contents

*Japanese Diplomatic Code of 1874 ... a small code of some 261 elements

Some Domestic Ciphers: 1876-1879 ... substitution ciphers for kana syllabary

*Japanese Diplomatic Code of 1885 ... five-digit code words for English vocabulary

*Kana Code of 1904 ... two digits representing kana

**Dictionary Code Words (Used in 1904) ... dictionary code words for English vocabulary

An Encoded Telegram from 1918 ... ten-letter code words (in transmission)

Japanese Codes Broken by Yardley's Cipher Bureau: 1919-1929

Variety of Diplomatic Codes: 1919-1933


Another article in Japanese supercedes the sections marked with * and supplements the section with **.

Japanese Diplomatic Code of 1874

There is a document titled "Enlargement and Revision of Telegraphic Code" from 1874 (Meiji 7) in the Diplomatic Archives of the Ministry of Foreign Affairs of Japan (formerly known as the Diplomatic Record Office in English) (JACAR (Japanese Center for Asian Historical Records) Ref.A03030300000). The enlarged code is said to have three additional lines and six revisions in red, which, however, cannot be identified in the grayscale photos.

The first page has a table with rows and columns numbered 1-9 for expressing kana (Japanese characters) and some words and names with two-digit combinations. Kana characters to be expressed are in regular I-RO-HA order, followed by kana with voiced consonants in A-I-U-E-O order.

The next two pages continues the table with 20 further columns headed by A-V (no I and U) as well as slightly disordered kana ("i" "ro" "mo" "ho" ...) to express further words and names with letter-digit combinations. Thus, "mo-7 to-1" represents ("denshin shodakuseri" [telegram accepted]). Presumably, alphabetic letters could be used instead of kana (e.g., "C7 F1").

Some Domestic Ciphers: 1876-1879

The 1876 revision of the secret communication system of the Ministry of War introduced a substitution cipher as follows.

Plaintextirohanihohetochirinuru...
Cipherchitohehoniharoisusemo...

While the plaintext lists kana (Japanese characters) in ordinary I-RO-HA order, their substitutes are arranged in reverse I-RO-HA order and shifted by 7. The number of characters in the shift may be other (odd) numbers, providing a polyalphabetic substitution with 22 different substitution tables called "1st cipher", "2nd cipher", etc.

The 1877 revision of the Home Ministry cipher used a substitution table of associating characters in A-I-U-E-O order with those in reverse A-I-U-E-O order.

The Papers of Mishima Michitsune include several ciphers for communication between the Home Ministry, the Finance Ministry, and prefectural governor's offices. Among simple substitution ciphers as above, there is a cipher wheel issued in November 1879 (Meiji 12). The inner wheel has kana in regular I-RO-HA order for plaintext characters, while the outer wheel has the same sequence of kana in reverse order. The association between plaintext characters and cipher characters may be changed by rotating the wheel. Under the first character "i" of the inner wheel is an opening, through which one of numbers 1-5 can be seen, which could be used to identify one of five possible variations. (Somehow only five positions were used.) (See here and English version.)

Papers discovered in 2014 include a similar cipher wheel used by Iwakura Tomomi in 1877 (See, e.g., here (English) and here (Japanese)).

Japanese Diplomatic Code of 1885

A Guideline for a Japanese Diplomatic Code of 1904 (translated below) refers to an Old Code of 1885 (Meiji 18) for English vocabulary. (At the time, Japanese diplomats used English in coded telegrams (Inaba 1993 p. 241).) It had the following code words:

15768 and
15314 are
22534 by
24443 characterized
52115 Japanese Gov't
57941 moderation
66719 The proposals of
68448 reasonableness
93657 8

Apparently, this assigns five-figure code numbers to alphabetically arranged plaintext words and phrases. Towards the end of the volume, Page 863 provides code numbers for numerals such as "93657" for 8.

If the full range of 00001 to 99999 was used, the size of the vocabulary was comparable to that of leading commercial codes at the time such as the A1 Code of 1887/1888 (almost 88,000 entries) and the ABC Code (5th ed.) of 1901 (about 103,000 entries). It should be noted, however, most commercial codes in Britain and America, including the A1 and ABC, used code words, which were less prone to transmission errors than figures (see another article).

Kana Code of 1904

In 1904 (Meiji 37), a new code for kana was prepared. It encodes kana with two-digit code numbers, which were to be regrouped into five-figure code groups for transmission. (Telegraphing figures was charged at a rate of five figures to one word.)

The assignment of numbers appears to follow I-RO-HA order of kana syllabary (i ro ha ni ho he to chi ri nu ru wo ...) in a boustrophedon manner. Kanas with identical sound in modern Japanese are not repeated. For example, 88 is not "wo" but "o" and, between "no" and "ku", "o" is omitted. These omissions as well as the A-I-U-E-O order in the section of voiced consonants (ga gi gu ge go) are also seen in the 1874 code above.


It is interesting to note that the sole use in the example given in the guideline (translated below) for the code number "93" for the numeral "one" is in the context "ichi-chi", which represents a Japanese word "itchi" (agree), which is written with a kanji "ichi" (meaning "one") and another "chi".

The guideline specifies that an English text in a telegram should be encoded with the old code of 1885. The mixture of two-figure code numbers for kana and five-figure code numbers for English text was to be regrouped into five-figure code groups for transmission, as shown below, where bars (|) show the boundaries of semantic units. Five-figure code words, separated across the five-figure groups, are shown with color.

69|37|7 9|68|57 06|45|0 3|15|29 70|936 82|667 44|521 40|153 39|244 68|225 59|579 66|157 93|684 73|69|0 9|37|39 88|09|7 6|09|74 38|15|6 6|09|93 79|25|8 9|7070

Copies of this same code were provided to the legations and consulates overseas. The idea of using separate codes for different channels was not yet introduced.

A Guideline for the Japanese Diplomatic Code of 1904

The following is a translation of the full text of a guideline for the Japanese Diplomatic Code of 1904. (The interlined format of the encoding example is not reproduced here.) The original is in the Diplomatic Archives (7-1-4-27). (JACAR B13080297300)

While the newly prepared Kana Code in Versions A and B on separate sheets, from No. __ to No. __, each __ copies, is being delivered, it shall be used in accordance with the following guideline for use.

1. The present code can be used from the day it is received.

2. Version A should be used during six of the months: January, March, May, July, September, and November and Version B should be used during six of the months: February, April, June, August, October, and December until otherwise instructed.

3. The present code constitutes one kana [character] (and several frequently used words and phrases) with two digits. When transmitting an encrypted telegram, they should be divided into five-figure groups. When there is an extra less than five, the vacancy should be filled with 7070 when Version A is used and with 5858 when Version B is used.

4. In preparing a telegram with the present code, for portions desired to be written in English, the numerical codes of the Code of the 18th Year of Meiji [the 1885 Code] (referred to as the Old Code) should be used with it, in which case the numerical codes corresponding to the word communicated plus 25 should be used.

5. In the case of Paragraph 4, a numerical code for "the following ... words are in the Old Code", that is, 70 for Version A and 58 for Version B should be placed where transition to English is about to be made. The above-mentioned "... words are in the Old Code" does not refer to the number of words in the original writing but the number of code words, for which the number desired is to be found on page 863 of the Old Code and is augmented by +25 according to the provision of the last clause of Paragraph 4 and the resulting number shall be used.

6. When the present code is used, the frequently used opening "Kiden dai 20 go (or 10-gatsu 11-nichi zuke kiden)" [Your telegram No. 20 (or Your telegram dated 11 October)] etc. should be written in plaintext as "Kiden 20" or "Kiden 11/10" and "Our telegram No. 20 (or Our Telegram dated 11 October)" should be written in plaintext as "Ohden 20" or "Ohden 11/10".

7. Since the numerical codes of the Old Code are organized in five digits, when they are used with the present code, even numbers less than ten thousand must always be padded at the beginning with such a number of zero(s) enough to make them five digits. For example, 01356 00456 00032.

8. An example is shown below.
When a minister to the United States sends the Foreign Minister a telegram "Regarding your telegram No. 35, newspapers here all agree in reporting 'The proposals of the Japanese Gov't are characterized by moderation and reasonableness.'", its use with the present code results in the following.

Kiden 35 (or 12/10, i.e., 12 October)
69|37|7 9|68|57 (to-u-chi-ka-ku) 06|45|0 3|15|29 (shi-n-bu-n-ha) 70(the following ... words are in the Old Code) 93682(8 (augmented with +25)) 66744(The proposals of) 52140(Japanese Gov't) 15339(are) 24468(characterized) 22559(by) 57966(moderation) 15793(and) 68473(reasonableness) 69(to)|0 9|37|39(i-u-ni) 88|09|7 6|09|74(o-i-te-i-zu) 38|15|6 6|09|93(re-mo-a-i-ichi[1]) 79|25|8 9(chi-se-ri)|7070(padding number)

9. The present code has been sent to the twelve legations in Britain, Russia, France, Germany, the United States, Austria, Italy, Spain, Belgium, the Netherlands, China, and Korea in the first despatch and is to be sent to other legations and consulates when opportunity arises.

3 February 1904
Foreign Minister, Baron KOMURA Jutaro

Modification

Instructions from Foreign Minister Komura Jutaro, effective on 1 May 1905, slightly modified the code by incrementing every code number of the Kana Code (Versions A and B) by 1. Accordingly, the code number for "denpo" (telegram) is changed from 30 to 31 for Version A and 18 to 19 for Version B. The code number 99 (representing kana "nu" in Version A and word "kuni" (country) in Version B) is incremented to 00. (When in 1906 Prime Minister Saionji Kinmochi repeated the above guideline to Sugimura Fukashi, minister resident in Brazil, he used incremented code numbers in the example encoding.)

Dictionary Code Words (Used in 1904)

Inaba (1993), p.251, prints an encoded telegram No. 23 dated 17 October 1904 from Mitsuhashi Nobukata, minister to the Netherlands, to Komura Jutaro, Foreign Minister, together with the plaintext (the original draft before encoding (Inaba 1993 p.247, 241)). It was one of many telegrams in code found in the Russian archives in the 1990s (Inaba 1998 p.1, 7), while there are few extant in the Diplomatic Archives in Japan (Inaba 1998 p.7, p.20 n.26). Those telegrams were intercepted at Paris and somehow decoded by the French and were provided to Russia (Inaba 1998 p.2, 9-13; Inaba 1993 p.241-242). While telegrams in other codes (several five-figure codes) have also been found in Russia, this is the only code broken (Inaba 1993 p.241, 240; Inaba 1998 p.10).

23 iapyx deteriorar aienabile querqueros reinrneht medusian fulsimas essnerar pargneau piromncies gawbries esturbato fatieins fashianist reprimand involarla chrenbild deplantabo Mitsuhashi

23 In reference to your telegram 18 / the / invitations / were destined / only / recent[l]y / no answer has yet been received / but it is believed that / no objection / will be raised / by any / signatory [?] / and/ I hope that / full powers / will be sent/ in any time. Mitsuhashi

The code words, ranging from 5 to 10 letters, look like real words in some European languages. Use of such code words from any of the eight languages of English, French, German, Italian, Dutch, Portuguese, Spanish and Latin had been common among commercial codes after it was expressly allowed by the international telegraph regulation of 1879. The maximum length of 10 characters had been adopted (for messages outside Europe) in the 1875 regulation. (See another article.)

"Iapyx" at the beginning may be some indicator as the one for indicating time and date in the US Blue Code (see another article). (Later, Yardley (1931) (p.41) observed "For over a year the Tokyo Government had prefaced their code telegrams with a series of letters. From the text of the telegrams it was clear that the letters had nothing to do with the messages themselves. For this reason no one considered it worth while to attempt to find out what they meant." but found the first two groups in one message indicated "Message No. 186, sent on 13th (February 1927), referring to Message 181" (see below).)

Inaba (2011), p.18-19, prints another encoded telegram, again found in Moscow, dated 28 October 1904 from Motono Ichiro, minister to Paris, to Komura Jutaro, together with its decoding in French (found in Moscow) and its original microfilmed from the records in Japan. (I found this article after releasing the first version of the present article, which necessitated correction of my original conjecture. -- 24 January 2014)

The image below presents code words occurring in the above two telegrams with corresponding plaintext words/phrases/syllables. The assignment of the plaintext to the code words is my conjecture. The image also includes a listing of code words in alphabetical order. Although entries in red may require further study, the code words from the two telegrams sort well together. The two telegrams even use the same code word "reprimand" for the plaintext "and." Thus, we may safely conclude that the Foreign Minister used the same code with ministers in the Netherlands and France. Further, the code appears to include sections of dates and numbers, place names, words and phrases in reverse alphabetical order (of core words), and letters and syllables again in reverse alphabetical order.

For those familiar with classic diplomatic codes as used by Thomas Jefferson or Louis XIV, the fragmentary plaintext ("are said to be", "among them", etc.) represented by these code words may look strange. But this was the kind of vocabulary employed by many commercial codes for telegraphy at the time. In order to reduce cable cost, commercial codes included many frequently used phrases, including fragments to be completed by the following code words.

Similarity to commercial codes was so evident that Manuilov, virtually the head of Russian agents in Paris (Inaba 1993 p.232, Inaba 1998 p.5), purchased a general English codebook and tried, without success, to break this kind of Japanese code in June 1904 (Inaba 1998 p.10).


References for this Section

Chiharu INABA (1993), "Roshia kokuritsu monjokan ni miru Nichiro senso chu no nihon kanren monjo: Roshia himitsu keisatsu ni nusumareta denpo, shokan", Shakai kagaku tokyu, No. 112, p.231 ff., March 1993 (in Japanese) (稲葉千晴(1993), 「ロシア国立文書館にみる日露戦争中の日本関連文書」, 社会科学討究)

Chiharu INABA (1998), "Franco-Russian Intelligence Collaboration against Japan during the Russo-Japanese War, 1904-05", Japanese Slavic and East European Studies, Vol. 19 (open access)

Chiharu INABA (2011), "Japanese Ciphers and Codes during the Russo-Japanese War (1904-05)", Urban Science Studies, No. 16, 2011, pp. 17-24 (in Japanese) (稲葉千晴(2011), 「日露戦争中の日本の暗号」, 都市情報学研究)


For an overview of commercial codes, see another article. Some of the codebooks available online are as follows:
Slater (1870) 24000 words (no phrases); five-digit code groups (not code words).
Bloomer (1874) About 7500 words and phrases; code words and serial numbers.
Anglo-American (1891) About 27000 words and phrases; code words and serial numbers.
ABC4 (1881) About 25000 words and phrases; code words and serial numbers.
Armand Coste (1888) 30000 entries for words/phrases/syllables.
Lieber's (1896) About 75000 words and phrases; code words and serial numbers.
Westinhouse (1902) About 45000 words and phrases; code words and serial numbers.
Bentley (1906) 30000 words and phrases; five-letter artificial code words (no serial numbers).

An Encoded Telegram from 1918

It appears that there was no convenient code for overseas transmission (in alphabetic letters) of telegrams in Japanese during the 19th century. Those firms engaged in international trade appear to have used codebooks in English such as ABC, A1, and Western Union, supplemented with some private code, according to a 1907 business manual in Japanese (Digital Library from the Meiji Era, p.84).

A new possibility was opened by the 1903 revision of the international telegraph regulations, which allowed any pronounceable letter groups up to ten letters. For English codebooks, the revision resulted in a practice of combining five-letter code words into ten-letter groups, thereby reducing the telegram cost by half (see another article). Herbert O. Yardley, the very person who led a team that broke Japanese diplomatic codes (see below), published a commercial code with such five-letter code words (Universal Trade Code) in 1921.

For Japanese telegrams, the 1903 revision opened a way for transmission of encoded Japanese message in ten-letter groups, in contrast to traditional code words that looked like real words and had varying length as used in the 1904 telegram above.

Below is one of the few encoded telegrams extant in the Diplomatic Archives. The telegram, dated 23 May 1918, is from Foreign Minister Goto Shinpei.


While five-letter code words were used by a naval attaché codes, diplomatic codes such as Ja, Jb, Jc, and Jp used two-letter code words (plus four-letter code words for Jp) combined into ten-letter groups for transmission. The ten-letter code groups in the above may be something similar to Ja, Jb, and Jc.

Two-letter code words grouped into ten letter groups for transmission appears to have been used at least as early as 1916. An epistle of the Foreign Minister to the consul general in Sidney (JACAR Ref. B13080294100) censured that the beginning of a received telegram "oyraon 28 otodanodve" should have been "oyraononut otodanodve", without interspersing plaintext, explaining that succession of codes for five digits would cause no trouble in decoding. This appears to indicate a digit is represented by two letters such as "on" (2) and "ut" (8).

Japanese Codes Broken by Yardley's Cipher Bureau: 1919-1929

Yardley's Cipher Bureau, known as the American Black Chamber, solved many of the intercepted Japanese codes which they designated Ja, Jb, Jc, ....

Ja, the first solved, was a simple code used only for low-level communications (Kahn p.68), representing Japanese characters with two-letter combinations:

AS FY OK   WI UB PO MO IL RE   RE OS OK BO    RE UB BO
 o wa ri    a  i ru ra  n do   do ku ri tsu   do  i tsu
 (end)         (Ireland)      (independence)  (Germany)

But this does not mean its solution was simple, too. In the first place, since the code words were transmitted in ten-letter groups, it had to be established first that the code groups consisted of two letters rather than five letters etc. (Occurrence of the sequence "BA IL LY" beginning at the 1st, 3rd, 5th, 7th, or 9th letter in a ten-letter group strongly suggested that the code unit consisted of two letters (ASA3, Wayne G. Barker (ed.), The History of Codes and Ciphers in the United States during the Period between the World Wars, Part I. 1919-1929 p.96).) Kahn tells how Yardley's team broke the code (the breakthrough came in the night of 12-13 December 1919) with additional details to fill the gap in Yardley's own narrative (Kahn p.64-67). The feat is all the more impressive because while a member of the team had some knowledge of the Japanese language, a translator for the decoded message could be found only after the solution (Kahn p.67-68).

The first codes solved contained some 200 to 250 code groups for seventy-three kana and some frequently used words (Kahn p.68). This means they were no larger than the old code of 1874, which had 9*29=261 entries.

Jb was merely a rearrangement of Ja into eight alphabetical sequences (Kahn p.68).

Jc was a two-letter code solved by May 1920 (Kahn p.68; ASA3 p.100).

Jd appeared to be a naval code and was not solved as of May 1920 (Kahn p.68; ASA3 p.100).

Je was a code in English, solved by May 1920 (Kahn p.68; ASA3 p.100).

Jf was a military attache code and was solved several months before September 1920, though no significant intelligence was found (ASA3 p.100-101).

Jg contained about 700 code groups. More than half were three-letter code groups, which all turned out to begin with V, W, X, Y, or Z (e.g., VAB, VAC, ..., VUZ, etc.). (Kahn p.260, 69)

Jh was first thought to contain one hundred thousand groups but the figure was later halved. It turned out to have an English vocabulary (Kahn p.69; ASA3 p.101). It was used with or without encipherment. When enciphered, messages might be either in digits or letters. By the end of March 1921, about 2000 code groups had been identified and encipherments had been identified (ASA3 p.101).

One English code that had been solved as of December 1921 (which may be Jh) was reported to employ an "ingenious method" for defining an ad hoc code word within a message. Specifically, a frequently occurring long phrase is surrounded by a blank five-letter code word on its first occurrence and thereafter the phrase could be represented with the single blank code word. Yardley appreciated this scheme by saying "It is possible that such a system might be profitably used by our own Communications Section. As far as I know it is entirely original with the Japanese and certainly presents a very difficult problem to the cryptographer." (ASA3 p.108)

Ji, as well as Jj and Jl, was first thought to contain thirty thousand to fifty thousand code groups but the figure was later halved. Ji was a naval attaché code and had five-letter code words, including those representing numerals from 1 to 50 and letters A-Z. (Kahn p.69) Five-letter code words were prevalent from 1904 among commercial codes because they could be combined into ten-letter code groups for reduction of the cost by half (see another article). Yardley described his success with this code in June 1921. Even where superencipherment was used, it did not forestall the codebreaking (Kahn p.69).

Ji had five-letter code groups for numerals and letters for the alphabet. That ODWOK represented "34" was discovered by a plaintext message from Tokyo to Washington acknowledging that message number 34 had been received. A spelled out English word "p-r-o-t-o-c-o-l", with three occurrences of o, allowed identification of the entire Roman alphabet (ASA3 p.105).

Jj, also a naval attaché code, could not be solved because of its size (Kahn p.69; ASA3 p.105-106).

Jk was a military attaché code, of which first translations were reported on 13 September 1920 (ASA3 p.101).

Jl had, as with Jh, English vocabulary. It had code words such as AFFECTED, BUCHILLE, PLANXIMUS, RACINOLANO, etc. After it was discovered that assignment of code groups to plaintext was in reverse alphabetical order, about 200 groups were identified. (ASA3 p.102 says it was for "diplomatic traffice", p. 115 says it was the only system "known to be used by the Japanese Treasury") Such reverse alphabetical order for an English-vocabulary code is also seen in a diplomatic code used in 1904 (see another article).

Jm was another military code, which was made readable by 8 January 1921. It used one of eleven keys for encipherment. From the beginning of 1921, a new code baffled the attempts of the Cipher Bureau for nearly a year. In contrast to Jm, it employed code switching at irregular intervals of a little less than four lines on average. Jr, Jn, and Jq below belong to this new sereis. (ASA3 p.101, 107, 109, 115)

Jp appeared in the summer of 1921. This was a combination of two-letter and four-letter code groups, transmitted in ten-letter groups. This was the code used in instructions to the Japanese plenipotentiary to the Washington Naval Conference (see another article).

Jr was an army code and consisted of eleven different small codes (JR1, JR2, ..., JR11). The sum of the digits of the serial number (transmitted in plaintext) of the message indicates the sub-code used. Thus, a telegram numbered 52 used the sub-code JR7. There may be a switch to another sub-code in the midst of a message by indicating the switch with a special indicator code group. This code was broken by May 1922. From November 1921, it appears a new indication system was introduced for many messages were no longer numbered. (Kahn p.85-86; ASA3 p.106-107, 115)

Jn and Jq were army codes similar to Jr. Jq was only used for routine messages. (Kahn p.85-86)

Js, Jt, and Jz were naval attaché codes. Js was a one-part code in English. Jt was never solved. Jz was based on syllables as with diplomatic codes. (ASA3 p.115)

Ju and Jv were diplomatic codes. The first message in Ju was solved in September 1924. (ASA3 p.112)

Jw was an English-language code, solution of which was nearly completed as of August 1924. This code was changed every two months and the versions were dubbed JWA, JWB, etc. (ASA3 p.112)

An English-language code, probably Jw, was broken in 1924 by a clerk, who correctly suspected that encrypted telegrams would include quotations from the New York Times (Kahn p.86; ASA3 p.113). In 1927, one message encoded in Jw conveyed a memorandum from the US Secretary of State to the Japanese Foreign Office (Yardley (1931) p.40).

Jx was also a diplomatic code. Messages in it were first intercepted in January 1925. This was never solved (ASA3 p.114).

Jy and Jaa were also diplomatic codes. In 1927, for each of them, more than 100 messages were intercepted and about 70 were translated. (ASA3 p.114)

Jz was a naval attaché code. As of 1927, 18 messages in it were intercepted but it was still under study. (ASA3 p.114)

Jbb was the 28th Japanese code broken by Yardley's bureau. Yardley (1931) provides its specimen "FAFEOZIDNY VAFEITQUPU EXAPAJJAJI AGENCICIJI FOLOUKRAAZ OJEGJEGLU ATNIOWUDJI ..." from 1927. With the materials already deciphered, the message was found to concern the Chinese situation. After additional efforts, the beginning was deciphered as FA(0)FE(1)OZ(86)ID(-)NY(13th) VA(my)FE(1)IT(81), that is, Message No. 186, sent on 13th (February 1927), referring to an earlier message No. 181. (Yardley (1931) p.40)

Jcc and Jdd were diplomatic codes. As of 1927, 60 and 18 messages were intercepted respectively but they were still under study. (ASA3 p.114)

Jee was a diplomatic code. (ASA3 p.115)

Despite successes, cryptanalysis against Japan dwindled from 1923 partly because of loss of fund and staff (Kahn p.86; ASA3 p.110 etc.). In June 1929, the Cipher Bureau still solved what Yardley considered was "a series of important code messages" but soon thereafter the Cipher Bureau was dissolved by the new secretary of state Henry L. Stimson, who considered "Gentlemen do not read each other's mail" (Kahn p.97-98).

References for this Section

David Kahn (2004), The Reader of Gentlemen's Mail, Herbert O. Yardley and the Birth of American Codebreaking

Army Security Agency [ASA3], (ed.) Wayne G. Barker (1946, 1979), The History of Codes and Ciphers in the United States during the Peiord between the World Wars. Part I. 1919-1929, A Cryptographic Series

Herbert O. Yardley (1931), "Double-Crossing America", Liberty, 10 October 1931, pp.38-42

Variety of Diplomatic Codes: 1919-1933

Several classes of Japanese diplomatic codes are described in a memorandum dated 9 October 1933 by Sakuma, Chief of Telegraph Section of the Foreign Ministry, titled "Regarding the Russian Disclosure of Purported Secret Documents of Ours" (JACAR Ref. B12080889500 p.20-36; printed in Sakuma (1998)), which describes several codes during 1919-1933.

Since the Paris Peace Conference (1919), the Foreign Ministry used new-style codes, which were simpler than complicated older codes. The older codes probably refers to codes consisting of dictionary code words as used in 1904, not the ten-letter groups from 1918 in the above image.

The first of such new codes was a two-letter code (about 100 entries). Rarely used as of 1933.

Then came hybrid two-letter/four-letter codes, including from about 400 entries at the beginning to about 1000 in 1933. This class encompassed most of the codes, including NI and OTSU currently used as of 1933. The NI Code (about 1000 entries) had four sections alternately used for three months each.

A relatively simple one of this class (about 800 entries) is the Jp code broken by Yardley (see above). It was called YA (Kahn (2004) p.69; Kahn (1967) p.358, probably after Yardley p.289, says YU; though Kahn (2004) does not say the reason of his change, he may have some evidence that Yardley mixed up Jp with "Japanese air force code: Yu" mentioned on Page 281 of Kahn (2004)).

There was also a hybrid two-letter/three-letter code (about 700 entries) for top secret matters. Probably, this corresponds to Jg, as designated by Yardley.

For the London Naval Conference (1930), new codes were prepared for top secret matters: I and RO.

The I Code was quadruplex, meaning that each entry had four code words, which were alternately used according to a certain key. It took about two hours for encoding a telegram of about five lines.

The more tractable RO Code was a three-letter code (about 1200 entries). But it was inferior to the hybrid two-letter/four-letter code in security, telegram cost, and mutilation detection. After an incident of telegram theft at the embassy in Turkey (at the end of June 1932), its use was discontinued.

The RO Code for top secret matters was replaced by the HA Code, which was a three-letter/four-letter code (about 1200 entries). It was duplex, meaning that each entry had two code words. Its use started in August 1931 in the embassy in Russia, Vladivostok, and Harbin and in June 1932 in other places. However, its complex structure required two or three times the encoding time for the NI Code. As of 1933, the most used codes were this HA Code for top secret matters and the NI and OTSU Codes otherwise.

The Russian disclosure on this very day (9 October 1933), which motivated this memorandum, revealed that the HA Code was known to Russia. When the compromise had been suspected in May, it was decided to adopt superencipherment (with a memorable key) used by the Navy and, as of October 1933, Special Use for the NI Code was about to be delivered to overseas establishments. (Similar Special Use was also specified for the HA Code except for establishments in Russia (JACAR Ref. B12080889500 p.48).) (The arrangement of such special use could be relatively easily conveyed to Europe and North America because several establishments had been provided with a cipher machine, the one called RED by the US codebreakers. For Manchuria and China, couriers should be sent. For South America, it was considered unnecessary to make such an arrangement for the time.) The HA Code would be immediately discontinued for establishments in Russia and as soon as the Special Use for the NI Code was delivered for the other parts. However, experiment showed that superencipherment would increase the time taken for encoding by three to five times and the time for decoding by eight to ten times. It could only be expected to be used for the most sensitive portions.

There was another quadruple, hybrid two-letter/four-letter code compiled at the time of Su Bingwen Incident (Wikipedia) in China. It had not yet been used and would be delivered overseas.

Hierarchy of Codes

Use of these various codes was specified in instructions dated 17 September 1931 as follows (JACAR B12080889500 p.10 ff.; draft B13080930600 p.76 ff.).

(a) For telegrams that requires absolute secrecy: the I Code or Special Use of the HA Code;

(b) For telegrams whose content is known or will be known to the other country: Ordinary Use of the HA Code, Special Use of the RO Code, or the NI Code;

(c) For telegrams whose verbatim content is known or will be known to the other country: Ordinary Use of the RO Code, the TEI Code, the U Code, or the NI Code, etc.

(d) For telegrams not so secret or encoded just for cost reduction: the L Code; and

(e) For telegrams for passport control etc.: the HO Code, being delivered.

Regarding the Japanese designations of these codes, I, RO, HA, NI, and, HO are the first five letters of the old Japanese syllabary. OTSU and TEI are part of the traditional ten calendar signs (KO, OTSU, HEI, TEI, ...) and are often used to mean "Class-B" or "Class-D." "U" and "L" are those of the Latin alphabet.

Courier Rather Than Code

In the age of manual encryption, higher security of codes meant more labor in encryption/decryption. In July 1923, in view of higher complexity of the recently introduced code, it was decided to use couriers to carry documents between the Foreign Ministry and establishments overseas. (JACAR B12080839900) As of April 1924, however, the ambassador in Rome was still urging the Foreign Minister to put it into practice because the small staff were spending too much time and labor due to complicated telegraphic code (JACAR B12080840000). In 1934, it was proposed to "revive" the courier system in Europe in view of concern of security of mail in Germany, Italy, etc. with an explanation that it would ameliorate the delay in decrypting telegrams (JACAR B12080839800).

Codes Broken by US Codebreakers

While Yardley's Cipher Bureau was abolished in 1929, Japanese secret telegrams were studied by the Signal Intelligence Service (SIS) under William F. Friedman of the Army Signal Corps, established in April 1930.

The LA Code, so called by US codebreakers from its indicator prefix, was a passport code like HO and was in use since 1925. One typical message it conveyed was "The following has been authorized as the year-end bonus for employee typists of your office." It was a two-letter/four-letter code representing kana and some combinations with two-letter groups (vowel+consonant or consonant+vowel combinations). For example, ki, to, ka+n, and 4 are represented by CI, IF, CE, and ZO, respectively. There was regularity in the assignment: every kana ending in -e was represented with a code group beginning with A (ke=AC, se=AD) and every kana beginning with k- was represented with a code group having C. Examples of four letter groups are TUVE (dollars), SISA (consul), XYGY (Yokohama). (Kahn (1967) p.14-15)

More complex than this was the PA Code as designated by US codebreakers. Although this may belong to a later period, the RO or NI Code may be similar to this. This was a two-letter/four-letter code but had a larger vocabulary than LA in irregular order. One message was as follows, with two encoding errors spotted by Kahn.

BYDH (4th) DOST (gogo) JE (1) YO (kei) IA (jun) OQ (() GU (ho) RA (no) HY (ru) HY (ru) UQ ()) VI (kata) LA (ta) [sic] YJ (hayaku) AY (shutsu) EC (ko) TY (re) [sic] FI BANL (Morimura)
(On the 4th [December 1941], at 1 o'clock pm, a light cruiser, the "Honolulu" class, hastily departed -- Morimura)

Its superenciphered version was designated as PA-K2 by US codebreakers. To apply superencipherment, first the encoded message (here padded with an extra "I" to make the final five-letter group complete) is written letter by letter under a key number (a sequence of numbers).

key:10151116281517371319418612914
BYDHDOSTJEYOIAOQGUR
AHYHYUQVILAYJAYECTY
F IB AN L I

Transcribing line by line according to the order of the numbers in the key sequence gives a superenciphered encoded message: "SDEAT QYOUB DGORY ...." For transmission, it is prefixed with SIKYU (for "urgent"), the message number 02500, a system indicator GIGIG, and a key indicator AUDOB. (Kahn (1967) p.15)

Kahn (1967) reproduces a page of what seems to be an original Japanese codebook from around 1931. It is part of a four-letter code, with rows and columns headed by two-letter groups of consonant+vowel or vowel+consonant. Although the assignment looks irregular at a first glance, entries beginning with the same sound or having other common features tend to be grouped on a diagonal line downward to the right. The page includes simple kana combinations (AKIF for ze+n, CEIF for te+tsu, etc.), words (AVIN for Germany, BAIN for plenipotentiary, CEIN for notification, etc.), phrases (EGIF for beforehand, EGIN for not, etc.), symbols (REEW for comma, SAFO for dash) (Kahn (1967) p.17).

References for this Section

JACAR Ref. B12080889500 etc.

Shiraishi Masaaki (1998), "Iwayuru 'Kaibunsho Jiken' ni kansuru Sakuma Denshin Kacho Kiso Chosho ni tsuite", Gaikoshiryokanpo, June 1998, pp.97-104 (白石仁章「いわゆる"怪文書事件"に関する佐久間電信課長起草調書について」,『外交史料館報』,1998.6,pp.97-104)

David Kahn (1967), The Codebreakers



©2013 S.Tomokiyo
First posted on 16 October 2013. Enlarged to the year 1933 on 7 March 2014. Last modified on 14 January 2016.
Articles on Historical Cryptography
inserted by FC2 system