An Ormonde-Clanricarde Cipher -- An Example of Codebreaking

I recently succeeded in deciphering a fragment of a letter entirely in cipher printed by Davys (1737) for the reader. The letter was from the Earl of Clanricarde to the Marquis of Ormonde. Although the cipher turned out to be one of the simplest kind (it took Davys only about four hours), a brief description of how I solved the cipher "might be of use to an industrious man," as Davys put it (p.30).

Cipher in Question

90 6645737747 83576045 109 655383814976 34
99 677867767358495077 23 70577852 108 26 50495371
8378 36 435476415368 836977 22 40
5972786058415368 39 764553 91 446577736975
78 30 5345735169 93 29 108 664568
43724176426957 105 22 104 32 8145795457836944
33 70577852 97109 23 108 74495371 4269754569844582
34 61695175 93 5245 666083 24 824163458348 73
48658469 5378 20 5578854557 4970 49 72656045 111
21 105 40 106 7687 44608387 91 7041735972 30
8354 97 20 99 75785883 105 7669 30 91 111
566971606581686944 102 25 105 37 73 48418469
47787769 76606748 658259814163 105 38 5869457669
426083 22 50775485 39 111 617269574953 34 775481
61486583 24 6754845782 5978 83417469 25 91 666059
102 7663 67787770734469536745 7353 25 107 30 91
8169587945675982 39 8378 26 107 49 584854847544
102 69604557 80847359 83724958 36 847772657987
5049534744785245

Reduction of the Problem

Numerical ciphers in the sixteenth century England generally used numbers up to several hundred (see here for ciphers of Charles I). Although there are a few instances of numbers above 1000 (Peterson's, Manning's, and Bampfield's; see here), it never got to such many-digit numbers. Davys also says he had never seen this kind of cipher.

Come to think of it, however, the number of codes would not be much more than 1000, if at all. These long figures should represent some combinations of a limited number of low figures. A simplest assumption would be that the long figures actually represent two-digit codes run into one another. Such an assumption can be supported by the fact that all the long figures (i.e., those except for 102, 107, etc.) have an even number of digits. Further, when one breaks down the long figures into two-digit groups, such groups generally appear to be in the range from 40 to 90. Thus, after all, the cipher appears to be essentially a two-digit code system, whereby each letter of the alphabet is assigned two figures (at least in average). It would be similar to many of the ciphers of Charles I (see here).

(In reaching such a working hypothesis, I have to admit that, unlike Davys, I did have seen a cipher like this. A letter dated Rome, January 12 1675, addressed to Edward Coleman, who was executed in the turmol of the Popish Plot, contains similar runs of many digits, which turned out to consist of two-digit groups. (Hay p.203 ff.) [After writing this, I found Davys described this cipher letter in his Postscript. See another article.])

It would then be natural to assume that three-digit figures and possibly figures above 90 represent frequently used words or names. Probably breaks of figures represent word breaks. This last assumption may be challenged by sequences such as "78 30" and "37 73" because two one-letter words, "a" or "I" in English, succeeding one another is unlikely in English. However, such a problem may be solved by nulls or transcription errors ("7830" might easily be copied as "78 30").

Frequency Analysis

The usual frequency analysis indicated the following high-frequency figures: "69" (21 times) - "45" (17) - "83" (15) - "78" (14) - "53" (12) - "60"/"73"/"49"/"76" (10). These figures are likely to represent letters of ETAONRISH, the letters known to be of high-frequency in English. In particular, the highest frequency figure "69" may represent "e" but the margin is meagre to say anything conclusive.

Taking advantage of what seems to be word breaks, frequency of "words" was examined. The high-frequency groups were "83 78" (7 times), "76 63" (5 times), and "66 45 68" (4 times). In English, it is well-known that the most frequently used three-letter word is "the." I was tempted to identify 66(t)-45(h)-68(e) but this hypothesis had to be discarded outright. First of all, the frequency of "66 45 68" is too low compared with the other two-letter words. Further, "68", which should be of very high-frequency if it represents "e", appears only 4 times. Again, considering that this is not a simple substitution cipher, nothing is conclusive, since fifty figures 40-90 allows frequency of individual letters to be obscured by assigning many figures to high-frequency letters such as "e". At least, it seemed very likely that such a high-frequency word like "the" would be given a special code above 90.

The most distinguishing result of the word frequency count is the high frequency of "83 78". With this, however, it can represent any of common two-letter words such as "he", "me", "we", "be", "at", "in", "to", "or". (Other possibilities "so" or "no" are not likely to appear as many as seven times in such a short fragment.) In order to find out which is correct, it was examined how each of "83" and "78" was used in the ciphertext. The fact that figure "83", which is the third most frequent figure, occurs at the end of a word implies it would not be "a", "i", or "o." Further, it would not be "e" because it is the first letter of a high-frequency two-letter word. Thus, of ETAONRISH, "83" may correspond to "h" or "t" and the two-letter word "83 78" may be "he" or "to." Considering that "78" is the fourth most frequent letter, it would be more likely to be "o" than "e". Of course, however, considering the homophonicity with the fifty figures, we cannot jump to the conclusion.

Contact Chart and First Findings

About this time, I also produced a contact chart (see Kahn pp.99-105). Although it did not provide a clue as in the textbook, it did show that "78" does not appear together with other high-frequency letters. This is consistent with the above hypothesis that "83" would be "o" rather than "e." Thus, I assumed "83" to be "t" and "78" to be "o."

The next step was to substitue these findings in the original ciphertext and see any known pattern appears. With the short fragment, the most that could be obtained was "75 78(o) 58 83(t)", which might be "boat", "bolt", "boot", "bout", etc. etc. It might be just anything and there seemed to be no way to further proceed.

Breakthrough

Then came a breakthrough. I noticed that the distance between "83" and "78" is just the right distance between "t" and "o." This implies that the letters of the alphabet are assigned code numbers in a regular order. Further, there is another two-letter combination "59 78", occurring twice. A two-letter word ending in "o" (78) might again be "to", though with such a low frequency, "so" and "no" might not be rejected altogether. However, again, the distance between "83" and "59" is 24, the number of letters in the alphabet! (At the time, "i" and "j" as well as "u" and "v" were identified.)

I was sure this could not be a coincidence. Without testing the hypothesis any further, I keyed in the following cipher table.

Running a Perl script for deciphering (yes, I did have an advantage of computer tools over Davys) immediately showed that my assumption was correct.

As it turned out, the high-frequency figures were identified as follows: "69" (e) - "45" (e) - "83" (t) - "78" (o) - "53" (n) - "60"(u)/"73"(i)/"49"(i)/"76"(m). The four-letter pattern was found to be "75(l) 78(o) 58(s) 83(t)".

The numbers up to 40 appeared to be nulls. It was not difficult to guess "108" stood for "the."

Deciphered Text

The final decoding is as follows. This letter (17 March 1643 [i.e., 1644]) is printed as CCLI (p.63) in Carte. (Actually, I filled further blanks from this printed version.) In postscript, Clanricarde makes an excuse of using cipher: "I beleeue I should not haue troubled your lordship with a character [i.e., cipher], but that St. Patrick's day makes letters subject to miscarry; and yet at best leisure your lordship may be pleased to be the translator of it."

90 < IT >
66 45 < e > 73 77 < n > 47 < g >
83 < t > 57 < r > 60 45 < e >
109 < THAT >
65 < a > 53 < n > 83 < t > 81 < r > 49 76 < m >
34 < - >

99 < HATH >
67 < c > 78 < o > 67 < c->m > 76 < m > 73 58 < s > 49 50 < k->o > 77 < n >
23 < - >
70 < f > 57 < r > 78 < o > 52 < m >
108 < THE >
26 < - >
50 < k > 49 53 < n > 71 < g >
83 < t > 78 < o >
36 < - >
43 < c > 54 < o > 76 < m > 41 < a > 53 < n > 68 < d >
83 < t > 69 < e > 77 < n >
22 < - >
40 < - >

59 < t > 72 < h > 78 < o > 60 58 < s > 41 < a > 53 < n > 68 < d >
39 < - >
76 < m > 45 < e > 53 < n >
91 < WITH >
44 < d > 65 < a > 77 < n > 73 69 < e > 75 < l >

78 < o >
30 < - >
53 < n > 45 < e > 73 51 < l > 69 < e >
93 < OF >
29 < - >
108 < THE >
66 45 < e > 68 < d >

43 < c > 72 < h > 41 < a > 76 < m > 42 69 < e > 57 < r >
105 < IT >
22 < - >
104 < IS >
32 < - >
81 < r > 45 < e > 79 54 < o > 57 < r > 83 < t > 69 < e > 44 < d >

33 < - >
70 < f > 57 < r > 78 < o > 52 < m >
97 < THEM > 109 < THAT >
23 < - >
108 < THE >
74 < k > 49 53 < n > 71 < g >
42 69 < e > 75 < l > 45 < e > 69 < e > 84 45 < e > 82 < s >

34 < - >
61 < w > 69 < e > 51 < l > 75 < l >
93 < OF >
52 < m > 45 < e >
66 60 83 < t >
24 < - >
82 < s > 41 < a > 63 < y > 45 < e > 83 < t > 48 < h >
73 

48 < h > 65 < a > 84 69 < e >
53 < n > 78 < o >
20 < - >
55 78 < o > 85 < w > 45 < e > 57 < r >
49 70 < f >
49 
72 < h > 65 < a > 60 45 < e >
111 < NOT >

21 < - >
105 < IT >
40 < - >
106 < WAS >
76 < m > 87 < y >
44 < d > 60 83 < t > 87 < y >
91 < AND >
70 < f > 41 < a > 73 59 < t > 72 < h >
30 < - >

83 < t > 54 < o >
97 < HIM >
20 < - >
99 < HATH >
75 < l > 78 < o > 58 < s > 83 < t >
105 < IT >
76 < m > 69 < e >
30 < - >
91 < AND >
111 < NOT >

56 < q->r > 69 < e > 71 < g > 60 65 < a > 81 < r > 68 < d > 69 < e > 44 < d >
102 < FOR >
25 < - >
105 < IT >
37 < - >
73 
48 < h > 41 < a > 84 69 < e >

47 < g > 78 < o > 77 < n > 69 < e >
76 < m > 60 67 < c > 48 < h >
65 < a > 82 < s > 59 < t > 81 < r > 41 < a > 63 < y >
105 < IT >
38 < - >
58 < s > 69 < e > 45 < e > 76 < m > 69 < e >

42 60 83 < t >
22 < - >
50 < k > 77 < n > 54 < o > 85 < w >
39 < - >
111 < NOT >
61 < w > 72 < h > 69 < e > 57 < r > 49 53 < n >
34 < - >
77 < n > 54 < o > 81 < r >

61 < w > 48 < h > 65 < a > 83 < t >
24 < - >
67 < c > 54 < o > 84 57 < r > 82 < s >
59 < t > 78 < o >
83 < t > 41 < a > 74 < k > 69 < e >
25 < - >
91 < AND >
66 60 59 < t >

102 < FOR >
76 < m > 63 < y >
67 < c > 78 < o > 77 < n > 70 < f > 73 44 < d > 69 < e > 53 < n > 67 < c > 45 < e >
73 53 < n >
25 < - >
107 < YOU >
30 < - >
91 < AND >
81 < r > 69 < e > 58 < s > 79 45 < e > 67 < c > 59 < t > 82 < s >
39 < - >
83 < t > 78 < o > to
26 < - >
107 < YOU >
49 
58 < s > 48 < h > 54 < o > 84 75 < l > 44 < d >

102 < FOR >
69 < e > 60 45 < e > 57 < r >
80 < q > 84 73 59 < t >
83 < t > 72 < h > 49 58 < s >
36 < - >
84 77 < n > 72 < h > 65 < a > 79 87 < y >
50 < k > 49 53 < n > 47 < g > 44 < d > 78 < o > 52 < m > 45 < e >