Break Cornwallis' Cipher! -- Introductory Codebreaking

This article is an introduction to codebreaking for beginners. We are going to break a cipher actually used during the American Revolutionary War.

Clinton-Cornwallis Cipher

The surrender of the British General Cornwallis in October 1781 secured the independence of the United States for all practical purposes. During the last months in Yorktown, Cornwallis exchanged secret messages with the British Commander-in-Chief, Sir Henry Clinton, in New York. George Washington outwitted Clinton and arrived at the Chesapeake with the French ally, while the British preparation for relief made slow progress.

The following is the whole text of a letter from Clinton to Cornwallis dated 24 September 1781.

I was honored yesterday with your Lordship's Letter of the 16th and 17th instant. And 33 40 61 7 32 84 -
7 22 - 7 - 11 15 15 22 21 5 4 - 24 26 - 22 18 15 - 26 2 7 4 - 7 5 14 -
4 15 5 15 25 7 2 - 24 26 26 21 8 15 25 23 - 18 15 2 14 - 22 18 21 23 -
14 7 13 - 21 22 - 21 23 14 15 22 15 25 11 21 5 15 14 - 22 18 7 22 - 7 12 =
24 19 15 - 5 0 0 0 - 11 15 5 - 25 7 5 3 - 7 5 14 - 26 21 2 15 - 23 18 =
7 2 2 - 12 15 - 15 11 12 7 25 3 15 14 - 24 5 - 12 24 7 25 14 -
22 18 15 - 3 21 5 4 23 - 23 18 21 29 23 - 7 5 14 - 22 18 15 - 21 24 =
21 5 22 - 15 17 15 25 22 21 24 - 5 23 - 24 26 - 22 18 15 - 5 7 19 13 -
7 5 14 - 7 25 11 13 - 11 7 14 15 - 21 5 - 7 - 26 15 9 - 14 7 13 23 -
22 24 - 25 15 2 21 15 19 15 - 13 24 6 - 7 5 14 - 7 26 22 15 25 =
9 7 25 14 23 - 8 24 24 29 15 25 7 22 15 - 9 21 22 18 - 13 24 6 -
The Fleet 8 24 5 23 21 23 22 23 - 24 26 - 23 - 23 7 21 2 -
24 26 - 22 18 15 - 2 21 5 15 - 22 18 25 15 15 - 24 26 - 9 18 21 8 18 -
7 25 15 - 22 18 25 15 15 - 14 15 8 3 15 25 23 -

22 18 15 25 15 - 21 23 - 15 19 15 25 13 -
25 15 7 23 24 5 - 22 24 18 24 29 15 - 9 15 - 23 18 7 2 2 -
23 22 7 25 22 - 26 25 24 11 - 18 15 5 8 15 - 7 12 24 6 22 -
22 18 15 - 5 - 24 8 22 24 12 15 25 I have the Honoor to be
Your Lordship's most obedient and most humble servant.

PS
7 14 11 21 25 7 2 - 14 21 4 12 13 - 21 23 - 22 18 21 23 -
11 24 11 15 5 22 - 7 25 25 21 19 15 14 - 7 22 - 22 18 15 - 18 24 24 3 -
9 21 22 18 - 22 18 25 15 15 - 23 7 21 2 - 24 26 - 22 18 15 - 2 21 5 15 -
If a venture without knowing whether
22 18 15 13 - 8 7 5 - 12 15 - 23 15 15 5 - 12 13 - 6 23 - 21 -
25 15 1 6 15 23 22 - 22 18 7 22 - 21 26 - 7 22 - 21 23 -
9 15 2 2 - 6 29 24 5 - 18 15 7 25 21 5 4 - 7 - 8 24 5 23 21 =
14 15 25 7 12 2 15 - 26 21 25 21 5 4 - 22 24 9 7 25 14 23 -
22 18 15 - 15 5 22 25 7 5 8 15 - 24 26 - 8 18 15 23 7 29 15 7 3 -
22 18 25 15 15 - 2 7 25 4 15 - 23 11 24 3 15 23 - 11 7 13 - 12 15 -
11 7 14 15 - 29 7 25 7 2 2 15 2 - 22 24 - 21 22 - 7 5 14 -
21 26 - 13 24 6 - 29 24 23 23 15 23 23 - 22 18 15 - 29 24 23 22 -
7 22 - 4 2 24 6 8 15 23 22 15 25 - 26 24 6 25 -

21 - 23 18 7 2 2 - 23 15 5 14 - 7 5 24 22 18 =
15 25 - 25 6 5 5 15 25 - 22 24 - 13 24 6 - 21 5 - 7 -
2 21 22 2 15 - 22 21 11 15 -
I have received your Lordship's Letter of the 8th instant.

Frequency Counting

Upon looking at the cipher, it is noted that the numbers range from 0 to 33, with sporadic higher numbers 40, 61, and 84. Given this limited range of numbers, each letter may be assumed to represent one letter of the alphabet rather than a specific word.

The first thing a codebreaker would do is to count the frequency of each number. The result is as follows:
15 (87 times)
 7 (56 times)
22 (54 times)
24 (42 times)
21 (41 times)
23 (40 times)
25 (40 times)
 5 (38 times)
18 (35 times)
 2 (26 times)
14 (23 times)
26 (18 times)
11 (14 times)
13 (13 times)
 8 (12 times)
12 (11 times)
 6 (11 times)
 4 (9 times)
29 (8 times)
 9 (8 times)
 3 (7 times)
19 (5 times)
 0 (3 times)
 1 (1 time)
17 (1 time)
32 (1 time)
33 (1 time)
40 (1 time)
61 (1 time)
84 (1 time)

Now, a basic fact every codebreaker should know is that the most frequently occurring letters in English are "ETAONRISH" (another mnemonic for the eight letters, omitting H, is "a sin to err"). Though actual ranks may vary, the top nine ranking numbers 15, 7, 22, 24, 21, 23, 25, 5, 18 are likely to correspond to the letters E, T, A, O, N, R, I, S, H. Especially, the frequency of the number "15" stands out and probably it represents E.

Another basic fact that comes in handy in codebreaking is that the most frequently used word in English is "the". So, we first look for a three-letter combination that is used in many places. It is found that a sequence "22 18 15" occurs no less than twelve times. So, probably, identifications could be made as 22(T), 18(H), 15(E). This is consistent with the high frequency of these numbers.

In addition, this particular cipher has a distinct feature to facilitate codebreaking. The hyphen (-) seems to be a word break. If so, most of the three letter combinations "22 18 15" would represent a complete word, supporting the hypothesis that "22 18 15" corresponds to "the".

The word break is a strong clue for codebreaking. We note a one-letter word "7" appearing here and there. One-letter word in English is either "a" or "I". Referring to the frequency counting above, the very high frequency of the number "7" suggests this is more likely to be "a".

Let's look at how far the plaintext can be revealed by applying these findings:

33 40 61 a 32 84 -
a t - a - 11 e e t 21 5 4 - 24 26 - t h e - 26 2 a 4 - a 5 14 -
4 e 5 e 25 a 2 - 24 26 26 21 8 e 25 23 - h e 2 14 - t h 21 23 -
14 a 13 - 21 t - 21 23 14 e t e 25 11 21 5 e 14 - t h a t - a 12 =
24 19 e - 5 0 0 0 - 11 e 5 - 25 a 5 3 - a 5 14 - 26 21 2 e - 23 h =
a 2 2 - 12 e - e 11 12 a 25 3 e 14 - 24 5 - 12 24 a 25 14 -
t h e - 3 21 5 4 23 - 23 h 21 29 23 - a 5 14 - t h e - 21 24 =
21 5 t - e 17 e 25 t 21 24 - 5 23 - 24 26 - t h e - 5 a 19 13 -
a 5 14 - a 25 11 13 - 11 a 14 e - 21 5 - a - 26 e 9 - 14 a 13 23 -
t 24 - 25 e 2 21 e 19 e - 13 24 6 - a 5 14 - a 26 t e 25 =
9 a 25 14 23 - 8 24 24 29 e 25 a t e - 9 21 t h - 13 24 6 -
The Fleet 8 24 5 23 21 23 t 23 - 24 26 - 23 - 23 a 21 2 -
24 26 - t h e - 2 21 5 e - t h 25 e e - 24 26 - 9 h 21 8 h -
a 25 e - t h 25 e e - 14 e 8 3 e 25 23 -
t h e 25 e - 21 23 - e 19 e 25 13 -
25 e a 23 24 5 - t 24 h 24 29 e - 9 e - 23 h a 2 2 -
23 t a 25 t - 26 25 24 11 - h e 5 8 e - a 12 24 6 t -
t h e - 5 - 24 8 t 24 12 e 25 I have the Honoor to be ....

PS
a 14 11 21 25 a 2 - 14 21 4 12 13 - 21 23 - t h 21 23 -
11 24 11 e 5 t - a 25 25 21 19 e 14 - a t - t h e - h 24 24 3 -
9 21 t h - t h 25 e e - 23 a 21 2 - 24 26 - t h e - 2 21 5 e -
If a venture without knowing whether
t h e 13 - 8 a 5 - 12 e - 23 e e 5 - 12 13 - 6 23 - 21 -
25 e 1 6 e 23 t - t h a t - 21 26 - a t - 21 23 -
9 e 2 2 - 6 29 24 5 - h e a 25 21 5 4 - a - 8 24 5 23 21 =
14 e 25 a 12 2 e - 26 21 25 21 5 4 - t 24 9 a 25 14 23 -
t h e - e 5 t 25 a 5 8 e - 24 26 - 8 h e 23 a 29 e a 3 -
t h 25 e e - 2 a 25 4 e - 23 11 24 3 e 23 - 11 a 13 - 12 e -
11 a 14 e - 29 a 25 a 2 2 e 2 - t 24 - 21 t - a 5 14 -
21 26 - 13 24 6 - 29 24 23 23 e 23 23 - t h e - 29 24 23 t -
a t - 4 2 24 6 8 e 23 t e 25 - 26 24 6 25 -

21 - 23 h a 2 2 - 23 e 5 14 - a 5 24 t h =
e 25 - 25 6 5 5 e 25 - t 24 - 13 24 6 - 21 5 - a -
2 21 t 2 e - t 21 11 e -
I have received your Lordship's Letter of the 8th instant.

Further Steps

Then, one would notice sequences such as "t h 25 e e" and "t h e 25 e". Thus, most probably, "25" would be R. Further, after a plaintext word "whether", there appears a word "t h e 13". Thus, "13" would be Y.

If we look more closely, we may find further patterns that allow us to specify words they represent. However, we now apply these identifications R(25) and 13(Y) and see how far we got.

33 40 61 a 32 84 -
a t - a - 11 e e t 21 5 4 - 24 26 - t h e - 26 2 a 4 - a 5 14 -
4 e 5 e r a 2 - 24 26 26 21 8 e r 23 - h e 2 14 - t h 21 23 -
14 a y - 21 t - 21 23 14 e t e r 11 21 5 e 14 - t h a t - a 12 =
24 19 e - 5 0 0 0 - 11 e 5 - r a 5 3 - a 5 14 - 26 21 2 e - 23 h =
a 2 2 - 12 e - e 11 12 a r 3 e 14 - 24 5 - 12 24 a r 14 -
t h e - 3 21 5 4 23 - 23 h 21 29 23 - a 5 14 - t h e - 21 24 =
21 5 t - e 17 e r t 21 24 - 5 23 - 24 26 - t h e - 5 a 19 y -
a 5 14 - a r 11 y - 11 a 14 e - 21 5 - a - 26 e 9 - 14 a y 23 -
t 24 - r e 2 21 e 19 e - y 24 6 - a 5 14 - a 26 t e r =
9 a r 14 23 - 8 24 24 29 e r a t e - 9 21 t h - y 24 6
-
The Fleet 8 24 5 23 21 23 t 23 - 24 26 - 23 - 23 a 21 2 -
24 26 - t h e - 2 21 5 e - t h r e e - 24 26 - 9 h 21 8 h -
a r e - t h r e e - 14 e 8 3 e r 23 -

t h e r e - 21 23 - e 19 e r y -
r e a 23 24 5 - t 24 h 24 29 e - 9 e - 23 h a 2 2 -
23 t a r t - 26 r 24 11 - h e 5 8 e - a 12 24 6 t -
t h e - 5 - 24 8 t 24 12 e r
I have the Honoor to be ....

PS
a 14 11 21 r a 2 - 14 21 4 12 y - 21 23 - t h 21 23 -
11 24 11 e 5 t - a r r 21 19 e 14 - a t - t h e - h 24 24 3 -
9 21 t h - t h r e e - 23 a 21 2 - 24 26 - t h e - 2 21 5 e -
If a venture without knowing whether
t h e y - 8 a 5 - 12 e - 23 e e 5 - 12 y - 6 23 - 21 -
r e 1 6 e 23 t - t h a t - 21 26 - a t - 21 23 -
9 e 2 2 - 6 29 24 5 - h e a r 21 5 4 - a - 8 24 5 23 21 =
14 e r a 12 2 e - 26 21 r 21 5 4 - t 24 9 a r 14 23 -
t h e - e 5 t r a 5 8 e - 24 26 - 8 h e 23 a 29 e a 3 -
t h r e e - 2 a r 4 e - 23 11 24 3 e 23 - 11 a y - 12 e -
11 a 14 e - 29 a r a 2 2 e 2 - t 24 - 21 t - a 5 14 -
21 26 - y 24 6 - 29 24 23 23 e 23 23 - t h e - 29 24 23 t -
a t - 4 2 24 6 8 e 23 t e r - 26 24 6 r -

21 - 23 h a 2 2 - 23 e 5 14 - a 5 24 t h =
e r - r 6 5 5 e r - t 24 - y 24 6 - 21 5 - a -
2 21 t 2 e - t 21 11 e -
I have received your Lordship's Letter of the 8th instant.

Now, we note the sequence "a 26 t e r = 9 a r 14 23 - 8 24 24 29 e r a t e - 9 21 t h - y 24 6".

First, "26" would be F. Supposing the double hyphen (=) indicates the word continues to the next line, probably "9 a r 14 23" would be "wards". Next, the sequence "8 24 24 29 e r a t e" is characteristic in the succession of identical letters "24 24". The word that matches this pattern is "cooperate". This context suggests the next word "9 21 t h" would be "with" and "y 24 6" would be "you".

Thus, we now identify 26(F), 9(W), 14(D), 23(S), 8(C), 24(O), 29(P), 21(I), 6(U). Remarkably, occurrences of 9(W) in "wards" and "with" and those of 24(O) in "cooperate" and "you" are consistent. This supports these identifications.

Further, from "e 19 e r y", "19" would be V. The sequence "a r r 21 19 e 14" would be either "arrives" or "arrived", of which the latter is consistent with the above identification 14(D). The sequence "e 5 t r a 5 8 e" suggests "entrance" and this would give 5(N) and 8(C), of which the latter is consistent with the above.

Applying these findings, the plaintext would be fairly revealed.

33 40 61 a 32 84 -
a t - a - 11 e e t i n 4 - o f - t h e - f 2 a 4 - a n d -
4 e n e r a 2 - o f f i c e r s - h e 2 d - t h i s -
d a y - i t - i s d e t e r 11 i n e d - t h a t - a 12 =
o v e
- n 0 0 0 - 11 e n - r a n 3 - a n d - f i 2 e - s h =
a 2 2 - 12 e - e 11 12 a r 3 e d - o n - 12 o a r d -
t h e - 3 i n 4 s - s h i p s - a n d - t h e - i o =
i n t - e 17 e r t i o - n s - o f - t h e - n a v y -
a n d - a r 11 y - 11 a d e - i n - a - f e w - d a y s -
t o - r e 2 i e v e - y o u - a n d - a f t e r =
w a r d s - c o o p e r a t e - w i t h - y o u -
The Fleet c o n s i s t s - o f - s - s a i 2 -
o f - t h e - 2 i n e - t h r e e - o f - w h i c h -
a r e - t h r e e - d e c 3 e r s -

t h e r e - i s - e v e r y -
r e a s o n - t o h o p e - w e - s h a 2 2 -
s t a r t - f r o 11 - h e n c e - a 12 o u t -
t h e - n - o c t o 12 e r
I have the Honoor to be ....

PS
a d 11 i r a 2 - d i 4 12 y - i s - t h i s -
11 o 11 e n t - a r r i v e d - a t - t h e - h o o 3 -
w i t h - t h r e e - s a i 2 - o f - t h e - 2 i n e -
If a venture without knowing whether
t h e y - c a n - 12 e - s e e n - 12 y - u s - i -
r e 1 u e s t - t h a t - i f - a t - i s -
w e 2 2 - u p o n - h e a r i n 4 - a - c o n s i =
d e r a 12 2 e - f i r i n 4 - t o w a r d s -
t h e - e n t r a n c e - o f - c h e s a p e a 3 -
t h r e e - 2 a r 4 e - s 11 o 3 e s - 11 a y - 12 e -
11 a d e - p a r a 2 2 e 2 - t o - i t - a n d -
i f - y o u - p o s s e s s - t h e - p o s t -
a t - 4 2 o u c e s t e r - f o u r -

i - s h a 2 2 - s e n d - a n o t h =
e r - r u n 5 e r - t o - y o u - i n - a -
2 i t 2 e - t i 11 e -
I have received your Lordship's Letter of the 8th instant.

The rest would be fairly easy to identify. The word "11 e e t i n 4" suggests 11(M) and 4(G). The sequence "4 e n e r a 2 - o f f i c e r s" would be "general officers", giving further identification 2(L). It is consistent with the next "h e 2 d" being "held". A further sequence "i t - i s d e t e r 11 i n e d" supports the above identification 11(M). (Clearly, a word break hyphen is missing after "is".) The sequence "a 12 = o v e" suggests 12(B).

Here, we notice a sequence "n 0 0 0". There is no word in English that includes three letters in succession (except for a genitive such as "burgess's"). Perhaps, "0" means a numeral "0" itself and does not represent an alphabetical letter. Thus, "5" in the original cipher "5 0 0 0" should also indicate the numeral itself instead of "n" and the sequence should read "5000". Then, with a little familiarity with military terms, it would be easy to translate "5 0 0 0 - 11 e n - r a n 3 - a n d - f i 2 e" as "5000 men rank and file". This gives further identification 3(K).

"e 17 e r t i o - n s" would expose "17(X)". Here, the hyphen is superfluous. (In actual codebreaking, we must bear in mind that the original cipher may include errors.) The word "re1uest" shows "1" should be "Q".

Plaintext Revealed

The complete original plaintext is now revealed.

33 40 61 a 32 84
At a meeting of the flag and
general officers held this
day it is determined that ab=
ove 5000 men rank and file sh=
all be embarked on board
the king's ships and the io=
int
exertions of the navy
and army made in a few days
to relieve you and after=
wards cooperate with you.
The Fleet consists of 23 sail
of the line, three of which
are three deckers.

There is every
reason to hope we shall
start from hence about
the 5 October.
I have the Honor to be ....

PS
Admiral Digby is this
moment arrived at the Hook
with three sail of the line.
If a venture without knowing whether
they can be seen by us, I
request that if at [->all]* is
well upon hearing a consi=
derable firing towards
the entrance of Chesapeak
three large smokes may be
made parallel to it and
if you possess the post
at Gloucester, four.

I shall send anoth=
er runner to you in a
litle time.
I have received your Lordship's Letter of the 8th instant.

(*The deciphered plaintext "at" (7 22) should read "all" (7 2 2).)

Notes on the Cipher

The revealed substitution table is as follows.
 a  b  c  d  e  f  g  h  i  k  l  m  n  o  p  q  r  s  t  u  v  w  x  y  z
07 12 08 14 15 26 04 18 21 03 02 11 05 24 29 01 25 23 22 06 19 09 17 13 16

Here, the 25 letters of the alphabet (there is no "j") are substituted by code numbers 1-29 (10, 20, 27, 28 are not used). (Preceding 0s are included in the cipher table by the author merely for convenience of tabulation.)

At the time, it was common not to make distinction between "i" and "j" as well as "u" and "v". In this table, "u" and "v" are assigned separate numbers but "j" is not assigned a number of its own. Thus, "joint" is spelled as "ioint" in the above letter.

The nine most frequently used numbers are 15(E), 7(A), 22(T), 24(O), 21(I), 23(S), 25(R), 5(N), 18(H), which verifies the "ETAONRISH" rule.

There remains an undeciphered sequence "33 40 61 7(a) 32 84" at the beginning. Of these six numbers, all but 7(a) are numbers not included in the above cipher table. Actually, this sequence is used as an indicator to show that the cipher table should be shifted such that "7" is aligned with "A".

While the cipher used in this particular letter from Clinton to Cornwallis was one of the easiest kinds, the British tried to give it some security by shifting the cipher table from time to time.

For example, a letter from Cornwallis to Clinton dated 8 September 1781 used the same cipher table, which, however, was shifted such that 18 was aligned with "A".
 a  b  c  d  e  f  g  h  i  k  l  m  n  o  p  q  r  s  t  u  v  w  x  y  z
18 21 03 02 11 05 24 29 01 25 23 22 06 19 09 17 13 16 07 12 08 14 15 26 04

This shifting position was indicated by "44 32 18" at the beginning of the cipher, of which high numbers "44" and "32" are nulls and "18" indicates the position. Remarkably, Cornwallis used two positions in the short ciphered passage of only nine lines. The cipher table for the second position was as follows:
 a  b  c  d  e  f  g  h  i  k  l  m  n  o  p  q  r  s  t  u  v  w  x  y  z
03 02 11 05 24 29 01 25 23 22 06 19 09 17 13 16 07 12 08 14 15 26 04 18 21

This new position was indicated by an indicator sequence "43 32 3".

In some cases, a British correspondent switched the table more than twenty times in one letter. But it could not protect the secret from James Lovell, an American codebreaker (see here under "Substitution by 1-29" for details).



©2008 S.Tomokiyo
First posted on 7 July 2009. Last modified on 7 July 2009.

Articles on Historical Cryptography
inserted by FC2 system