In 2021, a cipher used in some of Charles I's letters written during his captivity in the Isle of Wight in 1648 was broken by Norbert Biermann and Matthew Brown (Cipherbrain). The present article considers how they could solve the cipher left undeciphered for many years.
(I wrote this article when the reading of the most part was published as comments in Cipherbrain. Afterwards, Biermann and Brown, joined by Thomas Bosbach, continued the work between them and the final results were published in Cipherbrain on 5 May 2021.)
I posted the following ciphertexts in "King Charles II's Ciphers during Exile" and also included them in the list in "Unsolved Historical Ciphers".
These ciphertexts were mentioned in Cipherbrain, a German cryptology blog by Klaus Schmeh, on 11 April 2021. Klaus Schmeh and Elonka Dunin chose them as one of the topics in their webinar "More famous and not-so-famous unsolved codes" (18 April 2021) hosted by the National Museum of Computing in Bletchley Park. They found a volume on the British Library website containing not only the original of the above (6 November 1648) but also additional letters (1 September 1648, 6 November 1648).
Codebreaking work began by examining known ciphers. Biermann and Brown (who worked independently at first) reconstructed ciphers corresponding to the "Second Cipher between Charles I and Henrietta-Maria in Holland and the North (Aug. 1642-July 1643)", "Cipher with the Prince of Wales (February 1647)", the one solved by John Wallis, and perhaps "Cipher between Charles I and Silus Titus (1648)" covered in my article, "King Charles I's Ciphers".)
The real discussion started on 18 April, when Matthew Brown observed from the two ciphers he independently reconstructed that "The final word in the common word block for the 2 known keys is YOU. The frequency for [615=YOU] looks about right." He also saw that it fit the context in two instances: "615 ought 428" (2 Sep.) and "And now I must command you to answer me freely to a Question, (I am confident that you will not dissemble with me) which is, if 615 211 179 ..." (3 Oct.).
Norbert Biermann had also independently reached the hypothesis 615=YOU, and made further suggestions:
614 = YOUR (perhaps from frequency and adjacency to 615)
563 = THE (perhaps from frequency and "a conditional advice concerning 563 528 456, of which you ..." (3 Oct.))
561 = THAT (perhaps from frequency, closeness to 563, and "possible 561" (2 Sep.))
557 = TO (perhaps from frequency, closeness to 563, and "557 make" (2 Sep.))
572 = UNTO (Biermann first assumed 572=TO from "adheare 572" (2 Sep.), but changed his mind because 572 occurs only once)
447 = OF (perhaps from frequency and "an advice 447" (7 Nov.))
428 = NOW (or another adverb, because of "you ought 428 to" (2 Sep.); later, this turned out to be NOT)
340/345/347/350 = IS/IT/IF/IN (Biermann initially did not specify which correspond to which; later, they turned out to be I/IN/IS/ING in this order)
From the assumption 563=THE, Biermann further observed that "(36 563 29 1 39 5 51 37 15 7 72 61)" (3 Oct.) may be "(o-the-r ...)" or "(o-the-r-s ...)" such as "others redeemed", "others returned".
Brown (who had also assumed the same for YOUR, THE, THAT, TO) proposed "(o-the-r-w-a-i-s ...)."
He made an additional suggestion:
277 = FOR because of frequency and "404 277(for) 615(you); to this I would have your speedy resolution."
On 20 April, Biermann posted the key he identified. The numbers 1-106 represent letters (including two nulls), followed by common words from 142 (able) to 615 (you) generally in alphabetical order (but not quite).
The parenthetical phrase above turned out to be
(36 563 29 1 39 5 51 37 15 7 72 61)
o the r _ r e s p e c t s
Even with the hindsight of the initial guesses given above, I would not have been able to do this. I hope Biermann would publish how he achieved this feat.
After this, more and more symbols were identified by Biermann and Brown.
The following is the decipherment of the above two ciphertexts according to the keys provided in the comment section in the Cipherbrain article as of 24 April 2021.
(It is sweet of the King to care about the prince's feeling. Biermann once believed the "mademoiselle" refers to Lucy Walter (Wikipedia), but of course, the King would not have considered her a good match. I wonder whether the "mademoiselle" refers to Anne Marie Louise d'Orléans, Duchess of Montpensier, known as "la Grande Mademoiselle" (Wikipedia). Queen Henrietta-Maria, mother of Prince Charles, contemplated the match as of 1646 (Eva Scot, The King in Exile p.28, Plowden, Henrietta Maria, p.264). )
(Additional Note: In the timeline provided by Biermann with the final version, Lucy Walter is still mentioned, with the assumption that Lucy Walter's pregnancy of the future Duke of Monmouth (to be born in April 1649) was known to Charles I. But if the King refers to her, the King did not have to confirm if the prince liked her person.)
"516" (request) occurs only here. I wonder whether there is a better "r" word to fit the context.
Today, computer algorithms such as hill climbing can solve a simple cipher in an instant. One of the first such instances I am aware of is Armin Krauß's solution in 2015 of a short ciphertext in a letter to James Madison (see another article) published at Cipherbrain (then known as Klausis Krypto Kolumne, 1 August, 3 August). (When Lawren M. Smithline solved a challenge cipher to Thomas Jefferson in 2009 (another article), he used "dynamic programming", explained as "the engine that solves the scoring of all the possibilities and efficiently determines the best guesses", constructed "as a mimic to biological-sequence comparison".)
Algorithms were developed that can handle homophonic substitution ciphers. Solution of the Copiale Cipher in 2011 (see here) is an early example. Afterwards, generic algorithms were published (Amrapali Dhavare, Richard M. Low, and Mark Stamp (2013), "Efficient Cryptanalysis of Homophonic Substitution Ciphers"; Anna Lehofer (2018), "The Application of Hierarchical Clustering to Homophonic Ciphers"; Nils Kopal (2019), "Cryptanalysis of Homophonic Substitution Ciphers Using Simulated Annealing with Fixed Temperature", to just name a few that came to my notice). Today, many "homophonic solvers" are available online.
Even when there are additional elements (e.g., for syllables or words/names), if just a few words are identified by computer analysis, human efforts may recover the rest by traditional codebreaking techniques.
Even algorithmic cryptanalytic techniques cannot handle cases with a high proportion of non-letter elements (e.g., for syllables or words/names). When the same topic was mentioned in Cipherbrain in June 2020, the discussion came to a deadend. Then, what allowed the success this time?
First, the two additional specimens found by Klaus Schmeh and Elonka Dunin provided a basis for analysis. Not only did they provide additional entry points of codebreaking, but also they diluted the noise from ciphertexts in different cipher(s) (my list provided four ciphertexts written by Charles I in the Isle of Wight, of which two turned out to be in different cipher(s)).
Then, of course, clear text left by the King's partial encoding provided context to guess nearby words. Moreover, the alphabetical ordering of the nomenclature is certainly a weakness that allowed guesses once a few words were identified. I believe as yet these can better be exploited by humans than computer algorithms.
In future, what can be done to apply computer techniques to this kind of cipher? In most of Charles I's ciphers (see another article) as well as in many French ciphers (another article) and probably others, letters of the alphabet are represented in low numbers. So, it is worth trying to run a solver to a ciphertext in which only one- or two-digit figures are retained. Although the nomenclature may start in the two-digit range or the letter section continues above 100 (as in the present case, in which the letter section extends at least up to 106), if a meaningful reading is obtained for even a part of sequences in low numbers, it may give a clue to a human codebreaker. (In doing this, one should take care to retain something where three-digit figures are removed. For example, "35 74 159 28 17" should not be just converted to "35 74 28 17", but something like "35 74 X 28 17" to prevent treating "74 28" as a bigram.)
A preliminary analysis is also desirable to identify which ciphertexts are encoded in the same code. For the present case, of the four ciphertexts originally in my list, two turned out to be in different cipher(s). Examining overlap of code symbols in use may help this. (In a large project of deciphering papal ciphers, a "key clustering algorithm" was developed in George Lasry, Beáta Megyesi & Nils Kopal (2020), "Deciphering papal ciphers from the 16th to the 18th Century").
Provided that a sufficient quantity of material is given, a computer algorithm can be used to identify at least the letter symbols. George Lasry, who has successfully applied his algorithm to solve many historical ciphers (see another article) says a few hundred 2-digit figures would be required by automated algorithms (comments to the Cipherbrain article). Once a breach is made, human ingenuity would be able to guess code groups one by one, as was done for the present case.
Now, two of the four ciphers I originally presented here remain unsolved, but it seems hard unless more specimens are found.