# Italian Cipher Letter to Madison (1780) Deciphered

There are lines in cipher that have long remained unread in a letter of 30 November 1780 to James Madison from an Italian, Philip Mazzei (see another article). In 2015, when the page listing it among unsolved historical ciphers was mentioned in a German cryptology blog, Armin Krauss immediately responded with his solution (Klausis Krypto Kolumne, 1 August, 3 August).

### Solution

VI RICORDERETE DEL MIO CAR=
TEGGIO CON AQUESTO SOVRANO; EI FA MOLTA STIMA DELLE MIE
PAROLE. UN GIORNO PARLANDO SUGLI AFFARI D'AMERICA EI
DISSE "VOI AVETE PREDETTO TUTTO AQUELLO CHE E SEGUITO".

With the help of machine translation, these lines appear to merely remind of the past correspondence about the American affairs and the words "You predicted all." etc.

### Cipher

I sorted every occurrence of the symbols under their respective plaintext letters. The result given below clearly shows the consistency of the solution (except for one seeming enciphering error for V).

The cipher employed homophonic substitution. It appears to simply assign two symbols for every letter, whether of high frequency (E, O, ...) or not (F, P, ...).

### Cryptanalysis: Hill Climbing

The quick solution of this short cryptogram was attained by "hill climbing." It is a basic technique in computer science used to find an optimal pattern among numerous possibilities. For this cryptanalysis, possible patterns are assignments of letters A-Z to some 40 symbols of the cryptogram.

The general operations of hill climbing are as follows. First, some initial assignment is chosen, for which it is evaluated how likely it represents an Italian plaintext. In order to let a computer tell whether it looks like Italian, "trigram" patterns may be analyzed. That is, trigrams (groups of three consecutive letters) are collected from the provisional deciphering obtained with the initial assignment and are compared with known trigram characteristics of Italian. For example, if trigrams such as "FQR", "KBP" abound, it would give a low score, whereas "COR", "PAR", etc. may give a high score. For the initial assignment, the score will be very low.

Then, some random change is made in the assignment and the deciphering according to the changed assignment is evaluated to see whether it improves the score. If not, the change is discarded and another random change is tried. If there is an improvement, the changed assignment is adopted as an updated basis for further improvements.

This process is repeated many times to increase the score ("hill climbing") until no improvement is possible (say within 1000 trials), at which point an optimal assignment is supposed to be found.

A disadvantage of this algorithm is that the solution may be stuck at a local optimum. That is, a solution (assignment) may be better than any of its minor variants (i.e., it is a local optimum) but there may be some remote solution better than that (the global optimum).

According to the explanation of the blog, Krauss appears to have used a few passes of the algorithm, each time identifying more letters/syllables, and the few that remained were determined by hand (with the help of Google Translate). The finishing touch by Armin was change of "LA MOLTA" to "FA MOLTA" (the relevant symbol for "F" occurring only once) and change of "SEGUIR" to "SEGUITO" (the relevant symbol for "R" being indistinct in the scan).