Creating a Contact Chart with Microsoft Excel

Contact Analysis

In codebreaking, the next step following the basic frequency counting would be bigram counting. However, there are 676 bigrams (for 26-letter alphabets) and inspection of the most frequent few bigrams might not be very helpful. Still, putting bigram statistics in a table may reveal some characteristics of the letters. Such a table is called a contact chart because it shows contacts of each letter: how often each letter is followed/preceded by other letters.

Contact analysis is explained in, e.g., David Kahn, The Codebreakers (1967), p.100-102. An example of my own use of contact charts is found in another article (Vatican ciphers) and another article (Ormonde's cipher).

Creating a Contact Chart with Microsoft Excel

The following describes how to create a contact chart with a spreadsheet program such as Microsoft Excel. (I use "Microsoft Excel 2016" in Japanese.)


(1) First, perform frequency counting of single symbols (i.e., unigrams) (see another artcle) and put the results in Columns A and B in a descending order. (Sorting is not essential but would be convenient to focus on bigrams including most frequent symbols in a contact chart.)


(2) Second, perform frequency counting of bigrams (see another article) and put the results in Column D and E in a descending order. (Sorting is not necessary.)


(3) Copy Column A and paste it to Column G, which would be the label column for the contact chart.


(4) Copy the unigram listing in Column A (A2, A3, A4, ...) and paste it to a horizontal sequence of cells H1, I1, J1, .... To do this, after copying, right-click on cell H1, choose Paste Special (形式を選択して貼り付け(S)), put a check sign in a box for Transpose (行と列を入れ替える(E)), and click on OK.

The unigram listing pasted in (3) and (4) do not need to be complete.


(5) Enter
=OFFSET($E$1,MATCH($G2&" "&H$1,$D:$D,0)-1,0)
in cell H2. This locates the bigram $G2&" "&H$1 (that is, cell G2 followed by H1 with a space in-between) from Column D ($D:$D) and fetches its counting from Column E.


(6) Copy and paste the formula of (5) to the entire contact chart table as large as desired (with H2 at the top left and AC10 at the bottom right in the illustrated example above).

In the illustrated example, cell K3 shows that the bigram "41 40" occurs 9 times, whereas cell J5 shows the bigram "40 41" occurs 12 times. In the illustrated example, "##" shows where there was no match for the bigram.


(7) When viewing the contact chart thus created, it will be convenient to lock the label row and the label column (those inculding the gray cells in the above example) from scrolling. This can be done by, while cell G1 is selected, selecting a view (表示) tab, clicking on "Freeze Panes" (ウィンドウ枠の固定), and selecting Freeze Panes (ウィンドウ枠の固定(F)), which is the first among the three options: Freeze Panes, Freeze Top Row, and Freeze First Column.

Contact Chart in the Days of Manual Work

In the days of manual work, a contact chart was more necessary than it is today. In those days, rather than creating a contact chart from bigram statistics, a contact chart was made for counting bigrams in the first place by using tallies. The following is from Kahn p.101.

Repetition of each column head in each cell helps manual working. The symbol in each cell is accompanied by a tally above and another below. The tally above a symbol is a count of how often the symbol of the row is followed by the symbol of that column. The tally below shows the count of the row symbol preceded by the column symbols. Of course, the information is redundant because the chart is symmetrical. (That is, the count of how often A is followed by B is equal to how often B is preceded by A.)




©2018 S.Tomokiyo
First posted on 14 October 2018. Last modified on 14 October 2018.
Cryptiana: Articles on Historical Cryptography
inserted by FC2 system