Solution of the November 2000 GCHQ Challenge

This is the solution of the GCHQ Challenge found by Markus Kuhn and Mike Bond of the Computer Security Research Group at the University of Cambridge Computer Laboratory on 2000-11-30 around 21:15 UTC. Sunday Times journalist Adam Nathan had called us around 14:30 the same day to point us to the newly released puzzle and ask us to try and break its code.

The solution message is

  NOW CALL US

[A solution of an older (January 2000) GCHQ puzzle can be found here.]

Details of the solution

Code Table

We first write a code table for the 5-bit binary code A=1, B=2, etc.:

   chr  dec  hex   binary  inv binary
  ------------------------------------
    A     1   01    00001    11110
    B     2   02    00010    11101
    C     3   03    00011    11100
    D     4   04    00100    11011
    E     5   05    00101    11010
    F     6   06    00110    11001
    G     7   07    00111    11000
    H     8   08    01000    10111
    I     9   09    01001    10110
    J    10   0a    01010    10101
    K    11   0b    01011    10100
    L    12   0c    01100    10011
    M    13   0d    01101    10010
    N    14   0e    01110    10001
    O    15   0f    01111    10000
    P    16   10    10000    01111
    Q    17   11    10001    01110
    R    18   12    10010    01101
    S    19   13    10011    01100
    T    20   14    10100    01011
    U    21   15    10101    01010
    V    22   16    10110    01001
    W    23   17    10111    01000
    X    24   18    11000    00111
    Y    25   19    11001    00110
    Z    26   1a    11010    00101

Solution of the five Bacon cipher words

http://www.gchq.gov.uk/careers/mathematicianindex.html contains a sequence of dots and commas, which encodes three letters:

  0 = .
  1 = ,

  00110 f
  10101 u
  01110 n

  -> FUN

http://www.gchq.gov.uk/benefits/benefitsindex.html contains normal and italic characters in headlines, which encodes nine letters:

  0 = normal in heading
  1 = italic in heading

  10010 r
  00101 e
  10111 w
  00001 a
  10010 r
  00100 d
  01001 i
  01110 n
  00111 g

  -> REWARDING

http://www.gchq.gov.uk/careers/linguistindex.html contains in the last line a mixture of the Latin and Cyrillic alphabets, which encodes ten letters:

  0 = Latin character
  1 = Cyrillic character

  10111 w
  01111 o
  10010 r
  10100 t
  01000 h
  10111 w
  01000 h
  01001 i
  01100 l
  00101 e

  -> WORTHWHILE

http://www.gchq.gov.uk/nap/index.html contains normal and bold characters, which encode eleven letters:

  0 = normal
  1 = bold

  00011 c
  01000 h
  00001 a
  01100 l
  01100 l
  00101 e
  01110 n
  00111 g
  01001 i
  01110 n
  00111 g

  -> CHALLENGING

http://www.gchq.gov.uk/about/cheltinfoindex.html contains lowercase and uppercase underlined characters, which encode fourteen letters:


  0 = underlined lowercase
  1 = underlined uppercase

  00001 a
  00111 g
  10010 r
  00101 e
  00001 a
  10100 t
  01100 l
  01111 o
  00011 c
  00001 a
  10100 t
  01001 i
  01111 o
  01110 n

  -> A GREAT LOCATION

Solution for the sixth message

The minus/underscore pattern on http://www.gchq.gov.uk/challenge.html matches the lengths of the five plaintext messages from the Bacon cipher in length:

  --_ / __--__-_- / _-_---_--_ / ----_---___ / _-__--_--__--- 
  FUN   REWARDING   WORTHWHILE   CHALLENGING   AGREATLOCATION

We transform the minus/underscore pattern into a bit sequence (underscore = 0, minus = 1):

  --_ / __--__-_- / _-_---_--_ / ----_---___ / _-__--_--__--- 
  110   001100101   0101110110   11110111000   01001101100111

We transform the Bacon plaintext into a bit sequence (vowel = 0, consonant = 1):

  FUN   REWARDING   WORTHWHILE   CHALLENGING   AGREATLOCATION
  101   101011011   1011111010   11011011011   01100110101001

Now we add both sequences without carry (this is also known as "not equal", "exclusive OR", "XOR", or even "GF(2) polynomial addition", where 1+1=0):

  110   001100101   0101110110   11110111000   01001101100111
  101   101011011   1011111010   11011011011   01100110101001
  -----------------------------------------------------------
  011   100111110   1110001100   00101100011   00101011001110

We read the result again as a 5-bit code of the Latin alphabet and discard the remaining two bits:

  01110 n
  01111 o
  10111 w
  00011 c
  00001 a
  01100 l
  01100 l
  10101 u
  10011 s
  10

  -> NOW CALL US

Post Scriptum

Markus Kuhn
created 2000-11-30 -- last modified 2000-12-01 -- http://www.cl.cam.ac.uk/~mgk25/gchq-challenge.html