Genetic Algorithm and Simulated Annealing Cipher Solver

To run the program on Unix or GNU/Linux:

  1. Make sure the following files are executable:
  2. Execute one of the three above files, you will need to provide two pieces of information:
    For example: ./crypto_sa.py `cat cipher_text`
To run the program in Windows:
  1. Run an interactive Python session on the command-line in the folder containing all the files
  2. Import the files
  3. Copy one of the main functions line-by-line into the interactive session (you'll have to copy the sample text in by hand)
  4. Once you actually make the call to the appropriate function (simulatedAnnealing or geneticAlgorithm), it will take a long time to return, but will print the status at each step through the loop

This was a project in my AI class. We were instructed to use simulated annealing and genetic algorithms to solve mono-alphabetic substitution ciphers. Unfortunately solving such ciphers is not a good application for these algorithms, but they can eventually provide decent results. Once the programs start running, they will output the current best solution after each cycle through the loop and will eventually print out the best solution they found just before exiting. Often they do not work well with short ciphertexts and/or small sample texts. However, they certainly work if you use the plaintext as the sample text, but that, of course, defeats the purpose. As with all algorithms of these kind, they are meant to be efficient ways to approximate a solution and are in no way guaranteed to produce the correct solution.

There are three main files and each attempts to decrypt the cipher using different methods. They are as follows:

Simulated Annealing
Simulated annealing is a process based on the metallurgical process of controlled cooling of metal known as annealing. The idea is that the solution is slowly "cooled" over time. At each time step a successor state is generated. The successor is automatically accepted if it is better than the current state, but it is conditionally accepted if it is worse than the current state based on the current temperature. At first (higher temperatures), much worse solutions are accepted, but as the temperature goes down, acceptable successor solutions are not allowed to be as much worse. The hope is that by sometimes accepting bad solutions, we can prevent ourselves from getting stuck in a local maxima of the solution space because the worse solution will be in a very different part of the solution space.

Genetic Algorithms
Genetic algorithms are based on evolution. The solution is encoded as a genome, often represented by a bit-string. Many random genomes are generated and rated for fitness (i.e. how good of a solution it is). Then the best solutions are "bred" together in a method similar to sexual reproduction that creates child genomes which are added to the population. As new solutions are added, bad solutions are removed from the population. The simulation is run for a long time and, hopefully, a sort of natural selection will raise the overall fitness of the population.

  • Files:
    1. Simulated annealing source file
    2. Standard genetic algorithm source file
    3. Genetic algorithm with full replacement source file
    4. Header file that contains code common to both genetic algorithm implementations
    5. Header file that contains code common to all of the cipher solvers
    6. An example ciphertext
    7. An example sample text (Alice's Adventures in Wonderland by Lewis Carroll)
    8. Zip file containing all of the above files as well as precompiled copies of the headers