Project Nayuki


Cryptographic primitives in plain Python

hashdemo and MD5 screenshot

Do you want to learn how to calculate a cipher like AES or a hash function like SHA-256?
Here I present popular crypto algorithms in straightforward Python code, with logic that is easy to follow.

Source code

Download the complete package:
crypto-primitives-plain-python.zip

Or browse individual files:

The code is open source under the MIT license.

Explanation

Modern digital cryptography might look like black magic to the novice programmer. But actually, cryptographic functions are built from sequences of basic operations. These operations include addition, bitwise XOR, bit shifting, table look-up, looping, et cetera.

The fact that cryptography is accomplished via arithmetic isn’t so obvious these days. In practice, ciphers and hash functions are hidden behind libraries with tidy interfaces. And the source code for these crypto libraries is often written in intimidating programming languages like C (with lots of preprocessor macros), assembly, or even HDL. In high-level languages like Python, using a cryptographic function means calling out to a native function that was written in C and compiled to machine code – not implemented in pure Python because high-level languages are slow.

To help the curious programmer who wants to understand what really happens inside of ciphers and other cryptographic primitives, I wrote implementations of popular crypto algorithms in plain, straightforward Python code and published them here on this page.

My code is optimized for clarity and simplicity, not speed or memory usage. It’s easy to insert print statements into the code to examine intermediate data values. My hope is that once you understand how this implementation works, you can translate the abstract algorithm to any language of your choice or start optimizing for performance.

Some conventions to note about the code:

  • All functions (public and private) take input values as arguments and return new output values. They do not modify any lists or data structures in place. Also, all functions are pure and do not write any global state.

  • Public functions take either bytes, bytearray, list of int, or tuple of int as input; they return bytes or bytearray as output. Private (internal) functions usually use bytes or tuple of int, both of which are immutable; this provides extra defense against accidental programming errors.

  • Because the cryptographic functions use byte lists as input/output, the module cryptocommon provides utility functions to convert between byte lists, hexadecimal strings, and ASCII strings.

As for my serious, non-pedagogical implementations of cryptographic hashes optimized for speed, see these pages/projects: