2024-02-20

Shift cipher in Go with blocks

In the previous post we implemented a shift cipher that accepts key of variable length. Now, the cryptographic algorithms can be classified into two groups. Stream ciphers and block ciphers. Stream ciphers process data a bit at a time and are good for continuous or unknown amounts of data like in networking. Block ciphers operate with fixed-length blocks of data and are suitable for handling variable sized data. What group does our current implementation belong to? Well, it works on a byte at a time so it sounds like a stream cipher. On the other hand, it could be regarded as a block cipher where the block size is one byte. In practice though it always enciphers or deciphers the whole message so the block size equals the message size.

And this is a problem. How come? Well, if the message is not large, it’s fine. But if it’s bigger, say 10GB, we might run out of memory because we read all of it at once

message, err := io.ReadAll(os.Stdin)

and we pass all the bytes we read to Encipher or Decipher function. But these functions only need to work on one byte at a time!

In order to turn our existing code into a practical block cipher, we don’t need to change the cipher scheme, as such. We just need to make it work with chunks, or blocks, of data. For this, there’s a special interface in the standard library’s package crypto/cipher:

type Block interface {
    BlockSize() int
    Encrypt(dst, src []byte)
    Decrypt(dst, src []byte)
}

Standard library interfaces (other famous ones are io.Reader and io.Writer) define standardized “connectors” that allow plugging together different parts of code. So let’s implement it, i.e. let’s create a type with the methods defined in the interface:

// Cipher implements crypto/cipher.Block interface.
type Cipher struct {
    key [BlockSize]byte
}

func NewCipher(key []byte) (cipher.Block, error) {
    if len(key) != BlockSize {
        return nil, fmt.Errorf("%w %d (must be %d)", ErrKeySize, len(key), BlockSize)
    }
    return &Cipher{
        key: [BlockSize]byte(key),
    }, nil
}

func (c *Cipher) Encrypt(dst, src []byte) {
    for i, b := range src {
        dst[i] = b + c.key[i]
    }
}

func (c *Cipher) Decrypt(dst, src []byte) {
    for i, b := range src {
        dst[i] = b - c.key[i]
    }
}

func (c *Cipher) BlockSize() int {
    return BlockSize
}

Fine but if you look at the signatures of the Encrypt and Decrypt functions they still take all the data as input. Maybe we have a little look at the documentation:

$ go doc crypto/cipher Block
<...>
    It provides the capability to encrypt or decrypt individual blocks. The mode
    implementations extend that capability to streams of blocks.

Aha, we need some other code, called mode, that will chop data for the Encrypt and Decrypt functions into chunks:

type BlockMode interface {                                                                                                                                  
    BlockSize() int                                                                                                                                     
    CryptBlocks(dst, src []byte)                                                                                                                        
}                                                                                                                                                           

Let’s implement also this interface (I show here only the code for encrypting):

type Encrypter struct {
    cipher    cipher.Block
    blockSize int
}

func NewEncrypter(c cipher.Block) Encrypter {
    return Encrypter{
        cipher:    c,
        blockSize: c.BlockSize(),
    }
}

func (e Encrypter) CryptBlocks(dst, src []byte) {
    if len(src)%e.blockSize != 0 {
        panic("encrypter: input not full blocks")
    }
    if len(dst) < len(src) {
        panic("encrypter: output smaller than input")
    }
    // Keep chopping block-sized pieces off the plaintext
    // and enciphering them until there are no more pieces.
    for len(src) > 0 {
        e.cipher.Encrypt(dst[:e.blockSize], src[:e.blockSize])
        dst = dst[e.blockSize:]
        src = src[e.blockSize:]
    }

}

func (e Encrypter) BlockSize() int {
    return e.blockSize
}

This is awesome and there’s one major problem. The CryptBlocks function will panic if the plaintext is not aligned with the blockSize (32 bytes in our case). It means we can work only with messages whose length in bytes is a multiple of 32. Interesting but a bit limiting. We improve on this situation by padding all messages to be multiples of 32. We define the padding scheme like this. Both the number and the value of padded bytes is equal to the difference from the nearest multiple of block size. If the message size is aligned with the block size, the number and the value of padded bytes is equal to the block size. And here’s the code:

func Pad(data []byte, blockSize int) []byte {
    n := blockSize - len(data)%blockSize
    padding := bytes.Repeat([]byte{byte(n)}, n)
    return append(data, padding...)
}

func Unpad(data []byte, blockSize int) []byte {
    n := int(data[len(data)-1])
    return data[:len(data)-n]
}

Finally we have all the necessary code to create commands that can encrypt and decrypt arbitrary data:

$ export KEY=0101010101010101010101010101010101010101010101010101010101010101
$ go run ./cmd/encipher/ -key $KEY < ../shift/testdata/tiger.txt | go run ./cmd/decipher/ -key $KEY 
The tiger appears at its own pleasure. When we become very silent at that
place, with no expectation of the tiger, that is when he chooses to appear...
When we stand at the edge of the river waiting for the tiger, it seems that the
silence takes on a quality of its own. The mind comes to a stop. In the Indian
tradition that is the moment when the teacher says, “You are that. You are that
silence. You are that.”
--Francis Lucille, “The Perfume of Silence”

See https://github.com/jreisinger/pocs/tree/main/crypto/shift-block for the full code.

2024-02-16

Shift cipher in Go with multibyte key

In the previous blog post we developed a simple crypto system. Its algorithm is based on shifting bytes by the number represented by a single byte we call a key. It means Eve has to do at maximum 256 (1 byte is 8 bits and that means 2^8 possibilities) guesses to find out the key. Let’s try to improve the situation here by supporting longer keys.

To encipher a message we go through it byte by byte and we also go byte by byte through the key. Usually the key is much shorter than the message we want to encrypt. So we need to go through the key multiple times. The math trick to do this is called modulo. A mod B is the remainder that’s left after dividing A by B as many times as you can. E.g. 5 mod 2 = 1. Modular arithmetic is sometimes called “clock arithmentic” because it wraps around like an analog clock; 12 hours later than 5 o’clock can’t be 17 o’clock, it’s 5 o’clock again. To put it another way, 17 mod 12 = 5.

To illustrate how modulo (% in Go) works let’s write a short program:

func main() {
        B := 3
        for A := range 10 {
                fmt.Printf("%d mod %d = ", A, B)
                fmt.Println(A % B)
        }
}

The program will produce this output - notice the result is never greater than 2 which is handy for a slice (or array) index:

0 mod 3 = 0
1 mod 3 = 1
2 mod 3 = 2
3 mod 3 = 0
4 mod 3 = 1
5 mod 3 = 2
6 mod 3 = 0
7 mod 3 = 1
8 mod 3 = 2
9 mod 3 = 0

OK, let’s use modulo to help us encrypt a message:

func Encipher(plaintext []byte, key []byte) []byte {
    ciphertext := make([]byte, len(plaintext))
    for i, b := range plaintext {
        ciphertext[i] = b + key[i%len(key)]
    }
    return ciphertext
}

To decrypt a message encrypted by a multi-byte key we do the same in reverse:

func Decipher(ciphertext []byte, key []byte) []byte {
    plaintext := make([]byte, len(ciphertext))
    for i, b := range ciphertext {
        plaintext[i] = b - key[i%len(key)]
    }
    return plaintext
}

Now, how do we pass a multi-byte key as a command line argument? The key, in some sense, is just a single number, no matter how many bytes it takes to express it. For example, if we had a 32-byte (that is, 256-bit) key, we could express it as either a series of 32 integers (one for each byte), or as a single very large integer. But Go’s int64 can hold only 8 bytes (or 64 bits) worth of information… There’s a neat and concise way to write large integers: as a string, using hexadecimal notation. For example the decimal number 3 735 928 559 can be represented as DEADBEEF (4 bytes) in hex, isn’t that funny? :-) If fact, any given byte can be written as exactly two hex digits, which is convenient.

$ echo hello | go run ./cmd/encipher -key DEADBEEF
F*[M�

Also notice that unlike with the single-byte version, the same plaintext letter does not always produce the same ciphertext letter. The “ll” is enciphered as “[M”. This makes the frequency analysis a lot harder for Eve.

But what troubles Eve even more is that her function for brute-forcing single key shift ciphers won’t work anymore:

func Crack(ciphertext, crib []byte) (key byte, err error) {
    for guess := 0; guess < 256; guess++ {
        plaintext := Decipher(ciphertext[:len(crib)], byte(guess))
        if bytes.Equal(plaintext, crib) {
            return byte(guess), nil
        }
    }
    return 0, errors.New("no key found")
}

She has to solve couple of issues:

  • Repeat the guessing of the key byte multiple times. The number of repetitions will be either the length of the encrypted message or some value defined by us (MaxKeyLen); whatever is smaller.
  • In order to use the Decipher function she needs to create byte slice out of a byte to match the function’s arguments type.
  • She has to check the whole key is correct after each cracked key byte.
const MaxKeyLen = 32 // bytes

func Crack(ciphertext, crib []byte) (key []byte, err error) {
    for k := range min(MaxKeyLen, len(ciphertext)) {
        for guess := range 256 {
            plaintext := Decipher([]byte{ciphertext[k]}, []byte{byte(guess)})
            if plaintext[0] == crib[k] {
                key = append(key, byte(guess))
                break
            }
        }
        if bytes.Equal(Decipher(ciphertext[:len(crib)], key), crib) {
            return key, nil
        }
    }
    return nil, errors.New("no key found")
}

The longer key is harder to brute-force but it’s still possible:

$ go run ./cmd/encipher -key DEADBEEF < ../shift/testdata/tiger.txt | go run ./cmd/crack -crib 'The tiger'
The tiger appears at its own pleasure. When we become very silent at that
place, with no expectation of the tiger, that is when he chooses to appear...
When we stand at the edge of the river waiting for the tiger, it seems that the
silence takes on a quality of its own. The mind comes to a stop. In the Indian
tradition that is the moment when the teacher says, “You are that. You are that
silence. You are that.”
--Francis Lucille, “The Perfume of Silence”

However, there is a limitation (or a bug): the crib must be at least as long as the key; it this case 4 bytes, i.e. ‘The’.

See https://github.com/jreisinger/pocs/tree/main/crypto/shift-multibytekey for all the code including tests.

2024-02-15

Shift cipher in Go

A simple way to encipher (or encrypt) some data is by using the shift cipher. We can do this in Go by going through the data byte by byte adding a key to each of the bytes. In Go bytes are equivalent to 8-bit numbers ranging from 0 to 255 (byte data type is actually an alias for uint8).

func Encipher(plaintext []byte, key byte) []byte {
    ciphertext := make([]byte, len(plaintext))
    for i, b := range plaintext {
        ciphertext[i] = b + key
    }
    return ciphertext
}

To decipher we need to do the same but in reverse, i.e. we detract the key from each byte of the enciphered data.

func Decipher(ciphertext []byte, key byte) []byte {
    return Encipher(ciphertext, -key)
}

This way Alice and Bob can exchange data in somehow secure manner. If Eve wants to learn what are they talking about she needs to know the encryption algorithm and the key. Let’s say she finds out they are using the Caesar cipher so she just needs to crack the key. The standard way to do this is called brute forcing, i.e. trying out all possibilities; in our case all possible keys. She also needs to know some bytes from the beginning of the “plaintext” data; this we call a crib.

func Crack(ciphertext, crib []byte) (key byte, err error) {
    for guess := 0; guess < 256; guess++ {
        result := Decipher(ciphertext[:len(crib)], byte(guess))
        if bytes.Equal(result, crib) {
            return byte(guess), nil
        }
    }
    return 0, errors.New("no key found")
}

If we call these functions from within commands (package main) it looks like this:

$ echo HAL | go run ./cmd/encipher
IBM
$ echo IBM | go run ./cmd/decipher
HAL
$ echo hello world | go run ./cmd/encipher -key 10 | go run ./cmd/crack -crib hell                                                                          
hello world

See shift for all the code. Most of the ideas and code come from John Arundel’s book I started to read. I plan to write the code from the book and to take notes in the form of blog posts like this one.