Providing integrity for Encrypted data with HMACs in .NET

Note: This post is a continuation of my previous article on AES Crypto in .NET

Once again I’m about to dip into and scratch the surface of Cryptography. So here’s the disclamer: This is not my job. I don’t do this for a living. Don’t ever make up your own encryption algorithms. Try not to write your own cryptographic code. Don’t take anything that I say as the advice of a security expert or legal advice or assume that any code in this post or any linked posts are correct in any way shape or form. If you have a project that requires cryptographic security, I suggest you find someone who has been doing this far longer than I have to write the code for you. Then have several 3rd party security firms review or write your security code. In short, I’m not responsible for your mistakes, or the correctness of the code or writing presented here.

As even more of a taste of just how hard it is, I admit that I thought the last article I wrote covered most of the basics of modern encryption in .NET pretty well. The result? Nope. Missed something big. See, cryptography is hard not because it’s hard to say “take a password plus this data and make it so someone can’t see the data without the password”, but because of all the ways someone could get access to your password, or access your data when its not encrypted, or access your computer when it’s unlocked, or crack your password if you pick a bad password to begin with. Take a look at this reddit thread article that nicely rips apart my previous article. In all seriousness, I really appreciate the feedback, especially when it comes to security because it’s hard, and the devil is in the details.

I forgot to cover data integrity in my last article. Encryption is not enough. Redditors referred to it as authentication, which is one of the uses of HMAC’s, but authenticity is not whats important in the example case I presented for my original article. We are interested in “How to detect that someone or something accidentally or maliciously changed the encrypted data”, essentially, we’re interested in cryptographic data integrity. In addition to that there were a few other things wrong with the article that the redditors pointed out, such as my salt values being overly large, just picking CBC without covering other chaining methods, not discussing how to remove keys from memory, and referencing Jeff Atwood. The long and the short of it, they’re completely right. Getting crypto correct is hard, and good crypto systems are worth millions of dollars. You’re probably not getting paid that much to write a little encryption. So use a library thats already been written and provides high level abstractions and don’t write crypto code yourself.

Alright. HMAC, what is it? The MAC stands for a Message Authentication Code, the H stands for Hash. Put it all together and you have a Hash-based Message Authentication Code. It’s a hashing function thats deliberately designed to resist malicious tampering. The key is preventing malicious tampering. A normal hash function (e.g. MD5 or SHA1) would detect accidental byte tweaks, but somebody maliciously tampering with your data could tinker with it, create a new hash code for the data, replace the old hash code, and you would never be the wiser. For this reason, Message Authentication Codes generally fall into the category of keyed hash algorithms since they use a key or derived key and mix it with a hashing or encryption function to produce a value that an attacker can’t reproduce, providing both data integrity, and (if you happen to be in a sender and receiver role where both parties share a key) authentication. There are however, a couple of things to take into account. First, we shouldn’t use the same key twice to encrypt and hash the data. The more times the same key is used, especially against a known piece of data, the more likely an attack can be developed and used to figure out our key. Arguably, the final key used for encryption and the final key used for the message authentication should be different, as different as possible. The best way to do this would be to append or change the key such that both the encryption key and authentication key run through the KDF (Key Derivation Function) from different starting points. As an example, consider the following:

// Derive the passkey from a hash of the passwordBytes plus salt with the number of hashing rounds.
var deriveKey = new Rfc2898DeriveBytes(password, passwordSalt, 10000);
var deriveHMAC = new Rfc2898DeriveBytes(password, hmacSalt, 10000);
// This gives us a derived byte key from our passwordBytes.
var aes256Key = deriveKey.GetBytes(32);
var hmacKey = deriveHMAC.GetBytes(32);

Because the hash function mixes the password with the salt, and because we have different salts, after only one round the derived keys will already be different on account of the salt value. So we have a derived key. One for the actual encryption, one to prevent tampering and provide data integrity.

On a side note, it’s arguably better to authenticate the encrypted output (encrypt then authenticate) rather than authenticate the plaintext, then encrypt, or authenticate and encrypt. Again, I know it’s beating a dead horse, but cryptography is hard, and ultimately, the security of a system is going to depend on the security of the entire system, not just the individual parts. So. We have an HMAC key, we have an encryption key, and we know that we want to encrypt, then authenticate the encrypted output. In addition, we want to make sure anything else that could easily be tampered with is also authenticated, such as our Initialization Vector, since any change to it can easily affect our decrypted output. Another small advantage thats almost not worth mentioning is that by authenticating the encrypted output instead of the plain text is that we can detect if anything has changed even before we start decrypting the text. So here’s an example. In this case, I chose to use HMACSHA1, there’s others on the MSDN but I chose this particular one since it uses the same hash algorithm used internally by the KDF I used in my previous post, Rfc2898DeriveBytes, aka (PKDF2).

var hmac = new HMACSHA1(hmacKey);
var ivPlusEncryptedText = iv.Concat(cipherTextBytes).ToArray();
var hmacHash = hmac.ComputeHash(ivPlusEncryptedText);

In this case, we’re using our derived hmacKey, and we’re computing the hash of both the initialization vector concated with our encrypted ciphertext. That gives us everything we need to have a self validating “package” of data that is secured and can’t be tampered without us knowing unless the attacker knows our key or can break AES256 encryption, but at that point this whole discussion is pointless.

With decryption, remember how I said we compute the Encryption and HMAC key separately? If we did that, and if we computed the hmac over the encrypted data, we can perform the validation step on the data before we compute our decryption key. The only reason we would do this is so that if the data is invalid or has been tampered with we don’t take the time to also compute the decryption key. Small things, but I wanted to explain why the key computation for the encryption and hmac is kept separate:

var deriveHmac = new Rfc2898DeriveBytes(password, hmacSalt, 10000);
var hmacKey= deriveHmac.GetBytes(32);
var hmacsha1 = new HMACSHA1(hmacKey);
var ivPlusEncryptedText = ivBytes.Concat(encryptedBytes).ToArray();
var hash = hmacsha1.ComputeHash(ivPlusEncryptedText);
 
if (!BytesAreEqual(hash, hmac))
   throw new CryptographicException( "Your encrypted data was tampered with!" );
 
var deriveKey = new Rfc2898DeriveBytes(password, passwordSalt, 10000);
var aes256Key = deriveKey.GetBytes(32);
 
using (var transform = new AesManaged())
{
   using (var ms = new MemoryStream(encryptedBytes))
   {
      using (var cryptoStream = new CryptoStream(ms, transform.CreateDecryptor(aes256Key, ivBytes), CryptoStreamMode.Read))
      {
         var decryptedBytes = new byte[encryptedBytes.Length];
         var length = cryptoStream.Read(decryptedBytes, 0, decryptedBytes.Length);
 
         var decryptedData = decryptedBytes.Take(length).ToArray();
      }
   }
}

So, there you go. Basic explanation about why HMAC’s are important, what I missed, some code, and the disclaimer to write security code at your own risk. Full demo code, demo code output, and a bunch of random links after the break.

Full Demo Code

using System;
using System.IO;
using System.Linq;
using System.Security.Cryptography;
using System.Text;
 
public class CryptoDemo
{
    public static void Main(string[] args)
    {
        string text = "";
 
        while (text == "")
        {
            TestEncryptionAndDecryption();
 
            text = Console.ReadLine();
            Console.Clear();
        }
    }
 
    public static void TestEncryptionAndDecryption()
    {
        const string myPassword = "uB3rAw3$omeP@assw0rd!";
        const string myData = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. " +
                              "Morbi rutrum pulvinar purus, nec ornare neque cursus id. " +
                              "Nunc non tortor est. Morbi laoreet commodo tellus, et suscipit neque elementum eu. " +
                              "Sed velit lorem, ultricies id varius vitae, eleifend eget massa. " +
                              "Curabitur dignissim eleifend quam, sit amet interdum velit rutrum vel. " +
                              "Nulla nec enim tortor.";
 
        Print("Password", myPassword);
        Print("Plain Text", myData);
 
        var data = Encoding.UTF8.GetBytes(myData);
 
        var encryptedData = EncryptData(myPassword, data);
 
        //what if we tampered with the data?
        //encryptedData.IV[0] = (byte)(encryptedData.IV[0] ^ 0x3f);
 
        try
        {
            var decrypted = DecryptData(myPassword, encryptedData);
            Print("Decrypted Text", Encoding.UTF8.GetString(decrypted));
        } catch(CryptographicException ce)
        {
            PrintSection("ERROR: " + ce.Message);
        }
    }
 
    // encrypt the data and all initialization into a series of arrays based on the
    // password and the data.
    public static EncryptedData EncryptData(string password, byte[] clearText)
    {
        PrintSection("Encryption");
 
        var passwordSalt = new byte[32];
        var hmacSalt = new byte[32];
        var iv = new byte[16];
 
        using (var rnd = RandomNumberGenerator.Create())
        {
            rnd.GetBytes(passwordSalt);
            rnd.GetBytes(hmacSalt);
            rnd.GetBytes(iv);
        }
 
        Print("Random Password Salt", passwordSalt);
        Print("Random HMAC Salt", hmacSalt);
        Print("Random Initialization Vector", iv);
 
        // Derive the passkey from a hash of the passwordBytes plus salt with the number of hashing rounds.
        var deriveKey = new Rfc2898DeriveBytes(password, passwordSalt, 10000);
        var deriveHMAC = new Rfc2898DeriveBytes(password, hmacSalt, 10000);
 
        // This gives us a derived byte key from our passwordBytes.
        var aes256Key = deriveKey.GetBytes(32);
        var hmacKey = deriveHMAC.GetBytes(32);
 
        Print("Derived AES-256 Key", aes256Key);
        Print("Derived HMAC Key", hmacKey);
 
        // AES vs RijndaelManaged: http://stackoverflow.com/questions/2289306/aesmanaged-versus-rijndaelmanaged
        using (var transform = new AesManaged())
        {
            using (var ms = new MemoryStream())
            {
                using (var cryptoStream = new CryptoStream(ms, transform.CreateEncryptor(aes256Key, iv), CryptoStreamMode.Write))
                {
                    cryptoStream.Write(clearText, 0, clearText.Count());
                    cryptoStream.FlushFinalBlock();
                }
 
                var cipherTextBytes = ms.ToArray();
                Print("Encrypted Bytes", cipherTextBytes);
 
                var hmac = new HMACSHA1(hmacKey);
                var ivPlusEncryptedText = iv.Concat(cipherTextBytes).ToArray();
                var hmacHash = hmac.ComputeHash(ivPlusEncryptedText);
 
                Print("HMAC Hash (IV + Ciphertext)", hmacHash);
 
                return new EncryptedData
                       {
                           Data = cipherTextBytes,
                           HMAC = hmacHash,
                           IV = iv,
                           HMACSalt = hmacSalt,
                           PasswordSalt = passwordSalt
                       };
            }
        }
    }
 
    public static byte[] DecryptData(string password, EncryptedData data)
    {
        if (!data.HasValidData())
            throw new CryptographicException("EncryptedData is not valid.");
 
        PrintSection("Decryption");
 
        var deriveHmac = new Rfc2898DeriveBytes(password, data.HMACSalt, 10000);
        var hmacKey = deriveHmac.GetBytes(32);
 
        Print("Derived HMAC Key", hmacKey);
 
        // Check to see if our data is valid
        if (!ValidateEncryptedData(hmacKey, data)) 
            throw new CryptographicException("Your encrypted data was tampered with!");
 
        // We don't need to derive our decryption key until after our ciphertext has been validated.
        var deriveKey = new Rfc2898DeriveBytes(password, data.PasswordSalt, 10000);
        var aes256Key = deriveKey.GetBytes(32);
 
        Print("Derived AES-256 Key", aes256Key);
 
        using (var transform = new AesManaged())
        {
            using (var ms = new MemoryStream(data.Data))
            {
                using (var cryptoStream = new CryptoStream(ms, transform.CreateDecryptor(aes256Key, data.IV), CryptoStreamMode.Read))
                {
                    var decryptedBytes = new byte[data.Data.Length];
                    var length = cryptoStream.Read(decryptedBytes, 0, decryptedBytes.Length);
 
                    var decryptedData = decryptedBytes.Take(length).ToArray();
 
                    return decryptedData;
                }
            }
        }
    }
 
    public static bool ValidateEncryptedData(byte[] hmacKey, EncryptedData data)
    {
        if (hmacKey == null) throw new ArgumentNullException("hmacKey");
 
        if (!data.HasValidData())
            return false;
 
        var ivPlusEncryptedText = data.IV.Concat(data.Data).ToArray();
 
        var hmacsha1 = new HMACSHA1(hmacKey);
        var hash = hmacsha1.ComputeHash(ivPlusEncryptedText);
 
        Print("HMAC Hash (IV + Ciphertext)", hash);
 
        if (!BytesAreEqual(hash, data.HMAC)) return false;
 
 
        return true;
    }
 
    public struct EncryptedData
    {
        public byte[] PasswordSalt;
        public byte[] HMACSalt;
 
        public byte[] IV;
        public byte[] Data;
        public byte[] HMAC;
 
        // Note, all the lengths in here are algorithm dependent, and more specifically, they're
        // dependendant on the variant of the algorithm. e.g. Depending on the block size the IV
        // Could be 8 / 16 / 32...
        public bool HasValidData()
        {
            if (HMAC == null || HMAC.Length != 20)
                return false;
 
            if (IV == null || IV.Length != 16)
                return false;
 
            if (HMACSalt == null || HMACSalt.Length != 32)
                return false;
 
            if (PasswordSalt == null || PasswordSalt.Length != 32)
                return false;
 
            if (Data == null || Data.Length < 16)
                return false;
 
            return true;
        }
    }
 
    public static void Print(string message, string text)
    {
        var length = Encoding.UTF8.GetBytes(text).Length;
        Writer.WriteLine("{0} ({1} bytes, {2} bits): ", message, length, length * 8);
        Writer.WriteLine(text);
        Writer.WriteLine();
    }
 
    public static void Print(string message, byte[] bytes)
    {
        Writer.WriteLine("{0} ({1} bytes, {2} bits): ", message, bytes.Length, bytes.Length * 8);
        Writer.WriteLine(Convert.ToBase64String(bytes));
        Writer.WriteLine();
    }
 
    public static void PrintSection(string message)
    {
        Writer.WriteLine();
        Writer.WriteLine("-------------- {0} --------------", message);
        Writer.WriteLine();
        Writer.WriteLine();
    }
 
    private static readonly TextWriter Writer = Console.Out;
 
    // Checks to see if all the bytes in the two arrays are equal.
    // Returns fals if either of the arrays are null or not the same length.
    public static bool BytesAreEqual(byte[] array1, byte[] array2)
    {
        if (array1 == null || array2 == null || array1.Length != array2.Length)
            return false;
 
        if (array1.Length == 0) return true;
 
        for (int i = 0; i < array1.Length; i++)
        {
            if (array1[i] != array2[i])
                return false;
        }
 
        return true;
    }
}

Demo Output

Password (21 bytes, 168 bits):
uB3rAw3$omeP@assw0rd!

Plain Text (355 bytes, 2840 bits):
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi rutrum pulvinar purus, nec ornare neque cursus id. Nunc non tortor est. Morbi laoreet commodo tellus, et suscipit neque elementum eu. Sed velit lorem, ultricies id varius vitae, eleifend eget massa. Curabitur dignissim eleifend quam, sit amet interdum velit rutrum vel. Nulla nec enim tortor.


-------------- Encryption --------------


Random Password Salt (32 bytes, 256 bits):
WtnnWdgEW0eNH9AyYWy+H7cQzzaWoytA6ResVavfOzI=

Random HMAC Salt (32 bytes, 256 bits):
DN78HvhGnFtgD4ARaUHgkp5bhCeROPNUi2a7Aj1V1oI=

Random Initialization Vector (16 bytes, 128 bits):
kaI98fqi9KGnvH59n77Sdg==

Derived AES-256 Key (32 bytes, 256 bits):
F77Z31yr+ZDxhgHurbjBdDUzRzSbHbMiHwCI0nAan0Y=

Derived HMAC Key (32 bytes, 256 bits):
GT7S/bn3JDj/Ts18mcwXcXjO6IaHGTtOODOyeJzVzPw=

Encrypted Bytes (368 bytes, 2944 bits):
4m2SMtaJcKG9KAiwCuxrWzRIZDIIb/f6cqpMWOO3CGClakjC1b1fv9EkIj1q9sQUFV/CZuLZUDUrV2nBFA+aJtrXTPF3bC6VzvORBxg/z611NoZ+bAuP0qew9tkhzzQoM5yawDrD6B8broYR7gH7a9jK8Nq6EpLXQjfMz47SAhlFQ0LWMluDnvdpZ06pGhmg9142poRyern6/DQV7kJtFurKpRuHHSKjzxXR0CHQiK78O9EwGlvgCN72kUJE7yL1jjY8IzU+1Y9zpYJXXLxZpPDN7uCecNbFVd3btQjwsoX09MoE8Scbd9cqUehkM+eJg/p7LPZgsKzzHg3CLraVBhWHjLWy9LtI2Dj5eYJ1FZGQ/wQn8ha6Yxj6tk907JxSwa5PbAzXAJ0Myys8SXaRLk5vPSW4CSBQZ5R8PZM8xy9cgQiucoVNDNCA8ifJ+kFTHqE1x6pIKVGsYMrWFtgmZhcxe8Bp+DfQBNpOkOBTXjo=

HMAC Hash (IV + Ciphertext) (20 bytes, 160 bits):
2XFkr7quSmK4tUGQpa1Ey9Hvsos=


-------------- Decryption --------------


Derived HMAC Key (32 bytes, 256 bits):
GT7S/bn3JDj/Ts18mcwXcXjO6IaHGTtOODOyeJzVzPw=

HMAC Hash (IV + Ciphertext) (20 bytes, 160 bits):
2XFkr7quSmK4tUGQpa1Ey9Hvsos=

Derived AES-256 Key (32 bytes, 256 bits):
F77Z31yr+ZDxhgHurbjBdDUzRzSbHbMiHwCI0nAan0Y=

Decrypted Text (355 bytes, 2840 bits):
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi rutrum pulvinar purus, nec ornare neque cursus id. Nunc non tortor est. Morbi laoreet commodo tellus, et suscipit neque elementum eu. Sed velit lorem, ultricies id varius vitae, eleifend eget massa. Curabitur dignissim eleifend quam, sit amet interdum velit rutrum vel. Nulla nec enim tortor.

Related Links

Security Libraries

Evil Maid Attack

MAC’s and Encryption with Authentication

Barry Dorrans Encryption Session at DDD8

Protecting Memory in .NET

Galois Counter Mode

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>