Tuesday, June 17, 2008

Hashing in .NET (cryptography related battle tactics)




Those who think I'm going to talk about stuff related to hashish or hash brown are totally not right. (By the way I do like hash brown as well as this great Japanese liquor ;) )

I will be talking about hashing that is related to cryptography and security. Hashing can be described as a process of getting small digital "fingerprint" from any kind of data. Those interested in general information can get it here.

In .NET, security and cryptography related stuff is located in System.Security.Cryptography namespace. Our hero of the day will be SHA1 algorithm. .NET class SHA1Managed implements it. According to .NET cryptography model this class implements abstract class SHA1. The same, by the way, is valid for other hash algorithms e.g. MD5. They both inherit from HashAlgorithm class. It is very likely if new hashing algorithm is added to the .NET framework it will inherit from HashAlgorithm.

There are three ways how to calculate hash value for some data.

  1. Use ComputeHash method

  2. Use TransformBlock/TransformFinalBlock directly

  3. Use CryptoStream

I'll show how to use above mentioned approaches. Let's assume we have some application data

RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
byte[] data = new byte[5 * 4096 + 320];
//fill data array with arbitrary data
rng.GetNonZeroBytes(data);
//initialize HashAlgorithm instance
SHA1Managed sha = new SHA1Managed();

The first way:

byte[] hash1 = sha.ComputeHash(data);

It very straightforward and simple: pass data and get hash output. But this method is not suitable when hash has to be calculated for several byte arrays or when data size is very large (calculate hash value for the binary file).
This leads us to the second way:

int offset = 0;
int blockSize = 64;
//reset algorithm internal state
sha.Initialize();
while (data.Length - offset >= blockSize)
{
offset += sha.TransformBlock(data, offset, blockSize,
data, offset);
}
sha.TransformFinalBlock(data, offset, data.Length - offset);
//get calculated hash value
byte[] hash2 = sha.Hash;

This way is much more flexible because: we can reuse HashAlgorithm instance (using Initialize method) and calculate hash value for large data objects.
However, to do that we still have to write additional code to read chunks from file and then pass them to TransformBlock method.

Finally, the third way:

//reuse SHA1Managed instance
sha.Initialize();
MemoryStream memoryStream = new MemoryStream(data, false);
CryptoStream cryptoStream = new CryptoStream(memoryStream, sha,
CryptoStreamMode.Read);
//temporary array used by the CryptoStream to store
//data chunk for which hash calculation was performed
byte[] temp = new byte[16];
while (cryptoStream.Read(temp, 0, temp.Length) > 0) { }
cryptoStream.Close();
hash3 = sha.Hash;

Isn't it beautiful? CryptoStream can use any Stream object to read from. Thus calculating hash value for a large file isn't a problem - just pass FileStream to CryptoStream constructor!
Under the hood CryptoStream uses TransformBlock/TransformFinalBlock, so the third way is derivative from the second one.
CryptoStream links data streams to cryptographic transformations: it can be chained together with any objects that implement Stream, so the streamed output from one object can be fed into the input of another object.

The first approach is good when you're calculating hash values from time to time.
The second and third are best when large part of your application's operation is connected with hash calculations (like using cryptography in network I/O).

No comments:

Post a Comment