Friday, June 18, 2010

Thread Safe Collection Iteration Techniques

Under multithreaded environment every operation should be tested and analyzed from the viewpoint of thread-safety. That is check every data structure what will happen if it is accessed/changed from multiple threads

Imagine, we need to iterate over a collection of items and perform some actions over each item of the collection. Since we're talking about threading - iteration should be done in a thread safe way. That is while we are iterating over collection no other thread is allowed to add or remove items from it.
No problemo! you may think - do the iteration under a lock.

But it is not that simple.

Code sample below illustrates two approaches how to do the iteration. Both have pros and cons. More on that after the code sample.

int initialItems = 5;
ICollection<string> coll = new List<string>(initialItems);

for(int i = 1; i <= initialItems; i++)
 coll.Add("item" + i.ToString());
   
//#1 iterating with lock approach
lock(coll)
{
 foreach(string item in coll)
 {
  PerformWorkWithItem(item);    
 }
}
//

//#2 iteration over a copy 
ICollection<string> copyColl = null;
lock(coll)
{
 copyColl = new List<string>(coll);
}

foreach(string item in copyColl)
{
 PerformWorkWithItem(item);    
}
//    

void PerformWorkWithItem(string item)
{
 //
 // perform operations that can take some 
 // considereable amount of time     
 //
}

Welcome back.

Approach #1 uses global lock for iteration. That means that while iterating collection is protected by the lock.
The pros are:
  • simplicity (just put the lock and do the job)
  • Memory efficiency - no new object are constructed
The cons are:
  • if PerformWorkWithItem takes long time to complete or is blocking (i.e. reading data from the network) access to collection is blocked for considerable amount of time
  • action with a collection item is also protected by the lock

Approach #2 uses different technique. It locks access to the collection only to perform a copy (snapshot) of the original collection. Iteration and PerformWorkWithItem action is made over a snapshot and is not protected by the lock.
The pros are:
  • Operations on collection items are done without locking the collection. If PerformWorkWithItem takes long time to complete original collection is not locked as in #1
  • Allows to schedule actions on collection items using separate thread
The cons are
  • If original collection is large enough performing data copy can become inefficient
  • Add complexity. While performing actions on snapshot items of the original collection may have been already changed.

Now that we know pros and cons of these two approaches we can deduce some hints that can help choose appropriate technique.

For instance, if PerformWorkWithItem action is relatively fast and there is no problem for the rest of the application to wait for iteration process then approach #1 is the best.

On the other hand if PerformWorkWithItem can take considerable amount of time and other parts of the application frequently access the collection (i.e. it is not desirable to block access to the collection for a long time) then #2 can do.

P.S. There also exists an approach #3. It utilizes lock-free data structures. But it is a whole new story and a topic for separate post.

Tuesday, June 15, 2010

AesManaged class Key and KeySize properties issue

Today when working with AesManaged class I've encountered very strange behavior.
If you have a code like this - you're in trouble:

AesManaged aes = new AesManaged();
aes.Key = key;
aes.KeySize = key.Length; //the problem
The problem with this code is setting KeySize after setting Key value.
When you set KeySize after Key - the previously specified key is discarded and a brand new key value is generated and put into Key property

I find this behavior rather strange, especially that there is no information describing what will happen after setting KeySize.

I would expect that when Key value is set setting KeySize will throw exception if specified key's size is bigger or smaller than the new one.