Sunday, June 24, 2012

GC pressure series: hidden boxing

Last time we talked about GC pressure from enumerating collections using their interfaces rather than their concrete types.  It turns out that the garbage produced by using the interfaces is directly caused by boxing, which we’ll explore more in depth and in a broader context here.

Before diving into boxing and when it can implicitly occur, let’s first review some terminology.  If you’re already fluent in value types vs. reference types, feel free to skip to the boxing section.

Value types

In C#, value types are structs.  Primitives you’re very familiar with like integer, float, and Guid are structs, as well as the ones you may define yourself.  If you pass a struct as a parameter to another method, that method gets a copy of that struct – not the original struct.  Consider this example:

static void Main() {
   int seeds = 5;
   AddOneSeed(5);
   Debug.Assert(seeds == 5);
}

static void AddOneSeed(int seeds) {
   seeds++;
}

The AddOneSeed method received a copy of the integer, which is a struct.  Therefore any changes it made do not propagate back to the caller.  Note that if a “ref” keyword is added to the parameter in the method’s signature that the method accesses the caller’s original struct instead of a copy, allowing the method to modify its caller’s value.

Value types store their data directly in their container.  In the above case, the value types appear as local variables within methods, so the data of these value types is stored directly on the stack of the thread executing these methods.  This also means that value types can vary in size, and the container must know exactly how large the value type is so that enough memory can be allocated for it.

Reference types

In C#, reference types are classes.  Reference types store their data on the heap, and anyone holding a reference type is actually just holding a reference: a memory pointer to that object on the heap.  Classes may contain structs as fields.  This means that value types may actually appear on the heap.  Consider this simple example:

class Fruit {
   int seeds;
}

Fruit is a reference type that contains a field that is a value type.  This value type always appears on the heap because it’s within an object on the heap.  The memory allocated on the heap by the Fruit object actually includes the value type directly.  In the case of Fruit above, the object on the heap is 12 bytes in size: 8 for the CLR header that exists for every object, and 4 more bytes for the seeds field.

Boxing: When a value type becomes a reference type

Sometimes a value type must be represented as a reference type.  The CLR allows anything to be represented as a System.Object.  But System.Object is a reference type.  Anyone with a reference to Object expects it to be a pointer to an object on the heap, and the size of Object to the holder of the reference is always 4 bytes regardless of the size of that object on the heap (or 8 bytes in a 64-bit process).  How can a value type, which may vary in size, be assigned to a System.Object, which is always exactly 4 bytes?  Boxing.

Boxing happens when the compiler has to assign a value-type to a reference type.  When boxing, the CLR allocates a new object on the heap just large enough to store the particular value type being boxed, and copies the value type into that object.  Now the reference to that object can be passed around simply as System.Object without anyone knowing how large the object really is. To actually look at the value type again, it must be unboxed, where the value is copied back onto the stack, and possibly into a field typed as the matching value type.

Let’s consider the simplest example of boxing.  We have an integer value type on the stack as a local variable in this method.  Then we assign this value type to a reference type.  IComparable is an interface.  Value types can implement interfaces.  But as soon as you refer to the value type by an interface it implements it gets boxed.  Check this out:

static void Main() {
   int number = 5;
   IComparable comparable = number;
}

The C# compiler compiles the above code to the below IL:

.locals init (
        [0] int32 number,
        [1] class [mscorlib]System.IComparable comparable)
nop 
ldc.i4.5 
stloc.0 
ldloc.0 
box int32
stloc.1

It’s not important that you understand all the IL, but notice the box instruction.  This tells the CLR to allocate an object and copy the value at the top of the stack into it, and then push a reference to that object onto the stack in its place.  The last instruction then assigns that object reference into the local variable typed as an IComparable. 

But wait… if a struct implements an interface already, why does it have to be boxed when it is assigned to a variable typed as that interface?  The answer lies in its size.  Anything can implement an interface (struct or class), and sizes can vary.  But folks holding a reference to an interface have to know in advance how much space to allocate for that reference.  Since value types vary in size, value types don’t qualify.  Instead, when you’re dealing with interfaces you’re always dealing with reference types, meaning that value types get boxed inside an object on the heap as a reference type.

The important thing to realize here is that although the C# code did not have a new keyword anywhere, an object was allocated anyway.  This is under-the-covers magic by C# and the CLR so that things just work, and normally it’s a good thing.  But this is a post on GC pressure, so of course we’re going to study how boxing can work against you, and how you can avoid it.

How to avoid boxing

The easiest way to avoid boxing is to always refer to a value type by its concrete type rather than by System.Object or an interface the value type implements.  You can still call methods on that struct that implement interfaces just by calling those methods directly.  For example, consider these two calls to CompareTo:

static void Main() {
   int number = 5;

   // CompareTo is found on the IComparable interface
   IComparable comparable = number; // box number
   int result = comparable.CompareTo(8); // box 8

   // CompareTo is also a public method on int directly.
   // In fact two overloads of CompareTo are available
   // on int, one that takes System.Object and another
   // that takes an int, so the C# compiler will choose
   // to call the one that takes int, avoiding both 
   // boxes that the above example causes.
   int result2 = number.CompareTo(8);
}

The example demonstrates two things.  First is the discrete second boxing that can happen.  Once we’ve boxed the int to an IComparable, there is only one CompareTo method for the compiler to choose from: CompareTo(object).  That means that in addition to having already boxed the 5, the 8 must also be boxed in order to call CompareTo(object).  On the other hand, leaving 5 as an integer avoids the first box, and allows the CompareTo(8) method to bind to int.CompareTo(int), which means the 8 can also remain as a value type.  To better understand what is happening, consider how the Int32 and IComparable types are defined:

public struct Int32 : IComparable, IFormattable, IConvertible, IComparable, IEquatable {
   public int CompareTo(int value);
   public int CompareTo(object value);
}

public interface IComparable {
   int CompareTo(object obj);
}

As you can see, when you have an IComparable reference, all the compiler can do is call CompareTo(object). But when you have an Int32, you also can call CompareTo(int) and avoid both boxes.

Keep in mind as well that returning a value type from a method whose signature returns an interface also boxes:

static IComparable GetNumber() {
   return 5; // boxed!
}

Some pathological cases

An occasional box here and there isn’t going to hurt the performance of your app.  But there are some things you can do that can lead to enough boxing to really hurt in frequently called code.

Boxing in a loop

Suppose you have no choice but to box a value type to call a method that takes an interface.  If you were calling that method in a loop with the same value type each time, you can take care to avoid boxing for each iteration of the loop.

static void BoxEachIteration() {
   int number = 5;
   for (int i = 0; i < 500; i++) {
      Work(number, i);
   }
}

static void BoxJustOnce() {
   int number = 5;
   IComparable boxedNumber = number;
   for (int i = 0; i < 500; i++) {
      Work(boxedNumber, i);
   }
}

static void Work(IComparable number, int iteration) {
}
Boxing in non-mutating methods

When you implement public methods that don’t mutate state, you should generally be very careful not to box at all.  For example, folks expect that retrieving a value from a dictionary should have no side effects including not allocating memory.  This is important because your cheap, non-mutating methods are often the ones that end up being called in loops which may end up being perf critical to the application.

Earlier this week I investigated a performance problem that I found was due to GC pressure caused by a hashtable-like collection that boxed in its Get method.  Ouch.  Here was the Get method:

public T Get(string key)
{
    return this.Get(new KeyedObject(key));
}

There's a new keyword there, but KeyedObject was a struct, so supposedly no heap allocations there. But it gets more subtle. Take a look at how the struct is defined, and how the Get overload that is being called is defined:

private struct KeyedObject : IKeyed
{
    private string name;
    internal KeyedObject(string name);
    string IKeyed.Key { get; }
}

public T Get(IKeyed item)
{
  // ...
}

Do you see the problem? Although the Get(string) method creates a struct, which doesn’t cause GC pressure, it then passes that struct to a method that accepts an interface.  That struct implements that interface, so while the C# compiler allows it, it first has to box the struct, which means all the work the authoring developer went to by using a struct to avoid allocations didn’t actually avoid the allocation after all.  Since this Get(string) method was being called thousands of times in this perf critical scenario, we were generating garbage like crazy and the GC had to keep interrupting the app to clean up.

One possible fix to this is to add a Get(KeyedObject) overload that does the work directly with the value type.  Then the C# compiler would call that one instead of the one taking the interface, and the boxing would disappear.

Non-generic collections

Generic collections aren’t just sugar.  They can help performance tremendously.  Consider a table of value type pairs.  If you use the non-generic Hashtable collection, where keys and values are typed as System.Object, each key and value must get boxed before storage.  That at least is a somewhat fixed cost – it’s not directly garbage because the box is persistent for as long as that entry is in the collection.  But now think about every lookup operation.  Let’s say you have a hashtable where the keys and values are both integers.  You have a key and you want to look up the value.  You are compelled to box the key first, making it impossible to query for values without allocating memory.

var table = new Hashtable();
table[5] = 8; // boxes 5 and 8
int value = (int)table[5]; // box 5 (again)

If on the other hand you use Dictionary<int, int>, all the internal storage mechanisms and methods are strongly typed to take and return integers rather than System.Object, so no boxing occurs. This can have a dramatic impact on performance due to GC pressure as well as memory availability since each integer takes only 4 bytes in memory instead of the minimum 12 bytes required by every GC heap allocated object.

Should I define my type as struct or class?

That’s a broad question and there is more to consider than GC pressure when making this decision.  In my personal experience, almost every time I’ve nobly started with a struct I’ve ended up converting it to a class.  So my personal default is to use class unless struct is required.  There are several reasons for this, but most pertinent to this post are two things to consider:

  1. Structs allow you to allocate and pass around the values without allocating memory on the heap.  Classes must always be allocated on the heap.
  2. Structs can be boxed many times, but classes will only allocate memory once

Every time you assign a struct such that boxing is required, a new object is allocated.  So the same struct can actually end up acquiring lots of memory and producing lots of garbage.  Whereas if you’d just chosen to use a class, the memory is allocated once and shared across all who reference it.  So be careful when you define structs and ask yourself how folks with that struct will use it.  If the likelihood is high that most instances of the struct will get boxed sometime in their lifetime, consider making it a class instead to help keep the GC pressure down.

1 comment:

  1. Nice article, thanks for the information. We just had this problem in one of our systems :)

    ReplyDelete