I’ve spent a few days analysing the performance of a C# WPF
application. The client’s users were
concerned that performance across certain areas just didn’t feel as fast as it
ought to be – particularly during busy times of a trading day.
That might sound pretty vague, but often as developers that’s
all we get.
There are plenty of resources available that go into a great
level of detail on the best practices to follow when designing apps. It helps to understand what .NET does under
the hood, in many aspects, but it’s important to remember that there are many
layers in which subtle coding designs can negatively affect performance.
I’m going to describe some interesting implementation
details which I discovered were the cause of performance issues that the users
were complaining about. This particular blog
covers structs and how slow Equals calls can be – please see my other blog
entry for similar results found by examining the default GetHashCode
implementation.
Summary of Results (Bottom Line)
If you don’t want to read how these numbers were created,
then the bottom line is, if you are
using structs to store data and these structs regularly have their Equals or
GetHashCode default methods called - either directly or indirectly (by storing
in a Dictionary), then in terms of
performance you should override the methods and implement your own Equals and
GetHashCode methods.
If you’re using a struct and you do not override the Equals or
GetHashCode methods, then you’re deferring down to the default
implementations. These are shockingly
slow, as they use Reflection to look at the fields defined in your struct.
The chart below shows (times in milliseconds) how slow it
can be by repeatedly calling Equals
on a struct that does/does not have the Equals method overridden.
Use of Class Versus Struct To Store Data
Commonly known facts:
·
Class instances get allocate on the heap
·
Class instances get via a copied pointer
reference (size is 4/8 bytes for 32/64 bit process)
·
Class instances have additional overhead of 8/16 bytes for 32/64 bit process (in fact
.NET aligns objects to the minimum of 12/24 bytes object size for 32/64 bit
processes)
·
Structs have no additional overhead, the amount
of memory is the sum of all fields in the struct
·
Structs declared in a class exist on the heap
(not the stack!)
·
Structs, by default, when passed to a method are
copies of the original struct (a byte-by-byte copy), so any changes made to the
struct from inside the method are local to that method
With this in mind, it appears that the original application developer
used a struct to store minute-by-minute stock market pricing data, rather than
a class. I can see their thinking: in a
32 bit process the amount of memory required to store an array of 10’s of
thousands structs would have been about half that required if a class were used
(there is even more of difference in a 64 bit process)
In addition to using structs to store the market data info,
various supporting calculations required value equality comparison and were stored
in Dictionary based collections, hence a large of calls to Equals and
GetHashCode:
public override int GetHashCode()
public override bool Equals(object obj)
Unfortunately, these structs did not to override the
GetHashCode and Equals methods, therefore using .NET’s default values, which resulted
in some rather surprising performance issues.
Blittable vs Non-Blittable Types
Performance can be further decreased by the type of fields
that the default implementation has to reflect over. Non-blittable
types, are those which do not have the same structure in memory in managed and
unmanaged code, and as such cannot be shared directly – reducing performance of
the default method implementations.
Commonly used blittable type (there are many more):
- System.Byte
- System.SByte
- System.Int16
- System.UInt16
- System.Int32
- System.UInt32
- System.Int64
- System.IntPtr
- System.UIntPtr
Commonly used non-blittable types:
- System.Boolean
- System.Char
- System.Object
- System.String
For further info, see http://en.wikipedia.org/wiki/Blittable_types
Test Code Generate Results
I’m going to walk through a selection of code snippets to
show how much of a performance difference you can see by implementing Equals
and GetHashCode.
As part of my profiling samples, I’ve created a base class (TimedActionBase) that, when used in
conjunction with NUnit and Resharper, makes it really easy to run snippets of
test code with a single mouse click and provide a statistical average for the
area under test. You’ll find a link to
the code at the bottom of this article.
You won’t need Resharper to compile the test code, but
you’ll need to run Install-Package NUnit using
the Package Manager Console to download and install NUnit from NuGet.
I’ll just summarise the base test class here, as it’s the actual tests that are more important.
Each test class derives from TimedActionBase and implements the TimedAction() method. This method gets called once initially (unloading any timed JIT overhead) and then is called a set number of times, with a percentile calculation used to provide a meaningful idea of how long the method takes to execute. I can then run SinglePassRunner() if a want a single timed pass (this calls the test method 100 times), or, in order to build up the data for the graphs MultiPassRunner():
using System;
using System.Collections.Generic;
using System.Diagnostics;
using NUnit.Framework;
[TestFixture]
public abstract class TimedActionBase
{
private readonly List<TimeSpan> _durations = new List<TimeSpan>();
private readonly Stopwatch _timer = new Stopwatch();
private bool _headerLogged;
protected abstract void TimedAction();
[Test]
public void SinglePassRunner()
{
TimedActionRunner();
}
[Test]
public void MultiPassRunner()
{
const int NoOfPasses = 180;
for (var i = 0; i < NoOfPasses; i ++)
{
TimedActionRunner();
}
}
private void TimedActionRunner()
{
const int DefaultTestCycles = 100;
// Prerun TimedAction method in case there's any JIT overhead
TimedAction();
_durations.Clear();
for (var cycle = 1; cycle <= DefaultTestCycles; cycle++)
{
_timer.Start();
TimedAction();
_timer.Stop();
var duration = _timer.Elapsed;
_durations.Add(duration);
_timer.Reset();
}
CalculateStatistics();
}
private void CalculateStatistics()
{
var average = _durations.AverageTotalMilliseconds();
var min = _durations.MinTotalMilliseconds();
var max = _durations.MaxTotalMilliseconds();
var deviation = _durations.Deviation();
const double Percentile = 95D;
var percentile = _durations.Percentile(Percentile);
var header = string.Empty;
if (!_headerLogged)
{
_headerLogged = true;
header = "Timings(ms) \tPercentile\tAverage\tMinimum\tMaxmimum\tDeviation\n";
}
Log(string.Concat(header, string.Join("\t",
new[]
{
GetType().Name,
FormatTimeSpan(percentile), FormatTimeSpan(average),
FormatTimeSpan(min), FormatTimeSpan(max),
FormatTimeSpan(deviation)
})));
}
private static string FormatTimeSpan(TimeSpan span)
{
return span.TotalMilliseconds.ToString("N0");
}
protected static void Log(string format, params object[] args)
{
Console.WriteLine(format, args);
}
}
To compare the performance of different implementations, all
of the sample tests below follow the same basic steps:
1.
Define a struct with slightly different
implementation under test (eg, with or without Equals)
2.
Create a List containing 50,000 structs
3.
Create a single comparison struct
4.
Walk through the list using the .Equals method and
calculate how long that timed operation took
1. Not Overriding Equals() With Non Blittable Fields (Slowest Speed)
Equals Overridden
|
Non-Blittable Fields
|
Blittable Fields
|
Typical Milliseconds
|
No
|
3
|
0
|
117
|
class SlowNonBlittableWithoutEqualsTest : TimedActionBase
{
struct MyStruct
{
public Decimal Field1;
public Decimal Field2;
public Decimal Field3;
}
private readonly List<MyStruct> _items = Enumerable
.Range(1, EqualsTestParams.NumberOfItems)
.Select(n => new MyStruct
{
Field1 = n + 1.1M,
Field2 = n + 2.2M,
Field3 = n + 3.3M
})
.ToList();
private readonly MyStruct _comparison = new MyStruct
{
Field1 = 1.1M, Field2 = 2.2M, Field3 = 3.3M
};
protected override void TimedAction()
{
var allEqual = _items
.Select(item => item.Equals(_comparison))
.ToList();
}
}
2. Not Overriding Equals() With Mixture of Blittable/Non Blittable Fields (Slow)
Equals Overridden
|
Non-Blittable Fields
|
Blittable Fields
|
Typical Milliseconds
|
No
|
2
|
1
|
90
|
class SlowBlittableMixtureWithoutEqualsTest : TimedActionBase
{
struct MyStruct
{
public Decimal Field1;
public int Field2;
public Decimal Field3;
}
private readonly List<MyStruct> _items = Enumerable
.Range(1, EqualsTestParams.NumberOfItems)
.Select(n => new MyStruct
{
Field1 = n + 1.1M,
Field2 = n + 2,
Field3 = n + 3.3M
})
.ToList();
private readonly MyStruct _comparison = new MyStruct
{
Field1 = 1.1M, Field2 = 2, Field3 = 3.3M
};
protected override void TimedAction()
{
var allEqual = _items
.Select(item => item.Equals(_comparison))
.ToList();
}
}
3. Not Overriding Equals() With All Blittable Fields (Medium)
Equals Overridden
|
Non-Blittable Fields
|
Blittable Fields
|
Typical Milliseconds
|
No
|
0
|
3
|
13
|
class MediumBlittableWithoutEqualsTest : TimedActionBase
{
struct MyStruct
{
public int Field1;
public int Field2;
public int Field3;
}
private readonly List<MyStruct> _items = Enumerable
.Range(1, EqualsTestParams.NumberOfItems)
.Select(n => new MyStruct
{
Field1 = n + 1,
Field2 = n + 2,
Field3 = n + 3
})
.ToList();
private readonly MyStruct _comparison = new MyStruct
{
Field1 = 1,
Field2 = 2,
Field3 = 3
};
protected override void TimedAction()
{
var allEqual = _items
.Select(item => item.Equals(_comparison))
.ToList();
}
}
4. Overriding Equals() With Non-Blittable Fields (Medium)
Equals Overridden
|
Non-Blittable Fields
|
Blittable Fields
|
Typical Milliseconds
|
Yes
|
3
|
0
|
12
|
class MediumNonBlittableWithEqualsTest : TimedActionBase
{
struct MyStruct : IEquatable<MyStruct>
{
public Decimal Field1;
public Decimal Field2;
public Decimal Field3;
public override bool Equals(object obj)
{
if (obj == null || (!(obj is MyStruct)))
return false;
return Equals((MyStruct)obj);
}
public bool Equals(MyStruct other)
{
return Field1 == other.Field1 &&
Field2 == other.Field2 &&
Field3 == other.Field3;
}
}
private readonly List<MyStruct> _items = Enumerable
.Range(1, EqualsTestParams.NumberOfItems)
.Select(n => new MyStruct
{
Field1 = n + 1.1M, Field2 = n + 2.2M, Field3 = n + 3.3M
})
.ToList();
private readonly MyStruct _comparison = new MyStruct
{
Field1 = 1.1M,
Field2 = 2.2M,
Field3 = 3.3M
};
protected override void TimedAction()
{
var allEqual = _items
.Select(item => item.Equals(_comparison))
.ToList();
}
}
5. Overriding
Equals() With Blittable Fields (Fastest)
Equals Overridden
|
Non-Blittable Fields
|
Blittable Fields
|
Typical Milliseconds
|
Yes
|
0
|
3
|
6
|
class FastBlittableWithEqualsTest : TimedActionBase
{
struct MyStruct : IEquatable<MyStruct>
{
public int Field1;
public int Field2;
public int Field3;
public override bool Equals(object obj)
{
if (obj == null || (!(obj is MyStruct)))
return false;
return Equals((MyStruct)obj);
}
public bool Equals(MyStruct other)
{
return Field1 == other.Field1 &&
Field2 == other.Field2 &&
Field3 == other.Field3;
}
}
private readonly List<MyStruct> _items = Enumerable
.Range(1, EqualsTestParams.NumberOfItems)
.Select(n => new MyStruct
{
Field1 = n+1,
Field2 = n+2,
Field3 = n+3
})
.ToList();
private readonly MyStruct _comparison = new MyStruct
{
Field1 = 1,
Field2 = 2,
Field3 = 3
};
protected override void TimedAction()
{
var allEqual = _items
.Select(item => item.Equals(_comparison))
.ToList();
}
}
The numbers are very surprising.
You can view similar results for the GetHashCode performance test
post.
No comments:
Post a Comment