Benchmarking method execution through interface, base class, virtual, override, dynamic, reflection and expression trees.
In .Net there are many ways to execute a method. The fastest being the straight forward call to a static method. But how does its speed compare to other methods?
There are countless reasons why we sometimes can’t just make a direct call. We use interfaces and class inheritance for code structuring, code logic, readability, future compatibility, various patterns, module support, etc… In some rare cases we don’t even know they type of the object, or we need reflective access on the code itself.
With the exception of reflection, which is well-known to be slow, we usually don’t worry too much about the execution speed of a method call. There are other things that are far more important. As the test results shows the overhead of a call is in most cases negligible.
I had to skip DynamicMethod (IL injection) for now due to time constraints. I also considered direct IL injection/compiling C# code, loading an assembly that overrides base class virtual method or interface and executing that. But this would be a lot of work and is sort of covered by these tests, so that too was skipped.
Execution methods
To give an idea of how the different tests are performed I’m describing each execution type in code first.
Normal
1 2 3 4 5 6 7 8 |
public class MyClass { public void Method() { } } MyClass myClass = new MyClass(); // Execute myClass.Method(); |
Interface
Inheriting an interface may force a search through all types from front to back to find method.
1 2 3 4 5 6 7 8 9 10 11 |
public interface MyInterface { void Method(); } public class MyClass { public void Method() { } } MyInterface myClass = new MyClass(); // Execute myClass.Method(); |
Non-virtual method in base class
Inheriting an object may force a search through all types from front to back to find method.
1 2 3 4 5 6 7 8 9 |
public class MyBase { public void Method() { } } public class MyClass: MyBase { } MyClass myClass = new MyClass(); // Execute myClass.Method(); |
Virtual method
A virtual method requires a lookup table, causing overhead when being called.
1 2 3 4 5 6 7 8 9 |
public class MyBase { public virtual void Method() { } } public class MyClass: MyBase { } MyBase myClass = new MyClass(); // Execute myClass.Method(); |
Virtual Override
Similarly to interfaces, a method in a base class can be overriden.
1 2 3 4 5 6 7 8 9 10 11 |
public class MyBase { public virtual void Method() { } } public class MyClass: MyBase { public override void Method() { } } MyBase myClass = new MyClass(); // Execute myClass.Method(); |
Dynamic
Dynamic was introduced in .Net 4.5 and allows you to skip compile time checking of the method. The execution is late bound, and therefore has more overhead than a normal exection.
1 2 3 4 5 6 7 8 |
public class MyClass { public void Method() { } } dynamic myClass = new MyClass(); // Execute myClass.Method(); |
Lambda
Lambdas can be used to pass execution in variables as Action or Func<>.
1 2 3 4 5 6 7 8 9 |
public class MyClass { public void Method() { } } var myClass = new MyClass(); Action lambda = () => myClass.Method(); // Execute lambda.Invoke(); |
Delegate
Delegates allows you to reference any method with a matching signature
1 2 3 4 5 6 7 8 9 10 |
public class MyClass { public static void Method() { } } public delegate void MethodDelegate(); var myClass = new MyClass(); MethodDelegate methodDelegate = myClass.Method; // Execute myClass.Invoke(); |
Reflection
We use .Net’s System.Reflection to find MethodInfo. This MethodInfo should be cached as looking it up is slow. The advantage of this method is that the class doesn’t have to inherit any interface or class. We can simply check if a method exists, and invoke it if so. The disadvantage is that it is very slow.
1 2 3 4 5 6 7 8 9 |
public class MyClass { public void Method() { } } MyClass myClass = new MyClass(); var methodInfo = class.GetType().GetMethod("Method"); // Execute methodInfo.Invoke(myClass, null); |
Static
This one is not in line with the problem we are looking into. I just added it to see if there were any surprises (none).
1 2 3 4 5 6 |
public class MyClass { public static void Method() { } } // Execute MyClass.Method(); |
Multiple inheritance
I added up to 10 levels of inheritance on interface and base class to see what effect that would have. Particularly the difference in what variable type we use. We can mostly infer this from logic, but its interesting to see.
Benchmarking tool and source
I’m using BenchmarkDotNet for benchmarking. Source code for the benchmark is located here.
Note that the time scale we are talking about here is negligible. 1ns = 0.000001ms, that is 1/1 000 000 000 of a second. So even a “bad” result of one whole ns can execute 1 billion times per second. For example the inlined static method executed on average 128,531,926,713.1 times per second during testing. Then we have not included the overhead of test itself, though benchmarkdotnet has removed overhead of execution. I had to put something in there to avoid it being optimized away completely. So every execution returns an integer that is discarded, as can be seen by the “pop” before the “ret” in MSIL below.
This is however in a tight test-loop where everything is in the CPU’s L1 cache. Fetching more complex executions that requires access to multiple memory areas will be far slower. Though we are still at a negligible duration.
Result
- .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3260.0
- Windows Server 2016 v1607, HyperV-enabled
- Intel Core i7-3770K @ 3.5GHz, 4 cores, HT, 256KB, L1 cache, 1MB L2 cache, 8MB L3 cache
- 4x16GB dual-channel 1333MHz RAM
Rank | Ratio | Method | Mean | Median | Error | StdErr | Gen 0/1k Op | Allocated Memory/Op |
---|---|---|---|---|---|---|---|---|
1 | 0.11 | ‘Self->10 int’ | 0.0078 ns | 0.0000 ns | 0.0032 ns | 0.0010 ns | – | – |
1 | 0.12 | ‘*10 base’ | 0.0086 ns | 0.0000 ns | 0.0054 ns | 0.0016 ns | – | – |
1 | 0.24 | ‘Self->1 base’ | 0.0173 ns | 0.0000 ns | 0.0113 ns | 0.0034 ns | – | – |
1 | 0.29 | ‘*1 base->10 base’ | 0.0205 ns | 0.0000 ns | 0.0107 ns | 0.0032 ns | – | – |
1 | 0.34 | ‘*1 base’ | 0.0244 ns | 0.0000 ns | 0.0113 ns | 0.0034 ns | – | – |
1 | 1.00 | *Normal | 0.0715 ns | 0.0000 ns | 0.0198 ns | 0.0060 ns | – | – |
1 | 1.12 | *Static | 0.0799 ns | 0.0000 ns | 0.0249 ns | 0.0075 ns | – | – |
2 | 8.45 | ‘*1 base->virt abs override’ | 0.6045 ns | 0.6416 ns | 0.0109 ns | 0.0033 ns | – | – |
3 | 8.75 | ‘Self->10 base virt override’ | 0.6257 ns | 0.6130 ns | 0.0254 ns | 0.0077 ns | – | – |
3 | 9.31 | ‘*1 base->virt override’ | 0.6660 ns | 0.6116 ns | 0.0279 ns | 0.0084 ns | – | – |
4 | 9.80 | ‘*Self->virt abs override’ | 0.7004 ns | 0.7225 ns | 0.0193 ns | 0.0058 ns | – | – |
5 | 10.06 | ‘*10 base->virt no override’ | 0.7192 ns | 0.6768 ns | 0.0236 ns | 0.0071 ns | – | – |
6 | 10.60 | ‘*Self->virt override’ | 0.7579 ns | 0.7227 ns | 0.0338 ns | 0.0102 ns | – | – |
7 | 11.07 | ‘*1 base->virt no override’ | 0.7914 ns | 0.6625 ns | 0.0478 ns | 0.0145 ns | – | – |
8 | 15.99 | ‘*1 int’ | 1.1436 ns | 1.1123 ns | 0.0245 ns | 0.0074 ns | – | – |
9 | 16.38 | ‘*10 int’ | 1.1711 ns | 1.1909 ns | 0.0174 ns | 0.0053 ns | – | – |
10 | 16.57 | ‘Self->1 int’ | 1.1846 ns | 1.1314 ns | 0.0213 ns | 0.0064 ns | – | – |
11 | 17.21 | ‘1 int->10 int’ | 1.2307 ns | 1.2219 ns | 0.0206 ns | 0.0063 ns | – | – |
12 | 17.61 | *Lambda | 1.2588 ns | 1.1337 ns | 0.0464 ns | 0.0141 ns | – | – |
13 | 17.76 | *Delegate | 1.2701 ns | 1.2387 ns | 0.0253 ns | 0.0077 ns | – | – |
14 | 113.59 | ‘*Expression Tree’ | 8.1217 ns | 7.8778 ns | 0.1324 ns | 0.0401 ns | – | – |
15 | 181.58 | *Dynamic | 12.9831 ns | 11.3489 ns | 0.2841 ns | 0.0861 ns | 0.0057 | 24 B |
16 | 1989.81 | *Reflection | 142.2713 ns | 132.4650 ns | 2.2464 ns | 0.6805 ns | 0.0055 | 24 B |
Rank 1: Static, instance and non-virtual
These are the calls that are optimized away by compiler. Depending on complexity of the call they are either completely inlined, or they are one jump away.
Use the
static
modifier to declare a static member, which belongs to the type itself rather than to a specific object.Normal (call to instance method).
Microsoft (static) Microsoft (inheritance)
Base class (call to non-virtual base class method).
Even though there is a 10x spread in speed the measurements for these are so small that it is difficult to tell which one is a winner. Results vary with execution. Judging from the ASM, Static should be fastest. But the rest of them would be more affected by other factors such as memory alignment.
Static
The IL shows us that this is a call to a known method with static address, and the ASM shows us that this was completely inlined. We can see from the IL that even without inlining this is as efficient as it gets.
IL
1 2 3 4 5 6 7 |
.method public hidebysig instance void Call_StaticClass() cil managed { .maxstack 8 L_0000: call int32 Tedd.DynamicBindingBenchmark.Tests.Classes.StaticClass::Method() L_0005: pop L_0006: ret } |
ASM
1 |
ret |
Normal (direct instance call)
Calling an instance requires first loading instance reference, then callvirt on the method.
IL
1 2 3 4 5 6 7 8 9 |
.method public hidebysig instance void Call_NormalClass() cil managed { .maxstack 8 L_0000: ldarg.0 L_0001: ldfld class Tedd.DynamicBindingBenchmark.Tests.Classes.NormalClass Tedd.DynamicBindingBenchmark.Tests.CallTests::_normalClass L_0006: callvirt instance int32 Tedd.DynamicBindingBenchmark.Tests.Classes.NormalClass::Method() L_000b: pop L_000c: ret } |
ASM
1 2 3 |
mov rax,qword ptr [rcx+90h] mov eax,dword ptr [rax+8] ret |
Base class without virtual
We see the same as Normal call.
IL
1 2 3 4 5 6 7 8 9 |
.method public hidebysig instance void Call_BaseClass10_10() cil managed { .maxstack 8 L_0000: ldarg.0 L_0001: ldfld class Tedd.DynamicBindingBenchmark.Tests.Classes.BaseClass10Class Tedd.DynamicBindingBenchmark.Tests.CallTests::_baseClass10_10 L_0006: callvirt instance int32 Tedd.DynamicBindingBenchmark.Tests.Classes.BaseClass1Class::Method() L_000b: pop L_000c: ret } |
ASM
1 2 3 |
mov rax,qword ptr [rcx+48h] mov eax,dword ptr [rax+8] ret |
Rank 2-7: Derived virtual method
At 8.5-11 times slower than a normal instance method execution we find methods marked virtual in base class.
The
Microsoftvirtual
keyword is used to modify a method, property, indexer, or event declaration and allow for it to be overridden in a derived class.
Base class with virtual, derived not override
With only 1 level from derived to base the lookup is the fastest among the derived calls to virtual. From the ASM we see that it does two extra jumps by following lookups, compared to the same call without virtual.
IL
1 2 3 4 5 6 7 8 9 |
.method public hidebysig instance void Call_BaseClass1_1NotOverride() cil managed { .maxstack 8 L_0000: ldarg.0 L_0001: ldfld class Tedd.DynamicBindingBenchmark.Tests.Classes.BaseClass1ClassVirtual Tedd.DynamicBindingBenchmark.Tests.CallTests::_baseClassVirtualNotOverride1_1 L_0006: callvirt instance int32 Tedd.DynamicBindingBenchmark.Tests.Classes.BaseClass1ClassVirtual::Method() L_000b: pop L_000c: ret } |
ASM
1 2 3 4 5 6 |
mov rcx,qword ptr [rcx+60h] mov rax,qword ptr [rcx] mov rax,qword ptr [rax+40h] mov rax,qword ptr [rax+20h] ; Content of Method: (returning an int32) mov eax,dword ptr [rcx+8] |
Base class with virtual, derived override / exact type
We see the same pattern for all of these. They are never inlined and they have two extra jumps in lookup. But if they type is specified directly then there can’t be any permutation, then the compiler is able to inline the method body. For example in the case of override.
The same setup, with variable type set to the base gives same result but without the inlining. (IL/ASM not shown here as it would be a bit redundant.)
IL
1 2 3 4 5 6 7 8 9 |
.method public hidebysig instance void Call_BaseClass1_1Override() cil managed { .maxstack 8 L_0000: ldarg.0 L_0001: ldfld class Tedd.DynamicBindingBenchmark.Tests.Classes.BaseClass1ClassVirtual Tedd.DynamicBindingBenchmark.Tests.CallTests::_baseClassVirtualOverride1_1 L_0006: callvirt instance int32 Tedd.DynamicBindingBenchmark.Tests.Classes.BaseClass1ClassVirtual::Method() L_000b: pop L_000c: ret } |
ASM
1 2 3 4 |
mov rcx,qword ptr [rcx+58h] mov rax,qword ptr [rcx] mov rax,qword ptr [rax+40h] mov rax,qword ptr [rax+20h] |
Rank 8-11: Interfaces
At 15-17 times slower than a normal instance method execution we find methods accessed through interfaces.
An interface contains only the signatures of methods, properties, events or indexers. A class or struct that implements the interface must implement the members of the interface that are specified in the interface definition.
Microsoft
Method defined in interface
All of the calls to interface method looks the same regardless of how many levels of interface it has to go through. This seems to be because compiler can infer which interface has the method and call it directly. Therefore there is no additional overhead on having multiple levels on interface between class and interface with method definition.
IL
1 2 3 4 5 6 7 8 9 |
.method public hidebysig instance void Call_Interface10_1() cil managed { .maxstack 8 L_0000: ldarg.0 L_0001: ldfld class Tedd.DynamicBindingBenchmark.Tests.Interfaces.Interface1 Tedd.DynamicBindingBenchmark.Tests.CallTests::_interface10_1 L_0006: callvirt instance int32 Tedd.DynamicBindingBenchmark.Tests.Interfaces.Interface1::Method() L_000b: pop L_000c: ret } |
ASM
1 2 3 4 |
mov rcx,qword ptr [rcx+20h] mov r11,7FF7E4D304B0h mov rax,qword ptr [r11] cmp dword ptr [rcx],ecx |
Rank 12: Lambda
17 times slower than a normal instance method execution. Lambda has an overhead of execution System.Func<T>.Invoke(). This gives us two extra instructions for calculating target address.
A lambda expression is an anonymous function that you can use to create delegates or expression tree types. By using lambda expressions, you can write local functions that can be passed as arguments or returned as the value of function calls.
Microsoft
IL
1 2 3 4 5 6 7 8 9 |
.method public hidebysig instance void Call_NormalClassLambda() cil managed { .maxstack 8 L_0000: ldarg.0 L_0001: ldfld class [mscorlib]System.Func`1<int32> Tedd.DynamicBindingBenchmark.Tests.CallTests::_normalClassLambda L_0006: callvirt instance !0 [mscorlib]System.Func`1<int32>::Invoke() L_000b: pop L_000c: ret } |
ASM
1 2 3 4 |
mov rax,qword ptr [rcx+0A8h] lea rcx,[rax+8] mov rcx,qword ptr [rcx] mov rax,qword ptr [rax+18h] |
Rank 13: Delegate
17 times slower than a normal instance method execution. Although the IL differs, the final ASM looks the same as with lambda.
A delegate is a type that represents references to methods with a particular parameter list and return type. When you instantiate a delegate, you can associate its instance with any method with a compatible signature and return type. You can invoke (or call) the method through the delegate instance.
Microsoft
IL
1 2 3 4 5 6 7 8 9 |
.method public hidebysig instance void Call_NormalClassDelegate() cil managed { .maxstack 8 L_0000: ldarg.0 L_0001: ldfld class Tedd.DynamicBindingBenchmark.Tests.CallTests/MethodDelegate Tedd.DynamicBindingBenchmark.Tests.CallTests::_normalClassDelegate L_0006: callvirt instance int32 Tedd.DynamicBindingBenchmark.Tests.CallTests/MethodDelegate::Invoke() L_000b: pop L_000c: ret } |
ASM
1 2 3 4 |
mov rax,qword ptr [rcx+0B8h] lea rcx,[rax+8] mov rcx,qword ptr [rcx] mov rax,qword ptr [rax+18h] |
Rank 14: Expression tree
114 times slower than a normal instance method execution. The call to execute the expression tree is as expected the same as with lambda, a call to System.Func<T>.Invoke(). But the execution happening behind the lambda is considerably slower. I haven’t dug into the ASM here so I’ll leave it at that for now.
Rank 15: Dynamic
180 times slower than a normal instance method execution.
Dynamic causes 24 bytes of memory allocation for GC, which I suspect is because of boxing of return type (int) via Object. It is 24 bytes because int takes 4 bytes, x64 address takes 16 bytes and .Net allocates memory in sizes of 12 and 24 so the next minimum would be 24. More details in John Skeets blog post of memory and strings.
C# 4 introduces a new type,
Microsoftdynamic
. The type is a static type, but an object of typedynamic
bypasses static type checking.
IL
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
.method public hidebysig instance void Call_NormalClassDynamic() cil managed { .maxstack 9 L_0000: ldsfld class [System.Core]System.Runtime.CompilerServices.CallSite`1<class [mscorlib]System.Action`2<class [System.Core]System.Runtime.CompilerServices.CallSite, object>> Tedd.DynamicBindingBenchmark.Tests.CallTests/<>o__43::<>p__0 L_0005: brtrue.s L_003b L_0007: ldc.i4 0x100 L_000c: ldstr "Method" L_0011: ldnull L_0012: ldtoken Tedd.DynamicBindingBenchmark.Tests.CallTests L_0017: call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle) L_001c: ldc.i4.1 L_001d: newarr [Microsoft.CSharp]Microsoft.CSharp.RuntimeBinder.CSharpArgumentInfo L_0022: dup L_0023: ldc.i4.0 L_0024: ldc.i4.0 L_0025: ldnull L_0026: call class [Microsoft.CSharp]Microsoft.CSharp.RuntimeBinder.CSharpArgumentInfo [Microsoft.CSharp]Microsoft.CSharp.RuntimeBinder.CSharpArgumentInfo::Create(valuetype [Microsoft.CSharp]Microsoft.CSharp.RuntimeBinder.CSharpArgumentInfoFlags, string) L_002b: stelem.ref L_002c: call class [System.Core]System.Runtime.CompilerServices.CallSiteBinder [Microsoft.CSharp]Microsoft.CSharp.RuntimeBinder.Binder::InvokeMember(valuetype [Microsoft.CSharp]Microsoft.CSharp.RuntimeBinder.CSharpBinderFlags, string, class [mscorlib]System.Collections.Generic.IEnumerable`1<class [mscorlib]System.Type>, class [mscorlib]System.Type, class [mscorlib]System.Collections.Generic.IEnumerable`1<class [Microsoft.CSharp]Microsoft.CSharp.RuntimeBinder.CSharpArgumentInfo>) L_0031: call class [System.Core]System.Runtime.CompilerServices.CallSite`1<!0> [System.Core]System.Runtime.CompilerServices.CallSite`1<class [mscorlib]System.Action`2<class [System.Core]System.Runtime.CompilerServices.CallSite, object>>::Create(class [System.Core]System.Runtime.CompilerServices.CallSiteBinder) L_0036: stsfld class [System.Core]System.Runtime.CompilerServices.CallSite`1<class [mscorlib]System.Action`2<class [System.Core]System.Runtime.CompilerServices.CallSite, object>> Tedd.DynamicBindingBenchmark.Tests.CallTests/<>o__43::<>p__0 L_003b: ldsfld class [System.Core]System.Runtime.CompilerServices.CallSite`1<class [mscorlib]System.Action`2<class [System.Core]System.Runtime.CompilerServices.CallSite, object>> Tedd.DynamicBindingBenchmark.Tests.CallTests/<>o__43::<>p__0 L_0040: ldfld !0 [System.Core]System.Runtime.CompilerServices.CallSite`1<class [mscorlib]System.Action`2<class [System.Core]System.Runtime.CompilerServices.CallSite, object>>::Target L_0045: ldsfld class [System.Core]System.Runtime.CompilerServices.CallSite`1<class [mscorlib]System.Action`2<class [System.Core]System.Runtime.CompilerServices.CallSite, object>> Tedd.DynamicBindingBenchmark.Tests.CallTests/<>o__43::<>p__0 L_004a: ldarg.0 L_004b: ldfld object Tedd.DynamicBindingBenchmark.Tests.CallTests::_normalClassDynamic L_0050: callvirt instance void [mscorlib]System.Action`2<class [System.Core]System.Runtime.CompilerServices.CallSite, object>::Invoke(!0, !1) L_0055: ret } |
ASM
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
cmp qword ptr [12CC9718h],0 jne M00_L00 mov rcx,7FF7E4F18790h call clr!InstallCustomModule+0x2320 mov rdi,rax mov rcx,offset Microsoft_CSharp_ni+0x1af8a mov edx,1 call clr+0x2690 mov rbx,rax mov rcx,offset Microsoft_CSharp_ni+0xc0230 call clr+0x2540 mov r8,rax xor ecx,ecx mov dword ptr [r8+10h],ecx xor ecx,ecx mov qword ptr [r8+8],rcx mov rcx,rbx xor edx,edx call clr+0x4180 mov qword ptr [rsp+20h],rbx mov rdx,qword ptr [12CC38C0h] mov r9,rdi mov ecx,100h xor r8d,r8d call Microsoft.CSharp.RuntimeBinder.Binder.InvokeMember(Microsoft.CSharp.RuntimeBinder.CSharpBinderFlags, System.String, System.Collections.Generic.IEnumerable`1, System.Type, System.Collections.Generic.IEnumerable`1) mov rdx,rax mov rcx,7FF7E4F72938h call System.Runtime.CompilerServices.CallSite`1[[System.__Canon, mscorlib]].Create(System.Runtime.CompilerServices.CallSiteBinder) mov ecx,12CC9718h mov rdx,rax call clr+0x3fc0 M00_L00 mov rdx,qword ptr [12CC9718h] mov rax,qword ptr [rdx+18h] lea rcx,[rax+8] mov rcx,qword ptr [rcx] mov r8,qword ptr [rsi+0C0h] mov rax,qword ptr [rax+18h] ; Microsoft.CSharp.RuntimeBinder.Binder.InvokeMember(Microsoft.CSharp.RuntimeBinder.CSharpBinderFlags, System.String, System.Collections.Generic.IEnumerable`1, System.Type, System.Collections.Generic.IEnumerable`1) test cl,2 setne al movzx eax,al test cl,4 setne dl movzx edx,dl test ecx,100h setne cl movzx ecx,cl xor ebp,ebp test eax,eax je Microsoft_CSharp_ni+0x787fc mov ebp,1 test edx,edx je Microsoft_CSharp_ni+0x78803 or ebp,2 test ecx,ecx je Microsoft_CSharp_ni+0x7880a or ebp,4 ; Microsoft.CSharp.RuntimeBinder.CSharpInvokeMemberBinder::.ctor(Microsoft.CSharp.RuntimeBinder.CSharpCallFlags,System.String,System.Type,System.Collections.Generic.IEnumerable`1,System.Collections.Generic.IEnumerable`1) lea rcx,[Microsoft_CSharp_ni+0xc03b8 call Microsoft_CSharp_ni+0x62940 mov r14,rax mov qword ptr [rsp+20h],rdi mov rdi,qword ptr [rsp+80h] mov qword ptr [rsp+28h],rdi mov rcx,r14 mov edx,ebp mov r8,rsi mov r9,rbx call Microsoft.CSharp.RuntimeBinder.CSharpInvokeMemberBinder..ctor(Microsoft.CSharp.RuntimeBinder.CSharpCallFlags, System.String, System.Type, System.Collections.Generic.IEnumerable`1, System.Collections.Generic.IEnumerable`1) mov rax,r14 ; System.Runtime.CompilerServices.CallSite`1[[System.__Canon, mscorlib]].Create(System.Runtime.CompilerServices.CallSiteBinder) mov rcx,qword ptr [rdi+30h] mov rcx,qword ptr [rcx] mov rcx,qword ptr [rcx] test cl,1 je System_Core_ni+0x318515 mov rcx,qword ptr [rcx-1] call System_Core_ni+0x2abf70 mov rbx,rax mov rcx,qword ptr [System_Core_ni+0x96e40 call System_Core_ni+0x2abf70 mov rdx,rax mov rcx,rbx mov rax,qword ptr [rbx] mov rax,qword ptr [rax+0D8h] call qword ptr [rax+30h] test al,al je System_Core_ni+0x92cd2c mov rcx,rdi call System_Core_ni+0x2abf90 mov rdi,rax lea rcx,[rdi+8] mov rdx,rsi call System_Core_ni+0x2abf88 mov rcx,rdi call System.Runtime.CompilerServices.CallSite`1[[System.__Canon, mscorlib]].GetUpdateDelegate() lea rcx,[rdi+18h] mov rdx,rax call System_Core_ni+0x2abf88 mov rax,rdi add rsp,30h pop rbx pop rsi pop rdi ret int 3 int 3 int 3 int 3 int 3 int 3 push rsi sub rsp,30h mov qword ptr [rsp+28h],rcx mov rsi,rcx mov rcx,qword ptr [rsi] mov rdx,qword ptr [rcx+30h] mov rdx,qword ptr [rdx] mov rax,qword ptr [rdx+8] |
Rank 16: Reflection
A whopping 2000 times slower than a normal instance method execution and 20 000 times slower than a static method. This is despite the fact that we have cached MethodInfo prior to test execution.
Reflection provides objects (of type Type) that describe assemblies, modules and types. You can use reflection to dynamically create an instance of a type, bind the type to an existing object, or get the type from an existing object and invoke its methods or access its fields and properties
Microsoft
Reflection causes 24 bytes of memory allocation for GC, which I suspect is because of boxing of return type (int) via Object. It is 24 bytes because int takes 4 bytes, x64 address takes 16 bytes and .Net allocates memory in sizes of 12 and 24 so the next minimum would be 24. More details in John Skeets blog post of memory and strings.
IL
1 2 3 4 5 6 7 8 9 10 11 12 |
.method public hidebysig instance void Call_NormalClassReflection() cil managed { .maxstack 8 L_0000: ldarg.0 L_0001: ldfld class [mscorlib]System.Reflection.MethodInfo Tedd.DynamicBindingBenchmark.Tests.CallTests::_normalClassReflectionMethodInfo L_0006: ldarg.0 L_0007: ldfld class Tedd.DynamicBindingBenchmark.Tests.Classes.NormalClass Tedd.DynamicBindingBenchmark.Tests.CallTests::_normalClassReflectionClass L_000c: ldnull L_000d: callvirt instance object [mscorlib]System.Reflection.MethodBase::Invoke(object, object[]) L_0012: pop L_0013: ret } |
ASM
1 2 3 4 5 6 7 8 9 10 11 12 |
mov rax,qword ptr [rcx+0D0h] mov rdx,qword ptr [rcx+0C8h] xor ecx,ecx mov qword ptr [rsp+20h],rcx mov qword ptr [rsp+28h],rcx mov rcx,rax xor r8d,r8d xor r9d,r9d mov rax,qword ptr [rax] mov rax,qword ptr [rax+58h] call qword ptr [rax+20h] nop |
Summary
The time it takes to execute of most of these techniques are well within negligible time frames. In fact, they are so small that it is difficult to accurately measure them. Your results may vary from mine.
The tests have been executed multiple times, and in multiple rounds, always giving the same result within a small margin of error.
What we learned
- There are certain ways of execution that are very slow. Reflection coming in at a clear last place. It is work noting that all of the “losers” have other strengths.
- Dynamic and Reflection can also cause memory allocations that GC has to handle.
- Static methods, instance method or non-virtual base class methods are candidates to be inlined. In some other scenarios it is not possible for the compiler to consider inlining.