Introduction

What happens if a method is just a wrapper for another method? Is the extra jump optimized away by compiler? Does it take much time? I thought I’d look into this and measure a bit. With the different compilers, Jits and runtimes I thought it would be fun to see what happens.

I’ll use a == operator implementation calling IEquatable<T>.Equals(T other) for testing. A good practice when creating structs is to implement Object.Equals , GetHashCode() , IEquatable<T> , op_Equality (== operator) and op_Inequality (!= operator). (Read more on Microsoft docs.) Since Object.Equals(object) , Equal(T other) , op_Equality and op_Inequality all more or less implement the same logic I figured one could just call the other. So whats the cost?

Note that this is not for optimization. The cost we are talking about here is negligle compared to the rest of your code, so this is purely for fun.

And this is not an attempt to measure the cost of an additional JMP, which is well documented and even varies depending on scenarios.

Test setup

[Config(typeof(BigJobConfig))]
public class OpEqualsTest
{
    public OpEqualsDirect OpEqualsDirect1 = new OpEqualsDirect() { z = 1 };
    public OpEqualsDirect OpEqualsDirect2 = new OpEqualsDirect() { z = 2 };
    public OpEqualsIndirect OpEqualsIndirect1 = new OpEqualsIndirect() { z = 1 };
    public OpEqualsIndirect OpEqualsIndirect2 = new OpEqualsIndirect() { z = 2 };

    public int Count = 0;

    [GlobalSetup()]
    public void GlobalSetup()
    {
        Count = 0;
    }

    [GlobalCleanup()]
    public void GlobalCleanup()
    {
        Console.WriteLine(Count);
    }

    [Benchmark(Baseline = true, Description = "Direct op_equals")]
    public void OpEqualsDirect()
    {
        if (OpEqualsDirect1 == OpEqualsDirect2)
            Count++;
    }

    [Benchmark(Baseline = false, Description = "Indirect op_equals")]
    public void OpEqualsIndirect()
    {
        if (OpEqualsIndirect1 == OpEqualsIndirect2)
            Count++;
    }
}

[Config(typeof(BigJobConfig))]

public class OpEqualsTest

{

public OpEqualsDirect OpEqualsDirect1 = new OpEqualsDirect() { z = 1 };

public OpEqualsDirect OpEqualsDirect2 = new OpEqualsDirect() { z = 2 };

public OpEqualsIndirect OpEqualsIndirect1 = new OpEqualsIndirect() { z = 1 };

public OpEqualsIndirect OpEqualsIndirect2 = new OpEqualsIndirect() { z = 2 };

public int Count = 0;

[GlobalSetup()]

public void GlobalSetup()

{

Count = 0;

}

[GlobalCleanup()]

public void GlobalCleanup()

{

Console.WriteLine(Count);

}

[Benchmark(Baseline = true, Description = "Direct op_equals")]

public void OpEqualsDirect()

{

if (OpEqualsDirect1 == OpEqualsDirect2)

Count++;

}

[Benchmark(Baseline = false, Description = "Indirect op_equals")]

public void OpEqualsIndirect()

{

if (OpEqualsIndirect1 == OpEqualsIndirect2)

Count++;

}

Public variables and using Count for something after run, since thought I had some issue with RyuJit being too smart.

OpEqualsDirect

public struct OpEqualsDirect : IEquatable<OpEqualsDirect>
{
    private int x;
    private int y;
    private int z;

    public bool Equals(OpEqualsDirect other)
    {
        return x == other.x && y == other.y && z == other.z;
    }

    public static bool operator ==(OpEqualsDirect o1, OpEqualsDirect other)
    {
        return o1.x == other.x && o1.y == other.y && o1.z == other.z;
    }
    public static bool operator !=(OpEqualsDirect o1, OpEqualsDirect other)
    {
        return !(o1.x == other.x && o1.y == other.y && o1.z == other.z);
    }
}

public struct OpEqualsDirect : IEquatable<OpEqualsDirect>

{

private int x;

private int y;

private int z;

public bool Equals(OpEqualsDirect other)

{

return x == other.x && y == other.y && z == other.z;

}

public static bool operator ==(OpEqualsDirect o1, OpEqualsDirect other)

{

return o1.x == other.x && o1.y == other.y && o1.z == other.z;

}

public static bool operator !=(OpEqualsDirect o1, OpEqualsDirect other)

{

return !(o1.x == other.x && o1.y == other.y && o1.z == other.z);

}

OpEqualsIndirect

public struct OpEqualsIndirect : IEquatable<OpEqualsIndirect>
{
    private int x;
    private int y;
    private int z;

    public bool Equals(OpEqualsIndirect other)
    {
        return x == other.x && y == other.y && z == other.z;
    }

    public static bool operator ==(OpEqualsIndirect o1, OpEqualsIndirect other)
    {
        return o1.Equals(other);
    }
    public static bool operator !=(OpEqualsIndirect o1, OpEqualsIndirect other)
    {
        return !o1.Equals(other);
    }
}

public struct OpEqualsIndirect : IEquatable<OpEqualsIndirect>

{

private int x;

private int y;

private int z;

public bool Equals(OpEqualsIndirect other)

{

return x == other.x && y == other.y && z == other.z;

}

public static bool operator ==(OpEqualsIndirect o1, OpEqualsIndirect other)

{

return o1.Equals(other);

}

public static bool operator !=(OpEqualsIndirect o1, OpEqualsIndirect other)

{

return !o1.Equals(other);

}

Decompiled

OpEqualsDirect

This one is pretty much as we would expect.

Bytecode hex

B6027B07000004037B07000004331D027B08000004037B08000004330F027B09000004037B09000004FE012A162A

1	B6027B07000004037B07000004331D027B08000004037B08000004330F027B09000004037B09000004FE012A162A

IL

.method public hidebysig specialname static 
    bool op_Equality (
        valuetype Tedd.BenchmarkRunner.Cases.OpEqualsDirect o1,
        valuetype Tedd.BenchmarkRunner.Cases.OpEqualsDirect other
    ) cil managed 
{
    .maxstack 8

    IL_0000: ldarg.0
    IL_0001: ldfld     int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::x
    IL_0006: ldarg.1
    IL_0007: ldfld     int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::x
    IL_000C: bne.un.s  IL_002B

    IL_000E: ldarg.0
    IL_000F: ldfld     int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::y
    IL_0014: ldarg.1
    IL_0015: ldfld     int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::y
    IL_001A: bne.un.s  IL_002B

    IL_001C: ldarg.0
    IL_001D: ldfld     int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::z
    IL_0022: ldarg.1
    IL_0023: ldfld     int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::z
    IL_0028: ceq
    IL_002A: ret

    IL_002B: ldc.i4.0
    IL_002C: ret
} // end of method OpEqualsDirect::op_Equality

.method public hidebysig specialname static

bool op_Equality (

valuetype Tedd.BenchmarkRunner.Cases.OpEqualsDirect o1,

valuetype Tedd.BenchmarkRunner.Cases.OpEqualsDirect other

) cil managed

{

.maxstack 8

IL_0000: ldarg.0

IL_0001: ldfld int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::x

IL_0006: ldarg.1

IL_0007: ldfld int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::x

IL_000C: bne.un.s IL_002B

IL_000E: ldarg.0

IL_000F: ldfld int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::y

IL_0014: ldarg.1

IL_0015: ldfld int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::y

IL_001A: bne.un.s IL_002B

IL_001C: ldarg.0

IL_001D: ldfld int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::z

IL_0022: ldarg.1

IL_0023: ldfld int32 Tedd.BenchmarkRunner.Cases.OpEqualsDirect::z

IL_0028: ceq

IL_002A: ret

IL_002B: ldc.i4.0

IL_002C: ret

} // end of method OpEqualsDirect::op_Equality

C#

public static bool operator ==(OpEqualsDirect o1, OpEqualsDirect other)
{
    return o1.x == other.x && o1.y == other.y && o1.z == other.z;
}

public static bool operator ==(OpEqualsDirect o1, OpEqualsDirect other)

{

return o1.x == other.x && o1.y == other.y && o1.z == other.z;

}

OpEqualsIndirect

My first question was whether the extra jump would be optimized away. I can’t see that from decoding the method directly, but we see for reference that it loads the argument and calls Equals on the struct instance. Pretty much as expected.

Bytecode hex

260F0003280F0000062A

1	260F0003280F0000062A

IL

.method public hidebysig specialname static 
    bool op_Equality (
        valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect o1,
        valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect other
    ) cil managed 
{
    .maxstack 8

    IL_0000: ldarga.s  o1
    IL_0002: ldarg.1
    IL_0003: call      instance bool Tedd.BenchmarkRunner.Cases.OpEqualsIndirect::Equals(valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect)
    IL_0008: ret
} // end of method OpEqualsIndirect::op_Equality

.method public hidebysig specialname static

bool op_Equality (

valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect o1,

valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect other

) cil managed

{

.maxstack 8

IL_0000: ldarga.s o1

IL_0002: ldarg.1

IL_0003: call instance bool Tedd.BenchmarkRunner.Cases.OpEqualsIndirect::Equals(valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect)

IL_0008: ret

} // end of method OpEqualsIndirect::op_Equality

C#

public static bool operator ==(OpEqualsIndirect o1, OpEqualsIndirect other)
{
    return o1.Equals(other);
}

public static bool operator ==(OpEqualsIndirect o1, OpEqualsIndirect other)

{

return o1.Equals(other);

}

Callee

So how about callee? Is it optimized away?

.method public hidebysig 
    instance void OpEqualsIndirect () cil managed 
{
    .custom instance void [BenchmarkDotNet.Core]BenchmarkDotNet.Attributes.BenchmarkAttribute::.ctor() = (
        01 00 02 00 54 02 08 42 61 73 65 6c 69 6e 65 00
        54 0e 0b 44 65 73 63 72 69 70 74 69 6f 6e 12 49
        6e 64 69 72 65 63 74 20 6f 70 5f 65 71 75 61 6c 73
    )
    .maxstack 8

    IL_0000: ldarg.0
    IL_0001: ldfld     valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect Tedd.BenchmarkRunner.Tests.OpEqualsTest::_opEqualsIndirect1
    IL_0006: ldarg.0
    IL_0007: ldfld     valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect Tedd.BenchmarkRunner.Tests.OpEqualsTest::_opEqualsIndirect2
    IL_000C: call      bool Tedd.BenchmarkRunner.Cases.OpEqualsIndirect::op_Equality(valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect, valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect)
    IL_0011: brfalse.s IL_0021

    IL_0013: ldarg.0
    IL_0014: ldarg.0
    IL_0015: ldfld     int32 Tedd.BenchmarkRunner.Tests.OpEqualsTest::Count
    IL_001A: ldc.i4.1
    IL_001B: add
    IL_001C: stfld     int32 Tedd.BenchmarkRunner.Tests.OpEqualsTest::Count

    IL_0021: ret
} // end of method OpEqualsTest::OpEqualsIndirect

.method public hidebysig

instance void OpEqualsIndirect () cil managed

{

.custom instance void [BenchmarkDotNet.Core]BenchmarkDotNet.Attributes.BenchmarkAttribute::.ctor() = (

01 00 02 00 54 02 08 42 61 73 65 6c 69 6e 65 00

54 0e 0b 44 65 73 63 72 69 70 74 69 6f 6e 12 49

6e 64 69 72 65 63 74 20 6f 70 5f 65 71 75 61 6c 73

)

.maxstack 8

IL_0000: ldarg.0

IL_0001: ldfld valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect Tedd.BenchmarkRunner.Tests.OpEqualsTest::_opEqualsIndirect1

IL_0006: ldarg.0

IL_0007: ldfld valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect Tedd.BenchmarkRunner.Tests.OpEqualsTest::_opEqualsIndirect2

IL_000C: call bool Tedd.BenchmarkRunner.Cases.OpEqualsIndirect::op_Equality(valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect, valuetype Tedd.BenchmarkRunner.Cases.OpEqualsIndirect)

IL_0011: brfalse.s IL_0021

IL_0013: ldarg.0

IL_0014: ldarg.0

IL_0015: ldfld int32 Tedd.BenchmarkRunner.Tests.OpEqualsTest::Count

IL_001A: ldc.i4.1

IL_001B: add

IL_001C: stfld int32 Tedd.BenchmarkRunner.Tests.OpEqualsTest::Count

IL_0021: ret

} // end of method OpEqualsTest::OpEqualsIndirect

No, it is calling op_Equality which in turn is calling IEquatable<T>.Equals(T other) .

Benchmark

All of this is before Jit. So lets see how it performs with some test runs.

For the test I’m doing a lightweight operation where I am comparing three int’s and it will fail on third.

BenchmarkDotNet=v0.10.12, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.125)
Intel Core i7-3930K CPU 3.20GHz (Ivy Bridge), 1 CPU, 12 logical cores and 6 physical cores
Frequency=3124843 Hz, Resolution=320.0161 ns, Timer=TSC
  [Host]         : .NET Framework 4.7 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.2600.0  [AttachedDebugger]
  LegacyJit-Mono : Mono 5.4.1 (Visual Studio), 64bit 
  Llvm-Mono      : Mono 5.4.1 (Visual Studio), 64bit 
  RyuJit-Clr     : .NET Framework 4.7 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2600.0
  RyuJit-Mono    : Mono 5.4.1 (Visual Studio), 64bit 

MinIterationTime=10.0000 s  Platform=X64  LaunchCount=1  
TargetCount=3  WarmupCount=3  </code><code>

BenchmarkDotNet=v0.10.12, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.125)

Intel Core i7-3930K CPU 3.20GHz (Ivy Bridge), 1 CPU, 12 logical cores and 6 physical cores

Frequency=3124843 Hz, Resolution=320.0161 ns, Timer=TSC

[Host] : .NET Framework 4.7 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.2600.0 [AttachedDebugger]

LegacyJit-Mono : Mono 5.4.1 (Visual Studio), 64bit

Llvm-Mono : Mono 5.4.1 (Visual Studio), 64bit

RyuJit-Clr : .NET Framework 4.7 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2600.0

RyuJit-Mono : Mono 5.4.1 (Visual Studio), 64bit

MinIterationTime=10.0000 s Platform=X64 LaunchCount=1

TargetCount=3 WarmupCount=3 </code><code>

Method	Job	Jit	Runtime	Mean	Error	StdDev	Scaled	ScaledSD	Allocated
‘Direct op_equals’	LegacyJit-Mono	LegacyJit	Mono x64	13.4742 ns	1.0661 ns	0.0602 ns	1.00	0.00	N/A
‘Indirect op_equals’	LegacyJit-Mono	LegacyJit	Mono x64	15.5428 ns	6.9294 ns	0.3915 ns	1.15	0.02	N/A
‘Direct op_equals’	Llvm-Mono	Llvm	Mono x64	13.4156 ns	5.5125 ns	0.3115 ns	1.00	0.00	N/A
‘Indirect op_equals’	Llvm-Mono	Llvm	Mono x64	15.9306 ns	8.7020 ns	0.4917 ns	1.19	0.04	N/A
‘Direct op_equals’	RyuJit-Clr	RyuJit	Clr	0.9740 ns	0.8871 ns	0.0501 ns	1.00	0.00	0 B
‘Indirect op_equals’	RyuJit-Clr	RyuJit	Clr	1.1444 ns	1.1916 ns	0.0673 ns	1.18	0.07	0 B
‘Direct op_equals’	RyuJit-Mono	RyuJit	Mono x64	14.6879 ns	4.9166 ns	0.2778 ns	1.00	0.00	N/A
‘Indirect op_equals’	RyuJit-Mono	RyuJit	Mono x64	15.8684 ns	4.6367 ns	0.2620 ns	1.08	0.02	N/A

Result

For the most part we can see a penalty of 8% to 19% in our simple test scenario. Neither compilers/JIT’ers optimize away the jump. However we can see that RyuJit on Clr is doing some black (register?) magic here. It stil has the relative overhead of 18%, but it is much faster than the other runtimes.

Cost of method wrapper

Introduction

Test setup

OpEqualsDirect

OpEqualsIndirect

Decompiled

OpEqualsDirect

Bytecode hex

IL

C#

OpEqualsIndirect

Bytecode hex

IL

C#

Callee

Benchmark

Result

Like this:

Related

Leave a ReplyCancel reply

Introduction

Test setup

OpEqualsDirect

OpEqualsIndirect

Decompiled

OpEqualsDirect

Bytecode hex

IL

C#

OpEqualsIndirect

Bytecode hex

IL

C#

Callee

Benchmark

Result

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Tedds blog