Gapotchenko.FX.Runtime.CompilerServices.Intrinsics
2022.1.4
Prefix Reserved
See the version list below for details.
dotnet add package Gapotchenko.FX.Runtime.CompilerServices.Intrinsics --version 2022.1.4
NuGet\Install-Package Gapotchenko.FX.Runtime.CompilerServices.Intrinsics -Version 2022.1.4
<PackageReference Include="Gapotchenko.FX.Runtime.CompilerServices.Intrinsics" Version="2022.1.4" />
paket add Gapotchenko.FX.Runtime.CompilerServices.Intrinsics --version 2022.1.4
#r "nuget: Gapotchenko.FX.Runtime.CompilerServices.Intrinsics, 2022.1.4"
// Install Gapotchenko.FX.Runtime.CompilerServices.Intrinsics as a Cake Addin #addin nuget:?package=Gapotchenko.FX.Runtime.CompilerServices.Intrinsics&version=2022.1.4 // Install Gapotchenko.FX.Runtime.CompilerServices.Intrinsics as a Cake Tool #tool nuget:?package=Gapotchenko.FX.Runtime.CompilerServices.Intrinsics&version=2022.1.4
Overview
Gapotchenko.FX.Runtime.CompilerServices.Intrinsics
module allows to define and compile intrinsic functions.
They can be used in hardware-accelerated implementations of algorithms.
Example
Suppose we are trying to fix the performance bottleneck in the following algorithm:
class BitOperations
{
// Returns the base 2 logarithm of a specified number.
public static int Log2_Trivial(uint value)
{
int r = 0;
while ((value >>= 1) != 0)
++r;
return r;
}
}
log<sub>2</sub> seems to be a trivial operation but it often becomes a serious bottleneck in path-finding or cryptographic algorithms. We can do better here if we switch to a table lookup:
class BitOperations
{
// "Bit Twiddling Hacks" by Sean Eron Anderson:
// http://graphics.stanford.edu/~seander/bithacks.html
static readonly int[] m_Log2DeBruijn32 =
{
0, 9, 1, 10, 13, 21, 2, 29,
11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7,
19, 27, 23, 6, 26, 5, 4, 31
};
public static int Log2_DeBruijn(uint value)
{
// Round down to one less than a power of 2.
value |= value >> 1;
value |= value >> 2;
value |= value >> 4;
value |= value >> 8;
value |= value >> 16;
var index = (value * 0x07C4ACDDU) >> 27;
return m_Log2DeBruijn32[index];
}
}
This is a vast improvement over previous version but we can do even better.
Meet the Intel 80386, a 32-bit microprocessor introduced in 1985.
It brought the Bit Scan Reverse (BSR) instruction that does exactly what we want to achieve with Log2
using just a small fraction of cycles.
Chances are your machine runs on a descendant of that influential CPU, be it AMD Ryzen or Intel Core.
So how can we use the BSR
instruction from .NET?
This is why Gapotchenko.FX.Runtime.CompilerServices.Intrinsics
class exists.
It provides an ability to define intrinsic implementation of a method with MachineCodeIntrinsicAttribute
. Let's see how:
using Gapotchenko.FX.Runtime.CompilerServices;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
class BitOperations
{
// Use static constructor to ensure that intrinsic methods are initialized (compiled) before they can be used
static BitOperations() => Intrinsics.InitializeType(typeof(BitOperations));
static readonly int[] m_Log2DeBruijn32 =
{
0, 9, 1, 10, 13, 21, 2, 29,
11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7,
19, 27, 23, 6, 26, 5, 4, 31
};
// Define machine code intrinsic for the method
[MachineCodeIntrinsic(Architecture.X64, 0x0f, 0xbd, 0xc1)] // BSR EAX, ECX
[MethodImpl(MethodImplOptions.NoInlining)]
public static int Log2_Intrinsic(uint value)
{
value |= value >> 1;
value |= value >> 2;
value |= value >> 4;
value |= value >> 8;
value |= value >> 16;
var index = (value * 0x07C4ACDDU) >> 27;
return m_Log2DeBruijn32[index];
}
}
Log2_Intrinsic
method defines a custom attribute that provides a machine code for BSR EAX, ECX
instruction.
Machine code is tied to CPU architecture and this is reflected in the attribute as well.
Please note that besides using MachineCodeIntrinsicAttribute
to define method intrinsic implementations,
BitOperations
class should use a static constructor to ensure that corresponding methods are initialized (compiled) before they are called.
Here are the execution times of all three implementations (lower is better):
Method | Mean | Error | StdDev |
---|---|---|---|
Log2_Trivial | 4.587 ns | 0.0325 ns | 0.0288 ns |
Log2_DeBruijn | 1.256 ns | 0.0068 ns | 0.0063 ns |
Log2_Intrinsic | 1.038 ns | 0.0660 ns | 0.0947 ns |
Log2_Intrinsic
is a clear winner.
Intrinsic compiler may or may not apply machine code to a method depending on current app host environment. When intrinsic is not applied, the original method implementation is used, thus providing a graceful, albeit less performant, fallback.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 is compatible. net5.0-windows was computed. net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 is compatible. netcoreapp2.1 is compatible. netcoreapp2.2 was computed. netcoreapp3.0 is compatible. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 is compatible. |
.NET Framework | net46 is compatible. net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 is compatible. net472 is compatible. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETCoreApp 2.0
- Gapotchenko.FX (>= 2022.1.4)
-
.NETCoreApp 2.1
- Gapotchenko.FX (>= 2022.1.4)
-
.NETCoreApp 3.0
- Gapotchenko.FX (>= 2022.1.4)
-
.NETFramework 4.6
- Gapotchenko.FX (>= 2022.1.4)
-
.NETFramework 4.7.1
- Gapotchenko.FX (>= 2022.1.4)
-
.NETFramework 4.7.2
- Gapotchenko.FX (>= 2022.1.4)
-
.NETStandard 2.0
- Gapotchenko.FX (>= 2022.1.4)
-
.NETStandard 2.1
- Gapotchenko.FX (>= 2022.1.4)
-
net5.0
- Gapotchenko.FX (>= 2022.1.4)
-
net6.0
- Gapotchenko.FX (>= 2022.1.4)
-
net7.0
- Gapotchenko.FX (>= 2022.1.4)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on Gapotchenko.FX.Runtime.CompilerServices.Intrinsics:
Package | Downloads |
---|---|
Gapotchenko.FX.Numerics
The module provides hardware-accelerated operations for numeric data types. |
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
2024.1.3 | 253 | 11/10/2024 |
2022.2.7 | 5,846 | 5/1/2022 |
2022.2.5 | 1,776 | 5/1/2022 |
2022.1.4 | 1,748 | 4/6/2022 |
2021.2.21 | 1,079 | 1/21/2022 |
2021.2.20 | 980 | 1/17/2022 |
2021.1.5 | 735 | 7/6/2021 |
2020.2.2-beta | 487 | 11/21/2020 |
2020.1.15 | 880 | 11/5/2020 |
2020.1.9-beta | 546 | 7/14/2020 |
2020.1.8-beta | 539 | 7/14/2020 |
2020.1.7-beta | 571 | 7/14/2020 |
2020.1.1-beta | 634 | 2/11/2020 |
2019.3.7 | 884 | 11/4/2019 |
2019.2.20 | 935 | 8/13/2019 |