SpiseMisu.Text.Dstring
0.11.15
See the version list below for details.
dotnet add package SpiseMisu.Text.Dstring --version 0.11.15
NuGet\Install-Package SpiseMisu.Text.Dstring -Version 0.11.15
<PackageReference Include="SpiseMisu.Text.Dstring" Version="0.11.15" />
<PackageVersion Include="SpiseMisu.Text.Dstring" Version="0.11.15" />
<PackageReference Include="SpiseMisu.Text.Dstring" />
paket add SpiseMisu.Text.Dstring --version 0.11.15
#r "nuget: SpiseMisu.Text.Dstring, 0.11.15"
#:package SpiseMisu.Text.Dstring@0.11.15
#addin nuget:?package=SpiseMisu.Text.Dstring&version=0.11.15
#tool nuget:?package=SpiseMisu.Text.Dstring&version=0.11.15
SpiseMisu.Text.Dstring
A Danish string is a German string alike implementation for .NET, managed memory optimized.
Improved version and perhaps final
(improved version, TBC further)
Metadata flags (brainstorm)
Flags could be something like (compress data):
- isBin: 1001010101 (log 02. / log 02. = 1.0-bit => 8 vals in 1-byte
- isDig: 0123456789 (log 10. / log 02. = 3.3-bit => 9 vals in 3-bytes
- isHex: AF332EC219 (log 16. / log 02. = 4.0-bit => 2 vals in 1-byte
- isB64: log 64. / log 2.
- isUID: d6c3ff78-0546-42dd-abc8-24a9e74ccf90 => 36 vals in 16-bytes
- isF32: 1.f / 3.f = 0.3333333433f => 1 val in 4-bytes (fixed)
- isF64: 1. / 3. = 0.3333333333 => 1 val in 8-bytes (fixed)
- …
and/or
- isRaw .......: for binary array
- isASCII .....: for ASCII chars
- isUTF8.......: for multiple single-byte UTF-8 chars. Ex: "æ ø å ñ"
- isUnicode ...: for multi-byte Unicode chars. Ex: "æ ø å ñ"
- …
The point is that to many times, developers are assumed to know which format a given string is formatted and/or encoded. With these flags, we can now get that info directly from the string itself.
NOTE: To convert between the different bases, just use bigint
.ToByteArray()and constructorBigInteger(byte[] value). That should ensure correctness. For floats (F32/F64) useSingleToUInt32Bits / UInt32BitsToSingleandDoubleToUInt64Bits / UInt64BitsToDoublebefore feeding to bigint to avoid loosing precision.
let hex2bs (hex:string) =
let ns =
if hex.StartsWith("0x") then
NumberStyles.HexNumber
else
NumberStyles.AllowHexSpecifier
BigInteger.Parse(hex, ns).ToByteArray()
Source: https://learn.microsoft.com/en-us/dotnet/api/system.globalization.numberstyles
and use the Parse(ReadOnlySpan<Char>, NumberStyles, IFormatProvider) to avoid
allocating an array from the string
NOTE: Optional as we could squeeze in a few more flags for when the byte array is initiated (
len becomes 8 = 0b0000_1000UL). These flags would only be relevant for the bigger strings. I mean, we could aim to compress with (stream):brotli,gzipandzlib(indotnet core):
Initial draft
A dstring consists of 16-bytes (128-bits) of continuous memory, where:
The first
byte, stores a bitmask for the seven next bytes as well as abyte []pointerThe next seven bytes, store each of the seven first bytes of a
string. If thestringis less than seven bytes, then the remaining bytes will be instantiated to adefaultvalue of zero and thei'thplace on thebitmask, will be set to zeroFinally, the last bytes, contain a
x64-pointer(8-bytes) to abyte [](on the heap) for the rest of the bytes in thestring. If the string is less than eight bytes, thebyte []will not be instantiated (nullvalue) and the8'thplace on thebitmask, will be set to zero
1.a) Example of a 4-byte dstring ("test"). No heap allocation:
+--------+----+----+----+----+----+----+----+----------+
|□□□□■■■■|0x74|0x65|0x73|0x74|0x00|0x00|0x00| <NULL> |
+--------+----+----+----+----+----+----+----+----------+
bit-mask b0 b1 b2 b3 b4 b5 b6 pointer
—— —— —— ——
1.b) Example of a +8-byte dstring ("Danish string") + heap allocation:
0x551A4290 (byte[] on heap)
|
v
+--------+----+----+---+----+----------+ +----+----+---+----+
|■■■■■■■■|0x44|0x61| … |0x20|0x551A4290| ---> |0x73|0x74| … |0x67|
+--------+----+----+---+----+----------+ +----+----+---+----+
bit-mask b0 b1 … b6 pointer b7 b8 … bn
—— —— —— ——————— —— —— ——
1.c) Example of an array of nine dstring:
extra allocated byte arrays on heap ----+------------+------------+
| | |
v | |
0x6796EE96 | |
+-+----+-----------------------+ | | |
|i|memo| continuous memory | v | |
+-+----+--------+---+----------+ +---+ v |
|0|0x00|■■■■■■■■| … |0x6796EE96| -----> | … | 0x53EB31F6 |
+-+----+--------+---+----------+ +---+ | |
|1|0x10|□□□□□□■■| … | <NULL> | v |
+-+----+--------+---+----------+ +---+ v
|2|0x20|■■■■■■■■| … |0x53EB31F6| ------------------> | … | 0x4A424B5E
+-+----+--------+---+----------+ +---+ |
|…|0x…0|□□□■■■■■| … | <NULL> | v
+-+----+--------+---+----------+ +---+
|8|0x80|■■■■■■■■| … |0x4A424B5E| -------------------------------> | … |
+-+----+--------+---+----------+ +---+
Project structure
├── SpiseMisu.Text.Dstring
│ ├── lib
│ │ └── utils.fs
│ ├── SpiseMisu.Text.Dstring.fsproj
│ └── dstring.fs
├── SpiseMisu.Text.Dstring.Perfs
│ ├── SpiseMisu.Text.Dstring.Perfs.fsproj
│ └── program.fs
├── SpiseMisu.Text.Dstring.Tests
│ ├── SpiseMisu.Text.Dstring.Tests.fsproj
│ ├── program.fs
│ └── tests.fs
├── demo
│ └── dstring.fsx
├── SpiseMisu.Text.Dstring.sln
├── global.json
├── license.txt
├── license_nuget_agpl-3.0-only.txt
└── readme.md
Memory layout
(how to use dotnet-dump to navigate the heap, TBC)
Benchmarks
// * Summary *
BenchmarkDotNet v0.15.4, Linux NixOS 25.05 (Warbler)
12th Gen Intel Core i7-12800H 0.40GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 8.0.414
[Host] : .NET 8.0.20 (8.0.20, 8.0.2025.41914), X64 RyuJIT x86-64-v3 DEBUG
Job-OVERNF : .NET 8.0.20 (8.0.20, 8.0.2025.41914), X64 RyuJIT x86-64-v3
Job=Job-OVERNF Runtime=.NET 8.0 IterationCount=1
LaunchCount=0 WarmupCount=0 Error=NA
| Method | N | Mean | Ratio | Allocated | Alloc Ratio |
|--------------------------------------------------- |-------- |-----------:|-------:|----------:|------------:|
| 'Array.zeroCreate<string> x.N' | 1000000 | 2.183 ms | 1.00 | 7.63 MB | 1.00 |
| 'Array.zeroCreate<dstring> x.N' | 1000000 | 5.296 ms | 2.43 | 15.26 MB | 2.00 |
| 'x.guids |> Array.map Encoding.ASCII.GetString' | 1000000 | 121.282 ms | 55.57 | 61.04 MB | 8.00 |
| 'x.guids |> Array.map Dstring.Bytes.toDstring' | 1000000 | 63.640 ms | 29.16 | 53.41 MB | 7.00 |
| 'x.sha256s |> Array.map Encoding.ASCII.GetString' | 1000000 | 215.073 ms | 98.54 | 91.55 MB | 12.00 |
| 'x.sha256s |> Array.map Dstring.Bytes.toDstring' | 1000000 | 76.005 ms | 34.82 | 68.66 MB | 9.00 |
| 'x.strings |> Array.sort' | 1000000 | 264.986 ms | 121.41 | 7.63 MB | 1.00 |
| 'x.strings |> Array.sortDescending' | 1000000 | 288.462 ms | 132.17 | 7.63 MB | 1.00 |
| 'x.strings |> Array.map Dstring.UTF8.fromString' | 1000000 | 112.914 ms | 51.74 | 53.41 MB | 7.00 |
| 'x.dstrings |> Array.map Dstring.UTF8.toString' | 1000000 | 252.340 ms | 115.62 | 98.81 MB | 12.95 |
| 'x.dstrings |> Dstring.Array.sort' | 1000000 | 174.879 ms | 80.13 | 15.26 MB | 2.00 |
| 'x.dstrings |> Dstring.Array.sortDescending' | 1000000 | 180.526 ms | 82.71 | 15.26 MB | 2.00 |
| 'x.dstrings |> Dstring.Array.sortPrefix' | 1000000 | 155.760 ms | 71.37 | 15.26 MB | 2.00 |
| 'x.dstrings |> Dstring.Array.sortPrefixDescending' | 1000000 | 157.594 ms | 72.21 | 15.26 MB | 2.00 |
// * Hints *
HideColumnsAnalyser
Summary -> Hidden columns: Error
// * Legends *
N : Value of the 'N' parameter
Mean : Arithmetic mean of all measurements
Ratio : Mean of the ratio distribution ([Current]/[Baseline])
Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
Alloc Ratio : Allocated memory ratio distribution ([Current]/[Baseline])
1 ms : 1 Millisecond (0.001 sec)
License
Source code in this repository is ONLY covered by a Server Side Public License, v 1 while the rest (knowhow, text, media, …), is covered by the
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license.
However, as it's not permitted to deploy a nuget package with non OSI nor
FSF licenses:
Pushing SpiseMisu.Text.Dstring.0.11.0.nupkg to 'https://www.nuget.org/api/v2/package'...
PUT https://www.nuget.org/api/v2/package/
BadRequest https://www.nuget.org/api/v2/package/ 846ms
error: Response status code does not indicate success: 400 (License expression must only contain licenses that are approved by Open Source Initiative or Free Software Foundation. Unsupported licenses: SSPL-1.0.).
The CIL-bytecode content of the nuget package is therefore dual-licensed
under the GNU Affero General Public License v3.0 only and the
rest (knowhow, text, media, …), is covered by the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
license.
For more info on compatible nuget packages licenses, see SPDX License
List.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net8.0
- FSharp.Core (>= 8.0.403)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.