LinearTsvParser 1.1.5

There is a newer version of this package available.
See the version list below for details.
dotnet add package LinearTsvParser --version 1.1.5
NuGet\Install-Package LinearTsvParser -Version 1.1.5
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="LinearTsvParser" Version="1.1.5" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add LinearTsvParser --version 1.1.5
#r "nuget: LinearTsvParser, 1.1.5"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install LinearTsvParser as a Cake Addin
#addin nuget:?package=LinearTsvParser&version=1.1.5

// Install LinearTsvParser as a Cake Tool
#tool nuget:?package=LinearTsvParser&version=1.1.5

Linear TSV Parser

Reading and writing Linear TSV files in a safe, lossless way. Both async and sync I/O operations are supported.

NuGet package

Available at: https://www.nuget.org/packages/LinearTsvParser

To include it in a .NET Core project:

dotnet add package LinearTsvParser

Examples

Reading a .tsv.gz file in async mode:

using System.IO;
using System.IO.Compression;
using System.Threading.Tasks;
using System.Collections.Generic;
using LinearTsvParser;

public class Example {
    public async Task ReadTsv() {
        using (var input = File.OpenRead("/tmp/test.tsv.gz"))
        using (var gzip = new GZipStream(input, CompressionMode.Decompress))
        using (var tsvReader = new TsvReader(gzip)) {
            while(!tsvReader.EndOfStream) {
                List<string> fields = await tsvReader.ReadLineAsync();
            }
        }
    }
}

Writing a .tsv.gz file in sync mode:

using System.IO;
using System.IO.Compression;
using System.Collections.Generic;
using LinearTsvParser;

public class Example {
    public void WriteTsv(List<string[]> data) {
        using (var outfile = File.Create("/tmp/test.tsv.gz"))
        using (var gzip = new GZipStream(outfile, CompressionMode.Compress))
        using (var tsvWriter = new TsvWriter(gzip)) {
            tsvWriter.WriteLine(new List<string>{ "One", "Two\tTwo", "Three" });

            foreach(string[] fields in data) {
                tsvWriter.WriteLine(fields);
            }
        }
    }
}

The writer accepts any enumerable of strings, let it be string[] or List<string>.

You can output the TSV to the standard output by using Console.Out in the constructor:

using System;
using System.IO;
using LinearTsvParser;

public class Example {
    public void WriteTsv() {
        using var tsvWriter = new TsvWriter(Console.Out));

        tsvWriter.WriteLine(new string[] {"One", "Two", "Three"});
    }
}

The Linear TSV format

  • Fields are separated by TAB characters
  • Text encoding is UTF-8
  • The reader can parse lines with any of these three endings: \n, \r\n, \r
  • The writer is restricted to output only the \n character as EOL
  • Special characters inside the fields are replaced (both ways):
    • Newline ⇒ "\n"
    • Carriage return ⇒ "\r"
    • Tab ⇒ "\t"
    • "\" (backslash) ⇒ "\\"
  • The column counts are not validated, they can vary per line.

Benchmark

The benchmark test compares the performace of this library with "native" solutions, which use string replace operations. The solution with string replace (native) uses more memory and is slower than this library (lib). The benchmark test can be found here: Benchmark.cs

Method Mean Error StdDev Allocated
LibReadTest 275.5 ms 7.15 ms 21.08 ms 62.31 MB
NativeReadTest 309.8 ms 10.08 ms 29.26 ms 66.66 MB
LibWriteTest 110.8 ms 2.81 ms 8.25 ms 23.52 MB
NativeWriteTest 195.9 ms 4.16 ms 11.99 ms 36.06 MB

Configurations

Run the unit tests: (You can also run them one-by-one from VS Code)

dotnet build -c Debug
dotnet test

Run the benchmark:

dotnet run -c Release

Create package for NuGet:

dotnet build -c Prod
dotnet pack -c Prod
Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp3.1 is compatible. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • .NETCoreApp 3.1

    • No dependencies.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on LinearTsvParser:

Package Downloads
FatCatDB

Zero configuration, high performance database library for ETL workflows

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.1.7 3,964 1/18/2020
1.1.6 551 1/14/2020
1.1.5 443 1/14/2020
1.1.4 506 1/12/2020
1.1.3 506 1/12/2020
1.1.2 506 1/12/2020
1.1.1 513 1/12/2020
1.1.0 504 1/12/2020
1.0.7 454 1/12/2020
1.0.6 438 1/12/2020
1.0.5 494 1/12/2020
1.0.4 493 1/12/2020
1.0.3 489 1/12/2020
1.0.2 541 1/12/2020
1.0.1 595 1/11/2020
1.0.0 577 1/11/2020