Dentlr 2.0.0

dotnet add package Dentlr --version 2.0.0
NuGet\Install-Package Dentlr -Version 2.0.0
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Dentlr" Version="2.0.0" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Dentlr --version 2.0.0
#r "nuget: Dentlr, 2.0.0"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Dentlr as a Cake Addin
#addin nuget:?package=Dentlr&version=2.0.0

// Install Dentlr as a Cake Tool
#tool nuget:?package=Dentlr&version=2.0.0

Dentlr

Dentlr is an INDENT and DEDENT token generating Lexer base class for Antlr4.

Dentlr can be used to parse positional grammars where leading whitespace determines scope level. Languages like Python and F# use this syntax.

Dentlr implements a C# Lexer base class that detects Newline tokens and checks the indent of the following token. INDENT and DEDENT tokens are inserted in the tokens stream the Lexer emits.

Dentlr is dependent on: Antlr4 9.2.0

  • Separate the Antlr lexer and parser grammar files. Put options { tokenVocab=MyLexer; } in the parser grammer MyParser; file.
  • Detects indents only after a newline.
  • The first indent encountered determines the indent size/length. All subsequent indents must be a multiple of that size or an InvalidIndentException is thrown.
  • Optionally use the IndentSize property to preset a fixed number of spaces to use for an INDENT.
  • Flexible handling of tokens that match whitespace which can be emitted before or after the INDENT and DEDENT tokens or skipped entirely.
  • Does not detect tabs \t (todo).

Usage

In the lexer grammar file specify the tokens that will be used for the INDENT and DEDENT tokens using an tokens {} expression at the beginning of the file. The name of these tokens does not matter. The tokens to be used by the DentlrLexer base class are initialized explicitly in code.

Next, specify the base (or super) class of the lexer class that will be generated for the lexer grammer file. Using the expression options { superClass=Dentlr.DentlrLexer; } does just that.

Finally a NEWLINE (or EOL) token has to be defined the DentlrLexer uses to detect newlines that triggers indent (and dedent) recognition.

A typical lexer grammar file looks something like this:

lexer grammar MyLexer;

tokens { INDENT, DEDENT }
options { superClass=Dentlr.DentlrLexer; }

// need a newline (EndOfLine) token.
EOL: '\r'? '\n' | '\r';

// ... your tokens

There are two ways to initialize the tokens to be used by the DentlrLexer. Either override the NextToken method on your Lexer class to do a one-time initialization.

public partial class MyLexer
{
    public override IToken? NextToken()
    {
        if (!AreTokensInitialized)
            InitializeTokens(INDENT, DEDENT, EOL);

        return base.NextToken();
    }
}

Or call InitializeTokens() at the time the MyLexer object is created.

    string source ...;
    var stream = new AntlrInputStream(source);
    var lexer = new MyLexer(stream)
    {
        IndentSize = 4      // optional predetermined fixed indent size
    };
    lexer.InitializeTokens(MyLexer.INDENT, MyLexer.DEDENT, MyLexer.EOL);
    ...

An InvalidOperationException is thrown when the tokens are not initialized.


TODO

  • Tokens in ctor
  • No Indent (or Dedent) tokens for empty lines with only whitespace
  • parse tabs / TabSize property
  • invalid indent error mode (as spaces, ignore, adjust, throw)
  • Implement a lexer base class for other languages (java, ts, python)
Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
2.0.0 88 3/9/2024
1.0.1 333 11/23/2022

Fix EOL handling.
Removed Whitespace mode.