Dart

Rumil

0.5.0

Parser combinators where parsers are values

A parser combinator library that represents parsers as data, not functions. You describe what to parse. A separate interpreter handles the rest: backtracking, memoization, error reporting. Because parsers are just values, left-recursive grammars work without rewriting, deep chains stay stack-safe, and composition is predictable.

dart pub add rumil
Try it live

Examples

Parsers are values you compose, not functions you call

// Describe what to parse. Nothing runs yet.
final id = letter & (letter | digit).many;
final assign = id & char('=').trim & expr;

// The interpreter handles the rest.
final result = assign.parse('x = 42');

Left-recursive grammars work directly

// No rewriting needed. The seed-growth algorithm handles it.
final expr = rule(() =>
  expr.zip(char('+').trim).zip(term).map(add)
  | term
);

Parse real formats with rumil_parsers

import 'package:rumil_parsers/rumil_parsers.dart';

final yaml = parseYaml('name: rumil\nversion: 0.5.0');
final json = parseJson('{"key": [1, 2, 3]}');
final toml = parseToml('[server]\nport = 8080');
final csv  = parseCsv('name,age\nAlice,30\nBob,25');

Three-way results — success, partial, or failure

final result = parser.run(input);
switch (result) {
  case Success(:final value):
    print(value);
  case Partial(:final value, :final errors):
    print('$value (with ${errors.length} warnings)');
  case Failure(:final errors):
    print('Failed: $errors');
}

Typed decoders turn ASTs into Dart objects

final config = parseYaml(source).decode(
  field('host', string) &
  field('port', integer) &
  field('debug', boolean),
);
// config is (String, int, bool) — fully typed

Features

Left-recursive grammars work directly, no need to rewrite around recursion
Stack-safe to arbitrary depth through defunctionalized trampolining
Three-way results (success, partial, failure) make error recovery part of the type
Format parsers for JSON, YAML, TOML, XML, CSV, HCL, and Proto3

Packages

6 packages, all at 0.5.0.

rumil

Core combinator framework

The parser combinator library at the foundation. Parsers are a sealed type. You build them from combinators, and an external interpreter runs them. The interpreter handles backtracking, memoization, and the Warth seed-growth algorithm for left recursion. Errors carry source locations. Nothing executes until you ask it to.

  • Left recursion via the Warth seed-growth algorithm
  • Typed errors with line, column, and offset
  • Stack-safe to arbitrary depth via defunctionalized trampolining

rumil_parsers

Format parsers and serializers

Parsers and serializers for JSON, CSV, XML, TOML, YAML, Proto3, and HCL, built on the core combinators. Each format produces a typed AST. You can parse a file, transform the tree, and serialize it back. Tested against the official conformance suite for every format that has one.

  • Seven formats, each with a typed AST
  • Round-trip parsing and serialization
  • Typed encoders for converting Dart objects to format ASTs
  • Conformance-tested against RFC and spec suites

rumil_expressions

Formula evaluation

A formula evaluator built on the core combinators. Parses expressions into a typed AST that you can inspect or transform before evaluating. Supports arithmetic, comparisons, boolean logic, string operations, variables, and custom functions.

  • Typed expression AST, available before evaluation
  • User-defined variables and custom functions
  • Source-accurate error locations

rumil_codec

Binary serialization

Binary codecs for encoding and decoding Dart values. Small codecs for primitives compose into codecs for complex types using Dart records. One definition handles both directions.

  • Compose codecs from primitives to complex types
  • Encode and decode with the same definition
  • ZigZag and LEB128 variable-length encoding

rumil_codec_builder

Code generation for binary codecs

Derives binary codec implementations from annotated Dart classes at build time. Handles sealed class hierarchies and enums. Runs through standard Dart build_runner.

  • Derive codecs from annotated classes
  • Sealed class and enum support
  • Standard build_runner integration

rumil_tokens

Lossless source code tokenizer

Classifies source text into typed token spans — keywords, strings, comments, numbers, types, annotations, identifiers, and punctuation — without dropping a single character. Language grammars are pure data. Ships grammars for Dart, Scala, YAML, JSON, and shell.

  • Lossless tokenization — round-trips to the original source
  • Sealed Token ADT with exhaustive pattern matching
  • Language grammars defined as data, not code