This document is the normative specification of the weaveback macro language.

Overview

The macro language is a strict, eager, string-valued expansion system. Every macro call reduces to a string. There are no first-class booleans, numbers, or lists outside of the Python escape hatch.

A weaveback run processes one or more driver files. Each file is parsed into an AST and then evaluated: text nodes pass through unchanged, macro calls are dispatched and their results are spliced in. %set(name, value) creates normal string variables; %env(NAME) is a builtin that reads the process environment only when explicitly enabled.

Sigil

The default sigil is %. Every syntactic construct in the language begins with the sigil. The sigil itself can be changed per-run with --sigil, and it may be any single UTF-8 scalar value accepted by Rust char. A doubled sigil (%) is a literal single sigil character; the macro expander never processes it as a call.

%foo(x)          → macro call
%foo(x)         → literal text: %foo(x)

The --sigil ^ convention is the intended workaround for documents that contain many literal percent signs (e.g., CSS, shell scripts).


Syntax

Token types

Token Description

%name(…)

Macro call. name is an identifier [A-Za-z_][A-Za-z0-9_]*.

%(name)

Variable reference.

%{ … %} / %tag{ … %tag}

Quoted argument block (single argument, still macro-active, nestable).

%[ … %] / %tag[ … %tag]

Opaque verbatim block (disables macro parsing inside, nestable).

%

Escaped sigil — expands to a single literal sigil character.

%// …

Line comment — discarded.

%/* … %*/

Block comment — discarded, nestable.

Everything else is literal text.

Quoted argument blocks %{ … %} and %tag{ … %tag}

A quoted argument block is syntactically a single macro argument. Its primary use is to include commas or closing parentheses literally inside an argument that would otherwise be split:

%def(greet, name, %{Hello, %(name)!%})

Without the block, the comma after Hello would split the argument list into Hello and ` %(name)!, giving `greet four arguments instead of three.

Quoted argument blocks have no scope of their own. They are evaluated in the enclosing context. They preserve their leading whitespace exactly as written.

%{%} is the most explicit way to pass an empty string argument. A bare blank argument between commas also works in practice, but %{%} is easier to read, review, and preserve through edits.

Tagged quoted blocks %tag{ … %tag} behave the same way, but use an explicit tag for readability and nesting discipline.

Verbatim blocks %[ … %] and %tag[ … %tag]

Verbatim blocks are opaque to macro parsing:

  • no macro calls are expanded

  • no %(var) interpolation happens

  • comment syntax is not recognised

  • nesting is allowed, including tagged nesting

They are the general-purpose “treat this region literally” mechanism, useful for embedded scripts, regexes, templates, or any local --nomacro behaviour. Like quoted argument blocks, they preserve leading whitespace exactly.

Direct contrast:

%def(show, x, %(x))
%show(%{Hello, %(name)!%})   ← `%(name)` still expands
%show(%[Hello, %(name)!%])   ← literal text, no expansion
%pydef(greet, name, %[ "hello " + name %])

Identifiers

An identifier is currently [A-Za-z_][A-Za-z0-9_]*.

Hyphens are not identifier characters. Macro and variable names therefore use underscore-style spellings such as emit_row or chunk_name, not emit-row or chunk-name.


Evaluation model

Strict, eager expansion

All arguments to a macro call are fully expanded in the caller’s current scope before the macro body runs. There is no lazy evaluation for user-defined macros. Consider:

%set(counter, caller)
%def(id, x, before=%(counter) arg=%(x) after=%(counter))
%id(%(counter))

Output: before=caller arg=caller after=caller. The argument is fully expanded in the caller scope before id is entered, then the resulting string is bound to x in the callee frame.

The one exception is %if, which evaluates its condition first and then evaluates only the selected branch.

Scope stack

The evaluator maintains a Vec<ScopeFrame>. Each frame holds:

  • variables: HashMap<String, String> — string bindings

  • macros: HashMap<String, MacroDefinition> — macro bindings

Frame 0 is the global frame; it is never popped. Each macro call pushes a fresh empty frame, binds parameters into it, runs the body, and pops on return.

Lookup walks from the top of the stack to frame 0 and returns the first match. A binding in an inner scope shadows the same name in an outer scope.

Assignment (%set, parameter binding) always targets the current top frame. There is no way to assign to an outer scope directly; use %export for that.

Call dispatch

When %foo(…) is encountered:

  1. Look up foo in the builtin registry first. Builtin names are permanently reserved; user macros cannot shadow them.

  2. If not a builtin, walk the scope stack for a MacroDefinition.

  3. If neither is found, return EvalError::UndefinedMacro.

Attempting to define a user macro with a builtin name (%def(set, …)) is immediately rejected with EvalError::InvalidUsage. This applies to %def, %redef, %alias and %pydef.

Argument binding

Arguments are evaluated in the caller’s scope before the callee frame is pushed.

Given a call %foo(a, b, key = c) and a definition %def(foo, p, q, r, body):

  1. Validate order: Positional args must precede named args. A positional arg after a named arg is EvalError::InvalidUsage.

  2. Positional binding: Arg 0 → p, arg 1 → q.

  3. Named binding: key is looked up in {p, q, r}. Unknown named arg → EvalError::InvalidUsage.

  4. Duplicate binding: Same param bound both positionally and by name → EvalError::InvalidUsage.

  5. Extra positional args: EvalError::InvalidUsage — too many positional arguments is a bug, not a tolerated shape.

  6. Effectful builtins in arguments: %set(…​) in argument position is EvalError::InvalidUsage. Arguments are values, not assignment sites.

  7. Missing params: Unbound params are UnboundParameter.


Variables

%(name) — reference

Looks up name in the scope stack (top-to-bottom).

  • A missing variable is UndefinedVariable(name).

%set(name, value) — assignment

Evaluates value and stores it as name in the current top frame.

  • name must be a single identifier.

  • Records the definition location in the source-map database.


Macro definitions

%def(name, [params…,] body)

Defines a constant text-substitution macro in the current frame.

Field Rule

name

Single identifier. Must not be a builtin name. Errors if the name already exists in the current frame as either constant or rebindable.

params

Zero or more identifiers, comma-separated. Duplicates are an error. The body is the last non-empty argument; a trailing empty argument at the end of the call is ignored.

body

Last argument. Usually %{ … %} for macro-active bodies or %[ … %] for literal script/text bodies.

The macro body is stored as an Arc<ASTNode> — cloning a macro definition is O(1) (no deep copy of the body).

%redef(name, [params…,] body)

Defines or replaces a rebindable macro in the current frame.

Field Rule

name

Single identifier. Must not be a builtin name. Errors if the name already exists in the current frame as a constant binding. If the name already exists as rebindable in the current frame, it is replaced.

params

Zero or more identifiers, comma-separated. Duplicates are an error. The body is the last non-empty argument; a trailing empty argument at the end of the call is ignored.

body

Last argument. Usually %{ … %} for macro-active bodies or %[ … %] for literal script/text bodies.

Constant vs. rebindable names

The definition model is intentionally explicit:

  • %def creates a constant binding

  • %redef creates or replaces a rebindable binding

That gives one simple invariant:

  • constant names stay constant

  • rebindable names are explicitly marked

Use %redef for deliberate X-macro or multi-pass rebinding patterns.

Rebinding patterns

X-macro pattern

A schema is defined once; successive passes rebind the X visitor to map that schema into different outputs.

Context-dependent meaning

Macro identifiers have no fixed meaning. Their expansion is determined by the current pass, allowing the same source to be interpreted multiple ways.

%def(FIELDS, %{
  %X(name, string)
  %X(age, int)
%})

%redef(X, name, type, %(name): %(type),)          <- pass 1: struct fields
%FIELDS()

%redef(X, name, type, new_%(name): impl Into<%(type)>)  <- pass 2: ctor params
%FIELDS()

%pydef(name, [params…,] body)

Same argument structure as %def. At call time, the body is first macro-expanded, then executed as Python (monty) source code. The return value of the script is the macro’s expansion.

Declared parameters are injected as Python variables. Ordinary %set variables are not automatically injected into the Python scope; if Python code needs their values, they must be interpolated during macro expansion before execution.

The natural forms are:

  • %pydef(name, …​, %[ …​ %]) for literal script bodies

  • %pydef(name, …​, %{ …​ %}) only when macro preprocessing of the script source is intentional

%pydef(add_prefix, name, %[ "wb_" + name %])
%add_prefix(agent)        ← wb_agent
%set(prefix, wb_)
%pydef(add_prefix, name, %{ "%(prefix)" + name %})
%add_prefix(agent)        ← wb_agent

In the first form, the Python body is taken literally. In the second form, the body text is macro-expanded first, then executed as Python.

%alias(new_name, source_name [, key = val, …]) — Macro aliasing

Creates a new macro definition that is a copy of source_name at the moment %alias is called. The copy shares the body Arc (O(1) clone).

Snapshot semantics: later redefinition of source_name does not affect the alias.

Free-variable pre-binding (optional named args): Each key = val pair is evaluated at alias time and stored as a frozen binding. Whenever the alias is called, those bindings are installed in the callee’s scope before parameter binding. This is the only capture mechanism in the language:

%def(render_row, msg, chunk_name, %{| %(msg) |
%})

%alias(emit_tangle_row, render_row, chunk_name = cli-doc-tangle-rows)
%emit_tangle_row(my option description)

The chunk_name free variable is pinned to cli-doc-tangle-rows for all calls to emit_tangle_row.

A frozen binding on a formal parameter name is shadowed by the call-site value (parameter wins over frozen default).

%alias is the explicit way to specialize a macro while keeping later call sites simple. It is also the only supported capture mechanism; %export does not freeze free variables for you.

%export(name) — Propagate to parent scope

Copies the binding for name one level up the scope stack (to the parent frame).

  • For both variables and macros, the binding is copied as-is — no automatic free-variable freezing occurs.

  • At global scope (no parent frame), %export is a no-op and emits a non-fatal warning.

To create a macro that carries frozen bindings into the parent scope, use %alias to create the frozen copy first, then %export it.

%def(make_row, text, %{| %(text) | %(chunk_name) |%})
%alias(make_cli_row, make_row, chunk_name = cli-doc)
%export(make_cli_row)

After export, the parent scope sees %make_cli_row(…​) with chunk_name already pinned to cli-doc.


Boolean model

The macro language has no boolean type. The convention used by all builtins:

  • empty string → false

  • any non-empty string → true

Predicates return 1 (truthy) or empty (falsy).

Note:

  • the parser strips leading whitespace from unquoted arguments

  • 0 is still truthy because it is a non-empty string

  • content inside %{ … %} blocks is preserved verbatim, so %foo(%{ %}) receives three spaces, which is non-empty and therefore truthy


Control flow

%if(cond, then [, else]) — Conditional

Evaluates cond. If the result is non-empty, evaluates and returns then; otherwise evaluates and returns else (or empty if else is omitted).

The non-selected branch is not evaluated%if is the one builtin that is genuinely lazy in its branch arguments.

%if() with no arguments is a non-fatal warning (the call always returns empty and is almost certainly a mistake).

%if(%eq(%(target), linux), use-linux, use-other)

%eq(a, b) — Equality predicate

Evaluates both arguments, compares byte-exact. Returns 1 if equal, empty if not.

%if(%eq(%(mode), release), optimize, debug)

%neq(a, b) — Inequality predicate

Returns 1 if a != b, empty if equal. %eq and %neq are strict inverses.

%if(%neq(%(backend), sqlite), external-db, sqlite)

%not([x]) — Logical negation

Accepts zero or one argument. Returns 1 if the argument is empty (or absent), empty if the argument is non-empty.

%not() with no argument: treats the absent argument as empty, so returns 1. %not(a, b) is EvalError::InvalidUsage.

%if(%not(%(feature_flag)), disabled, enabled)

%eval(name, args…) — Dynamic dispatch

Evaluates the first argument to obtain a macro name, then calls that macro with the remaining arguments. Enables data-driven dispatch.

%def(render_html, x, <b>%(x)</b>)
%def(render_md, x, **%(x)**)
%eval(render_%(fmt), hello)   ← fmt = "html" or "md"

%eval is the dynamic-dispatch escape hatch. Prefer ordinary direct calls when the callee name is statically known.

%here(name, args…) — Source-file patching

Calls the macro name(args…), writes the result into the source file itself immediately after the %here(…​) call, then stops evaluation (early_exit).

On the next run, the %here(…​) call has been patched to %here(…​), which expands to literal %here(…​) and is never dispatched again. This makes %here idempotent.

Caveats:

  • Multiple live %here calls in one file are a hard error.

  • The source file must be writable.

  • %here is a source-rewrite operation, not a normal expression-forming macro. Treat it as operational machinery, not part of the calm conceptual core of the language.

Once %here sets early_exit, all subsequent evaluate() calls return empty immediately. This is not an error condition.


Diagnostics

Errors

Error Trigger

UndefinedMacro(name)

Macro %name not found in scope or builtins

InvalidUsage(msg)

Wrong arg count; invalid identifier; positional-after-named; duplicate binding; unknown named arg; extra positional args; attempt to define a builtin name; invalid %def / %redef rebinding

Runtime(msg)

Recursion limit exceeded; script engine error; %here I/O failure

ParseError(msg)

Lexer or parser failure in a file being evaluated

IncludeNotFound(path)

%include/%import path not resolved

CircularInclude(path)

File is already on the include stack

IoError(e)

Filesystem error during file inclusion

Warnings (non-fatal)

Warnings are accumulated in the evaluator and drained by the caller via take_warnings(). They do not abort evaluation.

Warning Trigger

%if() with no arguments

%if() called with zero arguments; always returns empty

%export at global scope

%export called when there is no parent frame; the call is a no-op


String transformations

All case-conversion builtins apply word-aware splitting before transformation. Word boundaries are:

  • _, -, space

  • CamelCase transition (lower→upper)

  • Acronym boundary (upper→upper→lower)

  • Digit transition

Builtin Transformation Example

%capitalize(s)

Upper-case first character

helloHello

%decapitalize(s)

Lower-case first character

Hellohello

%to_snake_case(s)

Lower, underscore-separated

FooBarfoo_bar

%to_camel_case(s)

Lower first word, upper rest

foo_barfooBar

%to_pascal_case(s)

Upper every word

foo_barFooBar

%to_screaming_case(s)

Upper, underscore-separated

FooBarFOO_BAR

%convert_case(s, style)

Dispatch by style string (lower, upper, snake, screaming, kebab, screaming-kebab, camel, pascal, ada)

All return empty string if input is empty.

Accepted style values for %convert_case(s, style) are:

  • lower, lowercasefoobar

  • upper, uppercaseFOOBAR

  • snakefoo_bar

  • snake_casefoo_bar

  • screaming, screaming_snake, screaming_snake_caseFOO_BAR

  • kebab, kebab-case, kebab_casefoo-bar

  • screaming-kebab, screaming-kebab-case, screaming_kebab, screaming_kebab_caseFOO-BAR

  • camel, camelcase, camel_casefooBar

  • pascal, pascalcase, pascal_caseFooBar

  • ada, ada_caseFoo_Bar

Invalid style strings are InvalidUsage.


File inclusion

%include(path) — Include and evaluate

Resolves path against the include-path list. Reads and evaluates the file inline; its output is spliced at the call site. Macro definitions made in the included file are visible to the caller.

Circular includes are detected at runtime and return EvalError::CircularInclude.

An included file runs in the caller’s current scope frame. %set and %def inside the included file land in that frame. At top level this means global; inside a macro body, definitions are local and disappear on pop.

The path argument is evaluated normally before resolution, so conditional include idioms work:

%include(%if(%eq(%(target), linux), linux.adoc, %{%}))

If the expanded path is empty, the include is skipped.

%import(path) — Include without output

Like %include but discards the text output. Side effects (macro definitions, %set) are preserved.

%import(macros.adoc)   ← load definitions, produce no output
%render_doc(...)

During dependency discovery, %include / %import still evaluate their path argument first. Only the target-file evaluation is skipped after a real path is resolved and recorded.

Python integration

Persistent stores

The Python store (py_store: HashMap<String, String>) survives across macro calls and across %def/%pydef definitions for the lifetime of the evaluator session. It is not part of the scope stack and is not popped with frames.

Builtin Store Description

%pyset(k, v)

Python

Store string v as key k

%pyget(k)

Python

Return value for key k; empty if missing

Script stores bypass all scope-stack discipline. They are powerful for accumulation patterns but make behaviour order-dependent. Prefer local %set where possible.

%env(NAME) — Environment variable

Reads NAME from the process environment. Requires --allow-env.

If EvalConfig.env_prefix = Some("WB_".into()) or the CLI is run with --env-prefix WB_, then %env(PATH) reads WB_PATH.

The prefix is prepended only for the external environment lookup. Inside the macro source you still write the stripped logical name:

%env(PATH)         ← reads WB_PATH when --env-prefix WB_ is active
%env(HOME)         ← reads WB_HOME when --env-prefix WB_ is active

This keeps macro source uncluttered while letting real environment variables stay namespaced.


Recursion limit

The evaluator tracks call depth. Exceeding MAX_RECURSION_DEPTH (defined in weaveback_core) is the default behaviour, but callers may override it with EvalConfig.recursion_limit or the CLI --recursion-limit. Exceeding the active limit produces EvalError::Runtime. There is no tail-call optimisation; every call frame occupies stack space.


Configuration

EvalConfig is constructed by the caller and passed to the Evaluator.

pub struct EvalConfig {
    pub sigil: char,                  // default: '%'
    pub include_paths: Vec<PathBuf>,  // default: ["."]
    pub allow_env: bool,              // default: false
    pub env_prefix: Option<String>,   // default: None
    pub recursion_limit: usize,       // default: MAX_RECURSION_DEPTH
}

Dependency discovery: the separate discovery API still evaluates the %include / %import path argument. If that path expands to empty or whitespace, the include/import is a no-op. If it expands to a real path, the resolved path is recorded and the target file is not evaluated.


Builtin summary

Call form Description Args Returns

%set(n, v)

Assign variable in current frame

2

empty

%export(n)

Copy binding to parent frame

1

empty

%include(path)

Include and evaluate file

1

file content

%import(path)

Include without output

1

empty

%if(c, t [, e])

Conditional (lazy branches)

1–3

branch or empty

%eq(a, b)

Equality predicate

2

1 or empty

%neq(a, b)

Inequality predicate

2

1 or empty

%not([x])

Logical negation

0–1

1 or empty

%eval(n, args…)

Dynamic dispatch

macro result

%here(n, args…)

Patch source file and stop

empty

%capitalize(s)

Upper-case first char

1

string

%decapitalize(s)

Lower-case first char

1

string

%convert_case(s, style)

Case conversion by name

2

string

%to_snake_case(s)

snake_case

1

string

%to_camel_case(s)

camelCase

1

string

%to_pascal_case(s)

PascalCase

1

string

%to_screaming_case(s)

SCREAMING_CASE

1

string

%pyset(k, v)

Write Python store

2

empty

%pyget(k)

Read Python store

1

store value or empty

%env(NAME)

Read env var (requires --allow-env)

0–1

env value or empty


Open items

The following behaviours are known limitations and candidates for future work:

  1. Undefined variable helpers%(typo) is UndefinedVariable. %defined(name) / %default(x, fallback) style helpers would still improve explicitness.

  2. Missing (unbound) parameter — unbound formal parameters are UnboundParameter. Explicit optional parameters with inline defaults (%def(fmt, level, tag, msg = "(none)", body)) are a possible future addition.

  3. %here remains operationally special — it rewrites the source file and aborts further evaluation via early_exit. This is intentional, but it still sits outside the calm core of the language.