GNU m4 is the oldest surviving general-purpose macro processor and the closest conceptual relative of weaveback-macro. This document compares them across several dimensions to help you decide which is the right tool for a given problem — and to be honest about where each one falls short.
Purpose and design philosophy
m4 was designed by Brian Kernighan and Dennis Ritchie in the early 1970s
as a general-purpose text-substitution layer for Unix. Its canonical use
today is as the expansion engine inside GNU Autoconf: it processes .m4
files at configure-generation time and never runs on the end user’s
machine. The sendmail mail transfer agent also ships a large library of
m4 macros for generating its notoriously complex configuration files.
These two use cases — build-system glue and system-administration
templating — define m4’s sweet spot. m4 does not know or care what the
text around a macro means; it is a pure text-replacement machine.
weaveback-macro was designed for a narrower and more specific job: pre-processing the prose layer of a literate programming document before the tangle pass extracts code chunks. The text around a macro does matter — it is AsciiDoc or Markdown that a human will read and that Asciidoctor will render. This shapes every design decision: the special character is configurable so it does not collide with document markup, macros are defined inside ordinary prose, and the tool is meant to be transparent to a reader who does not know it exists.
In short: m4 is a universal macro language that happens to be used for literate-adjacent tasks. weaveback-macro is a literate-programming tool that happens to be implemented as a macro language.
Expressiveness
GNU m4
m4 is Turing-complete. Its expansion model — push the result back onto
the input stream and re-scan — means any computation expressible as text
transformation can be encoded in m4, including recursion, loops (via
recursive macros), arithmetic (eval), regex substitution (patsubst),
character translation (translit), substring extraction (substr), and
printf-style formatting (format). Diversions allow output to be
buffered and reordered, which enables forward-reference tricks.
system and esyscmd let macros call arbitrary shell commands and
capture their output. Stack-based scoping (pushdef/popdef) lets you
temporarily shadow a macro and restore the previous definition.
This power comes with a cost: any sufficiently large m4 program starts to feel less like a macro layer and more like an accidental programming language written in line noise. The GNU Autoconf project responded by building two abstraction layers on top of raw m4 — m4sugar (safe, conventional macro utilities) and m4sh (macros that generate shell code) — precisely because raw m4 is too sharp an instrument for daily use.
weaveback-macro
weaveback-macro is intentionally less expressive at the core level.
String macros (%def) and variables (%set) cover the common case.
The conditional %if(cond, then, else) and file inclusion
(%include/%import) complete the basic feature set.
Where weaveback-macro regains expressiveness is through a scripting escape
hatch. %pydef writes a macro body in Python-like syntax via
monty, a minimal, secure Python
interpreter written in Rust — not CPython, not a subprocess. Scripts
receive the current scope’s variables as strings and return the expanded
text. They run inside a sandboxed runtime with no filesystem or network
access.
The practical difference: m4 can express anything, but you have to express it entirely in m4’s opaque notation. weaveback-macro covers most document-preprocessing needs directly, and delegates the rest to Python when computation is genuinely needed.
| Capability | GNU m4 | weaveback-macro |
|---|---|---|
Recursive macros |
✓ (core feature) |
via |
Arithmetic |
✓ |
via Python |
Regex substitution |
✓ |
via Python |
String operations |
✓ |
via Python |
Conditionals |
✓ |
✓ |
File inclusion |
✓ |
✓ |
Shell invocation |
✓ |
✗ (monty is sandboxed) |
Output reordering |
✓ diversions |
✗ |
Scoped redefinition |
✓ |
✓ automatic (local by default, |
Named arguments |
✗ |
✓ |
Scripting language body |
✗ |
✓ (Python) |
Turing-complete |
✓ |
✓ (via scripting) |
Learning curve
GNU m4
m4’s learning curve is famously steep. The central difficulty is
quoting. By default, a backtick opens a quoted string and an
apostrophe closes it — a convention that collides with English prose,
Markdown, shell scripts, and most programming languages. The alternative
changequote([, ]) helps but introduces bracket-balancing requirements.
Quoted strings suppress expansion and are stripped of their delimiters
before the result is used, which means the number of quote layers needed
around any expression depends on how many times the text will be
re-scanned — a question that requires understanding the evaluation model
before you can answer it.
The re-scanning model itself is the second source of confusion. m4 pushes the result of every expansion back onto the input stream and re-reads it. An unquoted comma inside a macro argument is therefore not syntactically inert: if a macro expansion generates a comma, that comma participates in argument delimiting. This is conceptually elegant and practically treacherous.
The official GNU m4 manual acknowledges: "The most common problem with existing macros is improper quotation." External guides add: "Quoting can be cantankerous on occasion in m4." Many experienced developers who use Autoconf daily treat m4 as a write-once medium: you paste boilerplate from working examples and avoid touching it.
weaveback-macro
weaveback-macro’s model is simpler because it does not re-scan. A macro expansion produces text that is output as-is; it does not feed back into the macro expander. This eliminates the entire category of quoting bugs that makes m4 difficult.
Arguments follow Python-style calling conventions: positional arguments bind left-to-right, named arguments bind by name and may appear in any order after positional arguments. Missing parameters bind to the empty string. Extra positional arguments are silently ignored. This is familiar to anyone who has written Python.
The special character (% by default, configurable) only triggers macro
syntax when followed by a known macro name or opening parenthesis. A
lone % in prose — as in "the failure rate was 4%" — passes through
untouched. The escape for a literal special is doubling (%), the
same convention used by printf.
The main learning investment is understanding which constructs exist
(%def, %set, %if(cond,then,else), %include,
%pydef) and the distinction between defining a macro (%def) and
setting a variable (%set). A developer familiar with any template language will reach
productive use within an hour.
Quoting and scoping
m4 provides explicit scope manipulation via pushdef/popdef, which
requires manual discipline: the programmer must balance every pushdef
with a popdef.
weaveback-macro uses a scope stack that is managed automatically.
%def inside a macro body creates a definition local to that
invocation; it is discarded when the call returns. To promote a
definition to the caller’s scope, %export is used explicitly.
%include evaluates the included file in the caller’s current scope —
equivalent to inlining the text at the call site. When called at the
top level (the normal case) definitions land at the top level; when
called inside a macro body they are local to that invocation.
%export promotes a name one level up to the parent scope.
This model gives the locality of pushdef/popdef without
requiring the programmer to balance them manually.
m4 uses ifdef to test whether a name is defined. weaveback-macro uses
%if(cond, then, else) where the condition is truthy if it expands to a
non-empty string; Python handles any condition more complex than
that.
Error messages and debugging
GNU m4
m4’s error messages are terse and often point to the wrong location,
because by the time the error is detected the input has been transformed
by multiple expansion passes. Debugging uses traceon/traceoff and
the -d flag, which prints a trace of each macro call and its expansion
with depth indicators. The output is useful once you understand it, but
the format is unstructured text that requires visual parsing. The manual
warns that m4 cannot detect infinite rescanning loops because the problem
is undecidable.
weaveback-macro
Error messages include the source file and line number from the original
document (before expansion). Because there is no re-scanning, the error
site in the message corresponds to the site a human sees in the source.
Recursion depth is tracked and capped at a configurable limit (100 by
default), so infinite-recursion mistakes produce a clear error rather
than a stack overflow. There is no interactive debugger, but the
--dump-expanded flag on the combined weaveback binary prints the
fully expanded text to stderr before the tangle pass, which is usually
sufficient for diagnosis.
Speed and scaling
m4 is written in C and processes tens of megabytes of macro-heavy input
per second. Its --freeze-state / --reload-state mechanism lets you
pre-compute a common initialization state and amortise it across many
runs. Its changeword feature (which allows redefining word syntax)
slows it down considerably and is rarely used. m4 is stateless between
runs: no database, no carry-over cost.
weaveback-macro itself is fast (Rust, memchr single-byte scan). The
macro pass and the tangle pass are not the bottleneck for large projects.
The bottleneck is weaveback.db.
Database scaling
weaveback.db has two tables that store full file content:
-
gen_baselines— a byte-for-byte copy of every generated file, used to detect external edits between runs. Size tracks the total size of yourgen/directory. -
src_snapshots— a byte-for-byte copy of every source file read. Size tracks the total size of your literate sources.
noweb_map adds roughly 150 bytes per output line.
Rough projections:
| Project scale | Source | Generated | Estimated weaveback.db |
|---|---|---|---|
Small |
500 KB |
300 KB |
~1.5 MB |
Medium |
5 MB |
3 MB |
~12 MB |
Large |
30 MB |
15 MB |
~80 MB |
Very large |
300 MB |
150 MB |
~700 MB |
These are proportional, not fixed costs — the database grows linearly with project size. 80 MB is unremarkable; 700 MB starts to be a consideration for disk space and backup.
Memory during a run
The more immediate concern for large projects is RAM. The temp database
that accumulates during a tangle run uses SQLite in-memory storage
(open_in_memory). At merge time the entire contents of that database
— all of gen_baselines, src_snapshots, noweb_map, and chunk_deps
— lives simultaneously in RAM before being flushed to disk in one
BEGIN IMMEDIATE transaction. Add the ChunkStore (all parsed chunks)
and the evaluator state and the per-run RAM footprint is roughly 2× the
final database size.
For small and medium projects this is invisible. For a very large project a single weaveback run could require 1–2 GB of RAM just for the database merge, before accounting for the rest of the process.
m4 has no equivalent cost: it writes output as it expands and keeps nothing after the run.
Summary
For the overwhelming majority of literate programming projects — a few
dozen source files, tens of thousands of lines of generated output —
neither speed nor database size is a concern. At the scale of a large
monorepo the database design would need revisiting: lazy snapshotting,
opt-in src_snapshots, compressed BLOBs, or a file-backed temp database
instead of in-memory. None of that work has been done; the current
design has not been stress-tested at scale.
Error-proneness
This is where the difference is most pronounced.
m4’s re-scanning model means that many syntactically valid programs produce subtly wrong output rather than an error. A missing or misplaced quote tier causes a macro argument to expand one pass too early or too late. The output looks almost right. The bug can propagate through several files of generated configuration before surfacing. The difficulty is structural: it follows directly from the design.
weaveback-macro’s no-rescan model eliminates that class of bug entirely. The remaining failure modes are:
-
Calling an undefined macro name (reported immediately with location).
-
A
%pydefbody with a Python syntax error (reported with the monty error message embedded in the weaveback error). -
A circular
%includechain (detected and reported). -
A
%pydefbody whose Python raises an exception (propagated as an error with the Python traceback).
All of these produce explicit errors with source locations. None of them produce subtly-wrong-but-plausible output.
Licensing
| Tool | License |
|---|---|
GNU m4 |
GPLv3+ |
weaveback-macro |
0BSD OR MIT OR Apache-2.0 |
GPL means that any work that incorporates m4 itself at runtime may have
licensing implications depending on how courts interpret dynamic linking
and tool invocation. In practice, Autoconf-generated configure scripts
are explicitly exempted by the Free Software Foundation, and using m4 as
a build tool does not affect the license of the generated output. But it
is a question that legal teams sometimes ask.
weaveback’s triple license (0BSD / MIT / Apache-2.0) imposes no restrictions on the generated output and is compatible with any project, including proprietary ones.
Integration with literate programming
m4 has no knowledge of literate programming. Using it as a pre-processor
for noweb or similar requires wiring the two tools together manually,
handling newline injection carefully (the dnl builtin exists largely to
suppress the extra newlines that m4 leaves after macro definitions), and
ensuring that m4’s special characters do not conflict with the noweb chunk
delimiters (<< and >>). People have done this, but it is an
integration problem you solve yourself.
weaveback-macro is designed as a first-class stage of the weaveback
pipeline. It runs before the tangle pass; its output is the input to
chunk extraction. The special character is configurable so it never
conflicts with any chunk syntax you choose. Source locations from the
macro expansion pass are stored in the database and surfaced by
wb-query trace, so an agent or developer can trace an output line back
through both the chunk expansion and the macro expansion to the exact
token in the original document.
Ecosystem and community
m4 is a POSIX-standard tool with fifty years of history. It is pre-installed on essentially every Unix-like system. There is a large body of documentation, Stack Overflow answers, and working code — though much of it is Autoconf-specific and not easily transferred to other contexts. The user base is large but passive: most people encounter m4 through Autoconf and never write m4 directly.
weaveback-macro is a new tool with a small user base and no separate community. Its documentation is the source itself (literate-programmed) and this docs directory. You get Rust-quality error messages and a maintainer who reads every issue, but you cannot paste your question into a forum and expect an answer within the hour.
Portability
m4 runs on every Unix-like system and Windows (via MinGW/MSYS or native binaries). It is available in every Linux distribution’s package manager.
weaveback is distributed as pre-built binaries for glibc Linux, musl
Linux (fully static), Fedora/RPM, and Windows (MinGW and native), as
well as an AUR package (weaveback-bin). The Rust toolchain can
cross-compile it for any tier-1 or tier-2 Rust target. It is not yet
in any Linux distribution’s official repositories.
Security
m4’s system and esyscmd builtins can execute arbitrary shell
commands. This is expected — m4 is a build tool, not a sandbox. In
contexts where input is not fully trusted, this is a real concern.
weaveback-macro’s %env builtin (reads environment variables) is
disabled by default and requires --allow-env to enable, to prevent
templates from silently exfiltrating secrets.
%pydef runs inside a sandboxed interpreter with no filesystem access,
no network access, and no subprocess spawning. %pydef uses
monty, a minimal, secure Python
interpreter written in Rust — not CPython. Neither can read files,
make network calls, or escape the weaveback process in any way. This
is a meaningful security advantage over m4, where system and
esyscmd are available to any macro with no opt-in required.
The tangle pass offers a --dry-run flag that prints what files would
be written without touching the filesystem. This makes it safe to
inspect the effect of a template or chunk set on an unfamiliar codebase
before committing to any writes.
Unicode
m4 is limited to ASCII and 8-bit character sets. It has no notion of
multi-byte characters; substr and index operate on bytes. Unicode
text passes through only if macros do not touch it; any transformation
through substr, translit, or len produces incorrect results for
non-ASCII input.
weaveback-macro operates on UTF-8 strings throughout. The special
character and macro names must be ASCII: the special character uses
memchr for fast scanning, and macro names follow C-style identifier
rules ([a-zA-Z_][a-zA-Z0-9_]*). Arguments and output text are
Unicode-clean.
When to use which
Use m4 when:
-
You are working within the GNU Autotools ecosystem and m4 is already required.
-
You need diversions (output reordering) or
pushdef/popdefscoping. -
The output is going to a system (sendmail, BIND) that already has a large, established m4 macro library you want to reuse.
-
Your audience is Unix administrators who know m4.
Use weaveback-macro when:
-
You are writing literate programming documents in AsciiDoc or Markdown.
-
You want macro expansion to be transparent to a reader who does not know the tool exists.
-
You want named arguments, clear error messages, and source-location tracing from output lines back to original source.
-
You need occasional computation in a macro body and would rather write it in Python than in m4.
-
Licensing matters and GPL is a problem.
-
You are targeting Windows and do not want a POSIX dependency chain.
Summary
| Dimension | GNU m4 | weaveback-macro |
|---|---|---|
Purpose |
General-purpose |
Literate programming |
Expressiveness |
High (Turing-complete natively) |
Moderate core, high via Python |
Learning curve |
Steep (quoting rules) |
Gentle (no re-scan, named args) |
Error-proneness |
High (silent wrong output) |
Low (explicit errors with locations) |
Error messages |
Terse, displaced |
Source-accurate, embedded cause |
Speed |
Fast (C, memchr-equivalent) |
Fast (Rust, memchr) |
Unicode |
ASCII/byte-only |
UTF-8 clean |
Debugging |
|
|
Licensing |
GPLv3+ |
0BSD / MIT / Apache-2.0 |
Ecosystem |
50-year history, huge |
New, small |
Portability |
Universal |
Pre-built binaries, any Rust target |
Literate integration |
Manual wiring |
First-class pipeline stage |
Security |
|
|