Several of my projects do heavy markdown parsing. Comment rendering, documentation pipelines, content management. The volume keeps growing, and I've been hitting the point where pure-PHP parsers (Parsedown, league/commonmark, cebe/markdown, michelf) just can't keep up. They're solid libraries, but parsing thousands of documents per request or chewing through 200 KB files in interpreted PHP is slow no matter how well the code is written.
I wanted something 10x+ faster that could serve as a drop-in replacement for the common cases. The result is mdparser, a native C extension that wraps cmark-gfm (GitHub's CommonMark parser) and exposes it through a clean PHP 8.3+ OO API. I'm releasing it today.
mdparser vendors a copy of cmark-gfm 0.29.0.gfm.13 directly into the extension's shared object. No external library to link against, no cmake, no runtime dependencies. The entire cmark-gfm codebase compiles alongside the PHP wrapper into a single .so (or .dll on Windows). Four cherry-picked commits from cmark upstream close the 0.29-to-0.31 spec gap, giving full CommonMark 0.31 conformance: 652 out of 652 spec examples pass.
The PHP API is intentionally small. Two classes, one exception:
use MdParser\Parser;
use MdParser\Options;
// Defaults: safe mode on, GFM extensions on.
$parser = new Parser();
echo $parser->toHtml('# Hello');
// Or the static shorthand:
echo Parser::html('# Hello');
// Custom options via named arguments:
$parser = new Parser(new Options(
smart: true,
footnotes: true,
sourcepos: true,
));
// Three output formats:
$html = $parser->toHtml($markdown);
$xml = $parser->toXml($markdown);
$ast = $parser->toAst($markdown); // nested PHP arrays
Options is final readonly with 17 boolean fields. The Parser constructor translates those bools into cmark's internal bitmask once, so every subsequent parse call is pure cmark work with zero per-call overhead. Static factory presets (Options::strict(), Options::github(), Options::permissive()) cover common deployment patterns.
If you're migrating from Parsedown's line() or cebe/markdown's parseParagraph(), there's toInlineHtml(): inline-only HTML without the wrapping <p> tags. Useful for chat messages, table cells, and short user-facing strings.
This was the primary motivation. Measured on PHP 8.4 with each parser in its default configuration:
| Parser | Small (200 B) | Medium (1.8 KB) | Large (200 KB) |
|---|---|---|---|
| mdparser | 30,447 ops/s | 5,697 ops/s | 105 ops/s |
| Parsedown | 1,651 ops/s (18x slower) | 325 ops/s (17x) | 6 ops/s (17x) |
| cebe/markdown (GFM) | 1,350 ops/s (22x) | 374 ops/s (15x) | 6 ops/s (16x) |
| michelf (Extra) | 1,006 ops/s (30x) | 209 ops/s (27x) | 5 ops/s (19x) |
15-30x faster, from 200-byte chat messages to 200 KB documents. Your absolute numbers will differ by hardware, but the ratios hold. mdparser processes roughly 100 full CommonMark-spec-sized documents per second on a single core. The pure-PHP parsers manage 5-6.
The benchmark uses hrtime(true) around each parse call, 200 iterations with warm-up, trimmed mean to filter GC pauses. Reproducible scripts are in the bench/ directory.
mdparser covers CommonMark core plus all five GFM extensions. Here's how it stacks up against the pure-PHP alternatives:
| Feature | mdparser | Parsedown | league/cm | cebe GFM | michelf Extra |
|---|---|---|---|---|---|
| CommonMark core | full | partial | full | partial | partial |
| GFM tables | yes | yes | via ext | yes | via Extra |
| Strikethrough | yes | yes | via ext | yes | no |
| Task lists | yes | no | via ext | no | no |
| Autolinks | yes | yes | via ext | yes | no |
| Tag filter | yes | yes | via ext | partial | no |
| Smart punctuation | yes | no | via ext | no | no |
| Footnotes | yes | Extra | via ext | no | yes |
| Sourcepos | yes | no | yes | no | no |
| XML output | yes | no | no | no | no |
| AST output | yes (arrays) | no | yes (objects) | no | no |
mdparser is scoped to what cmark-gfm supports: CommonMark core plus five GFM extensions. It doesn't cover definition lists, abbreviations, attribute syntax, heading permalinks, table of contents, YAML front matter, mentions, LaTeX math, emoji shortcodes, or custom containers. If you need those, league/commonmark is the right choice. It's the most featureful pure-PHP option and actively maintained. Speed doesn't help if the feature you need isn't there.
mdparser builds and tests on PHP 8.3, 8.4, and 8.5 across Linux (x86_64), macOS (arm64/x86_64), and Windows (x86/x64, both TS and NTS). CI runs on all three platforms, with an ASAN job on Linux to catch memory issues. Pre-built Windows DLLs ship with each GitHub release.
pie install iliaal/mdparser
PIE handles the download, phpize, configure, make, and install. On a minimal PHP image you'll need git, bison, and libtool-bin as build dependencies.
From source:
git clone https://github.com/iliaal/mdparser.git
cd mdparser
phpize && ./configure --enable-mdparser
make -j && sudo make install