Markdown Parser
Level: Advanced 60–90 minConcepts: AlgorithmsEdge CasesBoundariesRefactoring
Solutions: C# | TypeScript | Python
Build a parser that converts a subset of Markdown to HTML, one feature at a time.
Requirements
Implement the following Markdown features in order. Each step builds on the previous — add one feature at a time using TDD.
- Paragraphs — plain text becomes
<p>text</p> - Bold —
**text**becomes<strong>text</strong> - Italic —
_text_becomes<em>text</em> - Headers —
# textbecomes<h1>text</h1>,## textbecomes<h2>text</h2>(support h1-h6) - Links —
[text](url)becomes<a href="url">text</a> - Unordered lists — lines starting with
-become<ul><li>text</li></ul>
Test Cases
| Input | Output |
|---|---|
Hello | <p>Hello</p> |
**bold** | <p><strong>bold</strong></p> |
_italic_ | <p><em>italic</em></p> |
**bold** and _italic_ | <p><strong>bold</strong> and <em>italic</em></p> |
# Heading | <h1>Heading</h1> |
## Sub Heading | <h2>Sub Heading</h2> |
###### Smallest | <h6>Smallest</h6> |
[click](http://example.com) | <p><a href="http://example.com">click</a></p> |
**_bold italic_** | <p><strong><em>bold italic</em></strong></p> |
| “ | “ (empty string) |
#No space | <p>#No space</p> (not a header — no space after #) |
List Example
Input:
- item one
- item two
- item three
Output:
<ul><li>item one</li><li>item two</li><li>item three</li></ul>
Bonus
- Add inline code —
`code`becomes<code>code</code> - Add code blocks — triple backtick blocks become
<pre><code>text</code></pre> - Add blockquotes —
> textbecomes<blockquote>text</blockquote> - Handle multiple paragraphs — blank lines separate paragraphs
- Handle escaped characters —
\*not bold\*renders as literal asterisks
Reference Walkthrough
Full C#, TypeScript, and Python implementations live at tddbuddy-reference-katas/markdown-parser with twenty-seven scenarios across all three languages, a fluent DocumentBuilder for multi-line test construction, and a two-pass block/inline architecture that handles all six core features plus inline code, code blocks, blockquotes, multiple paragraphs, and escaped characters.
- C# (.NET 8, xUnit, FluentAssertions) — walkthrough
- TypeScript (Node 20, Vitest, strict types) — walkthrough
- Python (3.11, pytest, dataclasses) — walkthrough
This is an Agent Full-Bake kata — one commit per language with the full design. See the repo’s Gears section for why that’s a deliberate teaching choice.