Markdown Parser

Level: Advanced 60–90 min

Concepts: AlgorithmsEdge CasesBoundariesRefactoring

Solutions: C# | TypeScript | Python


Build a parser that converts a subset of Markdown to HTML, one feature at a time.

Requirements

Implement the following Markdown features in order. Each step builds on the previous — add one feature at a time using TDD.

  1. Paragraphs — plain text becomes <p>text</p>
  2. Bold**text** becomes <strong>text</strong>
  3. Italic_text_ becomes <em>text</em>
  4. Headers# text becomes <h1>text</h1>, ## text becomes <h2>text</h2> (support h1-h6)
  5. Links[text](url) becomes <a href="url">text</a>
  6. Unordered lists — lines starting with - become <ul><li>text</li></ul>

Test Cases

InputOutput
Hello<p>Hello</p>
**bold**<p><strong>bold</strong></p>
_italic_<p><em>italic</em></p>
**bold** and _italic_<p><strong>bold</strong> and <em>italic</em></p>
# Heading<h1>Heading</h1>
## Sub Heading<h2>Sub Heading</h2>
###### Smallest<h6>Smallest</h6>
[click](http://example.com)<p><a href="http://example.com">click</a></p>
**_bold italic_**<p><strong><em>bold italic</em></strong></p>
“ (empty string)
#No space<p>#No space</p> (not a header — no space after #)

List Example

Input:

- item one
- item two
- item three

Output:

<ul><li>item one</li><li>item two</li><li>item three</li></ul>

Bonus

  • Add inline code`code` becomes <code>code</code>
  • Add code blocks — triple backtick blocks become <pre><code>text</code></pre>
  • Add blockquotes> text becomes <blockquote>text</blockquote>
  • Handle multiple paragraphs — blank lines separate paragraphs
  • Handle escaped characters\*not bold\* renders as literal asterisks

Reference Walkthrough

Full C#, TypeScript, and Python implementations live at tddbuddy-reference-katas/markdown-parser with twenty-seven scenarios across all three languages, a fluent DocumentBuilder for multi-line test construction, and a two-pass block/inline architecture that handles all six core features plus inline code, code blocks, blockquotes, multiple paragraphs, and escaped characters.

This is an Agent Full-Bake kata — one commit per language with the full design. See the repo’s Gears section for why that’s a deliberate teaching choice.