CSV Query Engine
Level: Advanced 60–90 minConcepts: AlgorithmsEdge CasesBoundariesRefactoringDesign Patterns
Solutions: C# | TypeScript | Python
Build a query engine that operates on CSV data.
Requirements
- Parse CSV — accept a CSV string with a header row and data rows
- Select columns —
select("name", "age")returns only specified columns - Where clause —
where("age", ">", 30)filters rows by condition- Support operators:
=,!=,>,<,>=,<= - Numeric comparisons for numeric values, string comparisons otherwise
- Support operators:
- Order by —
orderBy("age", "asc")sorts results ascending or descending - Limit —
limit(5)returns only the first N results - Count —
count()returns the number of matching rows - Operations are chainable and applied in order
Sample Data
name,age,city,salary
Alice,35,London,75000
Bob,28,Paris,55000
Charlie,42,London,90000
Diana,31,Berlin,65000
Eve,28,Paris,60000
Test Cases
| Query | Result |
|---|---|
| select(“name”) | Alice, Bob, Charlie, Diana, Eve |
| where(“city”, ”=”, “London”) | Alice and Charlie rows |
| where(“age”, ”>”, 30).select(“name”) | Alice, Charlie, Diana |
| orderBy(“age”, “asc”).select(“name”, “age”) | Bob(28), Eve(28), Diana(31), Alice(35), Charlie(42) |
| where(“city”, ”=”, “Paris”).count() | 2 |
| limit(2).select(“name”) | Alice, Bob |
| where(“age”, ”>=”, 35).orderBy(“salary”, “desc”) | Charlie(90000), Alice(75000) |
| where(“city”, ”=”, “Tokyo”).count() | 0 |
| select(“invalid_column”) | Error or empty results |
Edge Cases
- Empty CSV (header only, no data rows)
- CSV with a single row
- Numeric fields used in string comparisons
- Column names with spaces
- Data containing commas within quoted fields:
"Smith, Jr.",45,London
Bonus
- Add
sum("salary")andavg("salary")aggregation functions - Add
groupBy("city").count()— group rows and aggregate per group - Add
join(otherCsv, "id")— join two CSV datasets on a common column - Support
likeoperator for pattern matching:where("name", "like", "A%")
Reference Walkthrough
Full C#, TypeScript, and Python implementations live at tddbuddy-reference-katas/csv-query. Twenty-five scenarios across parsing, select, where (six comparison operators with automatic numeric/string detection), orderBy, limit, count, and chaining — shared across all three languages — with a CsvParser, CsvTable, Query (chainable pipeline), Row value type, UnknownColumnException domain exception, and two fluent test builders (RowBuilder, QueryBuilder).
- C# (.NET 8, xUnit, FluentAssertions) — walkthrough
- TypeScript (Node 20, Vitest, strict types) — walkthrough
- Python (3.11, pytest, dataclasses) — walkthrough
This kata ships in Agent Full-Bake mode: one commit per language with the full domain design landing together. The walkthroughs read as design rationale — why Row is a type rather than a bare dictionary, why numeric detection happens at comparison time rather than parse time, why UnknownColumnException beats silently returning empty. See the repo’s Gears section for when middle gear is the right call.