CSV Query Engine
Level: Advanced 60–90 minConcepts: AlgorithmsEdge CasesBoundariesRefactoringDesign Patterns
Build a query engine that operates on CSV data.
Requirements
- Parse CSV — accept a CSV string with a header row and data rows
- Select columns —
select("name", "age")returns only specified columns - Where clause —
where("age", ">", 30)filters rows by condition- Support operators:
=,!=,>,<,>=,<= - Numeric comparisons for numeric values, string comparisons otherwise
- Support operators:
- Order by —
orderBy("age", "asc")sorts results ascending or descending - Limit —
limit(5)returns only the first N results - Count —
count()returns the number of matching rows - Operations are chainable and applied in order
Sample Data
name,age,city,salary
Alice,35,London,75000
Bob,28,Paris,55000
Charlie,42,London,90000
Diana,31,Berlin,65000
Eve,28,Paris,60000
Test Cases
| Query | Result |
|---|---|
| select(“name”) | Alice, Bob, Charlie, Diana, Eve |
| where(“city”, ”=”, “London”) | Alice and Charlie rows |
| where(“age”, ”>”, 30).select(“name”) | Alice, Charlie, Diana |
| orderBy(“age”, “asc”).select(“name”, “age”) | Bob(28), Eve(28), Diana(31), Alice(35), Charlie(42) |
| where(“city”, ”=”, “Paris”).count() | 2 |
| limit(2).select(“name”) | Alice, Bob |
| where(“age”, ”>=”, 35).orderBy(“salary”, “desc”) | Charlie(90000), Alice(75000) |
| where(“city”, ”=”, “Tokyo”).count() | 0 |
| select(“invalid_column”) | Error or empty results |
Edge Cases
- Empty CSV (header only, no data rows)
- CSV with a single row
- Numeric fields used in string comparisons
- Column names with spaces
- Data containing commas within quoted fields:
"Smith, Jr.",45,London
Bonus
- Add
sum("salary")andavg("salary")aggregation functions - Add
groupBy("city").count()— group rows and aggregate per group - Add
join(otherCsv, "id")— join two CSV datasets on a common column - Support
likeoperator for pattern matching:where("name", "like", "A%")