CSV Query Engine

Level: Advanced 60–90 min

Concepts: AlgorithmsEdge CasesBoundariesRefactoringDesign Patterns


Build a query engine that operates on CSV data.

Requirements

  1. Parse CSV — accept a CSV string with a header row and data rows
  2. Select columnsselect("name", "age") returns only specified columns
  3. Where clausewhere("age", ">", 30) filters rows by condition
    • Support operators: =, !=, >, <, >=, <=
    • Numeric comparisons for numeric values, string comparisons otherwise
  4. Order byorderBy("age", "asc") sorts results ascending or descending
  5. Limitlimit(5) returns only the first N results
  6. Countcount() returns the number of matching rows
  7. Operations are chainable and applied in order

Sample Data

name,age,city,salary
Alice,35,London,75000
Bob,28,Paris,55000
Charlie,42,London,90000
Diana,31,Berlin,65000
Eve,28,Paris,60000

Test Cases

QueryResult
select(“name”)Alice, Bob, Charlie, Diana, Eve
where(“city”, ”=”, “London”)Alice and Charlie rows
where(“age”, ”>”, 30).select(“name”)Alice, Charlie, Diana
orderBy(“age”, “asc”).select(“name”, “age”)Bob(28), Eve(28), Diana(31), Alice(35), Charlie(42)
where(“city”, ”=”, “Paris”).count()2
limit(2).select(“name”)Alice, Bob
where(“age”, ”>=”, 35).orderBy(“salary”, “desc”)Charlie(90000), Alice(75000)
where(“city”, ”=”, “Tokyo”).count()0
select(“invalid_column”)Error or empty results

Edge Cases

  • Empty CSV (header only, no data rows)
  • CSV with a single row
  • Numeric fields used in string comparisons
  • Column names with spaces
  • Data containing commas within quoted fields: "Smith, Jr.",45,London

Bonus

  • Add sum("salary") and avg("salary") aggregation functions
  • Add groupBy("city").count() — group rows and aggregate per group
  • Add join(otherCsv, "id") — join two CSV datasets on a common column
  • Support like operator for pattern matching: where("name", "like", "A%")