Introduction
My PM pinged me on a Thursday afternoon: "Can you export the last 90 days of payment events into a spreadsheet for the finance team?" Simple enough hit the Stripe API, get the JSON, convert to CSV. I even asked ChatGPT to write me a quick script. Five minutes later I had a CSV file.
Except the billing_details column was [object Object]. The metadata field which finance actually needed was completely missing. And every payment that had multiple line items showed up as a single row with the items array jammed into one cell.
The LLM gave me a working script for flat JSON. But real API responses aren't flat. They're nested three, four, five levels deep with arrays inside objects inside arrays. The "quick export" turned into an hour of debugging flattening logic, handling null values in optional fields, and figuring out whether each line item should be its own row or a comma-separated value in a single cell.
I've done this conversion probably a hundred times across different projects now. The problem isn't writing JSON.stringify() or calling pandas.to_csv(). The problem is the decisions how do you represent hierarchical data in a flat table without losing information or creating an unreadable mess?
This guide covers those decisions. Real JSON from real APIs, multiple flattening strategies with trade-offs, edge cases that break naive converters, and when CSV is the wrong choice entirely.
Why JSON-to-CSV Is Harder Than It Looks
The Fundamental Mismatch - Hierarchical vs Tabular
JSON is a tree. CSV is a table. That's the entire problem in one sentence.
JSON can nest objects inside objects, arrays inside arrays, and mix types freely. CSV has rows and columns every row must have the same number of fields, and every field is a string. When you convert between these two models, you're making lossy decisions about how to represent structure in a flat format.
Here's a simple example that illustrates the mismatch:
{
"order_id": "ord_8x7Kp2",
"customer": {
"name": "Sarah Chen",
"email": "sarah@example.com",
"address": {
"city": "Portland",
"state": "OR"
}
},
"items": [
{ "sku": "WIDGET-01", "qty": 2, "price": 29.99 },
{ "sku": "GADGET-03", "qty": 1, "price": 149.0 }
],
"metadata": {
"source": "web",
"campaign_id": null
}
}What should the CSV look like? There's no single correct answer. Should customer.address.city become a column called customer.address.city or just city? Should the two items become two rows (duplicating the order info) or two columns (item_1_sku, item_2_sku)? What happens to campaign_id: null empty string, the literal "null", or omit the column?
These aren't academic questions. They determine whether your finance team can actually use the spreadsheet.
What Gets Lost in Translation
Every JSON-to-CSV conversion loses something. Being explicit about what you're sacrificing helps you choose the right strategy:
- Type information: JSON distinguishes
42(number),"42"(string),true(boolean), andnull. CSV treats everything as text. - Nesting structure: The parent-child relationship between
customerandaddressdisappears in a flat column list. - Array ordering semantics: Is the first item in the array more important than the last? CSV rows don't inherently carry that context.
- Null vs missing vs empty: JSON has
null, missing keys, and"". In CSV, they all look the same: an empty cell. - Object identity: Two items belonging to the same order lose their grouping when expanded into separate rows.
- Lightweight: Easier to process in traditional data pipelines.
- Interoperability: Works across almost every database and analytics platform.
When CSV Is the Wrong Choice
Before spending time on conversion, ask whether CSV is actually what you need:
- Data has more than 3 levels of nesting → Consider Parquet (columnar, preserves nested structures) or keep it as JSON
- Arrays have variable length → Parquet or NDJSON handles this better
- Consumers need to query the data → Load into a database or use the Parquet Viewer with SQL
- File is over 100MB → CSV will be enormous after flattening; Parquet compresses 5-10x better
- Data will be re-ingested programmatically → Keep it as JSON; CSV round-tripping loses type info
For everything else spreadsheet analysis, quick reporting, importing into tools that only accept CSV read on.
Flattening Strategies for Nested JSON
There's no universal "correct" way to flatten JSON. The right strategy depends on who's consuming the CSV and what they need to do with it. Here are the three approaches I use most, applied to the same source data.
Source JSON (array of orders):
[
{
"id": "ord_001",
"created_at": "2026-03-15T09:22:00Z",
"customer": {
"name": "Alex Rivera",
"email": "alex@startup.io"
},
"items": [
{ "product": "API Credits", "amount": 299.0 },
{ "product": "Support Plan", "amount": 89.0 }
],
"total": 388.0,
"status": "paid"
},
{
"id": "ord_002",
"created_at": "2026-03-15T14:05:00Z",
"customer": {
"name": "Jordan Park",
"email": "jordan@bigcorp.com"
},
"items": [{ "product": "Enterprise License", "amount": 2400.0 }],
"total": 2400.0,
"status": "paid"
}
]Strategy 1: Dot-Notation Flattening
Flatten nested objects by joining key paths with dots. Arrays get indexed: items.0.product, items.1.product.
Result:
id,created_at,customer.name,customer.email,items.0.product,items.0.amount,items.1.product,items.1.amount,total,status
ord_001,2026-03-15T09:22:00Z,Alex Rivera,alex@startup.io,API Credits,299.00,Support Plan,89.00,388.00,paid
ord_002,2026-03-15T14:05:00Z,Jordan Park,jordan@bigcorp.com,Enterprise License,2400.00,,,2400.00,paidPros: No data loss, reversible (you can reconstruct the JSON), every value gets its own cell.
Cons: Column count explodes with variable-length arrays. Order 1 has 2 items, order 2 has 1 so you get empty columns. If any order has 20 items, you get 40 extra columns. Column headers like items.7.product are ugly in spreadsheets.
Best for: Programmatic re-ingestion, data pipelines where you'll reshape downstream, small arrays with consistent length.
Strategy 2: Array Expansion (One Row Per Array Item)
Each array item becomes its own row. Parent fields are duplicated across rows.
Result:
id,created_at,customer.name,customer.email,item_product,item_amount,total,status
ord_001,2026-03-15T09:22:00Z,Alex Rivera,alex@startup.io,API Credits,299.00,388.00,paid
ord_001,2026-03-15T09:22:00Z,Alex Rivera,alex@startup.io,Support Plan,89.00,388.00,paid
ord_002,2026-03-15T14:05:00Z,Jordan Park,jordan@bigcorp.com,Enterprise License,2400.00,2400.00,paidPros: Clean column headers, consistent column count, easy to filter/sort in Excel, works with pivot tables.
Cons: Row count multiplies (1000 orders × 5 items = 5000 rows). Parent data is duplicated, inflating file size. Aggregations need care summing total would double-count orders with multiple items.
Best for: Spreadsheet analysis, reporting where line-item detail matters, importing into BI tools that expect one-fact-per-row.
Strategy 3: Selective Extraction
Pick only the fields you need. Ignore nested structures you don't care about.
Result:
id,customer_name,total,status
ord_001,Alex Rivera,388.00,paid
ord_002,Jordan Park,2400.00,paidPros: Minimal, clean, exactly what the consumer asked for. Small file size. No ambiguity.
Cons: Lossy you can't reconstruct the original JSON. Requires knowing upfront which fields matter.
Best for: Executive summaries, quick exports for non-technical stakeholders, feeding into systems that only need specific fields.
Real-World Conversion Scenarios
These are actual conversions I've done (with sanitized data). Each one has a different challenge.
Scenario 1: Exporting Stripe Payment Data for Accounting
The ask: Finance needs a monthly spreadsheet of all successful charges with customer info, amount, currency, and the payment method's last four digits.
The JSON (from Stripe's /v1/charges endpoint):
[
{
"id": "ch_3PqR7x2eZvKYlo2C",
"amount": 15000,
"currency": "usd",
"status": "succeeded",
"created": 1710489600,
"billing_details": {
"name": "Priya Sharma",
"email": "priya@designstudio.co",
"address": {
"city": "Austin",
"state": "TX",
"country": "US",
"postal_code": "78701"
}
},
"payment_method_details": {
"type": "card",
"card": {
"brand": "visa",
"last4": "4242",
"exp_month": 12,
"exp_year": 2027
}
},
"metadata": {
"plan": "pro_annual",
"team_size": "15"
}
},
{
"id": "ch_3PqR8y3fAkLZmp3D",
"amount": 4900,
"currency": "usd",
"status": "succeeded",
"created": 1710493200,
"billing_details": {
"name": "Marcus Johnson",
"email": "marcus@freelance.dev",
"address": {
"city": "Denver",
"state": "CO",
"country": "US",
"postal_code": "80202"
}
},
"payment_method_details": {
"type": "card",
"card": {
"brand": "mastercard",
"last4": "8910",
"exp_month": 3,
"exp_year": 2026
}
},
"metadata": {
"plan": "starter_monthly",
"team_size": "1"
}
}
]The challenge: Stripe amounts are in cents (15000 = $150.00). The created field is a Unix timestamp. Finance needs human-readable dates and dollar amounts. The nested billing_details.address and payment_method_details.card need selective flattening they don't need exp_month or postal_code.
The conversion strategy: Selective extraction + field transformation.
Target CSV:
charge_id,date,customer_name,email,amount_usd,currency,card_brand,card_last4,plan
ch_3PqR7x2eZvKYlo2C,2024-03-15,Priya Sharma,priya@designstudio.co,150.00,usd,visa,4242,pro_annual
ch_3PqR8y3fAkLZmp3D,2024-03-15,Marcus Johnson,marcus@freelance.dev,49.00,usd,mastercard,8910,starter_monthlyHow I'd do this with OnlineJSONFormatt's JSON to CSV tool:
- Paste the JSON array
- Open the Column Mapping Dialog
- Select fields:
id,created,billing_details.name,billing_details.email,amount,currency,payment_method_details.card.brand,payment_method_details.card.last4,metadata.plan - Rename columns to friendly names (
billing_details.name→customer_name) - Export as CSV or XLSX
The tool handles the dot-notation path resolution automatically. For the cents-to-dollars conversion and timestamp formatting, you'd do a quick find-and-replace in the spreadsheet or handle it in a downstream formula.
Scenario 2: Converting GitHub API Responses for Project Tracking
The ask: Engineering manager wants a spreadsheet of all open PRs across 5 repos with author, review status, and days open updated weekly for standup.
The JSON (from GitHub's /repos/{owner}/{repo}/pulls endpoint):
[
{
"number": 847,
"title": "feat: add rate limiting to /api/v2/users",
"state": "open",
"created_at": "2026-05-12T08:30:00Z",
"updated_at": "2026-05-18T16:45:00Z",
"user": {
"login": "dchen-eng",
"avatar_url": "https://avatars.githubusercontent.com/u/12345"
},
"labels": [{ "name": "feature" }, { "name": "needs-review" }],
"requested_reviewers": [
{ "login": "srivastava-k" },
{ "login": "oconnor-m" }
],
"head": {
"ref": "feat/rate-limiting",
"repo": {
"full_name": "acme-corp/api-gateway"
}
},
"draft": false
},
{
"number": 203,
"title": "fix: memory leak in WebSocket connection pool",
"state": "open",
"created_at": "2026-05-17T14:20:00Z",
"updated_at": "2026-05-19T09:10:00Z",
"user": {
"login": "tanaka-r",
"avatar_url": "https://avatars.githubusercontent.com/u/67890"
},
"labels": [{ "name": "bug" }, { "name": "priority-high" }],
"requested_reviewers": [],
"head": {
"ref": "fix/ws-memory-leak",
"repo": {
"full_name": "acme-corp/realtime-service"
}
},
"draft": false
}
]The challenge: labels and requested_reviewers are variable-length arrays. PR #847 has 2 reviewers, PR #203 has none. You can't use array expansion here because one row per label would make the spreadsheet unusable for tracking PRs.
The conversion strategy: Selective extraction with array-to-string joining.
Target CSV:
repo,pr_number,title,author,labels,reviewers,created_at,days_open,is_draft
acme-corp/api-gateway,847,feat: add rate limiting to /api/v2/users,dchen-eng,"feature, needs-review","srivastava-k, oconnor-m",2026-05-12T08:30:00Z,8,false
acme-corp/realtime-service,203,fix: memory leak in WebSocket connection pool,tanaka-r,"bug, priority-high",,2026-05-17T14:20:00Z,3,falseKey decision: Arrays like labels and requested_reviewers are joined into comma-separated strings within a single cell. This keeps one row per PR (which is what the manager actually needs for tracking) while preserving the information. The trade-off is that you can't easily filter by individual label in Excel without a formula.
Scenario 3: Transforming Analytics Event JSON for Spreadsheet Analysis
The ask: Product team wants to analyze user behavior for a feature launch. The analytics pipeline exports events as JSON. They need it in a spreadsheet to build pivot tables.
The JSON (from an analytics export):
[
{
"event_id": "evt_a1b2c3",
"event_name": "feature_activated",
"timestamp": "2026-05-14T10:32:15.847Z",
"user": {
"id": "usr_9f8e7d",
"plan": "enterprise",
"company_size": "500-1000"
},
"properties": {
"feature_name": "ai_suggestions",
"activation_source": "onboarding_wizard",
"time_to_activate_seconds": 45,
"previous_features_used": ["formatter", "validator", "diff"]
},
"context": {
"browser": "Chrome 125",
"os": "macOS",
"viewport": "1920x1080",
"locale": "en-US"
}
},
{
"event_id": "evt_d4e5f6",
"event_name": "feature_activated",
"timestamp": "2026-05-14T11:08:42.123Z",
"user": {
"id": "usr_2a3b4c",
"plan": "pro",
"company_size": "10-50"
},
"properties": {
"feature_name": "ai_suggestions",
"activation_source": "settings_page",
"time_to_activate_seconds": 180,
"previous_features_used": ["formatter"]
},
"context": {
"browser": "Firefox 126",
"os": "Windows",
"viewport": "1366x768",
"locale": "de-DE"
}
}
]The challenge: The properties.previous_features_used array varies in length per event. The context object has fields the product team doesn't care about (viewport). And they want to pivot by user.plan and properties.activation_source so those need to be clean, filterable columns.
The conversion strategy: Selective extraction + array counting (instead of expanding).
Target CSV:
event_id,timestamp,user_id,user_plan,company_size,feature_name,activation_source,time_to_activate_seconds,previous_features_count,browser,os
evt_a1b2c3,2026-05-14T10:32:15.847Z,usr_9f8e7d,enterprise,500-1000,ai_suggestions,onboarding_wizard,45,3,Chrome 125,macOS
evt_d4e5f6,2026-05-14T11:08:42.123Z,usr_2a3b4c,pro,10-50,ai_suggestions,settings_page,180,1,Firefox 126,WindowsKey decision: Instead of expanding previous_features_used into columns or rows, I converted it to a count (previous_features_count: 3). The product team cares about how many features users tried before activating, not which specific ones (they can drill into the JSON for that). This keeps the CSV clean for pivot tables.
Handling Edge Cases in JSON-to-CSV
These are the cases that break naive converters and the ones that cost you debugging time when you discover them after sharing the spreadsheet.
Commas and Quotes Inside Values
Per RFC 4180, CSV fields containing commas, double quotes, or newlines must be enclosed in double quotes. Double quotes within a field are escaped by doubling them.
{
"company": "Smith, Jones & Associates",
"description": "She said \"this is urgent\"",
"address": "123 Main St\nSuite 400"
}Correct CSV output:
company,description,address
"Smith, Jones & Associates","She said ""this is urgent""","123 Main St
Suite 400"Many tools get this wrong they either don't quote fields with commas (breaking the column alignment) or they escape quotes with backslashes instead of doubling (which Excel doesn't understand).
Unicode Characters and Encoding
JSON is UTF-8 by spec. CSV has no mandated encoding Excel on Windows defaults to the system locale (often Windows-1252), which mangles characters like é, ñ, 中文, or emoji.
The fix: Always export CSV as UTF-8 with a BOM (Byte Order Mark: \xEF\xBB\xBF). Excel recognizes the BOM and opens the file with correct encoding. Without it, José García becomes José GarcÃa.
{ "name": "José García", "city": "São Paulo", "note": "Priority: 高" }If your tool doesn't add a BOM, open the CSV in a text editor, save as "UTF-8 with BOM," then open in Excel.
Large Files: Streaming vs In-Memory Conversion
For JSON arrays under 10MB, in-memory conversion is fine. Load the whole thing, flatten, write CSV. But at 50MB+ you'll hit memory limits in the browser and potentially in Node.js too.
Streaming approach (Node.js):
const { createReadStream, createWriteStream } = require('fs');
const { parser } = require('stream-json');
const { streamArray } = require('stream-json/streamers/StreamArray');
const { stringify } = require('csv-stringify');
const csvStream = stringify({ header: true });
csvStream.pipe(createWriteStream('output.csv'));
createReadStream('large-data.json')
.pipe(parser())
.pipe(streamArray())
.on('data', ({ value }) => {
csvStream.write({
id: value.id,
name: value.customer?.name ?? '',
email: value.customer?.email ?? '',
total: value.total,
});
})
.on('end', () => csvStream.end());This processes one object at a time memory usage stays constant regardless of file size.
Date Format Normalization
JSON has no date type. Dates appear as ISO 8601 strings ("2026-05-20T10:30:00Z"), Unix timestamps (1716199800), or arbitrary formats ("May 20, 2026"). When converting to CSV for spreadsheet use, you need consistency.
My rule: Convert everything to ISO 8601 (YYYY-MM-DD or YYYY-MM-DDTHH:mm:ssZ) during conversion. Excel recognizes ISO dates and lets users reformat to their locale preference. Unix timestamps should be converted no spreadsheet user wants to see 1716199800 in a date column.
Null vs Empty String vs Missing Key
These three are semantically different in JSON but look identical in CSV (an empty cell):
[
{ "name": "Alice", "phone": null },
{ "name": "Bob", "phone": "" },
{ "name": "Carol" }
]Options:
| Approach | null | empty string | missing key |
|---|---|---|---|
| Treat all as empty | (empty) | (empty) | (empty) |
| Preserve null literal | null | (empty) | (empty) |
| Distinguish all three | null | (empty) | N/A |
I use "preserve null literal" for data engineering pipelines (so downstream systems can differentiate) and "treat all as empty" for spreadsheets going to non-technical stakeholders.
Tools and Approaches Compared
There's no single best tool it depends on whether this is a one-off task, a recurring pipeline, or something you need done in 30 seconds during a meeting.
Command-Line: jq + csvkit
Best for: Recurring conversions in scripts, CI/CD pipelines, large files.
# Flatten and convert with jq + miller
jq -r '.[] | [.id, .customer.name, .customer.email, .total, .status] | @csv' orders.json > orders.csv
# Or with headers using miller
jq '.' orders.json | mlr --ijson --ocsv catPros: Fast, scriptable, handles huge files, composable with other Unix tools. Cons: Learning curve for jq syntax, no visual preview, hard to debug complex flattening logic.
Programmatic: Node.js (json2csv) or Python (pandas)
Best for: Custom transformation logic, integration into existing codebases, automated pipelines.
Node.js example:
const { Parser } = require('json2csv');
const fields = [
{ label: 'Order ID', value: 'id' },
{ label: 'Customer', value: 'customer.name' },
{ label: 'Email', value: 'customer.email' },
{ label: 'Total', value: 'total' },
{ label: 'Status', value: 'status' },
];
const parser = new Parser({ fields });
const csv = parser.parse(ordersJson);Python example:
import pandas as pd
df = pd.json_normalize(orders_json, sep='_')
# Select and rename columns
df = df[['id', 'customer_name', 'customer_email', 'total', 'status']]
df.columns = ['Order ID', 'Customer', 'Email', 'Total', 'Status']
df.to_csv('orders.csv', index=False)Pros: Full control over transformation logic, handles any edge case, testable. Cons: Requires a development environment, overkill for one-off conversions, code maintenance.
Browser-Based: When You Need It Done Now
Best for: One-off conversions, quick exports during meetings, sensitive data you can't upload to a server.
The OnlineJSONFormatt JSON to CSV converter handles this workflow:
- Paste JSON or upload a file (up to 15MB)
- The tool auto-detects the array structure and available columns
- Open Column Mapping to select, rename, and reorder fields
- Preview the CSV output
- Download as CSV, XLSX, TSV, or Parquet
Everything runs client-side your data never leaves the browser. This matters when you're converting production API responses that contain customer PII or API keys.
Pros: Zero setup, visual column mapping, multiple export formats, privacy-safe. Cons: Limited to browser memory (~15MB JSON), no custom transformation logic beyond field selection.
When to Use Each Approach
| Scenario | Recommended Tool |
|---|---|
| One-off export for a colleague | Browser-based |
| Recurring daily/weekly export | Script (jq or Node.js/Python) |
| Complex transformation with business logic | Programmatic |
| Quick exploration of an API response | Browser-based |
| CI/CD pipeline step | Command-line |
| Sensitive data (PII, credentials in the payload) | Browser-based (client-side) |
| File over 50MB | Command-line (streaming) |
Step-by-Step: Converting JSON to CSV Online
Here's the complete workflow using OnlineJSONFormatt's JSON to CSV tool, demonstrated with the Stripe payment data from Scenario 1.
Step 1: Input your JSON
Paste your JSON array directly into the editor, upload a .json file, or fetch from a URL. The tool accepts arrays of objects (the most common format for API responses).
Step 2: Auto-detection
The tool parses your JSON and detects all available fields, including nested paths. For the Stripe data, it identifies paths like billing_details.name, payment_method_details.card.brand, and metadata.plan.
Step 3: Open Column Mapping
Click the Column Mapping button to customize your output:
- Select fields: Check only the columns you need. Uncheck
billing_details.address.postal_codeandpayment_method_details.card.exp_monthif finance doesn't need them. - Rename columns: Change
billing_details.nametoCustomer Nameandpayment_method_details.card.last4toCard Last 4. - Reorder columns: Drag
idto the first position,amountnext tocurrency.
Step 4: Preview and verify
The live preview shows your CSV output with the selected columns, renamed headers, and proper formatting. Check that:
- All needed fields are present
- No unexpected empty columns
- Values with commas are properly quoted
Step 5: Export
Download in your preferred format:
- CSV: Universal compatibility, opens in any spreadsheet app
- XLSX: Native Excel format, preserves column types better
- TSV: Tab-separated, avoids comma-escaping issues
- Parquet: Columnar format for data engineering pipelines
Each export creates a versioned output you can try different column configurations and keep multiple versions.
Handling Mixed-Type Arrays
One edge case that deserves its own section: arrays where items have different shapes.
[
{
"id": 1,
"type": "user",
"name": "Alice",
"email": "alice@example.com"
},
{
"id": 2,
"type": "organization",
"name": "Acme Corp",
"member_count": 150,
"billing_email": "billing@acme.com"
},
{
"id": 3,
"type": "user",
"name": "Bob",
"email": "bob@example.com"
}
]This array has objects with different keys depending on type. A naive converter produces:
id,type,name,email,member_count,billing_email
1,user,Alice,alice@example.com,,
2,organization,Acme Corp,,150,billing@acme.com
3,user,Bob,bob@example.com,,The union of all keys becomes the column set, with empty cells where a key doesn't exist for that row. This is usually fine for small datasets, but with 50+ unique keys across types, the CSV becomes sparse and hard to read.
Better approach: Filter by type first, then convert separately.
Using OnlineJSONFormatt's Filter & Transform feature:
- Filter where
type == "user"→ export asusers.csv - Filter where
type == "organization"→ export asorgs.csv
Two clean, focused CSVs are more useful than one sparse one.
My Decision Framework for JSON-to-CSV
After doing this conversion dozens of times, here's the mental checklist I run through:
- Does the consumer actually need CSV? If they're going to query the data, suggest Parquet or a database. If they need a spreadsheet, proceed.
- How nested is the data? One level deep → straightforward. Three+ levels → decide on a flattening strategy before starting.
- Are there variable-length arrays? If yes, decide: expand to rows, join to strings, or count. This single decision shapes the entire output.
- What's the file size? Under 15MB → browser tool. Over 50MB → streaming CLI approach.
- Is this recurring or one-off? One-off → browser tool or quick script. Recurring → invest in a proper pipeline with tests.
- Does the data contain PII? If yes → client-side tool only. Never paste customer data into a server-based converter.
Conclusion
JSON-to-CSV conversion is one of those tasks that seems trivial until you're staring at [object Object] in a spreadsheet cell or explaining to your PM why the row count tripled. The core challenge isn't the conversion itself it's the decisions about how to represent hierarchical data in a flat format.
The key takeaways:
- Choose your flattening strategy before you start: dot-notation, array expansion, or selective extraction. Each has clear trade-offs.
- Handle edge cases explicitly: commas in values, null semantics, Unicode encoding. These break spreadsheets silently.
- Match the tool to the task: browser-based for quick one-offs with sensitive data, CLI for pipelines, programmatic for complex logic.
- When CSV isn't enough: consider Parquet for large datasets or data engineering workflows. The OnlineJSONFormatt JSON to CSV tool exports to Parquet directly if you need it.