Introduction
Three things I do at least once a week that nobody taught me in school: escape JSON strings for embedding in other JSON, compare two API responses to find what changed between deployments, and make XML readable enough to debug a SOAP integration that should have been retired five years ago.
These aren't glamorous operations. No one writes blog posts about them. But they eat real time, 10 minutes here, 20 minutes there, and the wrong approach turns a 30-second task into a 15-minute detour. I've spent enough time on these that I've developed strong opinions about how to do them efficiently.
This guide covers the three operations with real code from real scenarios: a payment webhook that needed JSON embedded in a database column, an API migration where the response structure changed subtly, and a SOAP integration with a government tax service that returns XML errors as a single unreadable line.
JSON Escaping: When You Need JSON Inside JSON
The Problem: Embedding JSON in Strings
JSON escaping sounds trivial until you're staring at a string that contains a string that contains JSON that contains strings with quotes. It happens more often than you'd expect:
- Logging systems that store structured data as a text field
- Message queues (SQS, Kafka) where the message body is a JSON string containing a JSON payload
- Database columns that store JSON as text (not a native JSON type)
- API gateways that wrap inner payloads in an outer envelope as a string field
- Test fixtures where you need JSON literals inside JavaScript/TypeScript strings
The core issue: JSON uses double quotes for both string delimiters and key/value syntax. When you embed JSON inside a JSON string, those inner quotes need escaping, or the parser can't tell where the string ends.
What Escaping Actually Does (Character by Character)
Per RFC 8259 Section 7, these characters must be escaped inside a JSON string:
" → \" (quotation mark)
\ → \\ (reverse solidus)
/ → \/ (solidus, optional but common)
\b → \\b (backspace)
\f → \\f (form feed)
\n → \\n (newline)
\r → \\r (carriage return)
\t → \\t (tab)Here's what this looks like in practice. Say you have this JSON object representing a user event:
{
"event": "purchase_completed",
"user_id": "usr_8a3f2b",
"metadata": {
"items": [{ "sku": "WIDGET-001", "qty": 2, "price": 29.99 }],
"shipping_address": "123 Main St\nApt 4B\nNew York, NY 10001"
}
}When you need to store this as a string value inside another JSON document (say, a message queue envelope), it becomes:
{
"messageId": "msg_7721af",
"timestamp": "2026-03-15T14:22:08Z",
"body": "{\"event\":\"purchase_completed\",\"user_id\":\"usr_8a3f2b\",\"metadata\":{\"items\":[{\"sku\":\"WIDGET-001\",\"qty\":2,\"price\":29.99}],\"shipping_address\":\"123 Main St\\nApt 4B\\nNew York, NY 10001\"}}"
}Notice the \n in the shipping address became \\n because the newline itself needs to be represented as a literal backslash-n inside the outer string.
Common Escaping Mistakes
Double-escaping is the most frequent issue I encounter. It happens when you escape a string that's already escaped:
// ❌ Wrong: double-escaping
const payload = JSON.stringify(JSON.stringify(data));
// Result: "{\\\"event\\\":\\\"purchase_completed\\\"...}"
// ✅ Correct: escape once
const payload = JSON.stringify(data);
// Result: "{\"event\":\"purchase_completed\"...}"Forgetting newlines and tabs is the second most common. Raw multiline strings in your source data will break the JSON if not escaped:
# ❌ This breaks: raw newline inside JSON string
bad_json = '{"address": "123 Main St
Apt 4B"}'
# ✅ This works: newline escaped
good_json = '{"address": "123 Main St\\nApt 4B"}'Unescaping: Getting Your Data Back
The reverse operation is equally important. When you pull that message body from the queue and need to work with the inner JSON:
// Kafka consumer receives the message envelope
const envelope = JSON.parse(rawMessage);
// The body field is a JSON string, parse it again
const payload = JSON.parse(envelope.body);
// Now you can access nested fields normally
console.log(payload.metadata.items[0].sku); // "WIDGET-001"import json
# Reading from PostgreSQL text column
row = cursor.fetchone()
raw_body = row['event_payload'] # This is an escaped JSON string
# Parse the escaped string back to a Python dict
payload = json.loads(raw_body)
print(payload['metadata']['shipping_address'])
# Output:
# 123 Main St
# Apt 4B
# New York, NY 10001Real Scenario: Storing JSON Events in a PostgreSQL Text Column
Last year I worked on an event sourcing system where we stored raw webhook payloads in a PostgreSQL text column (the team chose text over jsonb because the payloads came from third parties and weren't always valid JSON, so we wanted to store them regardless for debugging).
The insert looked like this:
INSERT INTO webhook_events (source, received_at, raw_payload)
VALUES (
'stripe',
'2026-03-15T14:22:08Z',
'{"id":"evt_1R3abc","type":"payment_intent.succeeded","data":{"object":{"id":"pi_3Pxyz","amount":5998,"currency":"usd","metadata":{"order_id":"ORD-2026-4471","customer_note":"Ship to loading dock\\nAttn: Warehouse B"}}}}'
);The retrieval and parsing in Node.js:
const { rows } = await pool.query(
"SELECT raw_payload FROM webhook_events WHERE source = 'stripe' ORDER BY received_at DESC LIMIT 1"
);
// raw_payload is a string, parse it to access fields
const event = JSON.parse(rows[0].raw_payload);
if (event.type === 'payment_intent.succeeded') {
const orderId = event.data.object.metadata.order_id;
const amount = event.data.object.amount / 100; // cents to dollars
console.log(`Order ${orderId} paid: $${amount}`);
// Output: "Order ORD-2026-4471 paid: $59.98"
}The gotcha I hit: when the customer_note field contained actual newlines from a textarea input, the naive approach of string concatenation broke the SQL. Using parameterized queries with JSON.stringify() handled the escaping automatically:
// ✅ Safe: parameterized query handles escaping
await pool.query(
'INSERT INTO webhook_events (source, received_at, raw_payload) VALUES ($1, $2, $3)',
['stripe', new Date().toISOString(), JSON.stringify(webhookBody)]
);Escaping in Different Languages
// JavaScript / Node.js
const escaped = JSON.stringify(originalObject); // Returns a JSON string
const unescaped = JSON.parse(escaped); // Returns the original object# Python
import json
escaped = json.dumps(original_dict) # Returns a JSON string
unescaped = json.loads(escaped) # Returns the original dict
# For embedding in another JSON string (double encoding):
double_encoded = json.dumps(json.dumps(original_dict))# curl: sending JSON that contains JSON in a field
curl -X POST https://api.example.com/events \
-H "Content-Type: application/json" \
-d '{"source":"webhook","payload":"{\"event\":\"order.created\",\"id\":\"ord_123\"}"}'
# Using jq to escape a file's contents for embedding:
ESCAPED=$(jq -Rs '.' < payload.json)
curl -X POST https://api.example.com/events \
-H "Content-Type: application/json" \
-d "{\"source\":\"webhook\",\"payload\":$ESCAPED}"For quick one-off escaping during debugging, a browser-based JSON escape tool saves the round-trip of writing a script. Paste the JSON, get the escaped string, paste it where you need it.
JSON Comparison: Finding the Needle in the Haystack
Why Simple Text Diff Fails for JSON
I learned this the hard way during an API migration. We upgraded a payment provider's SDK, and the webhook payloads "looked the same" but our handler started failing silently. A git diff-style text comparison showed dozens of changes, but most were just key reordering (the new SDK serialized fields alphabetically instead of insertion-order).
The actual breaking change? A single nested field renamed from card_brand to cardBrand. Buried in 40 lines of false-positive "changes" from reordered keys.
Text diff treats these two objects as completely different:
// Version A (old SDK)
{
"amount": 5998,
"currency": "usd",
"card_brand": "visa",
"last_four": "4242"
}
// Version B (new SDK): same data, different key order + one rename
{
"card_brand": "visa",
"amount": 5998,
"last_four": "4242",
"currency": "usd"
}A text diff shows 4 lines changed. A structural JSON diff shows 0 changes (because key order is semantically irrelevant in JSON per RFC 8259). The real change, card_brand → cardBrand, only shows up when you compare the actual version that broke things:
// Version C (the actual breaking change)
{
"cardBrand": "visa",
"amount": 5998,
"lastFour": "4242",
"currency": "usd"
}Structural diff correctly identifies: card_brand removed, cardBrand added, last_four removed, lastFour added. Two meaningful changes, not forty noise lines.
Structural Comparison vs Textual Comparison
| Aspect | Text Diff (diff, git diff) | Structural JSON Diff |
|---|---|---|
| Key ordering | Treated as a change | Ignored (semantically irrelevant) |
| Whitespace/indentation | Treated as a change | Ignored |
| Added keys | Shows as "new lines" | Shows as "key added" with path |
| Removed keys | Shows as "deleted lines" | Shows as "key removed" with path |
| Modified values | Shows line-level change | Shows value-level change with old → new |
| Nested changes | Hard to trace (just line numbers) | Shows full path (data.object.card_brand) |
| Array reordering | Shows every element as changed | Can detect moved elements (tool-dependent) |
Comparing API Responses Across Versions
Here's a real scenario from a Stripe API version migration (2024-06-20 → 2025-04-16). The charge object changed subtly:
// API version 2024-06-20
{
"id": "ch_3Pxyz",
"object": "charge",
"amount": 5998,
"billing_details": {
"address": {
"city": "San Francisco",
"country": "US",
"line1": "123 Market St",
"line2": null,
"postal_code": "94105",
"state": "CA"
},
"email": "customer@example.com",
"name": "Jane Smith",
"phone": null
},
"payment_method_details": {
"card": {
"brand": "visa",
"exp_month": 12,
"exp_year": 2027,
"last4": "4242"
},
"type": "card"
},
"status": "succeeded"
}// API version 2025-04-16: spot the differences
{
"id": "ch_3Pxyz",
"object": "charge",
"amount": 5998,
"billing_details": {
"address": {
"city": "San Francisco",
"country": "US",
"line1": "123 Market St",
"line2": null,
"postal_code": "94105",
"state": "CA"
},
"email": "customer@example.com",
"name": "Jane Smith",
"phone": null
},
"payment_method_details": {
"card": {
"brand": "visa",
"exp_month": 12,
"exp_year": 2027,
"last4": "4242",
"network": "visa",
"funding": "credit"
},
"type": "card"
},
"status": "succeeded",
"payment_intent": "pi_3Pabc"
}A structural diff immediately surfaces:
- Added:
payment_method_details.card.network="visa" - Added:
payment_method_details.card.funding="credit" - Added:
payment_intent="pi_3Pabc"
If your code destructures payment_method_details.card and passes it to a function that doesn't expect extra fields, or if you're storing the raw object in a column with a strict schema, these additions matter.
Real Scenario: Debugging Why a Webhook Stopped Working
A teammate pinged me: "The order fulfillment webhook hasn't fired in 2 hours." The webhook endpoint was returning 200 (so the sender thought delivery succeeded), but our handler was silently dropping events.
My debugging process:
- Grabbed the last successful payload from our logs (2 hours ago)
- Grabbed the latest payload from the sender's retry queue
- Pasted both into a JSON comparison tool
The diff showed one change: the sender had wrapped their payload in a new envelope structure:
// Old format (what our handler expected)
{"event": "order.shipped", "order_id": "ORD-4471", "tracking": "1Z999AA10123456784"}
// New format (what they started sending)
{"version": "2.0", "data": {"event": "order.shipped", "order_id": "ORD-4471", "tracking": "1Z999AA10123456784"}}Our handler was doing req.body.event, which now returned undefined because event moved to req.body.data.event. A 30-second diff saved what could have been an hour of log-diving.
Programmatic JSON Comparison
For CI/CD pipelines and automated testing, you can compare JSON programmatically:
// Node.js: deep comparison ignoring key order
const { diff } = require('deep-diff');
const expected = require('./fixtures/expected-response.json');
const actual = await fetch('https://api.example.com/orders/123').then(r =>
r.json()
);
const differences = diff(expected, actual);
if (differences) {
console.error('API response changed:');
differences.forEach(d => {
console.error(
` ${d.kind}: ${d.path.join('.')} | was: ${d.lhs}, now: ${d.rhs}`
);
});
process.exit(1);
}# Python: using deepdiff for structural comparison
from deepdiff import DeepDiff
import json
with open('expected.json') as f:
expected = json.load(f)
with open('actual.json') as f:
actual = json.load(f)
diff = DeepDiff(expected, actual, ignore_order=True)
if diff:
print("Changes detected:")
for change_type, changes in diff.items():
print(f" {change_type}: {changes}")# jq: quick command-line comparison (sorts keys first)
diff <(jq -S '.' expected.json) <(jq -S '.' actual.json)For ad-hoc debugging during incidents, I reach for a browser-based structural diff tool rather than writing scripts. The visual color-coding (green for additions, red for removals, amber for modifications) makes changes immediately scannable.
XML Formatting: Yes, XML Is Still Everywhere
Where XML Still Lives in 2026
I hear "XML is dead" regularly. Then I spend half a day debugging one of these:
- SOAP APIs: Government services, banking integrations, enterprise ERP systems. The IRS e-file system, SWIFT payment messages, SAP interfaces, all SOAP/XML.
- RSS/Atom feeds: Every podcast, every news site, every blog with syndication.
- SVG files: Every icon library, every data visualization, every vector graphic on the web.
- Android layouts: Every Android app's UI is defined in XML (
activity_main.xml). - Maven/Gradle configs:
pom.xmland Gradle's Kotlin DSL still generates XML under the hood. - SAML authentication: Enterprise SSO flows pass XML assertions between identity providers.
- Office documents: DOCX, XLSX, PPTX are all ZIP files containing XML.
XML isn't dead. It's just not trendy. And when you need to debug it, you need it formatted.
Why XML Formatting Is Harder Than JSON Formatting
JSON has one structural pattern: key-value pairs and arrays. XML has:
- Elements with opening and closing tags
- Attributes on elements (which have their own quoting rules)
- Namespaces with prefixes (
<soap:Envelope xmlns:soap="...">) - CDATA sections (raw text that shouldn't be parsed)
- Processing instructions (
<?xml version="1.0"?>) - Mixed content (text interspersed with child elements)
- Self-closing tags (
<br/>vs<br></br>) - Comments (
<!-- ... -->)
A formatter needs to handle all of these while preserving semantics. Indenting a CDATA section's content would change its meaning. Reordering attributes might break namespace declarations.
Real Scenario: Debugging a SOAP Fault Response
I was integrating with a government tax filing API (SOAP-based, naturally). When a submission failed, the API returned this as a single line in our error logs:
<?xml version="1.0" encoding="UTF-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soap:Body><soap:Fault><faultcode>soap:Client</faultcode><faultstring>Validation failed</faultstring><detail><ValidationErrors xmlns="urn:tax-filing:errors:v2"><Error code="E4401" field="taxpayer.ssn" severity="fatal"><Message>SSN format invalid: expected XXX-XX-XXXX, received 9-digit number without dashes</Message><Context><SubmissionId>SUB-2026-88431</SubmissionId><Timestamp>2026-03-15T09:41:22Z</Timestamp></Context></Error><Error code="E4205" field="income.w2[2].employer_ein" severity="warning"><Message>EIN checksum validation failed for employer index 2</Message><Context><SubmissionId>SUB-2026-88431</SubmissionId><Timestamp>2026-03-15T09:41:22Z</Timestamp></Context></Error></ValidationErrors></detail></soap:Fault></soap:Body></soap:Envelope>Completely unreadable. After formatting (using an XML formatter or xmllint --format):
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soap:Body>
<soap:Fault>
<faultcode>soap:Client</faultcode>
<faultstring>Validation failed</faultstring>
<detail>
<ValidationErrors xmlns="urn:tax-filing:errors:v2">
<Error code="E4401" field="taxpayer.ssn" severity="fatal">
<Message>
SSN format invalid: expected XXX-XX-XXXX, received 9-digit number without dashes
</Message>
<Context>
<SubmissionId>SUB-2026-88431</SubmissionId>
<Timestamp>2026-03-15T09:41:22Z</Timestamp>
</Context>
</Error>
<Error code="E4205" field="income.w2[2].employer_ein" severity="warning">
<Message>
EIN checksum validation failed for employer index 2
</Message>
<Context>
<SubmissionId>SUB-2026-88431</SubmissionId>
<Timestamp>2026-03-15T09:41:22Z</Timestamp>
</Context>
</Error>
</ValidationErrors>
</detail>
</soap:Fault>
</soap:Body>
</soap:Envelope>Now I can immediately see: two validation errors, one fatal (SSN format), one warning (EIN checksum). The fix was adding dashes to the SSN field before submission. Without formatting, I'd have been squinting at that single line for minutes.
Real Scenario: Debugging an RSS Feed That Won't Validate
A content team reported their podcast RSS feed was rejected by Apple Podcasts. The feed validator said "invalid XML" with no useful line number. The feed was generated by a CMS and looked fine at a glance. After formatting and careful inspection:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
<channel>
<title>The Developer Podcast</title>
<description>Weekly conversations about software engineering & architecture</description>
<!-- ❌ Problem: unescaped ampersand in description ↑ -->
<itunes:author>Dev Team</itunes:author>
<item>
<title>Episode 42: Scaling PostgreSQL - Lessons & Failures</title>
<!-- ❌ Problem: unescaped ampersand and em-dash ↑ -->
<enclosure url="https://cdn.example.com/ep42.mp3" length="48000000" type="audio/mpeg"/>
</item>
</channel>
</rss>The & characters in "engineering & architecture" and "Lessons & Failures" needed to be &. In JSON, ampersands are fine. In XML, they're entity delimiters. The fix:
<description>Weekly conversations about software engineering & architecture</description>
<!-- ... -->
<title>Episode 42: Scaling PostgreSQL — Lessons & Failures</title>Real Scenario: Reading an SVG Path for Debugging
SVG files are XML, and when a designer hands you an icon that "doesn't render right," you often need to inspect the path data. Unformatted SVG from an export tool:
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M12 2L2 7l10 5 10-5-10-5z"/><path d="M2 17l10 5 10-5"/><path d="M2 12l10 5 10-5"/></svg>Formatted:
<svg xmlns="http://www.w3.org/2000/svg"
viewBox="0 0 24 24"
fill="none"
stroke="currentColor"
stroke-width="2">
<path d="M12 2L2 7l10 5 10-5-10-5z"/>
<path d="M2 17l10 5 10-5"/>
<path d="M2 12l10 5 10-5"/>
</svg>Now you can see it's three separate paths (a layered icon), each attribute is readable, and you can identify which path corresponds to which visual layer.
Command-Line XML Formatting
# xmllint (comes with libxml2, pre-installed on macOS)
xmllint --format response.xml
# Python (built-in, no install needed)
python3 -c "import xml.dom.minidom, sys; print(xml.dom.minidom.parseString(sys.stdin.read()).toprettyxml(indent=' '))" < response.xml
# xmlstarlet (installable via brew/apt)
xmlstarlet fo response.xml
# tidy (HTML/XML formatter)
tidy -xml -i -q response.xmlFor quick formatting during debugging sessions, especially when you're copying XML from log output or a network inspector, a browser-based XML formatter is the fastest path from "unreadable wall of text" to "I can see the structure."
The Workflow That Ties It All Together
These three operations rarely happen in isolation. Here are the workflows I use most frequently:
Workflow 1: API Debugging (Format → Compare → Identify → Fix)
- Capture the failing request/response (from logs, curl, or network inspector)
- Format the JSON to make it readable
- Compare with the last known working payload
- Identify the structural change (added field, renamed key, changed type)
- Fix the handler code to accommodate the change
- Verify by comparing the new output with expected
Workflow 2: Data Migration (Convert → Validate → Compare → Deploy)
- Convert source data to target format (JSON → XML for a SOAP endpoint, or XML → JSON for a REST migration)
- Validate the converted output against the target schema
- Compare a sample of converted records with manually verified examples
- Deploy the migration script with confidence
Workflow 3: Log Analysis (Unescape → Format → Search → Understand)
- Unescape the JSON payload from the log line (it's usually escaped because it's embedded in a structured log JSON)
- Format the unescaped JSON for readability
- Search for the relevant field or value
- Understand the state that caused the error
A Concrete Example of Workflow 3
Here's an actual log line from a Node.js application using structured logging (Winston + JSON format):
{
"level": "error",
"message": "Payment processing failed",
"timestamp": "2026-03-15T14:22:08.441Z",
"service": "checkout",
"requestId": "req_7a2f",
"payload": "{\"order_id\":\"ORD-4471\",\"amount\":5998,\"currency\":\"usd\",\"customer\":{\"id\":\"cus_8b3e\",\"email\":\"jane@example.com\"},\"payment_method\":{\"type\":\"card\",\"card\":{\"brand\":\"visa\",\"last4\":\"4242\",\"exp_month\":12,\"exp_year\":2027}},\"shipping\":{\"address\":{\"line1\":\"123 Market St\",\"city\":\"San Francisco\",\"state\":\"CA\",\"postal_code\":\"94105\"}}}",
"error": "Card declined: insufficient funds"
}Step 1: Extract the payload field value (it's an escaped JSON string). Step 2: Unescape it to get the actual JSON object. Step 3: Format it:
{
"order_id": "ORD-4471",
"amount": 5998,
"currency": "usd",
"customer": {
"id": "cus_8b3e",
"email": "jane@example.com"
},
"payment_method": {
"type": "card",
"card": {
"brand": "visa",
"last4": "4242",
"exp_month": 12,
"exp_year": 2027
}
},
"shipping": {
"address": {
"line1": "123 Market St",
"city": "San Francisco",
"state": "CA",
"postal_code": "94105"
}
}
}Now I can see the full context of the failed payment: customer, card, shipping address, amount, and correlate with the error message. Without unescaping and formatting, that log line is a wall of backslashes.
When to Use Each Tool
Here's the decision matrix I keep in my head:
| Scenario | Tool | Why |
|---|---|---|
| JSON payload stored as string in DB/logs | JSON Escape/Unescape | Extract the inner JSON for inspection |
| Embedding JSON in a curl command or test fixture | JSON Escape | Properly escape quotes and special characters |
| API response changed and handler broke | JSON Compare | Find structural differences ignoring key order |
| Config file changed and deployment failed | JSON Compare | Identify what was added/removed/modified |
| SOAP API returning errors as single-line XML | XML Formatter | Make the fault response readable |
| Debugging SVG rendering issues | XML Formatter | Inspect path data and attributes clearly |
| RSS/Atom feed rejected by validator | XML Formatter | Find unescaped entities and structural issues |
| Comparing two XML configs (Maven, Android layouts) | XML Compare | Structural diff that understands XML semantics |
| Migrating SOAP API to REST | XML Formatter + JSON Compare | Format the XML source, compare with JSON target |
| Debugging message queue payloads | JSON Unescape + Format | Unwrap the envelope, format the inner payload |
Quick Reference: Problem → Action
"I have JSON inside a string and need to read it" → Unescape, then format
"I have a JSON object and need to put it inside a string" → Escape (use JSON.stringify() programmatically, or a browser tool for one-offs)
"Two JSON objects look the same but behave differently" → Structural comparison (ignores key order and whitespace)
"XML error response is unreadable" → XML formatter (handles namespaces, attributes, CDATA)
"Need to find what changed between API versions" → JSON compare with the old and new response side by side
Conclusion
JSON escaping, structural comparison, and XML formatting aren't complex operations individually. But they're the kind of tasks that interrupt your flow: you're deep in debugging, and suddenly you need to unescape a payload, or format an XML response, or figure out what changed between two API versions.
The key insight from doing these hundreds of times: use structural tools, not text tools. diff doesn't understand JSON semantics. String replacement doesn't handle all escape edge cases. And reading unformatted XML is a waste of your cognitive bandwidth.
Whether you use command-line tools (jq, xmllint), language-specific libraries (deep-diff, deepdiff), or browser-based tools for quick one-offs, the important thing is having a reliable workflow that doesn't break your concentration when these tasks come up mid-debugging.
For a deeper dive into formatting conventions and team standards, check out our guide on JSON formatting best practices.