How to Convert Email to Structured JSON

Turn messy MIME email into a clean, consistent JSON object. JsonHook normalizes every field — headers, body, attachments — into the same predictable schema on every delivery.

Table of Contents
  1. Overview
  2. Prerequisites
  3. Step-by-Step Instructions
  4. Code Example
  5. Common Pitfalls

Overview

Raw email is MIME — a text format with headers, body parts, and attachments encoded using a mix of quoted-printable, base64, and raw text, separated by boundary markers. Converting this to a useful JSON structure is non-trivial: you need to handle folded headers, decode character sets, walk nested multipart trees, and normalize inconsistencies introduced by different mail clients.

JsonHook performs this conversion automatically. The output is always the same JSON schema:

{
  "email": {
    "from": "string",
    "to": ["string"],
    "cc": ["string"],
    "replyTo": "string | null",
    "subject": "string",
    "date": "ISO8601 string",
    "messageId": "string",
    "textBody": "string | null",
    "htmlBody": "string | null",
    "headers": { "header-name": "value" },
    "attachments": [
      {
        "filename": "string",
        "contentType": "string",
        "size": "number",
        "contentId": "string | null"
      }
    ]
  },
  "deliveryId": "string",
  "addressId": "string",
  "receivedAt": "ISO8601 string"
}

This schema is consistent regardless of whether the original email was a simple one-line plain text message or a heavily formatted HTML marketing email with ten attachments.

Prerequisites

No special prerequisites beyond a JsonHook account and a webhook endpoint. If you want to understand the full schema in advance:

  • Review the payload schema documentation
  • Send a few test emails of different types (plain text, HTML, with attachments) to a JsonHook address pointed at webhook.site to see the actual output

Understanding which fields may be null is important before writing handler code. The fields that can be null are: replyTo, textBody, htmlBody, and attachment.contentId.

Get Structured JSON from Every Email

Consistent schema. Zero MIME parsing. Free up to 100 emails/month.

Get Free API Key

Step-by-Step Instructions

To convert your inbound emails to structured JSON:

  1. Create a JsonHook inbound address via the API or dashboard, pointing at your webhook endpoint.
  2. Send a variety of test emails to the address — plain text, HTML, with attachments, with CCs, with custom headers.
  3. Inspect each payload in your delivery log or webhook inspector to understand how different email types map to the schema.
  4. Write a typed interface or schema for the payload in your language of choice (TypeScript interface, Python dataclass, Go struct) to get compile-time safety when accessing fields.
  5. Handle nullable fields gracefully throughout your handler code.
  6. Store or forward the JSON to your database, data warehouse, or downstream services.

Code Example

TypeScript interfaces for the full JsonHook payload, plus a typed handler:

interface JsonHookAttachment {
  filename: string;
  contentType: string;
  size: number;
  contentId: string | null;
}

interface JsonHookEmail {
  from: string;
  to: string[];
  cc: string[];
  replyTo: string | null;
  subject: string;
  date: string;
  messageId: string;
  textBody: string | null;
  htmlBody: string | null;
  headers: Record<string, string>;
  attachments: JsonHookAttachment[];
}

interface JsonHookPayload {
  email: JsonHookEmail;
  deliveryId: string;
  addressId: string;
  receivedAt: string;
}

// Handler
app.post("/webhooks/email", express.raw({ type: "application/json" }),
  (req, res) => {
    // ... verify signature ...
    const payload: JsonHookPayload = JSON.parse(req.body.toString());

    const { email } = payload;
    const body = email.textBody ?? email.htmlBody ?? "";
    const senderName = email.from.split("<")[0].trim();

    console.log(`${senderName}: ${email.subject} (${email.attachments.length} attachments)`);
    res.sendStatus(200);
  }
);

Common Pitfalls

When working with the structured JSON output:

  • Treating to as a string instead of an array. The to and cc fields are always arrays, even if there is only one recipient. Access email.to[0], not email.to.
  • Not handling the case where both textBody and htmlBody are null. Some non-standard emails have no body. Always guard against accessing a null string.
  • Assuming the from field is a bare email address. It may be in "Display Name <[email protected]>" format. Parse it appropriately if you need just the address part.
  • Assuming attachment content is in the payload. Attachment metadata is included; downloading the actual binary content requires a separate API call.
  • Not storing the full payload. Even if you only need a few fields today, store the full JSON payload in your database. You will likely want additional fields later and retroactive access is impossible without storage.

Frequently Asked Questions

Is the JSON schema versioned? Will it change?

Yes. JsonHook uses semantic versioning for its API and payload schema. Breaking changes are communicated with version bumps and a deprecation period. The API version is included in the base URL (/v1/) so you can pin to a specific version. Subscribe to the changelog at jsonhook.com/changelog to stay informed.

Does JsonHook decode base64-encoded email bodies?

Yes. JsonHook decodes all transfer encodings including base64 and quoted-printable, and normalizes the output to UTF-8. The textBody and htmlBody fields always contain decoded, UTF-8 strings — never raw encoded bytes.

How are multiple From addresses handled?

The from field contains the primary sender as a string. If a message has multiple From addresses (rare), the first is used. The full raw header is available in email.headers.from if you need the complete value.

Can I get the raw MIME message instead of the parsed JSON?

Raw MIME access is available on the Pro plan via the delivery log API. You can retrieve the original MIME message for any delivery within the retention period. For most use cases the parsed JSON is sufficient, but raw access is available for compliance or archival purposes.