Should I process emails synchronously or asynchronously in my pipeline?

Always asynchronously for any non-trivial processing. Your webhook handler should accept the delivery, push a job to a queue (BullMQ, SQS, RabbitMQ), and return 200 immediately. The actual extraction, database writes, and downstream API calls happen in worker processes. This prevents webhook timeouts and keeps your pipeline scalable.

How do I handle schema changes when email formats change?

Treat email format changes like API version changes. Write versioned extraction functions, and route to the appropriate version based on sender, date range, or a custom header. Store the raw JSON payload alongside extracted fields so you can re-run extraction against historical emails when the schema changes.

Can I connect the pipeline output to a CRM or external API?

Yes. After storing the extracted data locally, your pipeline worker can call any external API — HubSpot, Salesforce, Pipedrive, Airtable, etc. Use a queue with retry logic for these downstream API calls so a temporary CRM outage does not cause data loss.

How many emails per second can the pipeline handle?

With a queued processing architecture, throughput is limited by your worker capacity and downstream system capacity, not by JsonHook's delivery rate. JsonHook can deliver hundreds of webhooks per second. Scale your worker pool horizontally to match your email volume.

How to Set Up an Email-to-API Pipeline

2025-01-184 min read

Overview

An email-to-API pipeline transforms inbound email messages into structured API calls or database records through a sequence of processing stages. This pattern is useful anywhere that email is a data input channel: order management systems, CRM lead ingestion, support ticket systems, financial data extraction from invoice emails, and more.

A well-designed pipeline has four distinct layers:

Email reception: Accept SMTP connections, receive the raw MIME message (handled by JsonHook)
Email parsing: Convert MIME to structured JSON (handled by JsonHook)
Webhook delivery: POST the JSON to your API endpoint with HMAC signature (handled by JsonHook)
API processing: Validate, extract, store, and act on the data (handled by your application)

With JsonHook, layers 1–3 are handled for you as a managed service. Your only responsibility is layer 4.

Prerequisites

Before building the pipeline:

A JsonHook account with an API key
A backend API capable of receiving HTTPS POST requests
A database to store processed email data
A clear definition of what data you need to extract and what actions the pipeline should trigger
Optionally: a message queue for async processing (Redis + BullMQ, SQS, etc.)

Build Your Email-to-API Pipeline

JsonHook handles SMTP, parsing, and delivery. You handle the business logic.

Get Free API Key

Step-by-Step Instructions

Build the pipeline from scratch:

Design your data model. Define the database schema for extracted email data. For example, for a lead capture pipeline: leads(id, email, name, phone, source_email_id, created_at).
Create a JsonHook inbound address pointing at your API endpoint.
Implement your API webhook endpoint with signature verification and idempotency.
Write extraction logic to pull the relevant fields from the email payload.
Store extracted data in your database.
Trigger downstream actions: notify your team via Slack, create records in downstream APIs, send confirmation emails, etc.
Add monitoring — query the JsonHook delivery log API for failures.
Route email sources to your JsonHook address (update form targets, forwarding rules, etc.).

Code Example

Complete email-to-API pipeline in Express + PostgreSQL:

import express from "express";
import crypto from "crypto";
import { Pool } from "pg";

const app = express();
const db = new Pool({ connectionString: process.env.DATABASE_URL });
app.use(express.raw({ type: "application/json" }));

async function extractLeadData(email: any) {
  const text = email.textBody ?? "";
  return {
    from_email:   email.from.match(/[w.+%-]+@[w.-]+/)?.[0] ?? email.from,
    from_name:    email.from.match(/^([^<]+) {
  const sig = req.headers["x-jsonhook-signature"] as string;
  const expected = crypto
    .createHmac("sha256", process.env.JSONHOOK_SECRET!)
    .update(req.body).digest("hex");
  if (sig !== expected) return res.sendStatus(401);

  const { email, deliveryId } = JSON.parse(req.body.toString());

  // Idempotency
  const exists = await db.query(
    "SELECT 1 FROM processed_deliveries WHERE delivery_id = $1", [deliveryId]
  );
  if (exists.rows.length) return res.sendStatus(200);

  // Extract and store
  const lead = await extractLeadData(email);
  await db.query(
    `INSERT INTO leads (from_email, from_name, subject, body_preview, phone, received_at)
     VALUES ($1,$2,$3,$4,$5,$6)`,
    [lead.from_email, lead.from_name, lead.subject, lead.body_preview, lead.phone, lead.received_at]
  );
  await db.query(
    "INSERT INTO processed_deliveries (delivery_id) VALUES ($1)", [deliveryId]
  );

  res.sendStatus(200);
});

app.listen(3000);

Common Pitfalls

Pipeline design pitfalls:

Monolithic webhook handler. Keep your webhook handler thin — it should verify the signature, push to a queue, and return 200. All extraction, validation, and storage should happen in a separate worker process.
No schema migration plan. As your pipeline evolves, the data you extract from emails changes. Use a database migration tool (Flyway, Alembic, Knex migrations) and treat schema changes with the same rigor as API contract changes.
Not logging raw payloads during initial deployment. When first deploying to production, log the full JSON payload for the first 100 deliveries. Real-world email surprises abound — raw payload logging helps you catch edge cases early.
Tight coupling between extraction and storage. Write extraction logic as a pure function (no side effects, takes payload, returns data object) so it is easily unit-testable and replaceable when email formats change.
No alerting on extraction failures. When required fields are not found in an email, alert immediately rather than storing incomplete records. An extraction failure is a signal that the email format has changed and the pipeline needs updating.

How to Set Up an Email-to-API Pipeline

Overview

Prerequisites

Build Your Email-to-API Pipeline

Step-by-Step Instructions

Code Example

Common Pitfalls

Frequently Asked Questions

Related Guides

How to Automate Email Processing with Webhooks

How to Extract Data from Emails Automatically

Lead Capture with Email Webhooks

Receive Email Webhooks in Node.js