How to Set Up an Email-to-API Pipeline

Build a complete pipeline from email inbox to API data in four layers: SMTP receipt, MIME parsing, webhook delivery, and API processing. JsonHook handles the first three.

Table of Contents
  1. Overview
  2. Prerequisites
  3. Step-by-Step Instructions
  4. Code Example
  5. Common Pitfalls

Overview

An email-to-API pipeline transforms inbound email messages into structured API calls or database records through a sequence of processing stages. This pattern is useful anywhere that email is a data input channel: order management systems, CRM lead ingestion, support ticket systems, financial data extraction from invoice emails, and more.

A well-designed pipeline has four distinct layers:

  1. Email reception: Accept SMTP connections, receive the raw MIME message (handled by JsonHook)
  2. Email parsing: Convert MIME to structured JSON (handled by JsonHook)
  3. Webhook delivery: POST the JSON to your API endpoint with HMAC signature (handled by JsonHook)
  4. API processing: Validate, extract, store, and act on the data (handled by your application)

With JsonHook, layers 1–3 are handled for you as a managed service. Your only responsibility is layer 4.

Prerequisites

Before building the pipeline:

  • A JsonHook account with an API key
  • A backend API capable of receiving HTTPS POST requests
  • A database to store processed email data
  • A clear definition of what data you need to extract and what actions the pipeline should trigger
  • Optionally: a message queue for async processing (Redis + BullMQ, SQS, etc.)

Build Your Email-to-API Pipeline

JsonHook handles SMTP, parsing, and delivery. You handle the business logic.

Get Free API Key

Step-by-Step Instructions

Build the pipeline from scratch:

  1. Design your data model. Define the database schema for extracted email data. For example, for a lead capture pipeline: leads(id, email, name, phone, source_email_id, created_at).
  2. Create a JsonHook inbound address pointing at your API endpoint.
  3. Implement your API webhook endpoint with signature verification and idempotency.
  4. Write extraction logic to pull the relevant fields from the email payload.
  5. Store extracted data in your database.
  6. Trigger downstream actions: notify your team via Slack, create records in downstream APIs, send confirmation emails, etc.
  7. Add monitoring — query the JsonHook delivery log API for failures.
  8. Route email sources to your JsonHook address (update form targets, forwarding rules, etc.).

Code Example

Complete email-to-API pipeline in Express + PostgreSQL:

import express from "express";
import crypto from "crypto";
import { Pool } from "pg";

const app = express();
const db = new Pool({ connectionString: process.env.DATABASE_URL });
app.use(express.raw({ type: "application/json" }));

async function extractLeadData(email: any) {
  const text = email.textBody ?? "";
  return {
    from_email:   email.from.match(/[w.+%-]+@[w.-]+/)?.[0] ?? email.from,
    from_name:    email.from.match(/^([^<]+) {
  const sig = req.headers["x-jsonhook-signature"] as string;
  const expected = crypto
    .createHmac("sha256", process.env.JSONHOOK_SECRET!)
    .update(req.body).digest("hex");
  if (sig !== expected) return res.sendStatus(401);

  const { email, deliveryId } = JSON.parse(req.body.toString());

  // Idempotency
  const exists = await db.query(
    "SELECT 1 FROM processed_deliveries WHERE delivery_id = $1", [deliveryId]
  );
  if (exists.rows.length) return res.sendStatus(200);

  // Extract and store
  const lead = await extractLeadData(email);
  await db.query(
    `INSERT INTO leads (from_email, from_name, subject, body_preview, phone, received_at)
     VALUES ($1,$2,$3,$4,$5,$6)`,
    [lead.from_email, lead.from_name, lead.subject, lead.body_preview, lead.phone, lead.received_at]
  );
  await db.query(
    "INSERT INTO processed_deliveries (delivery_id) VALUES ($1)", [deliveryId]
  );

  res.sendStatus(200);
});

app.listen(3000);

Common Pitfalls

Pipeline design pitfalls:

  • Monolithic webhook handler. Keep your webhook handler thin — it should verify the signature, push to a queue, and return 200. All extraction, validation, and storage should happen in a separate worker process.
  • No schema migration plan. As your pipeline evolves, the data you extract from emails changes. Use a database migration tool (Flyway, Alembic, Knex migrations) and treat schema changes with the same rigor as API contract changes.
  • Not logging raw payloads during initial deployment. When first deploying to production, log the full JSON payload for the first 100 deliveries. Real-world email surprises abound — raw payload logging helps you catch edge cases early.
  • Tight coupling between extraction and storage. Write extraction logic as a pure function (no side effects, takes payload, returns data object) so it is easily unit-testable and replaceable when email formats change.
  • No alerting on extraction failures. When required fields are not found in an email, alert immediately rather than storing incomplete records. An extraction failure is a signal that the email format has changed and the pipeline needs updating.

Frequently Asked Questions

Should I process emails synchronously or asynchronously in my pipeline?

Always asynchronously for any non-trivial processing. Your webhook handler should accept the delivery, push a job to a queue (BullMQ, SQS, RabbitMQ), and return 200 immediately. The actual extraction, database writes, and downstream API calls happen in worker processes. This prevents webhook timeouts and keeps your pipeline scalable.

How do I handle schema changes when email formats change?

Treat email format changes like API version changes. Write versioned extraction functions, and route to the appropriate version based on sender, date range, or a custom header. Store the raw JSON payload alongside extracted fields so you can re-run extraction against historical emails when the schema changes.

Can I connect the pipeline output to a CRM or external API?

Yes. After storing the extracted data locally, your pipeline worker can call any external API — HubSpot, Salesforce, Pipedrive, Airtable, etc. Use a queue with retry logic for these downstream API calls so a temporary CRM outage does not cause data loss.

How many emails per second can the pipeline handle?

With a queued processing architecture, throughput is limited by your worker capacity and downstream system capacity, not by JsonHook's delivery rate. JsonHook can deliver hundreds of webhooks per second. Scale your worker pool horizontally to match your email volume.