Overview
At high email volumes, processing each webhook delivery synchronously in the HTTP handler becomes a bottleneck. A queued batch processing architecture decouples receipt (accepting the webhook) from processing (extracting data, writing to the database, calling downstream APIs), enabling much higher throughput.
The architecture looks like this:
- Webhook handler (thin) — verifies signature, pushes raw payload to queue, returns 200 immediately
- Queue (Redis + BullMQ, SQS, RabbitMQ) — buffers incoming deliveries
- Worker pool (N parallel workers) — processes jobs from the queue concurrently
- Batch writer (optional) — accumulates extracted records and bulk-inserts to the database periodically
This separation means your webhook handler always responds in under 100ms regardless of processing load, and your worker pool scales independently of your web server.
Prerequisites
Batch processing prerequisites:
- A message queue: Redis + BullMQ (Node.js), Celery + Redis (Python), or a managed queue like SQS or Cloud Tasks
- A worker process separate from your web server
- A database that supports bulk inserts efficiently (PostgreSQL, MySQL, ClickHouse for analytics)
- Metrics and monitoring for queue depth and worker throughput
Scale Your Email Processing Pipeline
JsonHook handles delivery at any volume. Your queue handles the rest.
Get Free API KeyStep-by-Step Instructions
Set up batch email processing:
- Install BullMQ and Redis:
npm install bullmq ioredis - Create a queue and add jobs in the webhook handler:
const emailQueue = new Queue("email-processing", { connection: redis }); app.post("/webhooks/email", async (req, res) => { // ... verify signature ... const payload = JSON.parse(req.body.toString()); await emailQueue.add("process", payload, { attempts: 3 }); res.sendStatus(200); }); - Implement workers that process queue jobs:
const worker = new Worker("email-processing", async (job) => { const extracted = extractData(job.data.email); await db.insert("email_data", extracted); }, { connection: redis, concurrency: 10 }); - For bulk inserts, accumulate records in a buffer and flush every N records or every T seconds.
- Monitor queue depth — alert if depth exceeds a threshold indicating worker saturation.
Code Example
BullMQ worker with batch PostgreSQL insert:
import { Worker, Queue } from "bullmq";
import { Pool } from "pg";
import Redis from "ioredis";
const redis = new Redis(process.env.REDIS_URL!);
const db = new Pool({ connectionString: process.env.DATABASE_URL });
let batch: any[] = [];
const BATCH_SIZE = 50;
const FLUSH_INTERVAL_MS = 2000;
async function flushBatch() {
if (!batch.length) return;
const toFlush = batch.splice(0, batch.length);
const values = toFlush.map((r, i) =>
`($${i * 4 + 1}, $${i * 4 + 2}, $${i * 4 + 3}, $${i * 4 + 4})`
).join(", ");
const params = toFlush.flatMap(r => [r.from, r.subject, r.body, r.deliveryId]);
await db.query(
`INSERT INTO emails (from_address, subject, body, delivery_id) VALUES ${values} ON CONFLICT (delivery_id) DO NOTHING`,
params
);
console.log(`Flushed ${toFlush.length} emails to database`);
}
setInterval(flushBatch, FLUSH_INTERVAL_MS);
const worker = new Worker("email-processing", async (job) => {
const { email, deliveryId } = job.data;
batch.push({
from: email.from,
subject: email.subject,
body: (email.textBody ?? "").slice(0, 1000),
deliveryId,
});
if (batch.length >= BATCH_SIZE) await flushBatch();
}, { connection: redis, concurrency: 20 });Common Pitfalls
Batch processing pitfalls:
- Losing jobs on process crash. In-memory batches are lost if the worker crashes before flushing. Use a queue's built-in persistence (BullMQ stores jobs in Redis) and only mark jobs as complete after the database write confirms success.
- Queue growing unboundedly. If workers are slower than the arrival rate, the queue grows indefinitely. Monitor queue depth and add more worker processes when depth exceeds a threshold. Set a maximum queue size and drop or DLQ excess jobs.
- No dead-letter queue (DLQ). Jobs that fail all retries disappear unless you configure a DLQ. Route failed jobs to a DLQ and alert so they can be investigated and replayed.
- Processing order not guaranteed. Queue workers process jobs concurrently and may not maintain arrival order. If order matters (e.g., email thread replies must be processed after the parent), use ordered queues or add sorting logic in your batch writer.
- Bulk insert conflicts. When retrying jobs, the same email may be inserted twice. Use
ON CONFLICT DO NOTHINGwith a unique constraint ondeliveryIdto make bulk inserts idempotent.