How to Extract Email Metadata

Every email carries rich metadata beyond its body. Extract timestamps, message IDs, sender authentication signals, and custom application headers from the JsonHook JSON payload.

Table of Contents
  1. Overview
  2. Prerequisites
  3. Step-by-Step Instructions
  4. Code Example
  5. Common Pitfalls

Overview

Email metadata encompasses all the structured information about an email message that is not part of the visible body content. This includes:

  • Temporal metadata: date (when the email was sent), receivedAt (when JsonHook received it)
  • Identity metadata: messageId, inReplyTo, references (for threading)
  • Routing metadata: the Received header chain showing the path from sender to JsonHook
  • Authentication metadata: SPF, DKIM, and DMARC results in authentication-results
  • Application metadata: custom X- headers set by sending applications

This metadata is often more reliable for automated processing than the email body because it follows standard formats and is less likely to change between message versions.

Prerequisites

No special prerequisites — all metadata is included in the standard JsonHook payload on every plan. Review the headers available for your specific email sources by sending test emails and inspecting the email.headers object in the delivery log.

Access Full Email Metadata as JSON

Every field, every header, every delivery — structured JSON from JsonHook.

Get Free API Key

Step-by-Step Instructions

Extract and use email metadata in your handler:

  1. Access top-level metadata fields:
    const { email, deliveryId, receivedAt } = payload;
    const sentAt = new Date(email.date);
    const messageId = email.messageId;
    const latencyMs = new Date(receivedAt).getTime() - sentAt.getTime();
  2. Access thread metadata for grouping related emails:
    const inReplyTo = email.headers["in-reply-to"] ?? null;
    const references = email.headers["references"] ?? null;
    const threadId = inReplyTo ?? messageId;
  3. Extract sender authentication status:
    const authResults = email.headers["authentication-results"] ?? "";
    const senderVerified = authResults.includes("dkim=pass") &&
                           authResults.includes("spf=pass");
  4. Read application-specific custom headers:
    const customerId = email.headers["x-customer-id"] ?? null;
    const eventType  = email.headers["x-event-type"]   ?? null;
  5. Store metadata alongside the email content in your database for later querying and analysis.

Code Example

A metadata extraction and storage function in TypeScript:

interface EmailMetadata {
  deliveryId:      string;
  messageId:       string;
  sentAt:          Date;
  receivedAt:      Date;
  deliveryLatencyMs: number;
  from:            string;
  subject:         string;
  inReplyTo:       string | null;
  threadId:        string;
  senderVerified:  boolean;
  attachmentCount: number;
  customHeaders:   Record<string, string>;
}

function extractMetadata(payload: any): EmailMetadata {
  const { email, deliveryId, receivedAt } = payload;
  const sentAt = new Date(email.date);
  const receivedDate = new Date(receivedAt);
  const authResults = email.headers["authentication-results"] ?? "";
  const inReplyTo = email.headers["in-reply-to"] ?? null;

  // Extract all custom X- headers
  const customHeaders = Object.fromEntries(
    Object.entries(email.headers as Record<string, string>)
      .filter(([k]) => k.startsWith("x-"))
  );

  return {
    deliveryId,
    messageId:       email.messageId,
    sentAt,
    receivedAt:      receivedDate,
    deliveryLatencyMs: receivedDate.getTime() - sentAt.getTime(),
    from:            email.from,
    subject:         email.subject,
    inReplyTo,
    threadId:        inReplyTo ?? email.messageId,
    senderVerified:  authResults.includes("dkim=pass") && authResults.includes("spf=pass"),
    attachmentCount: email.attachments.length,
    customHeaders,
  };
}

Common Pitfalls

Metadata extraction pitfalls:

  • Trusting the date field for ordering. The date header is set by the sender's mail client and can be incorrect (clock skew, misconfigured server). Use receivedAt (set by JsonHook's server) for reliable chronological ordering.
  • Using Message-ID without normalization. Message IDs often include surrounding angle brackets (<[email protected]>). Strip them before storing or comparing: messageId.replace(/^<|>$/g, "").
  • Expecting all custom headers to be present. Custom headers are only present when the sending system includes them. Always use optional chaining or a default value when reading any custom header.
  • Ignoring the receivedAt field for SLA calculations. The gap between email.date and receivedAt tells you how long email was in transit. Sudden spikes in this latency indicate delivery problems upstream of JsonHook.
  • Not indexing metadata fields in your database. Fields like messageId, from, and deliveryId are frequently queried. Index them in your database for efficient lookups.

Frequently Asked Questions

What is the difference between email.date and receivedAt?

email.date is the Date header from the original email, set by the sender's mail client at send time. receivedAt is the timestamp when JsonHook's SMTP server received the message. Use email.date for display to users (when the email was sent) and receivedAt for system ordering, SLA tracking, and pipeline latency calculations.

How do I use message IDs for email threading?

Email threading is based on the Message-ID, In-Reply-To, and References headers. When an email is a reply, In-Reply-To contains the Message-ID of the email being replied to. References contains the full thread chain. Group emails by their root Message-ID (the first Message-ID in the References chain) to reconstruct threads.

Can I tell which country an email was sent from?

The Received header chain contains IP addresses of each mail server that handled the message. You can geolocate those IPs using a service like MaxMind GeoIP. However, this is approximate — servers are often located in different countries from their senders. It is more reliable for identifying the sending mail server's location than the individual user's location.

Are DKIM and SPF results trustworthy in the authentication-results header?

Yes — the Authentication-Results header in the JsonHook payload is added by JsonHook's receiving mail server, not by the sender. It reports the actual DKIM signature verification and SPF lookup results performed by JsonHook's infrastructure. This is not forgeable by the sender (unlike headers the sender sets themselves).