Gmail Push Webhook Handler

Replaces the legacy Hiver integration for the BetterFiles email workflow. Receives real-time Gmail activity notifications via Google Pub/Sub push subscriptions, decodes the historyId delta, fetches new message metadata via the Gmail API, and upserts email_thread_state rows into Supabase. Also classifies email attachments into a doc_type enum for downstream filtering.

Endpoint

PathProviderStatus
https://webhook.reri.co/webhook/gmail-pushGoogle Pub/Sub⚠️ approved-pending-harden

Port: 18801 (env PORT, defaults to 18801) Note: Marked approved-pending-harden in FUNNEL-REGISTRY — dedup and audit log not yet fully wired. Hardening required before production traffic.

Auth method

Google Pub/Sub OAuth push — no HMAC. Google delivers messages from a verified service account associated with the configured push subscription. The handler trusts the push message as long as it arrives at the registered endpoint URL. The push subscription itself is the auth boundary (configured in Google Cloud Console, restricted to https://webhook.reri.co/webhook/gmail-push).

There is no request-level signature verification in the current implementation — this is the pending-harden gap. Future: add Google-signed JWT verification on the push token.

Payload shape

Google Pub/Sub sends a POST with this structure:

{
  "message": {
    "data": "<base64-encoded-string>",
    "messageId": "string",
    "publishTime": "RFC3339 timestamp"
  },
  "subscription": "projects/.../subscriptions/..."
}

Decoded data payload:

{
  "emailAddress": "user@reri.co",
  "historyId": "123456789"
}

Processing flow:

  1. Receive Pub/Sub push → decode base64 data
  2. Extract emailAddress + historyId
  3. Call users.history.list(startHistoryId) via GmailClient
  4. For each new message: call getMessage(id, 'metadata')
  5. classifyAttachment(filename, mimeType)doc_type enum
  6. Upsert into Supabase email_thread_state

Downstream dispatch

gmail-push-handler.js
 ├─ GmailClient.users.history.list() → message IDs for new activity
 ├─ GmailClient.getMessage(id, 'metadata') → headers, labels, attachment info
 ├─ classifyAttachment(filename, mimeType) → doc_type enum
 │   (maps filename + MIME → doc_type: 'contract'|'addendum'|'photo'|'unknown'|etc.)
 └─ supabase.from('email_thread_state').upsert() → RERI Supabase project

Supabase project: pxzxcfjgpteitwktkkiz (RERI Website — not CCP) Table: email_thread_state

Dedup strategy

Dedup via historyId — Pub/Sub delivers incrementally; re-delivering the same historyId causes a no-op upsert (same primary key). The GmailClient tracks the last processed historyId via withRetry to handle transient API failures without reprocessing.

Audit trail

  • Logger: log tagged gmail-push-handler via scripts/lib/logger.js. Structured JSON to stdout → journalctl.
  • Fatal error handler: uncaughtException + unhandledRejection log structured JSON + handler ID before process exit.
  • Supabase: email_thread_state upserts serve as the implicit audit trail of processed messages.
  • Pending: webhook_audit_log write not yet wired (part of hardening backlog).
  • webhook-architecture — Cross-handler governance; gmail-push is flagged approved-pending-harden, dedup + audit write must be added before production
  • _summary — BetterFiles agent consumes email_thread_state for deal tracking and TC workflows
  • cron-timer-registry — Handler runs as manual process (no systemd timer); needs unit file before prod
  • hubspot — email_thread_state rows may be associated with HubSpot deal IDs for cross-system linking