Trust · Architecture

Architecture.

A transparent technical reference for the people evaluating senderZ during due diligence. This page names the workers, explains the delivery guarantees, and documents what happens when things break. The marketing pitch lives at /platform; this page is for the CTO review.

01

System topology

senderZ runs three Cloudflare Workers and one physical host.

workers/api

Public-facing Hono router at api.senderz.com. Terminates customer and admin auth, validates requests, writes to D1, and enqueues messages. Never calls the device bridge directly.

workers/router

Internal queue consumer. Picks up messages from the Cloudflare Queue, runs compliance checks, resolves a phone, dispatches to the device bridge, updates D1 status, and fires tenant webhooks.

workers/billing

Internal. Receives usage events from the router, enforces plan quotas, and syncs Stripe subscription state via webhook.

Device supervisor

Process on dedicated Apple hardware. Runs the iMessage bridge, monitors its health, restarts it on crash, and keeps the Cloudflare Tunnel alive.

Customer → api.senderz.com (workers/api)
               ↓ enqueue
         Cloudflare Queue
               ↓ consume
          workers/router
               ↓
    ┌──────────┴──────────┐
    ↓                     ↓
workers/billing    Cloudflare Tunnel → Apple hardware
                                         ↓
                                  iMessage bridge
                                         ↓
                                    device pool
                                         ↓
                                iMessage / SMS / RCS
                                         ↓
                                  tenant webhooks

02

Queue semantics and delivery guarantees

senderZ uses Cloudflare Queues for the handoff between the API worker (producer) and the router worker (consumer). Cloudflare Queues gives at-least-once delivery with a default 12-hour retention window. senderZ keeps the rest of the pipeline idempotent so duplicate processing is safe.

Retry policy on failure: up to 3 attempts with exponential backoff (2 seconds, 4 seconds, 8 seconds). Each retry increments attempt_count in D1. After 3 failures the message is marked 'failed' with the error reason, a tenant webhook fires (message.failed event), and the queue message is acked.

A dedicated Dead Letter Queue Planned — Q3 2026 for messages that fail all retries is on the roadmap. Today failed messages live in D1 with status='failed' and are visible in the operator dashboard message log.

03

Tenant isolation

Every tenant-scoped row in D1 carries a tenant_id. Every SQL query against those tables includes WHERE tenant_id = ?. The assertTenantOwnership() guard in workers/api/src/guards.ts verifies at the route handler that the authenticated caller owns the resource being modified.

There is no global row that crosses tenants. There is no admin impersonation feature that reads data as a tenant without an explicit operator action logged. Queue messages carry tenant_id; KV keys are prefixed with tenant_id; webhook payloads include tenant_id.

Enforcement today is code review plus the guard helper. Static analysis in CI to block queries without tenant filtering Planned — Q3 2026 is the next hardening step.

04

Phone pool resolution

Tenants are assigned phones through the phone_assignments table. Two modes exist.

Exclusive — a phone reserved for a single tenant (Growth and Scale plans get one or more dedicated numbers). The router always uses the exclusive phone when present.

Pooled — phones shared across tenants (Starter plan). The router selects based on a priority queue: lowest messages_today first (load balancing), then lowest warming_priority (new phones protected), then oldest last_used_at (carrier spam-flag prevention). If no phone is available (all at daily cap), the router fails the message with NO_PHONE_AVAILABLE.

05

Compliance pipeline

Before every outbound message dispatch, the router runs four checks in this order. Any check failing stops the send with a typed reason code.

  1. 01

    Opt-out check

    Lookup in opt_outs for (tenant_id, phone_number) with opted_back_in_at IS NULL. If present, block with reason 'opted_out'.

  2. 02

    Quiet hours check

    Marketing messages only. Infer recipient timezone from area code. If local time is between 20:00 and 08:00, requeue for 08:00 local using Cloudflare Queues delayed delivery. OTP and alert messages skip this check.

  3. 03

    Warming limit check

    If the recipient is a new contact for the phone, verify we are under the per-phone daily new-contact cap (10/25/50 depending on phone age). If over, route to another phone or block.

  4. 04

    Phone resolver

    As described above. Returns the phone to use or fails the message.

The pipeline is fail-closed: if any check throws an unexpected error, the message is marked 'failed' rather than sent. This is the opposite of fail-open and is intentional — messaging compliance violations are expensive and the router never silently sends if its validation path is broken.

06

Channel strategy

senderZ supports three channels: iMessage, RCS (Phase 4, active on iOS 18+ numbers), and SMS. The default channel:'auto' cascade is iMessage → RCS → SMS.

iMessage capability is probed at send time by checking the capabilities column on the phone and the destination number's iMessage flag (cached for 24 hours per recipient_capabilities). If iMessage is unavailable, RCS is tried when the sender phone supports it and the recipient is RCS-capable; otherwise SMS is used.

Today all channels route through senderZ's device bridge on dedicated Apple hardware. A Twilio SMS backend adapter exists as a stub for high-volume SMS. It is not wired until the dual trigger fires: 5,000+ daily SMS combined with $15,000+ MRR plus an explicit operator decision. This is intentional capital discipline — not a feature limitation.

07

Reliability and recovery

senderZ publishes its Recovery Time Objectives. These are not marketing; they reflect the current production setup.

Failure Detection Recovery Action
iMessage bridge crash 30s 30s Automatic (device supervisor restart)
Internet outage 30s 2–5 min Automatic (hotspot failover)
Power outage Immediate 30–60 min UPS buys time, then manual
Device failure 30s < 5 min Manual standby device activation
Apple hardware failure 30s < 2 hours Manual restore on standby hardware
Cloudflare outage Immediate Wait Nothing — extremely rare

A public status page at status.senderz.com Planned — Q3 2026 will surface live incidents as they happen. Until then, incidents are communicated through email to affected tenants.

08

Observability

Every delivery attempt generates a tenant webhook (message.sent, message.delivered, message.failed, message.blocked). The operator dashboard exposes a global message log filterable by tenant, phone, status, and date. Cloudflare Analytics covers request/response metrics at the worker level.

SIEM integration Planned — Q4 2026 — pushing security events (auth failures, admin actions, key revocations) to a customer's SIEM of choice for enterprise accounts.

FAQ

Frequently asked questions

Does senderZ guarantee exactly-once delivery?

No. Cloudflare Queues gives at-least-once delivery. senderZ keeps the pipeline idempotent so duplicate processing is safe: every queue message carries a message_id; the router checks message state in D1 before acting; a message already in status "sent" is a no-op on re-delivery. The effective customer experience is exactly-once in the overwhelming majority of cases, but the contract is at-least-once.

How does senderZ handle duplicate inbound webhooks?

Every inbound webhook carries a unique GUID. The inbound handler deduplicates by looking up the GUID in messages before inserting. Duplicates are acknowledged with a 200 but do not re-fire tenant webhooks. This is why tenant webhook deliveries are near-exactly-once even though the underlying device connection occasionally retries.

What happens when the queue backs up?

Cloudflare Queues holds messages durably for up to 12 hours by default; senderZ's queue has this window. If the router falls behind — for example, during a delivery engine outage — messages accumulate. The router processes them in FIFO order on recovery. Customer webhooks for delayed messages fire when the message actually ships, not when it was enqueued. No messages are dropped; they are held.

Is the iMessage bridge a single point of failure?

Today yes, at the per-phone level. Each phone is tied to one bridge instance on the dedicated Apple hardware. If the bridge crashes, senderZ automatically recovers from delivery interruptions within seconds; during the gap, sends targeting that phone fail and route to the next phone in the pool. A full hardware failure is recoverable within 2 hours manually. The Twilio fallback backend adapter exists in code but is not enabled; the trigger for enabling it is 5,000+ daily SMS combined with $15k+ MRR.

What is the attack surface of the delivery hardware?

The dedicated Apple hardware only accepts inbound connections over a Cloudflare Tunnel, which terminates at senderZ's edge. There is no public IP exposed. Outbound it connects to Cloudflare for webhook delivery and to Apple/iCloud for iMessage operation. macOS auto-update is disabled to prevent surprise reboots that would require physical intervention. Physical access is limited to operator premises.

Under what conditions does the Twilio backend activate?

Three conditions, all required: daily SMS volume exceeds 5,000 messages, MRR exceeds $15,000, and the operator makes an explicit decision. The code path is ready: adapters are implemented behind a feature flag. Activation means setting ENABLE_TWILIO=true in the router worker environment and provisioning the TWILIO_* secrets. iMessage traffic always continues through the dedicated Apple device bridge regardless.

Where do queue messages live geographically?

Cloudflare Queues runs as a global distributed system. senderZ does not pin queue regions. Queue messages carry tenant_id, message_id, recipient phone number, and message body — the same data sensitivity as D1. Residency expectations match the rest of the platform; EU-only residency is Q2 2027 roadmap.

Request access

Need something formal?

We share our DPA, SOC 2 status, security questionnaire responses, and other formal materials under NDA. Email us or request access below.