> For the complete documentation index, see [llms.txt](https://docs.millimetric.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.millimetric.ai/architecture.md).

# Architecture

```
clients (curl / browser SDK / Node SDK / MCP-speaking AI agent)
        │  Bearer <api-key>
        ▼
┌────────────────────────────────────────────────┐
│  Hono on Cloudflare Workers                    │
│  /v1/track    /v1/batch    /v1/identify        │
│  /v1/forget   /v1/query    /v1/stats           │
│  /v1/sources  /mcp         /internal/retention │
└──────┬─────────────────────────┬───────────────┘
       │ control plane           │ event plane
       ▼                         ▼
┌──────────────────┐    ┌────────────────────────┐
│ Supabase         │    │ ClickHouse Cloud       │
│ • auth.users     │    │ • events               │
│ • projects       │    │ • daily_rollup MV      │
│ • api_keys       │    │ • sessions MV          │
│ • usage_meter    │    │                        │
└──────────────────┘    └────────────────────────┘
```

## Why this split

* **Supabase (Postgres)** holds the control plane: users, projects, API keys, billing meter. Row-level security and auth come for free. Updated mostly on UI actions, low write volume.
* **ClickHouse** holds the event plane: append-only, column-store, parameterised SELECTs over partitioned data. Built for the analytics workload.

The API never lets ClickHouse see the user's API key directly — every query is parameterised on `project_id`, which the API derived from the key. That's the only multi-tenant boundary the query plane needs.

## Request shape

1. Client sends `Authorization: Bearer {kind}_{env}_{prefix}_{secret}`.
2. Worker parses the key, looks up the prefix in Supabase (cached 5 min in-process), and constant-time compares `sha256(secret + pepper)` against the stored hash.
3. Worker applies scope (`ingest` / `read` / `admin`) and — for `pk_` keys — origin allowlist.
4. For ingest, the classifier runs server-side over `(url, referrer, request_host)` and the row goes into ClickHouse via the HTTP interface as `JSONEachRow`.
5. For read, the route or MCP tool builds a parameterised SELECT through `services/events.ts` and returns JSON.

## Privacy invariants

* Raw IP is never persisted. IPs are HMAC'd with a daily-rotating salt; only the country is kept.
* No cookies. The caller supplies `anonymous_id`; the browser SDK stores it in `localStorage`.
* `properties` payload is capped at 8 KB (oversized blobs are replaced with a stub).
* Per-project `retention_days` is enforced by `/internal/retention/run`, called on a Cron Trigger schedule, which issues per-project `ALTER TABLE … DELETE`.
* `/v1/forget` issues an immediate parameterised delete for `(project_id, user_id)` — `sk_*` keys only.

## File map

| Path                              | What lives there                                         |
| --------------------------------- | -------------------------------------------------------- |
| `apps/api/src/index.ts`           | App entry; routes wired into Hono.                       |
| `apps/api/src/routes/`            | One file per REST endpoint.                              |
| `apps/api/src/mcp/server.ts`      | JSON-RPC MCP handler (tools + resources).                |
| `apps/api/src/auth/apiKey.ts`     | Bearer middleware, key parsing, origin allowlist.        |
| `apps/api/src/auth/rateLimit.ts`  | In-memory token bucket per project + route.              |
| `apps/api/src/services/events.ts` | Shared SQL helpers used by REST + MCP.                   |
| `apps/api/src/clickhouse/`        | HTTP client (insert + parameterised select).             |
| `apps/api/src/supabase/`          | Minimal PostgREST client.                                |
| `apps/api/src/util/buildRow.ts`   | Validated input → `EventRow`. Runs classifier + IP hash. |
| `packages/classifier/`            | Pure source/medium classifier (no deps).                 |
| `packages/schema/`                | Zod schemas shared by API + SDKs.                        |
| `packages/sdk-browser/`           | `@millimetric/track` — npm entry + CDN snippet.          |
| `packages/sdk-node/`              | `@millimetric/track-node` — server-side wrapper.         |
| `infra/supabase/migrations/`      | Control-plane DDL + RLS.                                 |
| `infra/clickhouse/migrations/`    | Events table + materialised views.                       |

## Operational notes

* **Cron**: schedule `/internal/retention/run` daily with `X-Internal-Secret: $API_KEY_PEPPER`.
* **Rate limiting**: in-memory token bucket today (per-Worker-instance). Move to a Durable Object if free-tier abuse becomes real.
* **Auth cache**: Workers cache key lookups for 5 minutes; the worst-case latency after a key rotation is bounded by that TTL.
* **Bundle**: the Worker imports `hono`, `zod`, and the local classifier — no MCP SDK in the bundle, which keeps cold-start small.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.millimetric.ai/architecture.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
