Sessions
How session_id is derived, when to override it, and what the sessions view does for you.
A session is a span of activity from a single visitor with no more than 30 minutes of idle time between events. Sessions exist mainly to answer one question well: what brought this person in?
In Millimetric, sessions are a derived concept. The raw events table has a session_id per row; the sessions materialised view aggregates them.
How session_id is derived
If you don't send session_id, the server fills it in:
session_id = `${anonymous_id}-${30min_bucket}`
30min_bucket = floor(unix_seconds / 1800)So a single visitor's events get the same session_id as long as they happen inside the same 30-minute window. As soon as the gap exceeds ~30 minutes, the bucket increments and the visitor is in a new session.
This is intentionally simple:
Stateless. No server-side timer per visitor; sessions fall out of timestamps + anonymous id.
Deterministic. The same input produces the same
session_idwhether it's processed live or replayed from a backfill.Cheap. The classifier and ingest path don't read or write session state.
The trade-off: edges of the bucket can split or merge sessions slightly differently than a "30 min since last event" sliding window. For the analytics queries Millimetric optimises for, the difference is rounding noise.
Overriding session_id
You should override only when you need session boundaries to mean something specific in your product. For example:
A long-form video player where you want a "watch session" to span hours.
A multi-step onboarding that you want grouped even across reloads.
An AI agent where each
task_idis a session for telemetry purposes.
The browser SDK doesn't expose a setSessionId — the auto-derived one is what you want for web. Override only via the HTTP layer.
The sessions materialised view
sessions is a ClickHouse AggregatingMergeTree view that groups events by (project_id, session_id, anonymous_id) and stores:
session_id
derived or user-supplied
anonymous_id
the visitor
user_id
filled in if /v1/identify happened during the session
started_at
min(timestamp)
ended_at
max(timestamp)
duration_s
ended_at - started_at
event_count
count of all events in the session
pageview_count
count of $pageview events
entry_source / entry_medium / entry_campaign
classifier output on the first event of the session
entry_url / entry_path / entry_referrer
from the first event
country / device_type / browser / os
first event's enrichment
Why entry_source is the right answer for attribution
entry_source is the right answer for attributionPer-event attribution is per-event truth: a server-side signup POSTed without url or referrer will classify as direct. That's correct for that event, but it's the wrong answer for "where did this customer come from".
The entry_source on a session is the first-touch attribution for that visit. Query it instead of raw events when you want to credit the channel that brought someone in.
This is what powers /v1/sources — the endpoint that gives you Facebook paid and Facebook social on separate rows.
Pre-login → post-login in a session
If a visitor identifies during a session, the sessions row sees both:
anonymous_id = u_abcfor the whole sessionuser_id = user_42because at least one event in the session carried it (the$identifyitself)entry_source / entry_mediumfrom the first event (which was anonymous)
Result: the session is correctly credited to the channel that brought them in, with the user id stitched on.
Multiple sessions per visitor
Common. Treat them independently for attribution purposes:
Multi-touch attribution (e.g. credit each channel proportionally) is a query-layer concern — Millimetric stores raw and first-touch; you compose more sophisticated models on top.
Sessions for the FB social-vs-paid split
Combining sessions + the classifier is the whole point.
→
That breakdown is the answer to the question every marketer asks and almost no analytics tool answers cleanly.
What sessions don't do
No cross-device stitching. A visitor on a phone and a laptop has two different
anonymous_ids and therefore different sessions. Useuser_id(post-identify) to roll those up.No session-level revenue. Attach
amount_centsto the conversion event itself; aggregate per-session in your query.No engagement scoring out of the box.
event_count,pageview_count,duration_sgive you the raw material — score it however you like.
See also
Events — what flows into a session.
Identities —
anonymous_idvsuser_id.Attribution — how
entry_sourceis determined.GET /v1/sources — the read endpoint that uses sessions under the hood.
Last updated
Was this helpful?