> For the complete documentation index, see [llms.txt](https://docs.millimetric.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.millimetric.ai/core-concepts/sessions.md).

# Sessions

A **session** is a span of activity from a single visitor with no more than 30 minutes of idle time between events. Sessions exist mainly to answer one question well: *what brought this person in?*

In Millimetric, sessions are a **derived** concept. The raw `events` table has a `session_id` per row; the `sessions` materialised view aggregates them.

## How session\_id is derived

If you don't send `session_id`, the server fills it in:

```
session_id = `${anonymous_id}-${30min_bucket}`

30min_bucket = floor(unix_seconds / 1800)
```

So a single visitor's events get the same `session_id` as long as they happen inside the same 30-minute window. As soon as the gap exceeds \~30 minutes, the bucket increments and the visitor is in a new session.

This is intentionally simple:

* **Stateless.** No server-side timer per visitor; sessions fall out of timestamps + anonymous id.
* **Deterministic.** The same input produces the same `session_id` whether it's processed live or replayed from a backfill.
* **Cheap.** The classifier and ingest path don't read or write session state.

The trade-off: edges of the bucket can split or merge sessions slightly differently than a "30 min since last event" sliding window. For the analytics queries Millimetric optimises for, the difference is rounding noise.

## Overriding session\_id

You should override only when you need session boundaries to mean something specific in your product. For example:

* **A long-form video player** where you want a "watch session" to span hours.
* **A multi-step onboarding** that you want grouped even across reloads.
* **An AI agent** where each `task_id` is a session for telemetry purposes.

```ts
import { track } from "@millimetric/track-node";

track({
  event: "tool_called",
  anonymous_id: "agent_001",
  user_id: "user_42",
  session_id: `task_${task.id}`,    // group every event in this task
  properties: { tool: "search_repo" }
});
```

The browser SDK doesn't expose a `setSessionId` — the auto-derived one is what you want for web. Override only via the HTTP layer.

## The sessions materialised view

`sessions` is a ClickHouse `AggregatingMergeTree` view that groups events by `(project_id, session_id, anonymous_id)` and stores:

| Column                                         | What it is                                              |
| ---------------------------------------------- | ------------------------------------------------------- |
| `session_id`                                   | derived or user-supplied                                |
| `anonymous_id`                                 | the visitor                                             |
| `user_id`                                      | filled in if `/v1/identify` happened during the session |
| `started_at`                                   | min(timestamp)                                          |
| `ended_at`                                     | max(timestamp)                                          |
| `duration_s`                                   | `ended_at - started_at`                                 |
| `event_count`                                  | count of all events in the session                      |
| `pageview_count`                               | count of `$pageview` events                             |
| `entry_source / entry_medium / entry_campaign` | classifier output on the **first** event of the session |
| `entry_url / entry_path / entry_referrer`      | from the first event                                    |
| `country / device_type / browser / os`         | first event's enrichment                                |

### Why `entry_source` is the right answer for attribution

Per-event attribution is per-event truth: a server-side `signup` POSTed without `url` or `referrer` will classify as `direct`. That's correct *for that event*, but it's the wrong answer for "where did this customer come from".

The `entry_source` on a session is the first-touch attribution for that visit. Query it instead of raw events when you want to credit the channel that brought someone in.

```sql
SELECT
  entry_source,
  entry_medium,
  count() AS sessions,
  uniq(anonymous_id) AS visitors
FROM sessions
WHERE project_id = '...'
  AND started_at > now() - INTERVAL 7 DAY
GROUP BY entry_source, entry_medium
ORDER BY sessions DESC;
```

This is what powers [`/v1/sources`](/api-reference/sources.md) — the endpoint that gives you Facebook *paid* and Facebook *social* on separate rows.

## Pre-login → post-login in a session

If a visitor identifies *during* a session, the `sessions` row sees both:

* `anonymous_id = u_abc` for the whole session
* `user_id = user_42` because at least one event in the session carried it (the `$identify` itself)
* `entry_source / entry_medium` from the first event (which was anonymous)

Result: the session is correctly credited to the channel that brought them in, with the user id stitched on.

## Multiple sessions per visitor

Common. Treat them independently for attribution purposes:

```sql
-- visitor's session count and channel mix
SELECT
  anonymous_id,
  count() AS sessions,
  groupArray(entry_source) AS channels
FROM sessions
WHERE project_id = '...'
  AND started_at > now() - INTERVAL 30 DAY
GROUP BY anonymous_id
HAVING sessions > 1
ORDER BY sessions DESC
LIMIT 50;
```

Multi-touch attribution (e.g. credit each channel proportionally) is a query-layer concern — Millimetric stores raw and first-touch; you compose more sophisticated models on top.

## Sessions for the FB social-vs-paid split

Combining sessions + the classifier is the whole point.

```sql
SELECT
  entry_source,
  entry_medium,
  count() AS sessions,
  uniq(anonymous_id) AS visitors,
  countIf(event_count > 1) AS engaged_sessions
FROM sessions
WHERE project_id = '...'
  AND entry_source = 'facebook'
  AND started_at > now() - INTERVAL 30 DAY
GROUP BY entry_source, entry_medium;
```

→

```
entry_source | entry_medium | sessions | visitors | engaged_sessions
facebook     | paid         |    432   |   391    |       301
facebook     | social       |    187   |   180    |       113
```

That breakdown is the answer to the question every marketer asks and almost no analytics tool answers cleanly.

## What sessions don't do

* **No cross-device stitching.** A visitor on a phone and a laptop has two different `anonymous_id`s and therefore different sessions. Use `user_id` (post-identify) to roll those up.
* **No session-level revenue.** Attach `amount_cents` to the conversion event itself; aggregate per-session in your query.
* **No engagement scoring out of the box.** `event_count`, `pageview_count`, `duration_s` give you the raw material — score it however you like.

## See also

* [Events](/core-concepts/events.md) — what flows into a session.
* [Identities](/core-concepts/identities.md) — `anonymous_id` vs `user_id`.
* [Attribution](/core-concepts/attribution.md) — how `entry_source` is determined.
* [GET /v1/sources](/api-reference/sources.md) — the read endpoint that uses sessions under the hood.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.millimetric.ai/core-concepts/sessions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
