How DevClocked Measures | DevClocked Docs

Every number DevClocked shows is derived, not guessed. Here is exactly how each metric is built, when it is measured versus estimated, and where the limits are.

Time Slice classification

Activity ticks are grouped into blocks and each block is labelled as one of six categories. Each classification carries a confidence from 0.3 to 0.95 based on how directly the signals matched — a live debug session scores high, a default file-edit guess scores low. That confidence is shown on every block and averaged per session.

Building — file creation, significant additions, source edits
Debugging — debug sessions, breakpoints, terminal errors, test files
Refactoring — refactor commands, balanced add/delete churn
Planning — docs, AI tools, design tools, issue trackers
Config — config files, CI paths, dependency manifests, dotfiles
Review — read-heavy passes over code, PR/MR review URLs

Leverage

Leverage expresses how much output your attention unlocked. It is computed from the composition of your work — hands-on coding vs agent orchestration — together with agent output signals. When agent runtime telemetry corroborates it, we treat it as measured; without that telemetry it is an estimate from work patterns, and the session view flags it est.

Measured — agent runtime was captured for the session. Leverage reflects real agent-to-attention ratio.
Estimated — no agent runtime present. Leverage is inferred from orchestration share and quality — directional, not exact.

Leverage is decomposable: it always drills into its causes rather than standing alone as a bare multiplier.

Flow & focus

Flow and focus are engagement signals, not verdicts.

Flow (0–100) — rewards consistent tick intervals and focused file spread; penalises context switching. High flow means sustained, uninterrupted work.
Focus (0–100) — rewards write activity and time spent in your top files. High focus means engaged, file-sticky work rather than scattered attention.

Token cost

Usage is read from the model's own top-level token counts — never summed across intermediate iterations, which would double-count. Each usage row is priced against a version-pinned price book, so costs are exact for priced models. Usage from models not yet in the book is flagged as unpriced rather than silently valued at zero.

Agent efficiency

For sessions with agent turns we compute two signals: tokens per turn, and a cache-hit ratio (cached tokens over input-side tokens) that proxies context resets — a low ratio means each turn re-sends context instead of reusing it. A run is flagged churny when it produces few lines per turn and its cache-hit ratio is low, and efficient when output per turn is high. When a signal's ingredient is missing it stays blank rather than reading as zero.

Known limits

Idle handling — short gaps split activity blocks, and sustained silence ends the session — so very deliberate think-time can read as idle.
Inferred sessions — sessions reconstructed from commits (no live tracker) lack tick-level signal — timing and classification are approximate.
Estimated leverage — without agent runtime, leverage is directional, not a measured ratio.
Low-confidence blocks — some blocks classify on weak signals; the confidence score tells you which.

A note on people

These numbers are not for ranking people. They explain how work happened and help you plan capacity — not to stack-rank, review, or police the humans doing it. Read our measurement ethics for the full stance.