Data limits and caveats — what these numbers can and can't say — Open Channel Stats

Open Channel Statsbeta

Reporting

2conditions

What YouTube's feed hasn't reported yet, or reports as an artifact.

Reporting lag
What’s missing
YouTube's reporting pipeline is two to three days behind real time. Every chart and aggregate in the dashboard stops at the most recently complete date.
Reading this
Numbers for the last two to three calendar days are always absent. This is not a gap in the dashboard — it is a gap in what YouTube exposes.
Enforced by
dashboard/src/lib/db/index.ts · getReportingCutoff
Returns the latest date with complete reporting data; all charts filter to date ≤ cutoff.
Surfaced oneach channel’s Dashboard and Honesty
Pre-publish stub rows
What’s missing
When a video is scheduled, YouTube sometimes records reporting rows for the day before it published — usually a row of zero views and null impressions.
Reading this
Every aggregation filters out these stub rows using the pre_publish_stub = 0 guard. The stub rows remain in the database for provenance.
Enforced by
dashboard/src/lib/db/index.ts
All summary CTEs include WHERE pre_publish_stub = 0.
scripts/build-public-db.js
Sets pre_publish_stub = 1 when date < published_at for the video.
Surfaced oneach channel’s Honesty
Since schema v9

Sampling

2conditions

Where a number is too thin a sample to read directionally.

CTR thin-sample tiers
What’s missing
CTR from fewer than 50 impressions is noise. Four tiers encode how many impressions back the number: noise / low / medium / high.
Reading this
Any CTR in the noise or low tier reflects too few impressions to read directionally. The four-segment glyph after every CTR cell encodes which tier applies.
Enforced by
dashboard/src/lib/ctr-confidence.ts
Tier boundaries and classification logic.
dashboard/src/components/ConfidencePip.tsx
Renders the four-segment glyph on every CTR display (variant="ctr").
Surfaced oneach channel’s Videos and Honesty
Since schema v9
Average watch time thin-sample tiers
What’s missing
Average watch time below 10 views is noise; 10 to 29 is low; 30 to 99 is medium; 100 or more is high.
Reading this
The same four-tier encoding applies to watch-time cells. Readings below the 30-view threshold tend to be dominated by a handful of unusually long or short sessions.
Enforced by
scripts/public-db/populate.js · avdConfidenceTier
AVD confidence boundaries classified at build time; the tier rides every avd_confidence column.
Surfaced oneach channel’s Videos and Honesty
Since schema v9

Attribution

2conditions

Where the parts don't sum to the whole, and the gap is shown.

Unattributed impressions
What’s missing
YouTube's per-source impression rows don't always sum to the per-video impression total. The gap is real and surfaced rather than redistributed.
Reading this
Some impressions come from surfaces YouTube doesn't expose through its reporting feed. The unattributed share is visible on the Honesty panel; traffic charts reflect only attributed impressions.
Enforced by
dashboard/src/lib/db/channel/reconciliation.ts · getUnattributedImpressions
Computes the channel-level unattributed impression share.
Surfaced oneach channel’s Traffic and Honesty
Since schema v9
Channel-snapshot reconciliation gap
What’s missing
The channel-level snapshot sometimes shows higher totals than the sum of per-video reporting rows.
Reading this
This is YouTube's own attribution gap. The dashboard charts the difference rather than hiding it, so readings on total channel views may slightly overcount relative to per-video sums.
Enforced by
dashboard/src/lib/db/channel/reconciliation.ts · getChannelReconciliation
Returns the per-day delta between channel snapshot and per-video sum.
Surfaced oneach channel’s Honesty

Withheld content

1condition

What the public dataset deliberately leaves out before it ships.

Comment text and non-public videos stay out
What’s missing
Two things never enter the public dataset: the text of viewer comments (counts only), and videos that aren't public on YouTube. Everything else ships in full — real titles, real video IDs, real thumbnails.
Reading this
Every public video is fully analyzable under its real title and ID. Individual comments and non-public uploads are not recoverable from the public data.
Enforced by
scripts/build-public-db.js
Excludes comment text and non-public videos from the public dataset at build time.
Surfaced oneach channel’s Data and Gaps

Missing data

2conditions

What YouTube doesn't expose for a channel this size, or at all.

Size-gated demographic data
What’s missing
YouTube only returns audience demographics once a channel clears a minimum-size threshold. This channel is currently below that threshold.
Reading this
Country, age, and gender breakdowns are not available in the current dataset, since the channel is below YouTube's size threshold for them.
Surfaced oneach channel’s Gaps
Revenue and monetization absent
What’s missing
Revenue, CPM, and ad-breakdown data are not in the dataset. This channel is not yet in the YouTube Partner Program.
Reading this
Monetization metrics (CPM, RPM, ad revenue) are not analyzable from this dataset.
Surfaced oneach channel’s Gaps

Keep verifying

The rest of the honesty layer

The limits registry sits beside the known-weaknesses notes and the detector catalog — together they name what the data can and can't say.

Known weaknesses

The strongest good-faith arguments against the inferences this dashboard makes, paired with the guardrail already in the code.

Every detector the engine can fire, what each one looks for, and what it takes for each to switch on.