· Paul Crossland
Redirect Timing Is Fetch Observability
New Fetch, HTML, and WPT work on TAO redirect chains shows why crawlers should treat timing data as conditional evidence.
Redirects are easy to treat as plumbing: request a URL, follow the chain, extract the final page. In production fetching, they are rarely that simple. Redirects carry consent state, region routing, bot-management decisions, login walls, language negotiation, canonicalization, affiliate parameters, and API discovery hints. When a crawler cannot see enough about the chain, it may still get a final 200 while losing the evidence needed to explain why the journey changed.
That is why a small set of standards and test updates from the last week is worth watching. On June 25, the WHATWG Fetch repository landed "Add TAO destination check for navigation redirect chains", changing the model for how Timing-Allow-Origin can expose cross-origin navigation redirect timing to the destination origin. The related WHATWG HTML pull request, "Expose redirectCount for TAO opted-in redirect chains", merged the same day. Web Platform Tests also moved quickly: WPT pull request 60823, created June 23 and merged June 24, added destination-based TAO tests for navigation timing, and WPT pull request 60873, merged June 25, made the redirect test non-tentative.
The practical takeaway is not that every scraper suddenly gets more redirect detail. It is the opposite: browser-exposed timing and redirect evidence is governed by standards, headers, origin relationships, and implementation status. If your observability model assumes that browser navigation timing always tells the whole story, it will mislead you on exactly the cross-origin flows where production crawlers most need clarity.
Why this matters for fetching systems
Many fetch pipelines combine at least three views of a page:
- The direct HTTP view, where a client records status codes, headers, redirects, and response bodies.
- The browser view, where JavaScript, storage, service workers, cookies, and rendering determine what the user-like session sees.
- The extraction view, where the system decides whether it found the expected data.
Redirects sit across all three. A direct HTTP client might report a three-hop chain. A browser page might expose less timing detail because cross-origin redirects are protected unless the right Timing-Allow-Origin rules apply. The extractor might only see that the final DOM is missing a price, calendar, or product ID. If those records are not joined carefully, teams end up debugging symptoms instead of causes.
The new Fetch and HTML work is about navigation timing exposure, not about crawling as a product category. Still, it reinforces an operational rule: browser observability is part of the platform contract. It is not a packet capture. A browser can know something internally and still avoid exposing it to page JavaScript. Automation code that reads performance.getEntriesByType("navigation") is reading an intentionally mediated view.
For reliable data systems, that distinction matters. A hidden or partially exposed redirect chain can be the difference between these diagnoses:
- The target moved content behind a new regional route.
- A consent or age gate introduced a same-origin intermediate page.
- A CDN or WAF added a cross-origin challenge hop.
- A partner or affiliate redirect changed campaign parameters.
- A browser implementation changed which timing fields are observable.
- The fetch code lost instrumentation during a refactor.
Those are different incidents with different owners. Treating them all as "page failed" creates noisy retries and weak postmortems.
TAO is an observability boundary
Timing-Allow-Origin is often discussed as a performance API detail. For crawler operators, it is better understood as an observability boundary. It helps decide what timing information a destination is allowed to learn about cross-origin resources or navigations.
The June 25 Fetch commit describes a gap in the old model: TAO opted in "backwards" in terms of origins and did not constitute the destination-origin opt-in that is appropriate for navigations. The new mechanisms enable a navigation response to know that a redirect chain opted in to exposing redirects to its origin. The HTML change then wires that into redirectCount exposure for opted-in chains.
That is a narrow standards change, but it has a broad lesson. If your crawler uses browser performance APIs to validate navigation health, the absence of redirect detail may mean several things:
- There were no redirects.
- There were redirects, but the browser did not expose them.
- The redirect chain did not opt in with the relevant TAO behavior.
- The browser or automation runtime has not implemented the latest behavior.
- Your instrumentation ran too late, in the wrong frame, or after a navigation replacement.
Only the first interpretation means "nothing happened." The others are observability states that should be logged explicitly.
Build a redirect evidence model
A production fetch record should separate observed facts from inferred facts. Instead of one field named redirects, use fields that preserve the source of evidence.
For direct HTTP fetches, record:
- Initial URL, final URL, redirect count, and full redirect chain when available.
- Status code,
Locationheader, response headers, and response size for every hop. - Whether redirects were followed automatically or manually.
- Method rewriting across redirects, especially
POSTtoGETchanges. - Cookie set and sent events at each hop, with values redacted or hashed.
- Egress region, proxy identifier, DNS result, and TLS peer information.
For browser navigations, record:
- Requested URL, committed final URL, main document status, and frame URL changes.
- Navigation timing fields that are present, plus a boolean for whether redirect timing was exposed.
redirectCountif available, andnot_exposedrather than0when the platform does not reveal enough to know.- Request and response events from the automation protocol where safe and available.
- Storage and cookie deltas before and after navigation.
- Console errors, failed requests, service-worker involvement, and document readiness milestones.
For extraction, record:
- The selector, schema, or API endpoint that produced each critical field.
- Null-rate and confidence by field.
- HTML hash, screenshot hash, and key text snippets for failed samples.
- Whether the final URL matched an expected canonical pattern.
This structure prevents a common mistake: collapsing "not observed" into "did not happen." In browser-grade fetching, those are different states.
Test redirects as first-class fixtures
Most crawler canaries include a few stable pages and a few JavaScript-heavy pages. Add redirect fixtures. They do not need to be complicated, but they should cover the paths that break real pipelines.
A useful test set includes:
- Same-origin redirects where full timing should be visible.
- Cross-origin redirects without TAO exposure.
- Cross-origin redirects with expected TAO exposure when supported.
- Region-dependent redirects that change by egress country.
- Consent or language redirects that depend on cookie state.
- Redirects that change method semantics.
- Long chains that approach your maximum-follow policy.
- Redirects into pages that trigger additional XHR or fetch calls.
Run those fixtures through both the direct HTTP client and the browser automation stack. The assertion should not be only "final page loaded." Assert the evidence contract: which chain details were visible, which fields were intentionally not exposed, whether the extraction result stayed stable, and whether the platform version changed.
When a standards or browser update lands, replay the fixtures. The June WPT changes are a reminder that tests often precede or accompany implementation convergence. A crawler fleet should be ready for browser behavior to become more precise without assuming that all channels, operating systems, and automation libraries move at the same time.
Do not use browser timing as the only source of truth
Browser timing APIs are valuable because they describe the user-like navigation path. They are insufficient because they are intentionally scoped. Direct HTTP instrumentation is valuable because it can expose chain mechanics. It is insufficient because it does not execute the page, maintain the same storage state, or necessarily match browser network behavior.
Use both.
For high-value targets, a good incident workflow is:
- Reproduce the URL in the same browser stack, region, locale, timezone, and session class.
- Capture browser navigation events, performance entries, final DOM, storage changes, and screenshot.
- Replay the same initial URL with a direct HTTP client using controlled redirect handling.
- Compare final URL, chain length, status codes, cookies, and response hashes.
- Repeat with a fresh session and a warmed session.
- Classify the outcome as target change, state change, regional routing, platform observability difference, or extractor failure.
That classification keeps operators from responding blindly. If the browser reached a challenge route that direct HTTP did not see, the issue may be session or client profile. If both paths follow a new regional redirect, the target or CDN changed. If only the exposed timing fields changed while extraction remains correct, the right action may be to update observability expectations rather than selectors.
A deployment checklist for redirect observability
Use this checklist when upgrading browsers, automation libraries, Node runtimes, or fetch instrumentation:
- Log browser version, automation library version, runtime version, and container image digest beside redirect evidence.
- Distinguish
zero redirects,redirects observed, andredirects not observablein schemas and dashboards. - Keep direct HTTP redirect traces for sampled browser failures.
- Store final URL and canonical URL separately.
- Alert on changes in final-host distribution, not only status-code distribution.
- Track redirect count distribution by target, region, and session age.
- Include consent, language, and geo state in redirect investigations.
- Avoid increasing retries until you know whether redirects are deterministic or state-driven.
- Treat browser performance API changes as observability changes, not automatically as target-site changes.
The standards work this week is a good prompt to tighten these contracts. Production fetching fails less mysteriously when redirect timing is treated as conditional evidence: useful when present, meaningful when absent, and never confused with the whole truth.