This section documents how CrUX collects and organizes user experience data.
At the core of the CrUX dataset are individual user experiences, which are aggregated into page-level and origin-level distributions. This section documents user eligibility and the requirements for pages and origins to be included in the dataset. All eligibility criteria must be satisfied in order for an experience to be included in page-level data available in PageSpeed Insights and the CrUX API: User, Origin and Page. Experiences which meet the User and Origin criteria but not Page aren't included in the origin-level data available in all CrUX data sources.
Pages and origins are automatically included or removed from the dataset if their eligibility changes over time. At this time, you cannot manually submit pages or origins for inclusion.
A page must be publicly discoverable to be considered for inclusion in the CrUX dataset.
A page is determined to be publicly discoverable using the same indexability criteria as search engines.
A page cannot meet the discoverability requirement if any of the following conditions are met, including root pages for the origin dataset:
- The page is served with an HTTP
status code other than
- The page is served with an HTTP
X-Robots-Tag: noindexheader or equivalent.
- The document includes a
<meta name="robots" content="noindex">meta tag or equivalent.
Refer to Google Search Console for an overview of your site's indexing status.
A page is determined to be sufficiently popular if it has a minimum number of visitors. An origin is determined to be sufficiently popular if it has a minimum number of visitors across all of its pages. An exact number is not disclosed, but it has been chosen to ensure that we have enough samples to be confident in the statistical distributions for included pages. The minimum number is the same for pages and origins.
Pages and origins that don't meet the popularity threshold are not included in the CrUX dataset.
An origin represents
an entire website, addressable by a URL like
https://www.example.com. For an
origin to be included in the CrUX dataset it must meet two requirements:
You can verify that your origin is discoverable by running a Lighthouse audit and looking at the SEO category results. Your site is not discoverable if your root page fails the Page is blocked from indexing or Page has unsuccessful HTTP status code audits.
If an origin is determined to be publicly discoverable, eligible user experiences on all of that origin's pages are aggregated at the origin-level, regardless of individual page discoverability. All of these experiences count towards the origin's popularity requirement.
For querying purposes, note that all origins in the CrUX dataset are lowercase.
The requirements for a page to be included in the CrUX dataset are the same as origins:
You can verify that a page is discoverable by running a Lighthouse audit and looking at the SEO category results. Your page is not discoverable if it fails the Page is blocked from indexing or Page has unsuccessful HTTP status code audits.
Pages commonly have additional identifiers in their URL including query string parameters like
?utm_medium=email and fragments like
#main. These identifiers are stripped from the URL in the CrUX dataset so that all user experiences on the page are aggregated together. This is useful for pages that would otherwise not meet the popularity threshold if there were many disjointed URL variations for the same page. Note that in rare cases this may unexpectedly group experiences for distinct pages together; for example if parameters
?productID=102 represent different pages.
Pages in CrUX are measured based on the top-level page. Pages included as iframes are not reported on separately in CrUX, but do contribute to the metrics of the top-level page. For example, if
https://www.example.com/frame.html in an iframe, then
page.html will be represented in CrUX (subject to the other eligibility criteria) but
frame.html will not. And if
frame.html has poor CLS then the CLS will be included when measuring the CLS for
page.html. CrUX is the Chrome User Experience Report and a user may not even be aware this is an iframe. Therefore, the experience is measured at the top level page—as per how the user sees this.
For a user to have their experiences aggregated in the CrUX dataset, they must meet the following criteria:
- Enable usage statistic reporting.
- Sync their browser history.
- Not have a Sync passphrase set.
- Use a supported platform.
The current supported platforms are:
- Desktop versions of Chrome including Windows, MacOS, ChromeOS, and Linux operating systems.
- Android versions of Chrome, including native apps using Custom Tabs and WebAPKs.
There are a few notable exceptions that do not provide data to the CrUX dataset:
- Chrome on iOS.
- Native Android apps using WebView.
- Other Chromium browsers (for example Microsoft Edge).
Chrome does not publish data about the proportions of users that meet these criteria. You can learn more about the data we collect in the Chrome Privacy Whitepaper.
Accelerated Mobile Pages (AMP)
Pages built with AMP are included in the CrUX dataset like any other web page. As of the June 2020 CrUX release, pages served via the AMP Cache and / or rendered in the AMP Viewer are also captured, and attributed to the publisher's page URL.
Data in CrUX undergoes a small amount of processing to ensure that it is statistically accurate, well structured and easy to query.
The CrUX dataset is filtered to ensure that the presented data is statistically valid. This may exclude entire pages or origins from appearing in the dataset.
In addition to the eligibility criteria applied to origins and pages, further filtering is applied for segments within the data:
Origins or pages having more than 20% of their total traffic excluded due to ineligible combinations of dimensions are excluded entirely from the dataset.
Because the global-level dataset encompasses user experiences from all countries, combinations of dimensions that do not meet the popularity criteria at the country level may still be included at the global level, provided that there is sufficient popularity.
A small amount of randomness is applied to the dataset to prevent reverse-engineering of sensitive data, such as total traffic volumes. This does not affect the accuracy of aggregate statistics.
Most metric values within the CrUX dataset are represented as histograms of values and bin sizes, where the histogram value is a fraction of all included segments summing to 1. Bin sizes are floating point numbers between 1.0 and 0.0001.
Histogram bin widths are normalized to simplify querying and visualizing the data. This means that larger bins may be split into smaller bins, which equally share the original density in order to maintain consistent bin widths.
CrUX datasets by Google are licensed under a Creative Commons Attribution 4.0 International License.