Learn how CrUX data is structured on BigQuery.
Introduction
The raw data behind the Chrome UX Report (CrUX) is available on BigQuery, a database hosted on Google Cloud.
CrUX on BigQuery allows users to directly query the full dataset going back to 2017, for example to analyze trends, compare web technologies and benchmark domains.
The data is structured by monthly release, as well as a number of summary tables to provide simple access for querying the data.
The BigQuery data is the basis of the CrUX Dashboard, which lets you visualize this data without writing SQL queries.
Access the dataset
Using BigQuery requires a Google Cloud account and basic knowledge of SQL. The CrUX dataset on BigQuery is free to access and explore up to the limits of the free tier, which is renewed monthly and provided by BigQuery. Additionally, new Google Cloud users may be eligible for a signup credit to cover expenses beyond the free tier. Note that a credit card must be provided for the Google Cloud project, see Why do I need to provide a credit card?.
If this is your first time using BigQuery then follow these steps to set up a project:
- Navigate to Create a Project on the Google Cloud console.
- Give your new project a name like "My Chrome UX Report" and click Create.
- Provide your billing information if prompted.
- Navigate to the CrUX dataset on BigQuery
Now you're ready to start querying the dataset.
Project organization
CrUX data on BigQuery is released on the second Tuesday of the following month. Each month is released as a new table under chrome-ux-report.all
. There are also a number of materialized tables which provide summary statistics for each month.
- `chrome-ux-report
Detailed table schema
The raw tables for each country and the all
dataset are provided by year and month.
Raw tables
The raw tables have the following schema:
origin
effective_connection_type
form_factor
first_paint
first_contentful_paint
largest_contentful_paint
dom_content_loaded
onload
layout_instability
cumulative_layout_shift
interaction_to_next_paint
navigation_types
navigate
navigate_cache
reload
restore
back_forward
back_forward_cache
prerender
experimental
permission
notifications
time_to_first_byte
popularity
Materialized table schema
Materialized tables are provided for easier access to summary data by a number of key dimensions. No histograms are provided, instead performance data is aggregated into fractions by performance assessment and the 75th percentile value. A set of example rows from the metrics_summary
table are shown in this example:
yyyymm | origin | fast_lcp | avg_lcp | slow_lcp | p75_lcp |
---|---|---|---|---|---|
202204 | https://example.com | 0.9056 | 0.0635 | 0.0301 | 1600 |
202203 | https://example.com | 0.9209 | 0.052 | 0.0274 | 1400 |
202202 | https://example.com | 0.9169 | 0.0545 | 0.0284 | 1500 |
202201 | https://example.com | 0.9072 | 0.0626 | 0.0298 | 1500 |
This shows that in the 202204 dataset, 90.56% of real-user experiences on https://example.com
met the criteria for good LCP, and that the coarse 75th percentile LCP value was 1,600ms. This is slightly slower than previous months.
Four materialized tables are provided:
metrics_summary
- key metrics by month and origin
device_summary
- key metrics by month, origin and device type
country_summary
- key metrics by month, origin, device type and country
origin_summary
- a list of all origins included in the dataset
metrics_summary
The metrics_summary
table contains summary statistics for each origin and each monthly dataset:
yyyymm
- Month of the data collection period
origin
- URL of the site origin
rank
- Coarse popularity ranking (as of March 2021)
[small|medium|large]_cls
- fraction of traffic by CLS thresholds
[fast|avg|slow]_<metric>
- fraction of traffic by performance thresholds
p75_<metric>
- 75th percentile value of performance metrics (milliseconds)
notification_permission_[accept|deny|ignore|dismiss]
- fraction of notification permission behaviors
[desktop|phone|tablet]Density
- fraction of traffic by form factor
[_4G|_3G|_2G|slow2G|offline]Density
- fraction of traffic by effective connection type
navigation_type_[navigate|navigate_cache|reload|restore|back_forward|back_forward_cache|prerender]
- fraction of navigation types
device_summary
The device_summary
table contains aggregated statistics by month, origin, country and device. In addition to the metrics_summary
columns there is:
device
- Device form factor
country_summary
The country_summary
table contains aggregated statistics by month, origin, country and device. In addition to the metrics_summary
columns there is:
country_code
- Two-letter country code
device
- Device form factor
origin_summary
The origin_summary
table contains a list of all origins in the CrUX dataset; it is updated monthly with the latest list of origins in the dataset and has a single column: origin
.
Experimental dataset
Tables in the experimental dataset are exact copies of the default YYYYMM
tables, but they make use of newer and more advanced BigQuery features like partitioning and clustering that enable you to write faster, simpler, and cheaper queries.
country
The experimental.country
dataset contains aggregated data from the country_CC
datasets with an additional yyyymm
column for the dataset date. The schema is identical to raw tables with the addition of the date and country_code
columns, allowing for country-level comparison over time queries to be executed without joining the monthly tables.
global
The experimental.global
dataset contains aggregated data from the all
dataset with an additional yyyymm
column for the dataset date. The schema is identical to raw tables with the addition of the date, allowing for comparison over time queries to be executed without joining the monthly tables.