CrUX on BigQuery
Learn how CrUX data is structured on BigQuery.
The raw data behind the Chrome UX Report (CrUX) is available on BigQuery, a database hosted on the Google Cloud Platform (GCP).
CrUX on BigQuery allows users to directly query the full dataset going back to 2017, for example to analyze trends, compare web technologies and benchmark domains.
The data is structured by monthly release, as well as a number of summary tables to provide simple access for querying the data. These are documented further below.
Accessing the dataset in GCP
Using BigQuery requires a GCP project and basic knowledge of SQL. The CrUX dataset on BigQuery is free to access and explore up to the limits of the free tier, which is renewed monthly and provided by BigQuery. Additionally, new GCP users may be eligible for a signup credit to cover expenses beyond the free tier. Note that a credit card must be provided for the GCP project, see Why do I need to provide a credit card?.
If this is your first time using BigQuery then follow below steps to set up a project:
- Navigate to Google Cloud Platform.
- Click Create a Project.
- Give your new project a name like “My Chrome UX Report” and click Create.
- Provide your billing information if prompted.
- Navigate to the CrUX dataset on BigQuery
Now you’re ready to start querying the dataset.
For example queries see the getting started guide on web.dev.
CrUX data on BigQuery is released on the second Tuesday of the following month. Each month is released as a new table under
chrome-ux-report.all. There are also a number of materialized tables which provide summary statistics for each month.
Detailed table schema
The raw tables for each country and the
all dataset have the following schema:
Materialized table schema
Materialized tables are provided for easy access to summary data by a number of key dimensions. No histograms are provided, instead performance data is aggregated into fractions by performance assessment and the 75th percentile value. A set of example rows from the
metrics_summary table are shown below as an example:
This shows that in the 202204 dataset, 90.56% of real-user experiences on
https://example.com met the criteria for good LCP, and that the coarse 75th percentile LCP value was 1,600ms. This is slightly slower than previous months.
Four materialized tables are provided:
- key metrics by month and origin
- key metrics by month, origin and device type
- key metrics by month, origin, device type and country
- a list of all origins included in the dataset
metrics_summary table contains summary statistics for each origin and each monthly dataset:
- Month of the data collection period
- URL of the site origin
- Coarse popularity ranking (as of March 2021)
- fraction of traffic by CLS thresholds
- fraction of traffic by performance thresholds
- 75th percentile value of performance metrics (milliseconds)
- fraction of notification permission behaviors
- fraction of traffic by form factor
- fraction of traffic by effective connection type
device_summary table contains aggregated statistics by month, origin, country and device. In addition to the
metrics_summary columns there is:
- Device form factor
country_summary table contains aggregated statistics by month, origin, country and device. In addition to the
metrics_summary columns there is:
origin_summary table contains a list of all origins in the CrUX dataset; it is updated monthly with the latest list of origins in the dataset and has a single column:
Tables in the experimental dataset are exact copies of the default
YYYYMM tables, but they make use of newer and more advanced BigQuery features like partitioning and clustering that enable you to write faster, simpler, and cheaper queries.
experimental.country dataset contains aggregated data from the
country_CC datasets with an additional
yyyymm column for the dataset date. The schema is identical to raw tables with the addition of the date and
country_code columns, allowing for country-level comparison over time queries to be executed without joining the monthly tables.
experimental.global dataset contains aggregated data from the
all dataset with an additional
yyyymm column for the dataset date. The schema is identical to raw tables with the addition of the date, allowing for comparison over time queries to be executed without joining the monthly tables.