Experimenting with subgroups
The subgroups feature enables SIMD-level parallelism, allowing threads within a group to communicate and perform collective math operations (for example, calculating the sum of 16 numbers). This provides a highly efficient form of cross-thread data sharing.
A minimal implementation of the subgroups proposal is available for local testing behind the "Unsafe WebGPU Support" flag at chrome://flags/#enable-unsafe-webgpu
.
You can also try subgroups on your site with real users by signing up for the origin trial. Read Get started with origin trials for instructions on how to prepare your site to use origin trials. The origin trial will run from Chrome 128 to 131 (ending February 19, 2025). See Intent to Experiment.
When the "subgroups"
feature is available in a GPUAdapter
, request a GPUDevice
with this feature to get subgroups support in WGSL and check its minSubgroupSize
and maxSubgroupSize
limits.
You also need to explicitly enable this extension in your WGSL code with enable subgroups;
. When enabled, you get access to the following additions:
subgroup_invocation_id
: A built-in value for the index of the thread within the subgroup.subgroup_size
: A built-in value for subgroup size access.subgroupBallot(value)
: Returns a set of bit fields where the bit corresponding tosubgroup_invocation_id
is 1 ifvalue
is true for that active invocation and 0 otherwise.subgroupBroadcast(value, id)
: Broadcasts thevalue
from the invocation withsubgroup_invocation_id
matchingid
to all invocations within the subgroup. Note:id
must be a compile-time constant.
More built-in functions such as subgroupAdd
, subgroupAll
, subgroupElect
, subgroupShuffle
will be added in the future. See issue 354738715.
To allow f16 in subgroups operations, request a GPUDevice
with the "subgroups"
, "subgroups-f16"
, and "shader-f16"
features, then enable it in your WGSL code with enable f16, subgroups, subgroups_f16;
.
The following code snippet provides a base to tinker with and discover the potential of subgroups.
const adapter = await navigator.gpu.requestAdapter();
if (!adapter.features.has("subgroups")) {
throw new Error("Subgroups support is not available");
}
// Explicitly request subgroups support.
const device = await adapter.requestDevice({
requiredFeatures: ["subgroups"],
});
const shaderModule = device.createShaderModule({ code: `
enable subgroups;
var<workgroup> wgmem : u32;
@group(0) @binding(0)
var<storage, read> inputs : array<u32>;
@group(0) @binding(1)
var<storage, read_write> output : array<u32>;
@compute @workgroup_size(64)
fn main(@builtin(subgroup_size) subgroupSize : u32,
@builtin(subgroup_invocation_id) id : u32,
@builtin(local_invocation_index) lid : u32) {
// One thread per workgroup writes the value to workgroup memory.
if (lid == 0) {
wgmem = inputs[lid];
}
workgroupBarrier();
var v = 0u;
// One thread per subgroup reads the value from workgroup memory
// and shares that value with every other thread in the subgroup
// to reduce local memory bandwidth.
if (id == 0) {
v = wgmem;
}
v = subgroupBroadcast(v, 0);
output[lid] = v;
}`,
});
// Send the appropriate commands to the GPU...
Deprecate setting depth bias for lines and points
A WebGPU spec change makes it a validation error to set depthBias
, depthBiasSlopeScale
, and depthBiasClamp
to a non-zero value when the topology for a render pipeline is a line or point type. To give developers enough time to update their code, a warning in the DevTools Console is shown about this upcoming validation while also forcing the values to 0 in these circumstances. See issue 352567424.
Hide uncaptured error DevTools warning if preventDefault
In the DevTools Console, warnings for uncapturederror
events are no longer displayed if an event listener for uncapturederror
has been registered and the Event preventDefault()
method has been called within the event listener callback. This behaviour matches event handling in JavaScript. See the following example and issue 40263619.
const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();
device.addEventListener("uncapturederror", (event) => {
// Prevents browser warning to show up in the DevTools Console.
event.preventDefault();
// TODO: Handle event.error
});
WGSL interpolate sampling first and either
WGSL interpolate
attribute lets you manage user-defined IO data interpolation. Now, new interpolate sampling parameters first
(default) and either
give you additional control: first
uses the value from the primitive's first vertex, while either
allows either the first or last vertex. See issue 340278447.
Dawn updates
The implementation of Dawn's WGPUFuture to handle asynchronous operations is now complete. Key concepts include wgpuInstanceProcessEvents for opportunistic event processing and WGPUCallbackMode for defining callback locations. WGPUFuture signifies one-time events with an infinite lifetime, and wgpuInstanceWaitAny awaits completion of any future or a timeout. See issue 42240932.
The CompositeAlphaMode::Auto
value is now not reported by Surface::GetCapabilities()
. It's still valid, and equivalent to Surface::GetCapabilities().alphaMode[0]
. See issue 292.
The OpenGL backend now supports Surface
with a y-flip blit for each Present()
call. See issue 344814083.
The Adapter::GetProperties()
method is deprecated in favor of using Adapter::GetInfo()
.
Jaswant, an external contributor, has rewritten all the CMake files, making them easier to update and allowing for pre-builds. Check out the quickstart guide for using Dawn in CMake projects.
This covers only some of the key highlights. Check out the exhaustive list of commits.
What's New in WebGPU
A list of everything that has been covered in the What's New in WebGPU series.
Chrome 131
- Clip distances in WGSL
- GPUCanvasContext getConfiguration()
- Point and line primitives must not have depth bias
- Inclusive scan built-in functions for subgroups
- Experimental support for multi-draw indirect
- Shader module compilation option strict math
- Remove GPUAdapter requestAdapterInfo()
- Dawn updates
Chrome 130
- Dual source blending
- Shader compilation time improvements on Metal
- Deprecation of GPUAdapter requestAdapterInfo()
- Dawn updates
Chrome 129
Chrome 128
- Experimenting with subgroups
- Deprecate setting depth bias for lines and points
- Hide uncaptured error DevTools warning if preventDefault
- WGSL interpolate sampling first and either
- Dawn updates
Chrome 127
- Experimental support for OpenGL ES on Android
- GPUAdapter info attribute
- WebAssembly interop improvements
- Improved command encoder errors
- Dawn updates
Chrome 126
- Increase maxTextureArrayLayers limit
- Buffer upload optimization for Vulkan backend
- Shader compilation time improvements
- Submitted command buffers must be unique
- Dawn updates
Chrome 125
Chrome 124
- Read-only and read-write storage textures
- Service workers and shared workers support
- New adapter information attributes
- Bug fixes
- Dawn updates
Chrome 123
- DP4a built-in functions support in WGSL
- Unrestricted pointer parameters in WGSL
- Syntax sugar for dereferencing composites in WGSL
- Separate read-only state for stencil and depth aspects
- Dawn updates
Chrome 122
- Expand reach with compatibility mode (feature in development)
- Increase maxVertexAttributes limit
- Dawn updates
Chrome 121
- Support WebGPU on Android
- Use DXC instead of FXC for shader compilation on Windows
- Timestamp queries in compute and render passes
- Default entry points to shader modules
- Support display-p3 as GPUExternalTexture color space
- Memory heaps info
- Dawn updates
Chrome 120
- Support for 16-bit floating-point values in WGSL
- Push the limits
- Changes to depth-stencil state
- Adapter information updates
- Timestamp queries quantization
- Spring-cleaning features
Chrome 119
- Filterable 32-bit float textures
- unorm10-10-10-2 vertex format
- rgb10a2uint texture format
- Dawn updates
Chrome 118
- HTMLImageElement and ImageData support in
copyExternalImageToTexture()
- Experimental support for read-write and read-only storage texture
- Dawn updates
Chrome 117
- Unset vertex buffer
- Unset bind group
- Silence errors from async pipeline creation when device is lost
- SPIR-V shader module creation updates
- Improving developer experience
- Caching pipelines with automatically generated layout
- Dawn updates
Chrome 116
- WebCodecs integration
- Lost device returned by GPUAdapter
requestDevice()
- Keep video playback smooth if
importExternalTexture()
is called - Spec conformance
- Improving developer experience
- Dawn updates
Chrome 115
- Supported WGSL language extensions
- Experimental support for Direct3D 11
- Get discrete GPU by default on AC power
- Improving developer experience
- Dawn updates
Chrome 114
- Optimize JavaScript
- getCurrentTexture() on unconfigured canvas throws InvalidStateError
- WGSL updates
- Dawn updates