What's New in WebGPU (Chrome 123)

François Beaufort
François Beaufort

DP4a built-in functions support in WGSL

DP4a (Dot Product of 4 Elements and Accumulate) refers to a set of GPU instructions commonly used in deep learning inference for quantization. It efficiently performs 8-bit integer dot products to accelerate the computation of such int8-quantized models. It can save (up to 75%) of the memory and network bandwidth and improve the performance of any machine learning models in inferencing compared with their f32 version. As a result, it's now heavily used within many popular AI frameworks.

When the "packed_4x8_integer_dot_product" WGSL language extension is present in navigator.gpu.wgslLanguageFeatures, you can now use 32-bit integer scalars packing 4-component vectors of 8-bit integers as inputs to the dot product instructions in your WGSL shader code with the dot4U8Packed and dot4I8Packed built-in functions. You can also use packing and unpacking instructions with packed 4-component vectors of 8-bit integers with pack4xI8, pack4xU8, pack4xI8Clamp, pack4xU8Clamp, unpack4xI8, and unpack4xU8 WGSL built-in functions.

It's recommended to use a requires-directive to signal the potential for non-portability with requires packed_4x8_integer_dot_product; at the top of your WGSL shader code. See the following example and issue tint:1497.

if (!navigator.gpu.wgslLanguageFeatures.has("packed_4x8_integer_dot_product")) {
  throw new Error(`DP4a built-in functions are not available`);
}

const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();

const shaderModule = device.createShaderModule({ code: `
  requires packed_4x8_integer_dot_product;

  fn main() {
    const result: u32 = dot4U8Packed(0x01020304u, 0x02040405u); // 42
  }`,
});

Special thanks to Intel's Web Graphics team in Shanghai for driving this specification and implementation to completion!

Unrestricted pointer parameters in WGSL

The "unrestricted_pointer_parameters" WGSL language extension loosens restrictions on which pointers can be passed to WGSL functions:

  • Parameter pointers of storage, uniform and workgroup address spaces to user declared functions.

  • Passing pointers to structure members and array elements to user declared functions.

Check out Pointers As Function Parameters | Tour of WGSL to learn more about it.

This feature can be feature-detected using navigator.gpu.wgslLanguageFeatures. It's recommended to always use a requires-directive to signal the potential for non-portability with requires unrestricted_pointer_parameters; at the top of your WGSL shader code. See the following example, the WGSL spec changes, and issue tint:2053.

if (!navigator.gpu.wgslLanguageFeatures.has("unrestricted_pointer_parameters")) {
  throw new Error(`Unrestricted pointer parameters are not available`);
}

const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();

const shaderModule = device.createShaderModule({ code: `
  requires unrestricted_pointer_parameters;

  @group(0) @binding(0) var<storage, read_write> S : i32;

  fn func(pointer : ptr<storage, i32, read_write>) {
    *pointer = 42;
  }

  @compute @workgroup_size(1)
  fn main() {
    func(&S);
  }`
});

Syntax sugar for dereferencing composites in WGSL

When the "pointer_composite_access" WGSL language extension is present in navigator.gpu.wgslLanguageFeatures, your WGSL shader code now supports access to components of complex data types using the same dot (.) syntax whether you're working directly with the data or with a pointer to it. Here's how it works:

  • If foo is a pointer: foo.bar is a more convenient way to write (*foo).bar. The asterisk (*) would normally be needed to turn the pointer into a "reference" that can be dereferenced, but now both pointers and references are much more similar and almost interchangeable.

  • If foo is not a pointer: The dot (.) operator works exactly as you're used to for directly accessing members.

Similarly, if pa is a pointer that stores the starting address of an array, then using pa[i] gives you direct access to the memory location where the 'i'th element of that array is stored.

It's recommended to use a requires-directive to signal the potential for non-portability with requires pointer_composite_access; at the top of your WGSL shader code. See the following example and issue tint:2113.

if (!navigator.gpu.wgslLanguageFeatures.has("pointer_composite_access")) {
  throw new Error(`Pointer composite access is not available`);
}

const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();

const shaderModule = device.createShaderModule({ code: `
  requires pointer_composite_access;

  fn main() {
    var a = vec3f();
    let p : ptr<function, vec3f> = &a;
    let r1 = (*p).x; // always valid.
    let r2 = p.x; // requires pointer composite access.
  }`
});

Separate read-only state for stencil and depth aspects

Previously, read-only depth-stencil attachments in render passes required both aspects (depth and stencil) to be read-only. This limitation has been lifted. Now, you can use the depth aspect in read-only fashion, for contact shadow tracing for instance, while the stencil buffer is written to identify pixels for further processing. See issue dawn:2146.

Dawn updates

The uncaptured error callback set with wgpuDeviceSetUncapturedErrorCallback() is now called immediately when the error happens. This is what developers consistently expect and want for debugging. See change dawn:173620.

The wgpuSurfaceGetPreferredFormat() method from the webgpu.h API has been implemented. See issue dawn:1362.

This covers only some of the key highlights. Check out the exhaustive list of commits.

What's New in WebGPU

A list of everything that has been covered in the What's New in WebGPU series.

Chrome 124

Chrome 123

Chrome 122

Chrome 121

Chrome 120

Chrome 119

Chrome 118

Chrome 117

Chrome 116

Chrome 115

Chrome 114

Chrome 113