DP4a built-in functions support in WGSL
DP4a (Dot Product of 4 Elements and Accumulate) refers to a set of GPU instructions commonly used in deep learning inference for quantization. It efficiently performs 8-bit integer dot products to accelerate the computation of such int8-quantized models. It can save (up to 75%) of the memory and network bandwidth and improve the performance of any machine learning models in inferencing compared with their f32 version. As a result, it's now heavily used within many popular AI frameworks.
When the "packed_4x8_integer_dot_product" WGSL language extension is present in navigator.gpu.wgslLanguageFeatures, you can now use 32-bit integer scalars packing 4-component vectors of 8-bit integers as inputs to the dot product instructions in your WGSL shader code with the dot4U8Packed and dot4I8Packed built-in functions. You can also use packing and unpacking instructions with packed 4-component vectors of 8-bit integers with pack4xI8, pack4xU8, pack4xI8Clamp, pack4xU8Clamp, unpack4xI8, and unpack4xU8 WGSL built-in functions.
It's recommended to use a requires-directive to signal the potential for non-portability with requires packed_4x8_integer_dot_product; at the top of your WGSL shader code. See the following example and issue tint:1497.
if (!navigator.gpu.wgslLanguageFeatures.has("packed_4x8_integer_dot_product")) {
throw new Error(`DP4a built-in functions are not available`);
}
const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();
const shaderModule = device.createShaderModule({ code: `
requires packed_4x8_integer_dot_product;
fn main() {
const result: u32 = dot4U8Packed(0x01020304u, 0x02040405u); // 42
}`,
});
Special thanks to Intel's Web Graphics team in Shanghai for driving this specification and implementation to completion!
Unrestricted pointer parameters in WGSL
The "unrestricted_pointer_parameters" WGSL language extension loosens restrictions on which pointers can be passed to WGSL functions:
Parameter pointers of
storage,uniformandworkgroupaddress spaces to user declared functions.Passing pointers to structure members and array elements to user declared functions.
Check out Pointers As Function Parameters | Tour of WGSL to learn more about it.
This feature can be feature-detected using navigator.gpu.wgslLanguageFeatures. It's recommended to always use a requires-directive to signal the potential for non-portability with requires unrestricted_pointer_parameters; at the top of your WGSL shader code. See the following example, the WGSL spec changes, and issue tint:2053.
if (!navigator.gpu.wgslLanguageFeatures.has("unrestricted_pointer_parameters")) {
throw new Error(`Unrestricted pointer parameters are not available`);
}
const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();
const shaderModule = device.createShaderModule({ code: `
requires unrestricted_pointer_parameters;
@group(0) @binding(0) var<storage, read_write> S : i32;
fn func(pointer : ptr<storage, i32, read_write>) {
*pointer = 42;
}
@compute @workgroup_size(1)
fn main() {
func(&S);
}`
});
Syntax sugar for dereferencing composites in WGSL
When the "pointer_composite_access" WGSL language extension is present in navigator.gpu.wgslLanguageFeatures, your WGSL shader code now supports access to components of complex data types using the same dot (.) syntax whether you're working directly with the data or with a pointer to it. Here's how it works:
If
foois a pointer:foo.baris a more convenient way to write(*foo).bar. The asterisk (*) would normally be needed to turn the pointer into a "reference" that can be dereferenced, but now both pointers and references are much more similar and almost interchangeable.If
foois not a pointer: The dot (.) operator works exactly as you're used to for directly accessing members.
Similarly, if pa is a pointer that stores the starting address of an array, then using pa[i] gives you direct access to the memory location where the 'i'th element of that array is stored.
It's recommended to use a requires-directive to signal the potential for non-portability with requires pointer_composite_access; at the top of your WGSL shader code. See the following example and issue tint:2113.
if (!navigator.gpu.wgslLanguageFeatures.has("pointer_composite_access")) {
throw new Error(`Pointer composite access is not available`);
}
const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();
const shaderModule = device.createShaderModule({ code: `
requires pointer_composite_access;
fn main() {
var a = vec3f();
let p : ptr<function, vec3f> = &a;
let r1 = (*p).x; // always valid.
let r2 = p.x; // requires pointer composite access.
}`
});
Separate read-only state for stencil and depth aspects
Previously, read-only depth-stencil attachments in render passes required both aspects (depth and stencil) to be read-only. This limitation has been lifted. Now, you can use the depth aspect in read-only fashion, for contact shadow tracing for instance, while the stencil buffer is written to identify pixels for further processing. See issue dawn:2146.
Dawn updates
The uncaptured error callback set with wgpuDeviceSetUncapturedErrorCallback() is now called immediately when the error happens. This is what developers consistently expect and want for debugging. See change dawn:173620.
The wgpuSurfaceGetPreferredFormat() method from the webgpu.h API has been implemented. See issue dawn:1362.
This covers only some of the key highlights. Check out the exhaustive list of commits.
What's New in WebGPU
A list of everything that has been covered in the What's New in WebGPU series.
Chrome 142
Chrome 141
- Tint IR completed
- Integer range analysis in WGSL compiler
- SPIR-V 1.4 update for Vulkan backend
- Dawn updates
Chrome 140
- Device requests consume adapter
- Shorthand for using texture where texture view is used
- WGSL textureSampleLevel supports 1D textures
- Deprecate bgra8unorm read-only storage texture usage
- Remove GPUAdapter isFallbackAdapter attribute
- Dawn updates
Chrome 139
- 3D texture support for BC and ASTC compressed formats
- New "core-features-and-limits" feature
- Origin trial for WebGPU compatibility mode
- Dawn updates
Chrome 138
- Shorthand for using buffer as a binding resource
- Size requirement changes for buffers mapped at creation
- Architecture report for recent GPUs
- Deprecate GPUAdapter isFallbackAdapter attribute
- Dawn updates
Chrome 137
- Use texture view for externalTexture binding
- Buffers copy without specifying offsets and size
- WGSL workgroupUniformLoad using pointer to atomic
- GPUAdapterInfo powerPreference attribute
- Remove GPURequestAdapterOptions compatibilityMode attribute
- Dawn updates
Chrome 136
- GPUAdapterInfo isFallbackAdapter attribute
- Shader compilation time improvements on D3D12
- Save and copy canvas images
- Lift compatibility mode restrictions
- Dawn updates
Chrome 135
- Allow creating pipeline layout with null bind group layout
- Allow viewports to extend past the render targets bounds
- Easier access to the experimental compatibility mode on Android
- Remove maxInterStageShaderComponents limit
- Dawn updates
Chrome 134
- Improve machine-learning workloads with subgroups
- Remove float filterable texture types support as blendable
- Dawn updates
Chrome 133
- Additional unorm8x4-bgra and 1-component vertex formats
- Allow unknown limits to be requested with undefined value
- WGSL alignment rules changes
- WGSL performance gains with discard
- Use VideoFrame displaySize for external textures
- Handle images with non-default orientations using copyExternalImageToTexture
- Improving developer experience
- Enable compatibility mode with featureLevel
- Experimental subgroup features cleanup
- Deprecate maxInterStageShaderComponents limit
- Dawn updates
Chrome 132
- Texture view usage
- 32-bit float textures blending
- GPUDevice adapterInfo attribute
- Configuring canvas context with invalid format throw JavaScript error
- Filtering sampler restrictions on textures
- Extended subgroups experimentation
- Improving developer experience
- Experimental support for 16-bit normalized texture formats
- Dawn updates
Chrome 131
- Clip distances in WGSL
- GPUCanvasContext getConfiguration()
- Point and line primitives must not have depth bias
- Inclusive scan built-in functions for subgroups
- Experimental support for multi-draw indirect
- Shader module compilation option strict math
- Remove GPUAdapter requestAdapterInfo()
- Dawn updates
Chrome 130
- Dual source blending
- Shader compilation time improvements on Metal
- Deprecation of GPUAdapter requestAdapterInfo()
- Dawn updates
Chrome 129
Chrome 128
- Experimenting with subgroups
- Deprecate setting depth bias for lines and points
- Hide uncaptured error DevTools warning if preventDefault
- WGSL interpolate sampling first and either
- Dawn updates
Chrome 127
- Experimental support for OpenGL ES on Android
- GPUAdapter info attribute
- WebAssembly interop improvements
- Improved command encoder errors
- Dawn updates
Chrome 126
- Increase maxTextureArrayLayers limit
- Buffer upload optimization for Vulkan backend
- Shader compilation time improvements
- Submitted command buffers must be unique
- Dawn updates
Chrome 125
Chrome 124
- Read-only and read-write storage textures
- Service workers and shared workers support
- New adapter information attributes
- Bug fixes
- Dawn updates
Chrome 123
- DP4a built-in functions support in WGSL
- Unrestricted pointer parameters in WGSL
- Syntax sugar for dereferencing composites in WGSL
- Separate read-only state for stencil and depth aspects
- Dawn updates
Chrome 122
- Expand reach with compatibility mode (feature in development)
- Increase maxVertexAttributes limit
- Dawn updates
Chrome 121
- Support WebGPU on Android
- Use DXC instead of FXC for shader compilation on Windows
- Timestamp queries in compute and render passes
- Default entry points to shader modules
- Support display-p3 as GPUExternalTexture color space
- Memory heaps info
- Dawn updates
Chrome 120
- Support for 16-bit floating-point values in WGSL
- Push the limits
- Changes to depth-stencil state
- Adapter information updates
- Timestamp queries quantization
- Spring-cleaning features
Chrome 119
- Filterable 32-bit float textures
- unorm10-10-10-2 vertex format
- rgb10a2uint texture format
- Dawn updates
Chrome 118
- HTMLImageElement and ImageData support in
copyExternalImageToTexture() - Experimental support for read-write and read-only storage texture
- Dawn updates
Chrome 117
- Unset vertex buffer
- Unset bind group
- Silence errors from async pipeline creation when device is lost
- SPIR-V shader module creation updates
- Improving developer experience
- Caching pipelines with automatically generated layout
- Dawn updates
Chrome 116
- WebCodecs integration
- Lost device returned by GPUAdapter
requestDevice() - Keep video playback smooth if
importExternalTexture()is called - Spec conformance
- Improving developer experience
- Dawn updates
Chrome 115
- Supported WGSL language extensions
- Experimental support for Direct3D 11
- Get discrete GPU by default on AC power
- Improving developer experience
- Dawn updates
Chrome 114
- Optimize JavaScript
- getCurrentTexture() on unconfigured canvas throws InvalidStateError
- WGSL updates
- Dawn updates