A look back in time: the evolution of test automation
The birth of test automation
Let's rewind back to the 1990s when the web browser was born. Test automation didn't become a reality until the 2000s, with the emergence of Selenium and WebDriver projects to tackle cross-browser and multi-device testing challenges.
Test automation prior to WebDriver “Classic” was quite tricky. Being able to automate browser testing significantly improved the quality of developers’ and testers’ lives.
Broadly, these tools can be organized into two major groups based on how they automate browsers:
- Low level: Tools that execute remote commands outside of the browser. When tools require even greater control, such as opening multiple tabs or simulating device mode, that's when they need to execute remote commands to control the browser via protocols. The two main automation protocols are WebDriver “Classic” and Chrome DevTools Protocol (CDP).
In the next section, we will take a look at these two protocols to understand their strengths and limitations.
WebDriver "Classic" versus Chrome DevTools Protocol (CDP)
WebDriver "Classic" is a web standard supported by all major browsers. Automation scripts issue commands via HTTP requests to a driver server, which then communicates with browsers through internal, browser-specific protocols.
While it has excellent cross-browser support and its APIs are designed for testing, it can be slow and does not support some low-level controls.
For example, imagine you have a test script that clicks on an element
await coffee.click();. It is translated into a series of HTTP requests.
# WebDriver: Click on a coffee element
curl -X POST http://127.0.0.1:4444/session/:element_id/element/click
-H 'Content-Type: application/json'
On the other hand, Chrome DevTools Protocol (CDP) was initially designed for Chrome DevTools and debugging, but was adopted by Puppeteer for automation. CDP communicates directly with Chromium-based browsers through WebSocket connections, providing faster performance and low-level control.
However, it only works with Chromium-based browsers and is not an open standard. On top of that, CDP APIs are relatively complex. In some cases, working with CDP is not ergonomic. For example, working with out-of-process iframes takes a lot of effort.
For example, clicking on an element
await coffee.click(); is translated into a series of CDP commands.
// CDP: Click on a coffee element
// Mouse pressed
type: 'mousePressed', x: 10.34, y: 27.1, clickCount: 1 }
// Mouse released
type: 'mouseReleased', x: 10.34, y: 27.1, clickCount: 1 }
What are the low-level controls?
Back in the days when WebDriver “Classic” was developed, there wasn't a need for low-level control. But times have changed, the web is much more capable now, and testing today demands more fine-grained actions.
Since CDP was designed to cover all debugging needs, it supports more low-level controls compared to WebDriver “Classic”. It's capable of handling features like:
- Capturing console messages
- Intercepting network requests
- Simulating device mode
- Simulating geolocation
- And more!
These weren’t possible in WebDriver “Classic” because of the different architecture—WebDriver “Classic” is HTTP-based, making it tricky to subscribe and listen to browser events. CDP, on the other hand, is WebSocket-based, supporting bi-directional messaging by default.
What’s next: WebDriver BiDi
Here is a summary of the strengths of both WebDriver “Classic” and CDP:
|WebDriver “Classic”||Chrome DevTools Protocol (CDP)|
|Best cross-browser support||Fast, bi-directional messaging|
|W3C standard||Provides low-level control|
|Built for testing|
WebDriver BiDi aims to combine the best aspects of WebDriver "Classic" and CDP. It's a new standard browser automation protocol currently under development.
Learn more about the WebDriver BiDi project—how it works, the vision, and the standardization process.