This is a lower-level alternative to the WebDriver / Selenium protocol. It has a lot more functionality that allows advanced testing and Web scraping use cases.

The WebDriver protocol is kinda garbage when you discover the Chrome DevTools Protocol.

Spawning a browser

CDP works over a WebSocket. Below is a code snippet that starts chromium with a CDP socket. I use a temp directory so chrome doesn’t keep around useless data about the session.

PORT = 1234
HEADLESS = False

chrome_temp = tempfile.TemporaryDirectory()

chromium = subprocess.Popen(
    [
        "chromium",
        "--remote-allow-origins=*",
        f"--remote-debugging-port={PORT}",
        f"--user-data-dir={chrome_temp.name}",
        "--headless" if HEADLESS else "",
    ],
    stdout=subprocess.DEVNULL,
    stderr=subprocess.PIPE,
)

assert chromium.stderr

Connecting to the browser

After running the chromium subprocess, it will print the WebSocket URL to stderr. We can grab the URL and connect to it like this.

DEVTOOLS_URL = ""

while True:
    line = chromium.stderr.readline().decode("utf-8").strip()
    if "DevTools listening on" in line:
        DEVTOOLS_URL = line.split(" ")[-1]
        break

assert DEVTOOLS_URL
print(DEVTOOLS_URL)

ws = websocket.create_connection(DEVTOOLS_URL)

Some notes

The Chrome DevTools Protocol (CDP) is a powerful interface that allows developers and testers to programmatically interact with and control Chromium-based browsers.

Originally designed to power Chrome’s built-in Developer Tools, CDP has evolved into a versatile tool for browser automation, debugging, and profiling. It provides low-level access to browser functionality through a set of domains, each offering specific commands and events for tasks like network monitoring, DOM manipulation, and JavaScript execution. While CDP offers advanced capabilities beyond traditional WebDriver APIs, it can be complex to use directly. As a result, many developers leverage CDP through higher-level automation libraries like Puppeteer, Playwright, and Selenium 4, which provide more user-friendly abstractions while still harnessing CDP’s power for enhanced browser control and debugging.

Developed as a remote debugging protocol, CDP enables direct communication with running Chromium-based browsers. It serves as the foundation for Chrome’s Developer Tools and has expanded to support broader applications in browser automation and testing. The protocol is structured into domains, such as Network, DOM, and CSS, each offering specific commands and events serialized in JSON format. This design allows developers to inspect browser states, control behavior, and gather debugging information programmatically, providing a powerful interface for advanced web development and testing scenarios.

Key capabilities

CDP offers a wide range of capabilities for browser control and debugging. These include inspecting and modifying network requests and responses, manipulating the DOM, executing JavaScript in the page context, and emulating various devices and network conditions. It also provides access to performance profiling tools, allowing developers to analyze page load times and resource usage. Additionally, CDP enables geolocation spoofing and mobile device emulation, which are crucial for testing location-based services and responsive designs. These features make CDP an invaluable tool for advanced web development scenarios, automated testing, and debugging complex web applications.

Integration with automation tools

Automation tools like Puppeteer, Playwright, and Selenium 4 leverage CDP to provide enhanced browser control and debugging capabilities. These frameworks abstract the complexities of CDP, offering user-friendly APIs that simplify browser automation tasks. For instance, Selenium 4 introduced the getDevTools() method on ChromeDriver, allowing testers to access CDP functionality within their existing Selenium scripts. This integration enables advanced scenarios such as network interception, JavaScript debugging, and performance profiling, which were previously challenging or impossible with traditional WebDriver APIs.

Comparison with WebDriver

The Chrome DevTools Protocol (CDP) and WebDriver offer distinct approaches to browser automation, each with its own strengths and use cases. While WebDriver provides a standardized, cross-browser API for automating web browsers, CDP offers more granular control specifically for Chromium-based browsers. WebDriver excels in cross-browser compatibility and ease of use, making it ideal for general web testing scenarios. In contrast, CDP provides deeper access to browser internals, enabling advanced capabilities like network interception, JavaScript debugging, and performance profiling that are not readily available through WebDriver. However, CDP’s power comes with increased complexity and potential instability across browser versions. Tools like WebdriverIO have begun to bridge this gap, offering CDP-like capabilities through WebDriver-compatible interfaces, allowing testers to leverage the best of both worlds in their automation frameworks.