ExecuteCDP for exposing the Chrome DevTools Protocol #83

GCuser99 · 2023-06-20T15:40:17Z

GCuser99
Jun 20, 2023
Maintainer

As of version 4.2, SeleniumVBA now supports communicating with the Chrome and Edge browsers via the Chrome DevTools Protocol, or CDP, providing a low-level interface for browser interaction. There are many 100's of commands that can be used to customize SeleniumVBA automation. For a complete list of CDP commands, see https://chromedevtools.github.io/devtools-protocol/tot/.

This new functionality is exposed through the ExecuteCDP method of the SeleniumVBA WebDriver class. To use ExecuteCDP, specify the command name (such as Page.captureScreenshot) and if inputs arguments are required, a Dictionary object containing input parameter key names and values. The return payload is always a Dictionary with a root key of "value". The ExecuteCDP method packages the input arguments into a CDP JSON object, sends the command to the Selenium WebDriver through two special HTTP endpoints (one for Edge and one for Chrome), and then receives the payload from the Selenium WebDriver. The process is not unlike what is internally performed when executing a normal WebDriver command. Beyond that, it is the User's responsibility to research how to construct the CDP command name and input parameters, and how to parse the return payload.

A few examples

Below is an example of executing one such command to produce an enhanced version of the SaveScreenshot method, with more optionality.

Sub cdp_enhanced_screenshot()
    'this demonstrates using the ExecuteCDP to perform a screenshot with enhanced user control
    Dim driver As SeleniumVBA
    Dim jc As SeleniumVBA
    Dim params As New Dictionary
    'Dim clipRect As New Dictionary
    Dim strB64 As String
    
    Set driver = New WebDriver
    Set jc = New WebJsonConverter
    
    driver.StartEdge
    driver.OpenBrowser
    
    driver.NavigateTo "https://www.wikipedia.org/"
    driver.Wait 500
    
    'see https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-captureScreenshot
    'construct the input parameters Dictionary
    params.Add "format", "jpeg" 'jpeg, png (default), webp
    params.Add "quality", 80 '0 to 100 (jpeg only)
    'clip parameter can be used to snapshot an element rectangle
    '(see GetRect method of the WebDriver and WebElement classes)
    'clipRect.Add "x", 200
    'clipRect.Add "y", 200
    'clipRect.Add "width", 400
    'clipRect.Add "height", 400
    'clipRect.Add "scale", 1
    'params.Add "clip", clipRect
    'the next 3 paramters are currently marked as experimental (as of 11 June, 2023)
    params.Add "captureBeyondViewport", True 'full screenshot
    params.Add "fromSurface", True 'defaults to true
    params.Add "optimizeForSpeed", False 'defaults to false

    'qc the inputs in JSON format
    Debug.Print jc.ConvertToJson(params, 4)
    
    'send the cdp command to the WebDriver and return "data" key of the response Dictionary
    strB64 = driver.ExecuteCDP("Page.captureScreenshot", params)("value")("data")
    
    'results in a base 64 encoded string which must be decoded into a bytearray before saving to file
    driver.SaveBase64StringToFile strB64, ".\screenshotfull.jpg"
    
    driver.Wait 500
    
    driver.CloseBrowser
    driver.Shutdown
End Sub

Another example showing use of the Page.setDownloadBehavior command, which allows us to redirect the browser's download directory AFTER capabilities have been set and the OpenBrowser command has been executed...

Sub test_cdp_enhanced_file_download()
    'this demonstrates using the ExecuteCDP to redirect the default browser
    'download location AFTER capabilities have been set (post-OpenBrowser)
    Dim driver As WebDriver
    Dim caps As WebCapabilities
    Dim params As New Dictionary
    
    Set driver = New WebDriver
   
    driver.StartChrome
    
    'set the directory path for saving download to
    Set caps = driver.CreateCapabilities
    caps.SetDownloadPrefs "%USERPROFILE%\Desktop"
    driver.OpenBrowser caps
    
    'redirect the download location AFTER capabilities have been set!!
    'https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-setDownloadBehavior
    params.Add "behavior", "allow" 'deny, allow, default
    params.Add "downloadPath", driver.ResolvePath(".\")
    driver.ExecuteCDP "Page.setDownloadBehavior", params
    
    'delete legacy copy if it exists
    driver.DeleteFiles ".\test.pdf"
    
    driver.NavigateTo "https://github.com/GCuser99/SeleniumVBA/raw/main/dev/test_files/test.pdf"
    
    driver.WaitForDownload ".\test.pdf"
    
    driver.CloseBrowser
    driver.Shutdown
End Sub

CDP versus supported WebDriver commands

There is obviously overlap in the two buckets of functionality so it's probably worth discussing the comparison. Here are some of my thoughts on the differences, but would be interested in hearing from others...

Currently, CDP can only be used with Edge and Chrome browsers - so Firefox gets left out of cross-browser compatibility.
CDP is for testing/automation at a lower-level of interaction with the browser. As a result, programming effort is generally greater for CDP. You can use CDP commands to emulate most if not all Selenium WebDriver commands, but the reverse is not true. In SeleniumVBA, we could wrap the CDP into new methods or enhanced versions of existing methods.
Due to its low-level nature, CDP is a relatively fast-moving target as the browsers evolve. On the other hand, the Selenium WebDriver's public-facing interface, especially for commands that conform to W3C standard, is more stable due to the issuance of a new WebDriver version for each new browser version, maintaining a mostly unchanged interface. So I would guess that for an application (such as SeleniumVBA) that extensively uses either or both of these two functionalities, maintaining an interface that relies heavily on CDP will likely require more effort than for the WebDriver command-based interface.

So should we use the CDP to design new or enhanced methods? My opinion is a qualified "yes", if we can find compelling use-cases...

I've included a few other examples besides the one shown above in the test_ExecuteCDP module.

Any comments or ideas? Have something to share? Looking for compelling or interesting use cases - if you have one please post it here.

6DiegoDiego9 · 2023-06-20T20:33:27Z

6DiegoDiego9
Jun 20, 2023
Collaborator

Wow this seems like a big new (open) feature!

Quick question: I read that you pass the CDP commands to Selenium WebDriver. Is this necessary? is passing them directly to the browser much more complex?

0 replies

GCuser99 · 2023-06-20T21:05:43Z

GCuser99
Jun 20, 2023
Maintainer Author

Hi @6DiegoDiego9 - yeah I too think it opens up a lot of custom functionality.

I think this method is the simplest way from a development perspective, but then of course has the disadvantage of having to manage the Selenium WebDriver(s) and all that that entails.

The other two ways that I know of (and I know you are aware of too) for accessing the CDP are using your demonstrated DebuggerAddress technique to attach to an existing browser, and then getting a reference to the CDP WebSocket for communication; or using the "pipes method" to communicate directly from VBA. I suppose that if I were building a very narrow/specific automation application, I would probably lean toward the later method in order to do away with the WebDriver complications (install, version alignment, etc.). But for a very general tool like SeleniumVBA, I personally think what we have here is the easiest in terms of maintainability.

0 replies

GCuser99 · 2023-07-09T19:00:29Z

GCuser99
Jul 9, 2023
Maintainer Author

After experimenting with the ExecuteCDP method for a short while, two potential limitations have arisen with using the Selenium WebDriver as means of utilizing full access to the CDP functionality...

Detecting and Responding to CDP Events. There seems to be no elegant way that I know of for detecting and responding to CDP events using WebDriver as the intermediary, which could be important for developing new methods. There only seems to be a WebDriver endpoint for sending commands and receiving the results. This as opposed to the "piping" methods used to connect directly to browser via a Websocket interface, such as VBAChromeDevProtocol. The later does appear to support CDP event callbacks but I have not tested to see if/how well it works.

DOM Element Interaction. The WebDriver and CDP indexing schemes for tracking/referencing elements in the DOM tree are different, and there seems to be no elegant way that I was able to find to map between the two indexing schemes. This is a problem if we wish to mix and match between the two sets of functionalities, such as locating an element using one system, and then interacting with the same element using the other. Thus the CDP may be less useful or practical for improving on functionality to manipulate DOM.

With those two limitations noted, there still does seem to be a significant number of useful things we can do with ExecuteCDP - a few of which I have demonstrated in test_ExecuteCDP, and the Advanced Customization section of the SeleniumVBA Wiki.

Cheers!

0 replies

6DiegoDiego9 · 2023-07-22T21:25:10Z

6DiegoDiego9
Jul 22, 2023
Collaborator

I experimented a bit too on the DOM Element Interaction and I was only able to convert CDP ID to JavaScript ID and viceversa. I didn't find a way a bridge between them and WebDriver ID.

Here is what I obtained for the same element:

Sub test123()
    Dim driver As New WebDriver, elem As WebElement, test As dictionary
    With driver
        .StartChrome: .OpenBrowser
        .NavigateTo "https://github.com/GCuser99/SeleniumVBA/"

        Const CSSselector$ = "div.flex-auto.min-width-0.width-fit.mr-3"

        'Get the WebDriver element ID
        Dim elementId$
        elementId = driver.FindElementByCssSelector(CSSselector).elementId

        'Get the CDP DOM Node ID
        Dim dblNodeId As Double
        Dim Params As New dictionary
        Params.Add "nodeId", .ExecuteCDP("DOM.getDocument")("value")("root")("nodeId")
        Params.Add "selector", CSSselector
        dblNodeId = .ExecuteCDP("DOM.querySelector", Params)("value")("nodeId")

        'Get the JavaScript object ID (wrapper for given node) from the CDP DOM Node ID
        Dim sObjectId$
        sObjectId = .ExecuteCDP("DOM.resolveNode", "{'nodeId':" & dblNodeId & "}")("value")("object")("objectId")

        'Get the CDP DOM Node ID from the JavaScript object ID
        Dim dblNodeIdBis As Double
        dblNodeIdBis = .ExecuteCDP("DOM.requestNode", "{'objectId':'" & sObjectId & "'}")("value")("nodeId")
        Debug.Assert dblNodeId = dblNodeIdBis

        Debug.Print "WebDriver element ID:", elementId
        Debug.Print "CDP DOM Node ID:", dblNodeId
        Debug.Print "Javascript object ID:", sObjectId

        .Shutdown
    End With
End Sub

Even GPT-4 temp0 just confirms what you already noted: https://cloud.typingmind.com/share/0cd848a8-597d-4895-86fe-54251c918246

0 replies

GCuser99 · 2023-07-23T01:27:12Z

GCuser99
Jul 23, 2023
Maintainer Author

Thanks for looking into this and sharing what you found @6DiegoDiego9. Again, wow - that's pretty amazing what GPT-4 can do! That is such coherent response - I'm tempted to insert it here in the discussion in order to memorialize it...

0 replies

GCuser99 · 2023-07-23T02:17:55Z

GCuser99
Jul 23, 2023
Maintainer Author

On second thought - does it add any value to the current and future world to memorialize a GPT response? Knowlege is now an open-source commodity! :-)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ExecuteCDP for exposing the Chrome DevTools Protocol #83

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

ExecuteCDP for exposing the Chrome DevTools Protocol #83

GCuser99 Jun 20, 2023 Maintainer

Replies: 6 comments

6DiegoDiego9 Jun 20, 2023 Collaborator

GCuser99 Jun 20, 2023 Maintainer Author

GCuser99 Jul 9, 2023 Maintainer Author

6DiegoDiego9 Jul 22, 2023 Collaborator

GCuser99 Jul 23, 2023 Maintainer Author

GCuser99 Jul 23, 2023 Maintainer Author

GCuser99
Jun 20, 2023
Maintainer

6DiegoDiego9
Jun 20, 2023
Collaborator

GCuser99
Jun 20, 2023
Maintainer Author

GCuser99
Jul 9, 2023
Maintainer Author

6DiegoDiego9
Jul 22, 2023
Collaborator

GCuser99
Jul 23, 2023
Maintainer Author

GCuser99
Jul 23, 2023
Maintainer Author