Rod is a High-level Chrome Devtools driver directly based on Chrome DevTools Protocol. It's designed for web automation and scraping. Rod also tries to expose low-level interfaces to users, so that whenever a function is missing users can easily send control requests to the browser directly.
- Fluent interface design to reduce verbose code
- Chained context design, intuitive to timeout or cancel the long-running task
- Debugging friendly, auto input tracing, remote monitoring headless browser
- Automatically find or download [browser](lib/launcher)
- No external dependencies, CI tested on Linux, Mac, and Windows
- High-level helpers like WaitStable, WaitRequestIdle, GetDownloadFile, Resource
- Two-step WaitEvent design, never miss an event
- Correctly handles nested iframes
- No zombie chrome process after the crash (how it works)
You can find examples from [here](examples_test.go) or [here](lib/examples).
For more detailed examples, please search the unit tests.
Such as the usage of method
HandleAuth, search the all the
*_test.go files that contain
You can also search the GitHub issues, they contain a lot of usage examples too.
If you have questions, please raise an issue or join the gitter room.
How it works
Here's the common start process of Rod:
Try to connect to a Chrome Devtools endpoint, if not found try to launch a local browser, if still not found try to download one, then connect again. The lib to handle it is [here](lib/launcher).
Use the JSON-RPC to talk to the browser endpoint to control it. The client to handle it is [here](lib/cdp).
The type definitions of the data transmitted via JSON-RPC are handled by this [lib](lib/proto).
To control a specific page, Rod will first inject a js helper script to it. Rod uses it to query and manipulate the page content. The js lib is [here](lib/assets).
Q: How to use Rod with docker
To let rod work with docker is very easy:
Run the Rod image
docker run -p 9222:9222 ysmood/rod
Open another terminal and run a go program like this [example](lib/examples/remote-launch/main.go)
The Rod image can dynamically launch a chrome for each remote driver with customizable chrome flags. It's [tuned](lib/docker/Dockerfile) for screenshots and fonts among popular natural languages. You can easily load balance requests to the cluster of this image, each container can create multiple browser instances at the same time.
Q: Does it support other browsers like Firefox or Edge
Rod should work with any browser that supports Chrome DevTools Protocol. For now, Firefox is supporting this protocol, and Edge will adopt chromium as their backend, so it seems like most major browsers will support it in the future except for Safari.
Q: Why is it called Rod
Rod is related to puppetry, see Rod Puppet.
So we are the puppeteer, Chrome is the puppet, we use the rod to control the puppet.
So in this sense,
puppeteer.js sounds strange, we are controlling a puppeteer?
Q: How to contribute
Please check this [doc](.github/CONTRIBUTING.md).
Q: How versioning is handled
Semver is used.
v1.0.0 whenever the second section changed, such as
v0.2.0, there must be some public API changes, such as changes of function names or parameter types. If only the last section changed, no public API will be changed.
Q: Why another puppeteer like lib
There are a lot of great projects, but no one is perfect, choose the best one that fits your needs is important.
It's slower by design because it encourages the use of hard-coded sleep. When work with Rod, you generally don't use sleep at all. Therefore it's more buggy to use selenium if the network is unstable. It's harder to setup and maintain because of extra dependencies like a browser driver.
With Puppeteer, you have to handle promise/async/await a lot. It requires a deep understanding of how promises works which are usually painful for QA to write automation tests. End to end tests usually requires a lot of sync operations to simulate human inputs, because Puppeteer is based on Nodejs all control signals it sends to chrome will be async calls, so it's unfriendly for QA from the beginning.
With Chromedp, you have to use their verbose DSL like tasks to handle the main logic, because Chromedp uses several wrappers to handle execution with context and options which makes it very hard to understand their code when bugs happen. The DSL like wrapper also make the Go type useless when tracking issues.
It's painful to use Chromedp to deal with iframes, this ticket is still open after years.
When a crash happens, Chromedp will leave the zombie chrome process on Windows and Mac.
Cypress is very limited, for closed shadow dom or cross-domain iframes it's almost unusable. Read their limitation doc for more details.