goribot alternatives and similar packages
Based on the "Specific Formats" category.
Alternatively, view goribot alternatives based on common mentions on social networks and blogs.
-
sh
A shell parser, formatter, and interpreter with bash support; includes shfmt -
go-humanize
Go Humans! (formatters for units to human friendly sizes) -
bluemonday
bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS -
mxj
Decode / encode XML to/from map[string]interface{} (or JSON); extract values with dot-notation paths and wildcards. Replaces x2j and j2x packages. -
html-to-markdown
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules. -
omniparser
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc. -
go-pkg-rss
This package reads RSS and Atom feeds and provides a caching mechanism that adheres to the feed specs. -
goq
A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the goquery library -
xquery
XQuery lets you extract data from HTML/XML documents using XPath expression. -
github_flavored_markdown
GitHub Flavored Markdown renderer with fenced code block highlighting, clickable header anchor links. -
go-pkg-xmlx
Extension to the standard Go XML package. Maintains a node tree that allows forward/backwards browsing and exposes some simple single/multi-node search functions. -
pagser
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler -
csvplus
csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins. -
gonameparts
Takes a full name and splits it into individual name parts -
codetree
:evergreen_tree: Parses indented code and returns a tree structure. -
jsoncolor
Colorized JSON output for Go https://godoc.org/github.com/nwidger/jsoncolor
Static code analysis for 29 languages.
Do you think we are missing an alternative of goribot or a related project?
Popular Comparisons
README
Goribot
一个分布式友好的轻量的 Golang 爬虫框架。
!! Warning !!
Goribot 已经被迁移到 Gospider|github.com/zhshch2002/gospider。修复了一些调度问题并分离了网络请求部分到另一个仓库。此仓库会继续保留,建议新朋友使用新的 Gospider。
Goribot has been moved to Gospider|github.com/zhshch2002/gospider. Fixed some scheduling issues and separated the network request part to another repo. This repo will continue to be kept, suggest new friends to use the new Gospider.
🚀Feature
- 优雅的 API
- 整洁的文档
- 高速(单核处理 >1K task/sec)
- 友善的分布式支持
- 便捷的细节
- 相对链接自动转换
- 字符编码自动解码
- HTML,JSON 自动解析
- 丰富的扩展支持
- 请求去重(👈支持分布式)
- 限制请求、速率、并发
- Json,CSV 存储结果
- Robots.txt 支持
- 记录请求异常
- 随机 UA 、随机代理
- 失败重试
- 轻量,适于学习或快速开箱搭建
版本警告
Goribot 仅支持 Go1.13 及以上版本。
👜获取 Goribot
go get -u github.com/zhshch2002/goribot
Goribot 包含一个历史开发版本,如果您需要使用过那个版本,请拉取 Tag 为 v0.0.1 版本。
⚡建立你的第一个项目
package main
import (
"fmt"
"github.com/zhshch2002/goribot"
)
func main() {
s := goribot.NewSpider()
s.AddTask(
goribot.GetReq("https://httpbin.org/get"),
func(ctx *goribot.Context) {
fmt.Println(ctx.Resp.Text)
fmt.Println(ctx.Resp.Json("headers.User-Agent"))
},
)
s.Run()
}
🎉完成
至此你已经可以使用 Goribot 了。更多内容请从 开始使用 了解。
🙏感谢
万分感谢以上项目的帮助🙏。
*Note that all licence references and agreements mentioned in the goribot README section above
are relevant to that project's source code only.