96 Text Processing packages and projects
-
goldmark
8.7 6.5 Go:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured. -
bluemonday
8.5 5.5 Gobluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS -
html-to-markdown
8.3 8.5 Go⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules. -
omniparser
7.3 4.3 Goomniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc. -
mxj
6.9 4.5 GoDecode / encode XML to/from map[string]interface{} (or JSON); extract values with dot-notation paths and wildcards. Replaces x2j and j2x packages. -
go-pkg-rss
6.4 0.0 GoDISCONTINUED. This package reads RSS and Atom feeds and provides a caching mechanism that adheres to the feed specs. -
Koazee
6.4 0.0 GoA StreamLike, Immutable, Lazy Loading and smart Golang Library to deal with slices. -
go-edlib
6.3 1.8 Go📚 String comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc... -
strutil-go
5.8 5.3 GoGo metrics for calculating string similarity and other string utility functions -
goribot
5.5 6.1 GoDISCONTINUED. A simple golang spider/scraping framework,build a spider in 3 lines. -
goq
5.4 0.0 GoA declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the goquery library -
xquery
5.3 0.0 GoDISCONTINUED. XQuery lets you extract data from HTML/XML documents using XPath expression. -
gospider
5.2 3.6 GoDISCONTINUED. ⚡ Light weight Golang spider framework | 轻量的 Golang 爬虫框架 [GET https://api.github.com/repos/zhshch2002/gospider: 404 - Not Found // See: https://docs.github.com/rest/repos/repos#get-a-repository] -
go-pkg-xmlx
5.1 0.0 GoDISCONTINUED. Extension to the standard Go XML package. Maintains a node tree that allows forward/backwards browsing and exposes some simple single/multi-node search functions. -
github_flavored_markdown
5.1 0.0 GoGitHub Flavored Markdown renderer with fenced code block highlighting, clickable header anchor links. -
Ren'Py graph vizualiser
4.3 5.8 GoDraws a flowchart graph of any Visual Novel from Renpy .rpy files ! -
pagser
4.0 2.7 GoPagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler -
csvplus
3.2 0.0 Gocsvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins. -
Go Mathematical Expression Toolkit
2.9 0.5 GoGo Mathematical Expression Toolkit. Run-time mathematical expression parser and evaluation engine. -
go-fasttld
2.2 6.7 Gogo-fasttld is a high performance effective top level domains (eTLD) extraction module. -
Tagify
2.1 4.0 HTMLTagify produces a set of tags from a given source. Source can be either an HTML page, a Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages. -
TySug
1.6 1.9 GoA project around helping to prevent typing typos. TySug (Typo Suggestions) suggests alternative words with respect to keyboard layouts -
walker
0.7 0.0 GoSeamlessly fetch paginated data from any source. Simple and high performance API scraping included! -
Markov Chain Algorithm
0.3 0.0 GoA Markov chain algorithm generates text by creating a statistical model of potential textual suffixes for a given prefix.
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
Promo
coderabbit.ai
