Popularity

7.9

Growing

Activity

0.0

Declining

Stars 1,693

Watchers 39

Forks 83

Last Commit 3 months ago

Programming language: Go

License: BSD 2-clause "Simplified" License

Tags: Natural Language Processing

Latest version: v1.0.0-alpha

spaGO alternatives and similar packages

Based on the "Natural Language Processing" category.
Alternatively, view spaGO alternatives based on common mentions on social networks and blogs.

prose

8.7 1.9 spaGO VS prose

DISCONTINUED. :book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.
go-i18n

8.5 7.1 spaGO VS go-i18n

Translate your Go program into multiple languages.

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

Promo workos.com

gse

8.4 4.4 spaGO VS gse

Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others.
gojieba

8.4 1.9 spaGO VS gojieba

"结巴"中文分词的Golang版本
go-pinyin

7.9 4.5 spaGO VS go-pinyin

汉字转拼音
when

7.6 5.1 spaGO VS when

A natural language date/time parser with pluggable rules
kagome

7.0 6.4 spaGO VS kagome

Self-contained Japanese Morphological Analyzer written in pure Go
whatlanggo

6.8 0.0 spaGO VS whatlanggo

Natural language detection library for Go
nlp

6.3 0.0 spaGO VS nlp

DISCONTINUED. [UNMANTEINED] Extract values from strings and fill your structs with nlp.
sentences

6.2 4.5 spaGO VS sentences

A multilingual command line sentence tokenizer in Golang
universal-translator

6.1 0.0 spaGO VS universal-translator

:speech_balloon: i18n Translator for Go/Golang using CLDR data + pluralization rules
locales

5.9 0.0 spaGO VS locales

:earth_americas: a set of locales generated from the CLDR Project which can be used independently or within an i18n package; these were built for use with, but not exclusive to https://github.com/go-playground/universal-translator
getlang

4.9 0.0 spaGO VS getlang

Natural language detection package in pure Go
RAKE.go

4.5 0.0 spaGO VS RAKE.go

A Go port of the Rapid Automatic Keyword Extraction algorithm (RAKE)
go-unidecode

4.5 3.1 spaGO VS go-unidecode

ASCII transliterations of Unicode text.
go-nlp

4.3 0.0 spaGO VS go-nlp

DISCONTINUED. Utilities for working with discrete probability distributions and other tools useful for doing NLP work.
segment

4.3 0.0 spaGO VS segment

A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29
gounidecode

4.2 0.0 spaGO VS gounidecode

Unicode transliterator for #golang
go-stem

3.9 0.0 spaGO VS go-stem

Word Stemming in Go
textcat

3.8 0.0 spaGO VS textcat

A Go package for n-gram based text categorization, with support for utf-8 and raw text
MMSEGO

3.6 0.0 spaGO VS MMSEGO

Chinese word splitting algorithm MMSEG in GO
address

3.3 6.5 spaGO VS address

Address handling for Go.
go-localize

3.3 0.0 spaGO VS go-localize

i18n (Internationalization and localization) engine written in Go, used for translating locale strings.
go2vec

3.2 0.0 spaGO VS go2vec

Read and use word2vec vectors in Go
stemmer

3.1 0.0 spaGO VS stemmer

Stemmer packages for Go programming language. Includes English, German and Dutch stemmers.
petrovich

2.9 3.8 spaGO VS petrovich

Golang port of Petrovich - an inflector for Russian anthroponyms.
porter2

2.9 0.0 spaGO VS porter2

High Performance Porter2 Stemmer
dpar

2.8 3.2 spaGO VS dpar

Neural network transition-based dependency parser (in Rust)
iuliia-go

2.8 1.8 spaGO VS iuliia-go

Transliterate Cyrillic → Latin in every possible way
govader

2.7 0.0 spaGO VS govader

vader sentiment analysis in go
go-mystem

2.6 0.0 spaGO VS go-mystem

CGo bindings to Yandex.Mystem
go-tinydate

2.5 0.0 spaGO VS go-tinydate

A tiny date object in Go. Tinydate uses only 4 bytes of memory
spreak

2.4 6.4 spaGO VS spreak

Flexible translation and humanization library for Go, based on the concepts behind gettext.
snowball

2.4 0.0 L1 spaGO VS snowball

Cgo binding for Snowball C library
paicehusk

2.4 0.0 spaGO VS paicehusk

Golang implementation of the Paice/Husk Stemming Algorithm
detectlanguage

2.0 0.0 spaGO VS detectlanguage

Detect Language API Go Client
golibstemmer

2.0 0.0 spaGO VS golibstemmer

Go bindings for the snowball libstemmer library including porter 2
gotokenizer

2.0 0.0 spaGO VS gotokenizer

A tokenizer based on the dictionary and Bigram language models for Go. (Now only support chinese segmentation)
t

1.8 3.5 spaGO VS t

t: translation util for go, using GNU gettext
icu

1.8 0.0 spaGO VS icu

Cgo binding for icu4c library
libtextcat

1.8 0.0 spaGO VS libtextcat

Cgo binding for libtextcat C library
shamoji

1.3 0.0 spaGO VS shamoji

The shamoji (杓文字) is a word filtering package
porter

1.2 0.0 spaGO VS porter

porter stemmer
gosentiwordnet

0.9 0.0 spaGO VS gosentiwordnet

💬 Sentiment analyzer library using SentiWordnet in Go
go-eco

0.5 0.0 spaGO VS go-eco

Automatically exported from code.google.com/p/go-eco
govader-backend

0.5 2.6 spaGO VS govader-backend

Sentimental Analysis Microservice
spelling-corrector

0.3 0.0 spaGO VS spelling-corrector

Spelling corrector for Spanish language

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Do you think we are missing an alternative of spaGO or a related project?

Add another 'Natural Language Processing' Package

Popular Comparisons

README

If you like the project, please ★ star this repository to show your support! 🤩

Currently, the main branch contains version v1.0.0-alpha.0, which differs substantially from version v0.7.0. For NLP-related features, check out the v0.7.0 release branch. The CHANGELOG details the major changes.

A Machine Learning library written in pure Go designed to support relevant neural architectures in Natural Language Processing.

Spago is self-contained, in that it uses its own lightweight computational graph both for training and inference, easy to understand from start to finish.

It provides:

Automatic differentiation via dynamic define-by-run execution
Gradient descent optimizers (Adam, RAdam, RMS-Prop, AdaGrad, SGD)
Feed-forward layers (Linear, Highway, Convolution...)
Recurrent layers (LSTM, GRU, BiLSTM...)
Attention layers (Self-Attention, Multi-Head Attention...)
Memory-efficient Word Embeddings (with badger key–value store)
Gob compatible neural models for serialization

Usage

Requirements:

Go 1.18

Clone this repo or get the library:

go get -u github.com/nlpodyssey/spago

Dependencies

The core module of Spago relies only on [testify](github.com/stretchr/testify) for unit testing. In other words, it has "zero dependencies", and we are committed to keeping it that way as much as possible.

Spago uses a multi-module workspace to ensure that additional dependencies are downloaded only when specific features (e.g. persistent embeddings) are used.

Getting Started

A good place to start is by looking at the implementation of built-in neural models, such as the LSTM. Except for a few linear algebra operations written in assembly for optimal performance (a bit of copying from Gonum), it's straightforward Go code, so you don't have to worry. In fact, SpaGO could have been written by you :)

The behavior of a neural model is characterized by a combination of parameters and equations. Mathematical expressions must be defined using the auto-grad ag package in order to take advantage of automatic differentiation.

In this sense, we can say the computational graph is at the center of the Spago machine learning framework.

Example 1

Here is an example of how to calculate the sum of two variables:

package main

import (
  "fmt"
  "github.com/nlpodyssey/spago/ag"
  "github.com/nlpodyssey/spago/mat"
)

type T = float32

func main() {
  // create a new node of type variable with a scalar
  a := ag.Var(mat.NewScalar(T(2.0))).WithGrad(true)
  // create another node of type variable with a scalar
  b := ag.Var(mat.NewScalar(T(5.0))).WithGrad(true)
  // create an addition operator (the calculation is actually performed here)
  c := ag.Add(a, b)

  // print the result
  fmt.Printf("c = %v (float%d)\n", c.Value(), c.Value().Scalar().BitSize())

  ag.Backward(c, mat.NewScalar(T(0.5)))
  fmt.Printf("ga = %v\n", a.Grad())
  fmt.Printf("gb = %v\n", b.Grad())
}

Output:

c = [7] (float32)
ga = [0.5]
gb = [0.5]

Example 2

Here is a simple implementation of the perceptron formula:

package main

import (
  "log"
  "os"

  . "github.com/nlpodyssey/spago/ag"
  "github.com/nlpodyssey/spago/ag/encoding"
  "github.com/nlpodyssey/spago/ag/encoding/dot"
  "github.com/nlpodyssey/spago/mat"
)

func main() {
  x := Var(mat.NewScalar(-0.8)).WithName("x")
  w := Var(mat.NewScalar(0.4)).WithName("w")
  b := Var(mat.NewScalar(-0.2)).WithName("b")

  y := Sigmoid(Add(Mul(w, x), b))

  err := dot.Encode(encoding.NewGraph(y), os.Stdout)
  if err != nil {
    log.Fatal(err)
  }
}

In this case, we are interested in rendering the resulting graph with Graphviz:

go run main.go | dot -Tpng -o g.png

Example 3

As a next step, let's take a look at how to create a linear regression model ($y = wx + b$) and how it will be trained.

The following algorithm will try to learn the correct values for weight and bias.

By the end of our training, our equation will approximate the line of best fit the objective function $y = 3x + 1$.

package main

import (
    "fmt"

    "github.com/nlpodyssey/spago/ag"
    "github.com/nlpodyssey/spago/gd"
    "github.com/nlpodyssey/spago/gd/sgd"
    "github.com/nlpodyssey/spago/initializers"
    "github.com/nlpodyssey/spago/losses"
    "github.com/nlpodyssey/spago/mat"
    "github.com/nlpodyssey/spago/mat/float"
    "github.com/nlpodyssey/spago/mat/rand"
    "github.com/nlpodyssey/spago/nn"
)

const (
    epochs   = 100  // number of epochs
    examples = 1000 // number of examples
)

type Linear struct {
    nn.Module
    W nn.Param `spago:"type:weights"`
    B nn.Param `spago:"type:biases"`
}

func NewLinear[T float.DType](in, out int) *Linear {
    return &Linear{
        W: nn.NewParam(mat.NewEmptyDense[T](out, in)),
        B: nn.NewParam(mat.NewEmptyVecDense[T](out)),
    }
}

func (m *Linear) InitWithRandomWeights(seed uint64) *Linear {
    initializers.XavierUniform(m.W.Value(), 1.0, rand.NewLockedRand(seed))
    return m
}

func (m *Linear) Forward(x ag.Node) ag.Node {
    return ag.Add(ag.Mul(m.W, x), m.B)
}

func main() {
    m := NewLinear[float64](1, 1).InitWithRandomWeights(42)

    optimizer := gd.NewOptimizer(m, sgd.New[float64](sgd.NewConfig(0.001, 0.9, true)))

    normalize := func(x float64) float64 { return x / float64(examples) }
    objective := func(x float64) float64 { return 3*x + 1 }
    criterion := losses.MSE

    learn := func(input, expected float64) float64 {
        x, target := ag.Scalar(input), ag.Scalar(expected)
        y := m.Forward(x)
        loss := criterion(y, target, true)
        defer ag.Backward(loss) //  free the memory of the graph before return
        return loss.Value().Scalar().F64()
    }

    for epoch := 0; epoch < epochs; epoch++ {
        for i := 0; i < examples; i++ {
            x := normalize(float64(i))
            loss := learn(x, objective(x))
            if i%100 == 0 {
                fmt.Println(loss)
            }
        }
        optimizer.Do()
    }

    fmt.Printf("\nW: %.2f | B: %.2f\n\n", m.W.Value().Scalar().F64(), m.B.Value().Scalar().F64())
}

Output:

W: 3.00 | B: 1.00

Performance

Goroutines play a very important role in making Spago efficient; in fact Forward operations are executed concurrently (up to GOMAXPROCS). As soon as an Operator is created (usually by calling one of the functions in the ag package, such as Add, Prod, etc.), the related Function's Forward procedure is performed on a new goroutine. Nevertheless, it's always safe to ask for the Operator's Value() without worries: if it's called too soon, the function will lock until the result is computed, and then return the value.

Known Limits

Sadly, at the moment, Spago is not GPU friendly by design.

Projects using SpaGo

Below is a list of projects that use Spago:

Cybertron - State-of-the-art Natural Language Processing in Go.
GoFlair - Named Entities Recognition via CLMs+BiLSTM+CRF
Golem - A batteries-included implementation of "TabNet: Attentive Interpretable Tabular Learning".
Translator - A simple self-hostable Machine Translation service.
PiSquared - A Telegram bot that asks you a question and evaluate the response you provide.
WhatsNew - A simple tool to collect and process quite a few web news from multiple sources.

Contributing

We're glad you're thinking about contributing to Spago! If you think something is missing or could be improved, please open issues and pull requests. If you'd like to help this project grow, we'd love to have you!

To start contributing, check the Contributing Guidelines.

Contact

We encourage you to write an issue. This would help the community grow.

If you really want to write to us privately, please email Matteo Grella with your questions or comments.

Acknowledgments

Spago is part of the open-source NLP Odyssey initiative initiated by members of the EXOP team (now part of Crisis24). I would therefore like to thank EXOP GmbH here, which is providing full support for development by promoting the project and giving it increasing importance.

spaGO

Self-contained Machine Learning and Natural Language Processing library in Go