go-cluster alternatives and similar packages

Based on the "Machine Learning" category.
Alternatively, view go-cluster alternatives based on common mentions on social networks and blogs.

GoLearn

9.6 0.0 go-cluster VS GoLearn

Machine Learning for Go
gorse

9.4 7.1 go-cluster VS gorse

Gorse open source recommender system engine

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

Promo workos.com

Gorgonia

9.1 2.8 go-cluster VS Gorgonia

Gorgonia is a library that helps facilitate machine learning in Go.
m2cgen

8.4 0.0 go-cluster VS m2cgen

Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
gosseract

8.4 6.7 go-cluster VS gosseract

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library
tfgo

8.3 1.5 go-cluster VS tfgo

Tensorflow + Go, the gopher way
envd

8.0 9.0 go-cluster VS envd

🏕️ Reproducible development environment
goml

7.9 0.0 go-cluster VS goml

On-line Machine Learning in Go (and so much more)
gago

7.3 0.0 go-cluster VS gago

:four_leaf_clover: Evolutionary optimization library for Go (genetic algorithm, partical swarm optimization, differential evolution)
bayesian

7.2 2.3 go-cluster VS bayesian

Naive Bayesian Classification for Golang.
CloudForest

7.1 0.0 go-cluster VS CloudForest

Ensembles of decision trees in go/golang.
ocrserver

7.0 0.0 go-cluster VS ocrserver

A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well
onnx-go

6.8 2.3 go-cluster VS onnx-go

onnx-go gives the ability to import a pre-trained neural network within Go without being linked to a framework or library.
gobrain

6.7 0.0 go-cluster VS gobrain

Neural Networks written in go
go-deep

6.6 3.5 go-cluster VS go-deep

Artificial Neural Network
sklearn

6.0 0.0 go-cluster VS sklearn

bits of sklearn ported to Go #golang
regommend

5.9 0.0 go-cluster VS regommend

Recommendation engine for Go
Goptuna

5.5 6.4 go-cluster VS Goptuna

A hyperparameter optimization framework, inspired by Optuna.
go-galib

5.5 0.0 go-cluster VS go-galib

Genetic Algorithms library written in Go / golang
goRecommend

5.3 0.0 go-cluster VS goRecommend

Collaborative Filtering (CF) Algorithms in Go!
goga

5.3 0.0 go-cluster VS goga

Golang Genetic Algorithm
shield

5.2 0.0 go-cluster VS shield

Bayesian text classifier with flexible tokenizers and storage backends for Go
go-fann

4.6 0.0 go-cluster VS go-fann

Go bindings for FANN, library for artificial neural networks
goscore

4.4 0.0 go-cluster VS goscore

Go Scoring API for PMML
neat

4.2 0.0 go-cluster VS neat

Plug-and-play, parallel Go framework for NeuroEvolution of Augmenting Topologies (NEAT).
fonet

4.1 0.0 go-cluster VS fonet

fonet is a deep neural network package for Go.
go-featureprocessing

4.1 5.2 go-cluster VS go-featureprocessing

🔥 Fast, simple sklearn-like feature processing for Go
libsvm

4.0 0.0 go-cluster VS libsvm

libsvm go version
NEUGO

3.8 0.0 go-cluster VS NEUGO

NEUGO: Neural Networks in Go
go-pr

3.8 0.0 go-cluster VS go-pr

Pattern recognition package in Go lang.
neural-go

3.7 0.0 go-cluster VS neural-go

A multilayer perceptron network implemented in Go, with training via backpropagation.
GoMind

3.7 0.0 go-cluster VS GoMind

A simplistic Neural Network Library in Go
Varis

3.5 0.0 go-cluster VS Varis

Golang Neural Network
golinear

3.4 0.0 go-cluster VS golinear

liblinear bindings for Go
EAGO

2.9 0.0 go-cluster VS EAGO

EAGO: Evolutionary Algorithms in Go
godist

2.8 0.0 go-cluster VS godist

Probability distributions and associated methods in Go
randomforest

2.8 2.6 go-cluster VS randomforest

Random Forest implementation in golang
evoli

2.8 0.0 go-cluster VS evoli

Genetic Algorithm and Particle Swarm Optimization
ddt

2.3 0.0 go-cluster VS ddt

Golang Dynamic Decision Tree
probab

2.0 0.0 go-cluster VS probab

Automatically exported from code.google.com/p/probab
mlgo

1.2 0.0 go-cluster VS mlgo

Automatically exported from code.google.com/p/mlgo

Do you think we are missing an alternative of go-cluster or a related project?

Add another 'Machine Learning' Package

Popular Comparisons

README

go-cluster

GO implementation of clustering algorithms: k-modes and k-prototypes.

K-modes algorithm is very similar to well-known clustering algorithm k-means. The difference is how the distance is computed. In k-means Euclidean distance between two vectors is most commonly used. While it works well for numerical, continuous data it is not suitable to use it with categorical data as it is impossible to compute the distance between values like ‘Europe’ and ‘Africa’. This is why in k-modes, the Hamming distance between vectors is used - it shows how many elements of two vectors is different. It is a good alternative for one-hot encoding while dealing with large number of categories for one feature. K-prototypes is used to cluster mixed data (both categorical and numerical).

Implementation of algorithms is based on papers: HUANG97, HUANG98, CAO09 and partially inspired by python implementation of same algorithms: KMODES.

Installation

go get -u gopkg.in/e-XpertSolutions/go-cluster.v1

Usage

This is basic configuration and usage of KModes and KPrototypes algorithms. For more information please refer to the documentation.

package main

import (
    "fmt"
    "github.com/e-XpertSolutions/go-cluster/cluster"
)

func main() {

    //input categorical data first must be dictionary-encoded to numbers - for example for values
    //"blue", "red", "green" it can be 1,2,3

    data := cluster.NewDenseMatrix(lineNumber, columnNumber, rawData)
    newData := cluster.NewDenseMatrix(newLineNumber, newColumnNumber, newRawData)


    //input parameters for the algorithm

    //distance and initialization functions may be chosen from the package or one may use 
    //custom functions with proper arguments
    distanceFunction := cluster.WeightedHammingDistance
    initializationFunction := cluster.InitCao

    //number of clusters and maximum number of iterations 
    clustersNumber := 5
    maxIteration := 20

    //weight vector - used to set importance of the features, bigger number means greater 
    //contribution to the cost function
    //vector must be of the same length as the number of features in dataset
    //it is not compulsory, if 'nil' then all features are treated equally (weight = 1)  
    weights := []float64{1,1,2}
    wvec := [][]float64{weights}

    //path to file where model will be saved or loaded from using LoadModel(), SaveModel()
    //if no need to load or save the model, can be set to empty string
    path = "km.txt"

    //KModes algorithm
    //initialization
    km := cluster.NewKModes(distanceFunction, initializationFunction, clustersNumber, 1, 
    maxIteration, wvec, "km.txt")


    //training
    //after training it is possible to access clusters centers vectors and computed labels
    //using km.ClusterCentroids and km.Labels
    err := km.FitModel(data)
    if err != nil {
        fmt.Println(err)
    }

    //predicting labels for new data
    newLabels, err := km.Predict(newData)
    if err != nil {
        fmt.Println(err)
    }


    //KPrototypes algorithm
    //it needs two more parameters than k-modes:
    //categorical - vector with numbers indicating columns with categorical features
    //gamma - float number, importance of cost contribution for numerical values
    categorical := []int{1} // means that only column number one contains categorical data
    gamma := 0.2 //cost from distance function for numerical data will be multiplied by 0.2

    //initialization
    kp := cluster.NewKPrototypes(distanceFunction, initializationFunction, categorical, 
    clustersNumber, 1, maxIteration, wvec, gamma, "km.txt")

    //training
    err := kp.FitModel(data)
    if err != nil {
        fmt.Println(err)
    }

    //predicting labels for new data
    newLabelsP, err := kp.Predict(newData)
    if err != nil {
        fmt.Println(err)
    }
}

Contributing

Contributions are greatly appreciated. The project follows the typical GitHub pull request model for contribution.

License

The sources are release under a BSD 3-Clause License. The full terms of that license can be found in LICENSE file of this repository.

References

[HUANG97]: Huang, Z.: Clustering large data sets with mixed numeric and categorical values, Proceedings of the First Pacific Asia Knowledge Discovery and Data Mining Conference, Singapore, pp. 21-34, 1997.

[HUANG98] Huang, Z.: Extensions to the k-modes algorithm for clustering large data sets with categorical values, Data Mining and Knowledge Discovery 2(3), pp. 283-304, 1998.

[CAO09] Cao, F., Liang, J, Bai, L.: A new initialization method for categorical data clustering, Expert Systems with Applications 36(7), pp. 10223-10228., 2009.

[KMODES] Python implementation of k-modes: https://github.com/nicodv/kmodes

*Note that all licence references and agreements mentioned in the go-cluster README section above are relevant to that project's source code only.