Popularity

8.7

Declining

Activity

4.1

Stars 3,362

Watchers 151

Forks 290

Last Commit 6 days ago

Programming language: Go

License: Apache License 2.0

Tags: Distributed Systems

gleam alternatives and similar packages

Based on the "Distributed Systems" category.
Alternatively, view gleam alternatives based on common mentions on social networks and blogs.

grpc-go

9.9 9.6 gleam VS grpc-go

The Go language implementation of gRPC. HTTP/2 based RPC
go-micro

9.9 6.4 gleam VS go-micro

A Go microservices framework

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

Promo workos.com

Nomad

9.7 9.9 gleam VS Nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
go-zero

9.7 9.5 gleam VS go-zero

DISCONTINUED. go-zero is a web and rpc framework written in Go. It's born to ensure the stability of the busy sites with resilient design. Builtin goctl greatly improves the development productivity. [Moved to: https://github.com/zeromicro/go-zero]
micro

9.6 8.7 gleam VS micro

A Go service development platform
raft

9.5 6.0 gleam VS raft

Golang implementation of the Raft consensus protocol
rpcx

9.5 8.2 gleam VS rpcx

Best microservices framework in Go, like alibaba Dubbo, but with more features, Scale easily. Try it. Test it. If you feel it's better, use it! 𝐉𝐚𝐯𝐚有𝐝𝐮𝐛𝐛𝐨, 𝐆𝐨𝐥𝐚𝐧𝐠有𝐫𝐩𝐜𝐱! build for cloud!
tendermint

9.5 0.0 gleam VS tendermint

⟁ Tendermint Core (BFT Consensus) in Go
ringpop-go

9.3 0.0 gleam VS ringpop-go

Scalable, fault-tolerant application-layer sharding for Go applications
Kitex

9.3 9.4 gleam VS Kitex

Go RPC framework with high-performance and strong-extensibility for building micro-services.
Serf

9.3 5.6 gleam VS Serf

Service orchestration and management tool.
torrent

9.2 9.3 gleam VS torrent

Full-featured BitTorrent client package and utilities
KrakenD

9.2 8.4 gleam VS KrakenD

Ultra performant API Gateway with middlewares. A project hosted at The Linux Foundation
dht

9.2 9.3 gleam VS dht

Full-featured BitTorrent client package and utilities
dragonboat

9.1 6.4 gleam VS dragonboat

A feature complete and high performance multi-group Raft library in Go.
Encore

8.9 9.6 gleam VS Encore

Encore is the Backend Development Platform purpose-built to help you create event-driven and distributed systems.
Dkron

8.9 8.4 gleam VS Dkron

Dkron - Distributed, fault tolerant job scheduling system https://dkron.io
emitter-io

8.8 6.2 gleam VS emitter-io

High performance, distributed and low latency publish-subscribe platform.
DHT

8.6 0.0 gleam VS DHT

BitTorrent DHT Protocol && DHT Spider.
glow

8.6 0.0 gleam VS glow

Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also working on another similar pure Go system, https://github.com/chrislusf/gleam , which is more flexible and more performant.
Olric

8.5 5.7 gleam VS Olric

Distributed in-memory object store. It can be used as an embedded Go library and a language-independent service.
gocelery

8.3 0.0 gleam VS gocelery

Celery Distributed Task Queue in Go
liftbridge

8.3 6.4 gleam VS liftbridge

Lightweight, fault-tolerant message streams.
Dragonfly

8.2 9.7 gleam VS Dragonfly

Dragonfly is an open source P2P-based file distribution and image acceleration system. It is hosted by the Cloud Native Computing Foundation (CNCF) as an Incubating Level Project.
go-doudou

8.0 8.4 gleam VS go-doudou

go-doudou（doudou pronounce /dəudəu/）is OpenAPI 3.0 (for REST) spec and Protobuf v3 (for grpc) based lightweight microservice framework. It supports monolith service application as well.
hprose

7.8 7.1 gleam VS hprose

Hprose is a cross-language RPC. This project is Hprose for Golang.
redis-lock

7.7 4.6 gleam VS redis-lock

Simplified distributed locking implementation using Redis
go-health

7.3 3.7 gleam VS go-health

Library for enabling asynchronous health checks in your service
arpc

7.2 7.2 gleam VS arpc

More effective network communication, two-way calling, notify and broadcast supported.
rain

7.2 7.2 gleam VS rain

🌧 BitTorrent client and library in Go
gorpc

7.0 0.0 gleam VS gorpc

Simple, fast and scalable golang rpc library for high load
resgate

6.9 0.0 gleam VS resgate

A Realtime API Gateway used with NATS to build REST, real time, and RPC APIs, where all your clients are synchronized seamlessly.
Temporal

6.9 9.1 gleam VS Temporal

Temporal Go SDK
trpc-go

6.9 8.9 gleam VS trpc-go

A pluggable, high-performance RPC framework written in golang
consistent

6.8 0.0 gleam VS consistent

Consistent hashing with bounded loads in Golang
go-peerflix

6.7 0.0 gleam VS go-peerflix

Go Peerflix
digota

6.7 0.0 gleam VS digota

ecommerce microservice
go-sundheit

6.4 6.4 gleam VS go-sundheit

A library built to provide support for defining service health for golang services. It allows you to register async health checks for your dependencies and the service itself, provides a health endpoint that exposes their status, and health metrics.
go-jump

6.1 1.8 gleam VS go-jump

go-jump: Jump consistent hashing
sleuth

6.0 1.8 gleam VS sleuth

A Go library for master-less peer-to-peer autodiscovery and RPC between HTTP services
jsonrpc

5.1 2.3 gleam VS jsonrpc

The jsonrpc package helps implement of JSON-RPC 2.0
dynamolock

5.0 9.3 gleam VS dynamolock

DynamoDB Lock Client for Go
outboxer

4.8 7.6 gleam VS outboxer

A library that implements the outboxer pattern in go
Maestro

4.4 0.0 gleam VS Maestro

Take control of your data, connect with anything, and expose it anywhere through protocols such as HTTP, GraphQL, and gRPC.
doublejump

4.2 0.0 gleam VS doublejump

A revamped Google's jump consistent hash
dot

3.9 0.0 gleam VS dot

distributed data sync with operational transformation/transforms
celeriac

3.7 0.0 gleam VS celeriac

Golang client library for adding support for interacting and monitoring Celery workers, tasks and events.
drmaa

3.6 1.8 gleam VS drmaa

Compute cluster (HPC) job submission library for Go (#golang) based on the open DRMAA standard.
go-mysql-lock

3.3 4.1 gleam VS go-mysql-lock

MySQL Backed Locking Primitive
flowgraph

3.1 0.0 gleam VS flowgraph

Flowgraph package for scalable asynchronous system development

Do you think we are missing an alternative of gleam or a related project?

Add another 'Distributed Systems' Package

Popular Comparisons

README

Gleam

Gleam is a high performance and efficient distributed execution system, and also simple, generic, flexible and easy to customize.

Gleam is built in Go, and the user defined computation can be written in Go, Unix pipe tools, or any streaming programs.

High Performance

Pure Go mappers and reducers have high performance and concurrency.
Data flows through memory, optionally to disk.
Multiple map reduce steps are merged together for better performance.

Memory Efficient

Gleam does not have the common GC problem that plagued other languages. Each executor runs in a separated OS process. The memory is managed by the OS. One machine can host many more executors.
Gleam master and agent servers are memory efficient, consuming about 10 MB memory.
Gleam tries to automatically adjust the required memory size based on data size hints, avoiding the try-and-error manual memory tuning effort.

Flexible

The Gleam flow can run standalone or distributed.
Adjustable in memory mode or OnDisk mode.

Easy to Customize

The Go code is much simpler to read than Scala, Java, C++.

One Flow, Multiple ways to execute

Gleam code defines the flow, specifying each dataset(vertex) and computation step(edge), and build up a directed acyclic graph(DAG). There are multiple ways to execute the DAG.

The default way is to run locally. This works in most cases.

Here we mostly talk about the distributed mode.

Distributed Mode

The distributed mode has several names to explain: Master, Agent, Executor, Driver.

Gleam Driver

Driver is the program users write, it defines the flow, and talks to Master, Agents, and Executors.

Gleam Master

The Master is one single server that collects resource information from Agents.
It stores transient resource information and can be restarted.
When the Driver program starts, it asks the Master for available Executors on Agents.

Gleam Agent

Agents runs on any machine that can run computations.
Agents periodically send resource usage updates to Master.
When the Driver program has executors assigned, it talks to the Agents to start Executors.
Agents also manage datasets generated by each Executors.

Gleam Executor

Executors are started by Agents. They will read inputs from external or previous datasets, process them, and output to a new dataset.

Dataset

The datasets are managed by Agents. By default, the data run only through memory and network, not touching slow disk.
Optionally the data can be persist to disk.

By leaving it in memory, the flow can have back pressure, and can support stream computation naturally.

Documentation

Word Count

Basically, you need to register the Go functions first. It will return a mapper or reducer function id, which we can pass it to the flow.

package main

import (
    "flag"
    "strings"

    "github.com/chrislusf/gleam/distributed"
    "github.com/chrislusf/gleam/flow"
    "github.com/chrislusf/gleam/gio"
    "github.com/chrislusf/gleam/plugins/file"
)

var (
    isDistributed   = flag.Bool("distributed", false, "run in distributed or not")
    Tokenize  = gio.RegisterMapper(tokenize)
    AppendOne = gio.RegisterMapper(appendOne)
    Sum = gio.RegisterReducer(sum)
)

func main() {

    gio.Init()   // If the command line invokes the mapper or reducer, execute it and exit.
    flag.Parse() // optional, since gio.Init() will call this also.

    f := flow.New("top5 words in passwd").
        Read(file.Txt("/etc/passwd", 2)).  // read a txt file and partitioned to 2 shards
        Map("tokenize", Tokenize).    // invoke the registered "tokenize" mapper function.
        Map("appendOne", AppendOne).  // invoke the registered "appendOne" mapper function.
        ReduceByKey("sum", Sum).         // invoke the registered "sum" reducer function.
        Sort("sortBySum", flow.OrderBy(2, true)).
        Top("top5", 5, flow.OrderBy(2, false)).
        Printlnf("%s\t%d")

    if *isDistributed {
        f.Run(distributed.Option())
    } else {
        f.Run()
    }

}

func tokenize(row []interface{}) error {
    line := gio.ToString(row[0])
    for _, s := range strings.FieldsFunc(line, func(r rune) bool {
        return !('A' <= r && r <= 'Z' || 'a' <= r && r <= 'z' || '0' <= r && r <= '9')
    }) {
        gio.Emit(s)
    }
    return nil
}

func appendOne(row []interface{}) error {
    row = append(row, 1)
    gio.Emit(row...)
    return nil
}

func sum(x, y interface{}) (interface{}, error) {
    return gio.ToInt64(x) + gio.ToInt64(y), nil
}

Now you can execute the binary directly or with "-distributed" option to run in distributed mode. The distributed mode would need a simple setup described later.

A bit more blown up example is here, using the predefined mapper or reducer: https://github.com/chrislusf/gleam/blob/master/examples/word_count_in_go/word_count_in_go.go

Word Count by Unix Pipe Tools

Here is another way to do the similar by unix pipe tools.

Unix Pipes are easy for sequential pipes, but limited to fan out, and even more limited to fan in.

With Gleam, fan-in and fan-out parallel pipes become very easy.

package main

import (
    "fmt"

    "github.com/chrislusf/gleam/flow"
    "github.com/chrislusf/gleam/gio"
    "github.com/chrislusf/gleam/gio/mapper"
    "github.com/chrislusf/gleam/plugins/file"
    "github.com/chrislusf/gleam/util"
)

func main() {

    gio.Init()

    flow.New("word count by unix pipes").
        Read(file.Txt("/etc/passwd", 2)).
        Map("tokenize", mapper.Tokenize).
        Pipe("lowercase", "tr 'A-Z' 'a-z'").
        Pipe("sort", "sort").
        Pipe("uniq", "uniq -c").
        OutputRow(func(row *util.Row) error {

            fmt.Printf("%s\n", gio.ToString(row.K[0]))

            return nil
        }).Run()

}

This example used OutputRow() to process the output row directly.

Join two CSV files.

Assume there are file "a.csv" has fields "a1, a2, a3, a4, a5" and file "b.csv" has fields "b1, b2, b3". We want to join the rows where a1 = b2. And the output format should be "a1, a4, b3".

package main

import (
    . "github.com/chrislusf/gleam/flow"
    "github.com/chrislusf/gleam/gio"
    "github.com/chrislusf/gleam/plugins/file"
)

func main() {

    gio.Init()

    f := New("join a.csv and b.csv by a1=b2")
    a := f.Read(file.Csv("a.csv", 1)).Select("select", Field(1,4)) // a1, a4
    b := f.Read(file.Csv("b.csv", 1)).Select("select", Field(2,3)) // b2, b3

    a.Join("joinByKey", b).Printlnf("%s,%s,%s").Run()  // a1, a4, b3

}

Distributed Computing

Setup Gleam Cluster Locally

Start a gleam master and several gleam agents

// start "gleam master" on a server
> go get github.com/chrislusf/gleam/distributed/gleam
> gleam master --address=":45326"

// start up "gleam agent" on some different servers or ports
> gleam agent --dir=2 --port 45327 --host=127.0.0.1
> gleam agent --dir=3 --port 45328 --host=127.0.0.1

Setup Gleam Cluster on Kubernetes

Start a gleam master and several gleam agents

kubectl apply -f k8s/

Change Execution Mode.

After the flow is defined, the Run() function can be executed in local mode or distributed mode.

  f := flow.New("")
  ...
  // 1. local mode
  f.Run()

  // 2. distributed mode
  import "github.com/chrislusf/gleam/distributed"
  f.Run(distributed.Option())
  f.Run(distributed.Option().SetMaster("master_ip:45326"))

Important Features

Fault tolerant OnDisk().
Read data from Local, HDFS, or S3.
Data Sources
- Cassandra, with example
- Kafka example
- Parquet files example
- ORC files example
- CSV files example
- TSV files
- TXT files
- Raw Socket example

Status

Gleam is just beginning. Here are a few todo items. Welcome any help!

Add new plugin to read external data.
Add windowing functions similar to Apache Beam/Flink. (in progress)
Add schema support for each dataset.
Support using SQL as a flow step, similar to LINQ.
Add dataset metadata for better caching of often re-calculated data.

Especially Need Help Now:

Go implementation to read Parquet files.

Please start to use it and give feedback. Help is needed. Anything is welcome. Small things count: fix documentation, adding a logo, adding docker image, blog about it, share it, etc.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

*Note that all licence references and agreements mentioned in the gleam README section above are relevant to that project's source code only.

gleam

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

gleam alternatives and similar packages

Popular Comparisons

README

Gleam

High Performance

Memory Efficient

Flexible

Easy to Customize

One Flow, Multiple ways to execute

Distributed Mode

Gleam Driver

Gleam Master

Gleam Agent

Gleam Executor

Dataset

Documentation

Word Count

Word Count

Word Count by Unix Pipe Tools

Join two CSV files.

Distributed Computing

Setup Gleam Cluster Locally

Setup Gleam Cluster on Kubernetes

Change Execution Mode.

Important Features

Status

License