Popularity

7.3

Stable

Activity

8.3

Stars 953

Watchers 25

Forks 84

Last Commit 6 days ago

Description

Another cross-platform, efficient, practical and pretty CSV/TSV toolkit

Yes, you could just use spreadsheet softwares like MS excel to do most of the job.

Howerver it's all by clicking and typing, which is not automatically and time-consuming to repeate, especially when we want to apply similar operations with different datasets or purposes.

Hope it be helpful to you.

Programming language: Go

License: MIT License

Tags: Utilities CSV TSV

Latest version: v0.22.0.rc1

csvtk alternatives and similar packages

Based on the "Utilities" category.
Alternatively, view csvtk alternatives based on common mentions on social networks and blogs.

fzf

10.0 9.6 csvtk VS fzf

:cherry_blossom: A command-line fuzzy finder
项目文档

9.9 9.4 csvtk VS 项目文档

🚀Vite+Vue3+Gin的开发基础平台，支持TS和JS混用。它集成了JWT鉴权、权限管理、动态路由、显隐可控组件、分页封装、多点登录拦截、资源权限、上传下载、代码生成器、表单生成器和可配置的导入导出等开发必备功能。

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

Promo www.influxdata.com

ngrok

9.9 3.7 csvtk VS ngrok

Unified ingress for developers
hub

9.9 4.2 csvtk VS hub

A command-line tool that makes git easier to use with GitHub.
delve

9.9 9.2 csvtk VS delve

Delve is a debugger for the Go programming language.
dive

9.9 6.6 csvtk VS dive

A tool for exploring each layer in a docker image
excelize

9.8 8.8 csvtk VS excelize

Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
go-torch

9.7 0.0 csvtk VS go-torch

DISCONTINUED. Stochastic flame graph profiler for Go programs.
ctop

9.7 0.0 csvtk VS ctop

Top-like interface for container metrics
GJSON

9.7 5.1 csvtk VS GJSON

Get JSON values quickly - JSON parser for Go
goreleaser

9.6 9.8 csvtk VS goreleaser

Deliver Go binaries as fast and easily as possible
resty

9.5 7.9 csvtk VS resty

Simple HTTP and REST client library for Go
Task

9.5 9.6 csvtk VS Task

A task runner / simpler Make alternative written in Go
wuzz

9.5 0.0 csvtk VS wuzz

Interactive cli tool for HTTP inspection
usql

9.4 9.0 csvtk VS usql

Universal command-line interface for SQL databases
xlsx

9.3 6.7 csvtk VS xlsx

Go library for reading and writing XLSX files.
godotenv

9.3 3.7 csvtk VS godotenv

A Go port of Ruby's dotenv library (Loads environment variables from .env files)
peco

9.3 4.7 csvtk VS peco

Simplistic interactive filtering tool
Kopia

9.2 9.6 csvtk VS Kopia

Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.
godropbox

9.0 2.6 csvtk VS godropbox

Common libraries for writing Go services/applications.
go-funk

8.9 4.0 csvtk VS go-funk

A modern Go utility library which provides helpers (map, find, contains, filter, ...)
hystrix-go

8.9 0.0 csvtk VS hystrix-go

Netflix's Hystrix latency and fault tolerance library, for Go
lancet

8.8 9.4 csvtk VS lancet

A comprehensive, efficient, and reusable util function library of Go.
gorequest

8.7 2.6 csvtk VS gorequest

GoRequest -- Simplified HTTP client ( inspired by nodejs SuperAgent )
minify

8.7 9.2 csvtk VS minify

Go minifiers for web formats
panicparse

8.6 0.0 csvtk VS panicparse

Crash your app in style (Golang)
mc

8.6 9.2 csvtk VS mc

Simple | Fast tool to manage MinIO clusters :cloud:
goreporter

8.6 0.0 csvtk VS goreporter

A Golang tool that does static analysis, unit testing, code review and generate code quality report.
mergo

8.4 4.0 csvtk VS mergo

Mergo: merging Go structs and maps since 2013
gojson

8.4 0.0 csvtk VS gojson

Automatically generate Go (golang) struct definitions from example JSON
create-go-app

8.3 7.5 csvtk VS create-go-app

✨ A complete and self-contained solution for developers of any qualification to create a production-ready project with backend (Go), frontend (JavaScript, TypeScript) and deploy automation (Ansible, Docker) by running only one CLI command.
EaseProbe

8.2 8.7 csvtk VS EaseProbe

A simple, standalone, and lightweight tool that can do health/status checking, written in Go.
spinner

8.2 4.3 csvtk VS spinner

Go (golang) package with 90 configurable terminal spinner/progress indicators.
filetype

8.1 4.2 csvtk VS filetype

Fast, dependency-free Go package to infer binary file types based on the magic numbers header signature
grequests

8.1 3.7 csvtk VS grequests

A Go "clone" of the great and famous Requests library
sling

7.9 6.5 csvtk VS sling

A Go HTTP client library for creating and sending API requests
boilr

7.9 0.0 csvtk VS boilr

:zap: boilerplate template manager that generates files or directories from template repositories
mole

7.9 0.0 csvtk VS mole

CLI application to create ssh tunnels focused on resiliency and user experience.
jump

7.8 2.9 csvtk VS jump

Jump helps you navigate faster by learning your habits. ✌️
mmake

7.8 0.0 csvtk VS mmake

Modern Make
beaver

7.7 3.6 csvtk VS beaver

💨 A real time messaging system to build a scalable in-app notifications, multiplayer games, chat apps in web and mobile apps.
gitbatch

7.7 0.0 csvtk VS gitbatch

manage your git repositories in one place
coop

7.7 0.0 csvtk VS coop

DISCONTINUED. Cheat sheet for some of the common concurrent flows in Go
mimetype

7.7 6.1 csvtk VS mimetype

A fast Golang library for media type and file extension detection, based on magic numbers
go-underscore

7.5 0.0 csvtk VS go-underscore

Helpfully Functional Go - A useful collection of Go utilities. Designed for programmer happiness.
circuitbreaker

7.5 0.0 csvtk VS circuitbreaker

Circuit Breakers in Go
JobRunner

7.4 0.0 csvtk VS JobRunner

Framework for performing work asynchronously, outside of the request flow
scany

7.4 4.9 csvtk VS scany

Library for scanning data from a database into Go structs and more
goreq

7.3 0.0 csvtk VS goreq

DISCONTINUED. Minimal and simple request library for Go language.
gron

7.3 0.0 csvtk VS gron

gron, Cron Jobs in Go.

Do you think we are missing an alternative of csvtk or a related project?

Add another 'Utilities' Package

Popular Comparisons

README

csvtk - a cross-platform, efficient and practical CSV/TSV toolkit

Documents: http://bioinf.shenwei.me/csvtk ( Usage and Tutorial). 中文介绍
Source code: https://github.com/shenwei356/csvtk
Latest version:

Introduction

Similar to FASTA/Q format in field of Bioinformatics, CSV/TSV formats are basic and ubiquitous file formats in both Bioinformatics and data science.

People usually use spreadsheet software like MS Excel to process table data. However this is all by clicking and typing, which is not automated and is time-consuming to repeat, especially when you want to apply similar operations with different datasets or purposes.

You can also accomplish some CSV/TSV manipulations using shell commands, but more code is needed to handle the header line. Shell commands do not support selecting columns with column names either.

csvtk is convenient for rapid data investigation and also easy to integrate into analysis pipelines. It could save you lots of time in (not) writing Python/R scripts.

Features
Subcommands
Installation
Command-line completion
Compared to csvkit
Examples
Acknowledgements
Contact
License
Starchart

Features

Cross-platform (Linux/Windows/Mac OS X/OpenBSD/FreeBSD)
Light weight and out-of-the-box, no dependencies, no compilation, no configuration
Fast, multiple-CPUs supported (some commands)
Practical functions provided by N subcommands
Support STDIN and gziped input/output file, easy being used in pipe
Most of the subcommands support unselecting fields and fuzzy fields, e.g. -f "-id,-name" for all fields except "id" and "name", -F -f "a.*" for all fields with prefix "a.".
Support some common plots (see usage)
Seamlessly support for data with meta line (e.g., sep=,) of separator declaration used by MS Excel

Subcommands

49 subcommands in total.

Information

headers: prints headers
dim: dimensions of CSV file
nrow: print number of records
ncol: print number of columns
summary: summary statistics of selected numeric or text fields (groupby group fields)
watch: online monitoring and histogram of selected field
corr: calculate Pearson correlation between numeric columns

Format conversion

pretty: converts CSV to readable aligned table
csv2tab: converts CSV to tabular format
tab2csv: converts tabular format to CSV
space2tab: converts space delimited format to CSV
transpose: transposes CSV data
csv2md: converts CSV to markdown format
csv2rst: convert CSV to reStructuredText format
csv2json: converts CSV to JSON format
csv2xlsx: convert CSV/TSV files to XLSX file
xlsx2csv: converts XLSX to CSV format

Set operations

head: prints first N records
concat: concatenates CSV/TSV files by rows
sample: sampling by proportion
cut: select and arrange fields
grep: greps data by selected fields with patterns/regular expressions
uniq: unique data without sorting
freq: frequencies of selected fields
inter: intersection of multiple files
filter: filters rows by values of selected fields with arithmetic expression
filter2: filters rows by awk-like arithmetic/string expressions
join: join files by selected fields (inner, left and outer join)
split splits CSV/TSV into multiple files according to column values
splitxlsx: splits XLSX sheet into multiple sheets according to column values
comb: compute combinations of items at every row

Edit

add-header: add column names
del-header: delete column names
rename: renames column names with new names
rename2: renames column names by regular expression
replace: replaces data of selected fields by regular expression
round: round float to n decimal places
mutate: creates new columns from selected fields by regular expression
mutate2: creates new column from selected fields by awk-like arithmetic/string expressions
sep: separate column into multiple columns
gather: gathers columns into key-value pairs
unfold: unfold multiple values in cells of a field
fold: fold multiple values of a field into cells of groups
fmtdate: format date of selected fields

Ordering

sort: sorts by selected fields

Ploting

plot see usage
- plot hist histogram
- plot box boxplot
- plot line line plot and scatter plot

Misc

cat stream file and report progress
version print version information and check for update
genautocomplete generate shell autocompletion script (bash|zsh|fish|powershell)

Installation

Download Page

csvtk is implemented in Go programming language, executable binary files for most popular operating systems are freely available in release page.

Method 1: Download binaries (latest stable/dev version)

Just download compressed executable file of your operating system, and decompress it with tar -zxvf *.tar.gz command or other tools. And then:

For Linux-like systems
1. If you have root privilege simply copy it to /usr/local/bin:
```
sudo cp csvtk /usr/local/bin/
```
2. Or copy to anywhere in the environment variable PATH:
```
mkdir -p $HOME/bin/; cp csvtk $HOME/bin/
```
For windows, just copy csvtk.exe to C:\WINDOWS\system32.

Method 2: Install via conda (latest stable version)

conda install -c bioconda csvtk

Method 3: Install via homebrew

brew install csvtk

Method 4: For Go developer (latest stable/dev version)

go get -u github.com/shenwei356/csvtk/csvtk

Method 5: For ArchLinux AUR users (may be not the latest)

yaourt -S csvtk

Command-line completion

Bash:

# generate completion shell
csvtk genautocomplete --shell bash

# configure if never did.
# install bash-completion if the "complete" command is not found.
echo "for bcfile in ~/.bash_completion.d/* ; do source \$bcfile; done" >> ~/.bash_completion
echo "source ~/.bash_completion" >> ~/.bashrc

Zsh:

# generate completion shell
csvtk genautocomplete --shell zsh --file ~/.zfunc/_csvtk

# configure if never did
echo 'fpath=( ~/.zfunc "${fpath[@]}" )' >> ~/.zshrc
echo "autoload -U compinit; compinit" >> ~/.zshrc

fish:

csvtk genautocomplete --shell fish --file ~/.config/fish/completions/csvtk.fish

Compared to `csvkit`

csvkit, attention: this table wasn't updated for 2 years.

Features	csvtk	csvkit	Note
Read Gzip	Yes	Yes	read gzip files
Fields ranges	Yes	Yes	e.g. `-f 1-4,6`
Unselect fileds	Yes	--	e.g. `-1` for excluding first column
Fuzzy fields	Yes	--	e.g. `ab*` for columns with name prefix "ab"
Reorder fields	Yes	Yes	it means `-f 1,2` is different from `-f 2,1`
Rename columns	Yes	--	rename with new name(s) or from existed names
Sort by multiple keys	Yes	Yes	bash sort like operations
Sort by number	Yes	--	e.g. `-k 1:n`
Multiple sort	Yes	--	e.g. `-k 2:r -k 1:nr`
Pretty output	Yes	Yes	convert CSV to readable aligned table
Unique data	Yes	--	unique data of selected fields
frequency	Yes	--	frequencies of selected fields
Sampling	Yes	--	sampling by proportion
Mutate fields	Yes	--	create new columns from selected fields
Replace	Yes	--	replace data of selected fields

Similar tools:

csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats. http://csvkit.rtfd.org/
xsv - A fast CSV toolkit written in Rust.
miller - Miller is like sed, awk, cut, join, and sort for name-indexed data such as CSV and tabular JSON http://johnkerl.org/miller
tsv-utils - Command line utilities for tab-separated value files written in the D programming language.

Examples

More examples and tutorial.

Attention

The CSV parser requires all the lines have same number of fields/columns. Even lines with spaces will cause error. Use '-I/--ignore-illegal-row' to skip these lines if neccessary.
By default, csvtk thinks your files have header row, if not, switch flag -H on.
Column names better be unique.
By default, lines starting with # will be ignored, if the header row starts with #, please assign flag -C another rare symbol, e.g. '$'.
By default, csvtk handles CSV files, use flag -t for tab-delimited files.
If " exists in tab-delimited files, use flag -l.
Do not mix use field (column) numbers and names.

Examples

Pretty result

$ csvtk pretty names.csv
id   first_name   last_name   username
11   Rob          Pike        rob
2    Ken          Thompson    ken
4    Robert       Griesemer   gri
1    Robert       Thompson    abc
NA   Robert       Abel        123

Summary of selected numeric fields, supporting "group-by"

$ cat testdata/digitals2.csv \
    | csvtk summary --ignore-non-digits --fields f4:sum,f5:sum --groups f1,f2 \
    | csvtk pretty
f1    f2     f4:sum   f5:sum
bar   xyz    7.00     106.00
bar   xyz2   4.00     4.00
foo   bar    6.00     3.00
foo   bar2   4.50     5.00

Select fields/columns (cut)

- By index: `csvtk cut -f 1,2`
- By names: `csvtk cut -f first_name,username`
- **Unselect**: `csvtk cut -f -1,-2` or `csvtk cut -f -first_name`
- **Fuzzy fields**: `csvtk cut -F -f "*_name,username"`
- Field ranges: `csvtk cut -f 2-4` for column 2,3,4 or `csvtk cut -f -3--1` for discarding column 1,2,3
- All fields: `csvtk cut -F -f "*"`

Search by selected fields (grep) (matched parts will be highlighted as red)

- By exactly matching: `csvtk grep -f first_name -p Robert -p Rob`
- By regular expression: `csvtk grep -f first_name -r -p Rob`
- By pattern list: `csvtk grep -f first_name -P name_list.txt`
- Remore rows containing missing data (NA): `csvtk grep -F -f "*" -r -p "^$" -v `

Rename column names (rename and rename2)

- Setting new names: `csvtk rename -f A,B -n a,b` or `csvtk rename -f 1-3 -n a,b,c`
- Replacing with original names by regular express: `cat ../testdata/c.csv | ./csvtk rename2 -F -f "*" -p "(.*)" -r 'prefix_$1'` for adding prefix to all column names.

Edit data with regular expression (replace)

- Remove Chinese charactors:  `csvtk replace -F -f "*_name" -p "\p{Han}+" -r ""`

Create new column from selected fields by regular expression (mutate)

- In default, copy a column: `csvtk mutate -f id `
- Extract prefix of data as group name (get "A" from "A.1" as group name):
  `csvtk mutate -f sample -n group -p "^(.+?)\."`

Sort by multiple keys (sort)

- By single column : `csvtk sort -k 1` or `csvtk sort -k last_name`
- By multiple columns: `csvtk sort -k 1,2` or `csvtk sort -k 1 -k 2` or `csvtk sort -k last_name,age`
- Sort by number: `csvtk sort -k 1:n` or  `csvtk sort -k 1:nr` for reverse number
- Complex sort: `csvtk sort -k region -k age:n -k id:nr`
- In natural order: `csvtk sort -k chr:N`

Join multiple files by keys (join)

- All files have same key column: `csvtk join -f id file1.csv file2.csv`
- Files have different key columns: `csvtk join -f "username;username;name" names.csv phone.csv adress.csv -k`

Filter by numbers (filter)

- Single field: `csvtk filter -f "id>0"`
- **Multiple fields**: `csvtk filter -f "1-3>0"`
- Using `--any` to print record if any of the field satisfy the condition: `csvtk filter -f "1-3>0" --any`
- **fuzzy fields**: `csvtk filter -F -f "A*!=0"`

Filter rows by awk-like arithmetic/string expressions (filter2)

- Using field index: `csvtk filter2 -f '$3>0'`
- Using column names: `csvtk filter2 -f '$id > 0'`
- Both arithmetic and string expressions: `csvtk filter2 -f '$id > 3 || $username=="ken"'`
- More complicated: `csvtk filter2 -H -t -f '$1 > 2 && $2 % 2 == 0'`

Ploting
- plot histogram with data of the second column:
```
csvtk -t plot hist testdata/grouped_data.tsv.gz -f 2 | display
```
[histogram.png](testdata/figures/histogram.png)

- plot boxplot with data of the "GC Content" (third) column,
group information is the "Group" column.

        csvtk -t plot box testdata/grouped_data.tsv.gz -g "Group" \
            -f "GC Content" --width 3 | display

  ![boxplot.png](testdata/figures/boxplot.png)

-  plot horiz boxplot with data of the "Length" (second) column,
group information is the "Group" column.

        csvtk -t plot box testdata/grouped_data.tsv.gz -g "Group" -f "Length"  \
            --height 3 --width 5 --horiz --title "Horiz box plot" | display

  ![boxplot2.png](testdata/figures/boxplot2.png)

- plot line plot with X-Y data

        csvtk -t plot line testdata/xy.tsv -x X -y Y -g Group | display

  ![lineplot.png](testdata/figures/lineplot.png)

- plot scatter plot with X-Y data

        csvtk -t plot line testdata/xy.tsv -x X -y Y -g Group --scatter | display

  ![scatter.png](testdata/figures/scatter.png)

Acknowledgements

We are grateful to Zhiluo Deng and Li Peng for suggesting features and reporting bugs.

Thanks Albert Vilella for features suggestion, which makes csvtk feature-rich。

Contact

Create an issue to report bugs, propose new functions or ask for help.

Or leave a comment.

License

MIT License

Starchart

*Note that all licence references and agreements mentioned in the csvtk README section above are relevant to that project's source code only.

csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang

Description

csvtk alternatives and similar packages

Popular Comparisons

README

csvtk - a cross-platform, efficient and practical CSV/TSV toolkit

Introduction

Table of Contents

Features

Subcommands

Installation

Method 1: Download binaries (latest stable/dev version)

Method 2: Install via conda (latest stable version)

Method 3: Install via homebrew

Method 4: For Go developer (latest stable/dev version)

Method 5: For ArchLinux AUR users (may be not the latest)

Command-line completion

Compared to csvkit

Examples

Acknowledgements

Contact

License

Starchart

Compared to `csvkit`