libtextcat alternatives and similar packages
Based on the "Natural Language Processing" category.
Alternatively, view libtextcat alternatives based on common mentions on social networks and blogs.
-
prose
DISCONTINUED. :book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction. -
gse
Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. -
universal-translator
:speech_balloon: i18n Translator for Go/Golang using CLDR data + pluralization rules -
locales
:earth_americas: a set of locales generated from the CLDR Project which can be used independently or within an i18n package; these were built for use with, but not exclusive to https://github.com/go-playground/universal-translator -
go-nlp
DISCONTINUED. Utilities for working with discrete probability distributions and other tools useful for doing NLP work. -
segment
A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29 -
go-localize
i18n (Internationalization and localization) engine written in Go, used for translating locale strings. -
gotokenizer
A tokenizer based on the dictionary and Bigram language models for Go. (Now only support chinese segmentation)
CodeRabbit: AI Code Reviews for Developers
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of libtextcat or a related project?
README
About
Cgo binding for libtextcat C library. Guaranteed compatibility with version 2.2.
Installation
Installation consists of several simple steps. They may be a bit different on your target system (e.g. require more permissions) so adapt them to the parameters of your system.
Get libtextcat C library code
- Download original libtextcat archive from libtextcat download section.
- Unarchive it.
NOTE: If this link is not working or there are some problems with downloading, there is a stable version 2.2 snapshot saved in Downloads.
Build and install libtextcat C library
From the directory, where you unarchived libtextcat, run:
./configure
make
sudo make install
sudo ldconfig
Install Go wrapper
go get github.com/goodsign/libtextcat
go test github.com/goodsign/libtextcat (must PASS)
Installation notes
Make sure that you have your local library paths set correctly and that installation was successful. Otherwise, go build or go test may fail.
libtextcat is installed in your local library directory (e.g. /usr/local/lib) and puts its libraries there. This path should be registered in your system (using ldconfig or exporting LD_LIBRARY_PATH, etc.) or the linker would fail.
Usage
cat, err := NewTextCat(ConfigPath) // See 'Usage notes' section
if nil != err {
// ... Handle error ...
}
defer cat.Close()
matches, err := cat.Classify(text)
if nil != err {
// ... Handle error ...
}
// Use matches.
// NOTE: matches[0] is the best match.
Usage notes
libtextcat library needs to load language models to start guessing languages. These models are set using a configuration file and a number of language model (.lm) files.
Configuration file maps .lm files to identifiers used in the library. See example. Path to this file is specified in the NewTextCat call.
.lm files contain language patterns and frequencies for a specified language. See example. Paths to these files are specified in the config file above. They can be absolute or relative (to the caller).
Quickstart
To immediately get started, copy /defaultcfg folder contents to the directory of your target project and use:
cat, err := NewTextCat("defaultcfg/conf.txt")
This will give you a standard set of languages described in the Default configuration section below.
Default configuration
This package contains a default configuration (/defaultcfg) which is created to work in following conditions:
- Utf-8 only languages
- Language list is taken from [snowball](github.com/goodsign/snowball) package
- Language identifiers are the same as in [snowball](github.com/goodsign/snowball) package
This configuration is meant to be used in pair with the [snowball](github.com/goodsign/snowball) package.
More info
For more information on libtextcat refer to the original website, which contains links on theory and other details.
libtextcat Licence
The libtextcat library is released under the BSD Licence
Licence
The goodsign/libtextcat binding is released under the BSD Licence
*Note that all licence references and agreements mentioned in the libtextcat README section above
are relevant to that project's source code only.