spaGO v0.4.0 Release Notes

Release Date: 2021-01-17 // over 3 years ago
  • โž• Added

    • โœ… Various new test cases (improving the coverage).
    • ๐Ÿ“ฆ nlp.embeddings.syncmap package.
    • ml.nn.recurrent.srnn.BiModel which implements a bidirectional variant of the Shuffling Recurrent Neural Networks ( SRNN).
    • ๐Ÿ‘€ Configurable timeout and request limit to all HTTP and gRPC servers (see also commands help).

    ๐Ÿ”„ Changed

    • ๐Ÿ”จ All CLI commands implementation has been refactored, so that the docker-entrypoint can reuse all other cli.App objects, instead of just running separate executables. By extension, now the Dockerfile builds a single executable file, and the final image is way smaller.
    • โฌ†๏ธ All dependencies have been upgraded to the latest version.
    • Simplify custom error definitions using fmt.Errorf instead of functions from github.com/pkg/errors.
    • Custom binary data serialization of matrices and models is now achieved with Go's encoding.gob. Many specific functions and methods are now replaced by fewer and simpler encoding/decoding methods compatible with gob. A list of important related changes follows.
      • utils.kvdb.KeyValueDB is no longer an interface, but a struct which directly implements the former "badger backend".
      • utils.SerializeToFile and utils.DeserializeFromFile now handle generic interface{} objects, instead of values implementing Serializer and Deserializer.
      • mat32 and mat64 custom serialization functions (e.g. MarshalBinarySlice, MarshalBinaryTo, ...) are replaced by implementations of BinaryMarshaler and BinaryUnmarshaler interfaces on Dense and Sparse matrix types.
      • PositionalEncoder.Cache and AxialPositionalEncoder.Cache fields (from ml.encoding.pe package) are now public.
      • All types implementing nn.Model interface are registered for gob serialization (in init functions).
      • embeddings.Model.UsedEmbeddings type is now nlp.embeddings.syncmap.Map.
      • As a consequence, you will have to re-serialize all your models.
    • Flair converter now sets the vocabulary directly in the model, instead of creating a separate file.
    • sequencelabeler.Model.LoadParams has been renamed to Load.

    โœ‚ Removed

    • In relation to the aforementioned gob serialization changes:
      • nn.ParamSerializer and related functions
      • nn.ParamsSerializer and related functions
      • utils.Serializer and utils.Deserializer interfaces
      • utils.ReadFull function
    • sequencelabeler.Model.LoadVocabulary

    ๐Ÿ›  Fixed

    • ๐Ÿณ docker-entrypoint sub-command hugging-face-importer has been renamed to huggingface-importer, just like the main command itself.
    • ๐Ÿณ docker-entrypoint sub-command can be correctly specified without leading ./ or / when run from a Docker container.
    • ๐Ÿ’ฅ BREAKING: mat32.Matrix serialization has been fixed, now serializing single values to chunks of 4 bytes (instead of 8, like float64). Serialized 32-bit models will now be half the size! Unfortunately you will have to re-serialize your models (sorry!).