The PyTorch Foundation, makers of the PyTorch machine learning framework, has launched torchao, a PyTorch native library that makes models faster and smaller by leveraging low-bit dtypes, sparsity, and quantization. It is a toolkit of techniques that span both training and inference, Team PyTorch said.

Unveiled September 26, torchao works with torch.compile() and FSDP2 over most PyTorch models on Hugging Face. A library for custom data types and optimizations, torchao is positioned to make models smaller and faster for training or inference out of the box. Users can quantize and sparsify weights, gradients, optimizers, and activations for inference and training. The torchao library serves as an accessible toolkit of techniques mostly written in easy-to-read PyTorch code spanning inference and training, according to Team Pytorch. Featured is torchao.float8 for accelerating training with float8 in native PyTorch.

Released under a BSD 3 license, torchao makes liberal use of new features in PyTorch and is recommended for use with the current nightly or latest stable release of PyTorch, Team PyTorch advises.