pytorch / nestedtensor

The nestedtensor package

The nestedtensor package is now on road to prototype release (end of October 2020).

It is developed against a fork of PyTorch to enable cutting-edge features such as improved performance or better torch.vmap integration.

Developers wills thus need to build from source, but users can use the binary we will ship on a nightly basis once the prototype is released.

Who can use this?

If you want to use the binaries you need to run on Linux, use Python 3.8+ and have a CUDA GPU with CUDA11.

If you want to build from source you can probably get it to work on many platforms, but supporting this won't take priority over development on the main platform. We're happy to review community contributions that achieve this however.

Why use this?

In general we batch data for efficiency, but one batched kernels need regular, statically-shaped data.

One way of dealing with dynamic shapes then, is via padding and masking. Various projects construct masks that, together with a data Tensor, are used as a representation for lists of dynamically shaped Tensors.

Obviously this is inefficient from a memory and compute perspective if the Tensors within this list are sufficient diverse.

You can also trace through the codebase where these masks are used and what kind of code that might cause (for example universal_sentence_embedding).

Otherwise we also have one-off operator support in PyTorch that aim to support dynamic shapes via extra arguments such as a padding index. Of course the upside here is that these are fast and sometimes memory efficient, but don't provide a consistent interface.

Other users simply gave up and started writing for-loops, or discovered that batching didn't help.

We want to have a single abstraction that is consistent, fast, memory efficient and readable and the nestedtensor project aims to provide that.

Description

NestedTensors are a generalization of torch Tensors which eases working with data of different sizes and length. In a nutshell, Tensors have scalar entries (e.g. floats) and NestedTensors have Tensor entries. However, note that a NestedTensor still is a Tensor. That means it needs to have a single dimension, single dtype, single device and single layout.

Tensor entry constraints

Each Tensor constituent is of the dtype, layout and device of the containing NestedTensor.
The dimension of a constituent Tensor must be less than the dimension of the NestedTensor.
An empty NestedTensor is of dimension zero.

Prerequisites

pytorch (installed from nestedtensor/third_party/pytorch submodule)
torchvision (needed for examples and tests)
ipython (needed for examples)
notebook (needed for examples)

Build for development

Get the source

git clone --recursive https://github.com/pytorch/nestedtensor
cd nestedtensor
# if you are updating an existing checkout
git submodule sync
git submodule update --init --recursive

Install the build tools

conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests
conda install -c pytorch magma-cuda110

Build from scratch

./clean_build_with_submodule.sh

Incremental builds

./build_with_submodule.sh

Tutorials

Please see the notebooks under examples.

Contribution

The project is under active development. If you have a suggestions or found an bug, please file an issue!

pytorch / nestedtensor

README.md

The nestedtensor package

Who can use this?

Why use this?

Description

Tensor entry constraints

Prerequisites

Build for development

Tutorials

Contribution

About

Releases

Packages

Contributors 7

Languages

pytorch / nestedtensor

Join GitHub today

Launching GitHub Desktop

Launching GitHub Desktop

Launching Xcode

Launching Visual Studio

Latest commit

Git stats

Files

README.md

The nestedtensor package

Who can use this?

Why use this?

Description

Tensor entry constraints

Prerequisites

Build for development

Tutorials

Contribution

About

Resources

License

Releases

Packages 0

Contributors 7

Languages

Essential cookies

Always active

Analytics cookies

Packages