Skip to content

tensorlayer/hyperpose

master
Switch branches/tags
Code

Latest commit

1. Interface Unify
1.1 Abstract uniform training pipeline into modules:
    *Augmentor* : used to  augment the image training data. Basic class is `Model.BasicAugmentor`.
    *PreProcessor* : used to generate target heatmap from the augmented image data and the key point annotations for model to regression. Basic class is `Model.BasicPreProcessor`.
    *PostProcessor*: used to generated detected human body joints from the model predict heatmap. Basic class is `Model.BasicPostProcessor`.
    *Visualizer*: used to visualize the model predict heatmap and the humans detected together with the origin images. Basic class is `Model.BasicVisualizer`.
    The difference of the training procedure between different pose estimation methods are then divided into pre-processing handled by the PreProcessor, post-processing handled by the PostProcessor and visualizing handled by the Visualizer. Inherit the corresponding Basic class, implementing member functions according to the provided function protocals, and then use `Config` module to set these 3 custom modules to implement any pose estimation pipeline you want.
1.2 Abtract uniform model API protocal
    uniform model APIs including `forwarding`, `cal_loss` and `infer` and provide a basic class `Model.BasicModel` for model customization. Inherite the model basic class to implement any pose estimation model you want. 

2. Additional handy component
2.1 Metric Manager
    Introducing `Model.MetricManager` that provides `update` and `report` function to statistic loss values during training and generate report massages for logging.
2.2 Image Processor
    Introducing `Model.ImageProcessor` that provides useful interfaces to read images, pad and scale them for easily converting images into model input format.
2.3 Weight Examination
    Introducing weight examination APIs in Model module for model, npz file and npz_dict file to easily exam the model weights.

3. Issue Fix
3.1 Python Demo
    Providing `python_demo.py` as a python demo to replace the old problematic demo program `infer.py`.  python_demo.py is used to easily try the npz model weights (both pretrained model weights or weights trained by users themselves) and also demonstraste the usage of PostProcessor, Visualizer and ImageProcessor modules. 
3.2 Shape mismatch issues
    Fix the shape mismatch issue that occurs when loading the pretrained model weights. (The issue was introduced by version compatibility) 
3.3 Domain adapation
    Fix the domain adaptation loss calculation and optimization issue occurs in tensorflow tape scope. warp all the domain adaptation data pipeline into the Domainadapt_dataset class. Domain adaptation can be put into pratical usage now.
3.4 Parallel training
    Fit Kungfu new APIs for parallel and distributed training.
3.5 Other
    other known issues in Processor modules such as tensorflow eager tensor and numpy ndarray compatibility issue, pyplot value clip issue.

4.  Standarize
4.1 Logging info standardize
    Use standard file stream and std-out stream in logging module to output log, split the logging information of hyperpose into [DATA],[MODEL],[TRAIN] 3 parts to regulate the logging information.  Formate human body joints output string format
4.2 Channels format standardize.
    Adapt all the pre-processing, post-processing, and visualizing functions to accept `channel_first` data in default to make the system clearer.
4.3 Model weights format standardize.
    Adapat all the model weights loading and saving procedure default format from `npz` to `npz_dict` format to make model weight conversion and examination convenient. (`npz` format weights are ordered arrays while `npz_dict` format weights are dictionary, which is more convenient to locate and examine specific weight.)
4.4 Help information standardize.
  Add help informations about variable definition, object definition, development platform basic usage, development platform custom usage, additional features when constructing models.
4.5 Tidy up all 10 model backbones provided.
4.6 Tidy up custom APIs in Config module.
e34c6ac

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
Jul 11, 2021
Aug 28, 2018


FeaturesDocumentationQuick StartPerformanceAccuracyCite UsLicense

HyperPose

HyperPose is a library for building high-performance custom pose estimation applications.

Features

HyperPose has two key features:

  • High-performance pose estimation with CPUs/GPUs: HyperPose achieves real-time pose estimation through a high-performance pose estimation engine. This engine implements numerous system optimisations: pipeline parallelism, model inference with TensorRT, CPU/GPU hybrid scheduling, and many others. These optimisations contribute to up to 10x higher FPS compared to OpenPose, TF-Pose and OpenPifPaf.
  • Flexibility for developing custom pose estimation models: HyperPose provides high-level Python APIs to develop pose estimation models. HyperPose users can:
    • Customise training, evaluation, visualisation, pre-processing and post-processing in pose estimation.
    • Customise model architectures (e.g., OpenPose, Pifpaf, PoseProposal Network) and training datasets.
    • Speed up training with multiple GPUs.

Demo

新宝岛 with HyperPose (Lightweight OpenPose model)

Quick Start

The HyperPose library contains two parts:

  • A C++ library for high-performance pose estimation model inference.
  • A Python library for developing custom pose estimation models.

C++ inference library

The easiest way to use the inference library is through a Docker image. Pre-requisites for this image:

Run this command to check if pre-requisites are ready:

wget https://raw.githubusercontent.com/tensorlayer/hyperpose/master/scripts/test_docker.py -qO- | python

Once pre-requisites are ready, pull the HyperPose docker:

docker pull tensorlayer/hyperpose

We provide 4 examples within this image (The following commands have been tested on Ubuntu 18.04):

# [Example 1]: Doing inference on given video, copy the output.avi to the local path.
docker run --name quick-start --gpus all tensorlayer/hyperpose --runtime=stream
docker cp quick-start:/hyperpose/build/output.avi .
docker rm quick-start


# [Example 2](X11 server required to see the imshow window): Real-time inference.
# You may need to install X11 server locally:
# sudo apt install xorg openbox xauth
xhost +; docker run --rm --gpus all -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix tensorlayer/hyperpose --imshow


# [Example 3]: Camera + imshow window
xhost +; docker run --name pose-camera --rm --gpus all -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --device=/dev/video0:/dev/video0 tensorlayer/hyperpose --source=camera --imshow
# To quit this image, please type `docker kill pose-camera` in another terminal.


# [Dive into the image]
xhost +; docker run --rm --gpus all -it -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --device=/dev/video0:/dev/video0 --entrypoint /bin/bash tensorlayer/hyperpose
# For users that cannot access a camera or X11 server. You may also use:
# docker run --rm --gpus all -it --entrypoint /bin/bash tensorlayer/hyperpose

For more usage regarding the command line flags, please visit here.

Python training library

We recommend using the Python training library within an Anaconda environment. The below quick-start has been tested with these environments:

OS NVIDIA Driver CUDA Toolkit GPU
Ubuntu 18.04 410.79 10.0 Tesla V100-DGX
Ubuntu 18.04 440.33.01 10.2 Tesla V100-DGX
Ubuntu 18.04 430.64 10.1 TITAN RTX
Ubuntu 18.04 430.26 10.2 TITAN XP
Ubuntu 16.04 430.50 10.1 RTX 2080Ti

Once Anaconda is installed, run below Bash commands to create a virtual environment:

# Create virtual environment (choose yes)
conda create -n hyperpose python=3.7
# Activate the virtual environment, start installation
conda activate hyperpose
# Install cudatoolkit and cudnn library using conda
conda install cudatoolkit=10.0.130
conda install cudnn=7.6.0

We then clone the repository and install the dependencies listed in requirements.txt:

git clone https://github.com/tensorlayer/hyperpose.git && cd hyperpose
pip install -r requirements.txt

We demonstrate how to train a custom pose estimation model with HyperPose. HyperPose APIs contain three key modules: Config, Model and Dataset, and their basic usages are shown below.

from hyperpose import Config, Model, Dataset

# Set model name to distinguish models (necessary)
Config.set_model_name("MyLightweightOpenPose")

# Set model type, model backbone and dataset
Config.set_model_type(Config.MODEL.LightweightOpenpose)
Config.set_model_backbone(Config.BACKBONE.Vggtiny)
Config.set_dataset_type(Config.DATA.MSCOCO)

# Set single-node training or parallel-training
Config.set_train_type(Config.TRAIN.Single_train)

config = Config.get_config()
model = Model.get_model(config)
dataset = Dataset.get_dataset(config)

# Start the training process
Model.get_train(config)(model, dataset)

The full training program is listed here. To evaluate the trained model, you can use the evaluation program here. More information about the training library can be found here.

Documentation

The APIs of the HyperPose training library and the inference library are described in the Documentation.

Performance

We compare the prediction performance of HyperPose with OpenPose 1.6, TF-Pose and OpenPifPaf 0.12. The test-bed has Ubuntu18.04, 1070Ti GPU, Intel i7 CPU (12 logic cores).

HyperPose Configuration DNN Size Input Size HyperPose Baseline
OpenPose (VGG) 209.3MB 656 x 368 27.32 FPS 8 FPS (OpenPose)
OpenPose (TinyVGG) 34.7 MB 384 x 256 124.925 FPS N/A
OpenPose (MobileNet) 17.9 MB 432 x 368 84.32 FPS 8.5 FPS (TF-Pose)
OpenPose (ResNet18) 45.0 MB 432 x 368 62.52 FPS N/A
OpenPifPaf (ResNet50) 97.6 MB 432 x 368 44.16 FPS 14.5 FPS (OpenPifPaf)

Accuracy

We evaluate the accuracy of pose estimation models developed by HyperPose. The environment is Ubuntu16.04, with 4 V100-DGXs and 24 Intel Xeon CPU. The training procedure takes 1~2 weeks using 1 V100-DGX for each model. (If you don't want to train from scratch, you could use our pre-trained backbone models)

HyperPose Configuration DNN Size Input Size Evaluate Dataset Accuracy-hyperpose (Iou=0.50:0.95) Accuracy-original (Iou=0.50:0.95)
OpenPose (VGG19) 199 MB 432 x 368 MSCOCO2014 (random 1160 images) 57.0 map 58.4 map
LightweightOpenPose (Dilated MobileNet) 17.7 MB 432 x 368 MSCOCO2017(all 5000 img.) 46.1 map 42.8 map
LightweightOpenPose (MobileNet-Thin) 17.4 MB 432 x 368 MSCOCO2017 (all 5000 img.) 44.2 map 28.06 map (MSCOCO2014)
LightweightOpenPose (tiny VGG) 23.6 MB 432 x 368 MSCOCO2017 (all 5000 img.) 47.3 map -
LightweightOpenPose (ResNet50) 42.7 MB 432 x 368 MSCOCO2017 (all 5000 img.) 48.2 map -
PoseProposal (ResNet18) 45.2 MB 384 x 384 MPII (all 2729 img.) 54.9 map (PCKh) 72.8 map (PCKh)

Cite Us

If you find HyperPose helpful for your project, please cite our paper:

@article{hyperpose2021,
    author  = {Guo, Yixiao and Liu, Jiawei and Li, Guo and Mai, Luo and Dong, Hao},
    journal = {ACM Multimedia},
    title   = {{Fast and Flexible Human Pose Estimation with HyperPose}},
    url     = {https://github.com/tensorlayer/hyperpose},
    year    = {2021}
}

License

HyperPose is open-sourced under the Apache 2.0 license.