Skip to content

pachyderm/examples

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Pachyderm Examples

Pachyderm Examples is a curated list of examples that use Pachyderm to accomplish various tasks.

Getting Started

  • Intro to Pachyderm Tutorial - A notebook introduction to Pachyderm, using the pachctl command line utility to illustrate the basics of Pachyderm data repositories and pipelines
  • Boston Housing Prices - A machine learning pipeline to train a regression model on the Boston Housing Dataset to predict the value of homes.
  • Boston Housing Prices (Intermediate) - Extends the original Boston Housing Prices example to show a multi-pipeline DAG and data rollbacks.
  • Market Sentiment - Train and deploy a fully automated financial market sentiment BERT model. As data is manually labeled, the model will automatically retrain and deploy.
  • Object Detection - Train an object detector on the COCO128 dataset with Lightning Flash, modify predictions with Label Studio, and version everything in Pachyderm.

Notebooks

Data Labeling

  • Label Studio Integration - Incorporate data versioning into any labeling project with Label Studio and Pachyderm.
  • Superb AI Integration - Version labeled image datasets created in Superb AI Suite using a cron pipeline.
  • Toloka Integration - Uses Pachyderm to create crowdsourced annotation jobs for news headlines in Toloka, aggregate the labeled data, and train a model.

Data Warehouse

  • Churn Prediction with Snowflake - Create a churn analysis model for a music streaming service with Pachyderm and Snowflake using the Data Warehouse integration.

Machine Learning

  • Boston Housing Prices (Intermediate) - Extends the original Boston Housing Prices example to show a multi-pipeline DAG and data rollbacks.
  • Breast Cancer Detection - A breast cancer detection system based on radiology scans scaled and visualized using Pachyderm.
  • Market Sentiment - Train and deploy a fully automated financial market sentiment BERT model. As data is manually labeled, the model will automatically retrain and deploy.
  • Apache Spark - MLflow Integration - End-to-end example demostrating the full ML training process of a fraud detection model with Spark, MLlib, MLflow, and Pachyderm.

ML Experiment Tracking

  • Weights and Biases - Log pipelines running in Pachyderm to Weights and Biases.
  • ClearML Integration - Log Pachyderm experiments to ClearML's experiment montioring platform, using Pachyderm Secrets.

Model Deployment