Skip to content

apache/systemds

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

This patch adds the compiler flags and runtime support to checkpoint
any Spark instruction which is marked for caching. During postprocessing
of a marked instruction, we first inplace persist the RDD and then store
the RDD in the local Lineage cache for reuse. This patch also fixes a
bug in the last commit which was unpersisting the locally cached RDDs
during rmvar. Future commits will add rewrites to mark the Spark
instructions for caching in a cost-based manner.
Hyperparameter tuning of LmDS with 2.5k columns improves by
22x by caching the cpmm results in the executors.

Closes #1756
f89a38d

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
bin
 
 
 
 
dev
 
 
 
 
 
 
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Apache SystemDS

Overview: SystemDS is an open source ML system for the end-to-end data science lifecycle from data integration, cleaning, and feature engineering, over efficient, local and distributed ML model training, to deployment and serving. To this end, we aim to provide a stack of declarative languages with R-like syntax for (1) the different tasks of the data-science lifecycle, and (2) users with different expertise. These high-level scripts are compiled into hybrid execution plans of local, in-memory CPU and GPU operations, as well as distributed operations on Apache Spark. In contrast to existing systems - that either provide homogeneous tensors or 2D Datasets - and in order to serve the entire data science lifecycle, the underlying data model are DataTensors, i.e., tensors (multi-dimensional arrays) whose first dimension may have a heterogeneous and nested schema.

Quick Start Install, Quick Start and Hello World

Documentation: SystemDS Documentation

Python Documentation Python SystemDS Documentation

Issue Tracker Jira Dashboard

Status and Build: SystemDS is renamed from SystemML which is an Apache Top Level Project. To build from source visit SystemDS Install from source

Build Documentation Component Test Application Test Function Test Python Test