Skip to content
@oap-project

Optimized Analytics Package for Spark Platform (OAP)

Pinned

  1. Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.

    Scala 235 73

  2. gluten Public

    Scala 379 112

  3. raydp Public

    RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.

    Python 196 46

  4. cloudtik Public

    Cloud Scale Platform for Distributed Analytics and AI

    Python 19 8

  5. oap-mllib Public

    Optimized Spark package to accelerate machine learning algorithms in Apache Spark MLlib.

    Scala 13 10

  6. oap-tools Public

    Tools for building, packaging, and OAP public cloud integrations such as AWS EMR, Google Dataproc and K8S.

    Jupyter Notebook 14 12

Repositories

  • gluten Public
    Scala 379 Apache-2.0 112 44 27 Updated Jan 13, 2023
  • velox Public

    A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

    C++ 5 Apache-2.0 456 0 8 Updated Jan 13, 2023
  • cloudtik Public

    Cloud Scale Platform for Distributed Analytics and AI

    Python 19 Apache-2.0 8 1 2 Updated Jan 13, 2023
  • gazelle_plugin Public

    Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.

    Scala 235 Apache-2.0 73 190 24 Updated Jan 13, 2023
  • arrow Public

    Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication…

    C++ 4 Apache-2.0 2,688 0 21 Updated Jan 13, 2023
  • raydp Public

    RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.

    Python 196 Apache-2.0 46 43 2 Updated Jan 9, 2023
  • remote-shuffle Public

    Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-disks.

    Scala 15 Apache-2.0 7 5 0 Updated Jan 5, 2023
  • sql-ds-cache Public archive

    Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.

    Scala 37 Apache-2.0 25 15 4 Updated Jan 3, 2023
  • libhdfs3-downstream Public archive

    a native c/c++ hdfs client (downstream fork from apache-hawq)

    C++ 0 Apache-2.0 55 0 0 Updated Jan 3, 2023
  • arrow-data-source Public archive

    Spark DataSouce plugin for reading files from various formats like Parquet into Arrow compatible columnar vectors.

    Scala 5 Apache-2.0 10 3 0 Updated Jan 4, 2023

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Most used topics

Loading…