oap-project
Repositories
-
native-sql-engine
Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
-
oap-mllib
Optimized Spark package to accelerate machine learning algorithms in Apache Spark MLlib.
-
raydp
RayDP: Distributed data processing library that provides simple APIs for running Spark on Ray and integrating Spark with distributed deep learning and machine learning frameworks.
-
sql-ds-cache
Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.
-
solution-navigator
Example solutions or code for using OAP features.
-
arrow
Forked from apache/arrowApache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication…
-
oap-project.github.io
The OAP project web site
-
pmem-shuffle
Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high performance and low latency shuffle solutions for Spark*.
-
remote-shuffle
Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-disks.
-
pmem-spill
Spark plug-in package for accelerating Spark runtime spill functions using PMem such as RDD cache PMem extension.
-
pmem-common
Common library for accessing PMEM native library functions including memkind, vmemcache and so on.
-
libhdfs3-downstream
Forked from martindurant/libhdfs3-downstreama native c/c++ hdfs client (downstream fork from apache-hawq)
-
arrow-data-source
Spark DataSouce plugin for reading files from various formats like Parquet into Arrow compatible columnar vectors.