Here are
44 public repositories
matching this topic...
TensorFlow binaries supporting AVX, FMA, SSE
Updated
Feb 4, 2020
Shell
SIMD Vector Classes for C++
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)
A simple C library for compressing lists of integers using binary packing
A C++ library to compress and intersect sorted lists of integers using SIMD instructions
TensorFlow binaries supporting AVX, FMA, SSE
Agenium Scale vectorization library for CPUs and GPUs
Updated
Aug 7, 2020
Python
Fast decoder for VByte-compressed integers
High-performance dictionary coding
UME::SIMD A library for explicit simd vectorization.
Fast random number generators: Vectorized (SIMD) version of xorshift128+
High performance algorithms in C#: SIMD/SSE, multi-core and faster
Fast C functions for the computing the positional popcount (pospopcnt).
Fast differential coding functions (using SIMD instructions)
Fast C header-only library for popcnt, pospopcnt, and set algebraic operations
DSL for SIMD Sorting on AVX2 & AVX512
Simple example for embedding SSE2 assembly in Cython projects
Updated
May 2, 2017
Python
This project aims to rename all C# intrinsic names to their more compact C/C++ counterparts that the industry uses.
Litesimd is a no overhead, header only, C++ library for SIMD processing, specialized on SIMD comparison and data shuffle.
Prefix-Sum Data Structures in C++. This is the code for the paper "Practical Trade-Offs for the Prefix-Sum Problem" by Giulio Ermanno Pibiri and Rossano Venturini,
https://arxiv.org/abs/2006.14552 .
Particle engine built on OpenGL used to produce various visual effects.
Random number generator for large applications using vector instructions
Simple pascal demo project to show how to use Single Instruction Multiple Data (SIMD) using Intel SSE instruction
Updated
Feb 13, 2017
Pascal
A fast implementation of single-pattern substring search using SIMD acceleration.
Updated
Jul 24, 2020
Rust
Course project in 'How to write Fast Numerical Code' on optimized implementation of latent dirichlet allocation
C++ Optimized Software Renderer using SDL2.0
A C/x86 assembly implementation of proximal operators with SSE3/AVX SIMD instructions
A Method for efficiently processing SpMV using SIMD and load balancing
A high performance small tensor library for inelastic finite element simulation
Improve this page
Add a description, image, and links to the
simd-instructions
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
simd-instructions
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.