#
cublas
Here are 58 public repositories matching this topic...
Deep Learning library using GPU(CUDA/cuBLAS)
-
Updated
Sep 18, 2021 - Elixir
Algorithms implemented in CUDA + resources about GPGPU
-
Updated
Jan 18, 2022 - Cuda
code for benchmarking GPU performance based on cublasSgemm and cublasHgemm
-
Updated
May 20, 2022 - Cuda
Escoin: Efficient Sparse Convolutional Neural Network Inference on GPUs
machine-learning
caffe
gpu
cuda
inference
cublas
convolutional-neural-networks
sparse-matrix
cusparse
-
Updated
Feb 28, 2019 - C++
Lab exercise of Parallel Processing course in NTUA regarding CUDA programming
-
Updated
Mar 3, 2020 - Cuda
The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Intel MKL(CPU) and cuBLAS(CUDA) on different matrix sizes/vendor's hardwares/OS. Out-of-the-box easy as MSVC, MinGW, Linux(CentOS) x86_64 binary provided. 在不同矩阵大小/硬件/操作系统下比较几个BLAS库的sgemm函数性能,提供binary,开盒即用。
-
Updated
Mar 28, 2019 - C
devincody
commented
Feb 26, 2019
Find a better (more automated) way to compare against python code.
good first issue
Good for newcomers
HSD: Hierarchical Spherical Defomration for Cortical Surface Registration
-
Updated
Jun 20, 2021 - C++
Matrix multiplication example performed with OpenMP, OpenACC, BLAS, cuBLABS, and CUDA
-
Updated
May 31, 2022 - C++
C++ CUDA-compatible template class that provides an interface for generic purpose matrix related algorithms and computations. Includes Matlab-like functions. This is mainly an example of how to use CUDA code with C++. Don't expect such high performances.
cpp
gpu
matrix
sum
cpp14
cuda
cublas
cpp17
hadamard
average
inverse
determinant
transpose
matlab-like
element-wise
-
Updated
Apr 28, 2021 - C++
Basel morphable face model mesh and texture generator using GPU.
cublas
face-reconstruction
face-morphing
bfm
face-generation
basel-face-model
3d-face-reconstruction
morphable-model
face-generator
face-morphable-model
-
Updated
Sep 14, 2020 - C
-
Updated
Feb 18, 2019 - Common Lisp
Generalized Orthogonal Least-Squares in CUDA
-
Updated
Apr 21, 2018 - Cuda
CUDA kernel functions
-
Updated
May 24, 2022 - Cuda
GPGPU Inverse Distance Weighting using matrix vector multiplication
-
Updated
Dec 5, 2017 - Cuda
A neat C++ custom Matrix class to perform super-fast GPU (or CPU) powered Matrix/Vector computations with minimal code, leveraging the power of cuBLAS where applicable.
-
Updated
Jun 24, 2017 - C++
A general k-means algorithm with L2 distance using pyCUDA
gpu
cuda
cublas
reduction
pca-analysis
heterogeneous-parallel-programming
k-means
acceleration-algorithm
thrustrtc
-
Updated
Oct 28, 2021 - Jupyter Notebook
Improve this page
Add a description, image, and links to the cublas topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the cublas topic, visit your repo's landing page and select "manage topics."
Description
https://numpy.org/doc/stable/reference/generated/numpy.corrcoef.html
https://docs.cupy.dev/en/stable/reference/generated/cupy.corrcoef.html
Seems args are different
Additional Information
dtypeargument added in NumPy version 1.20.