Skip to content
Avatar

Achievements

Achievements

Organizations

@ULAFF @facebookresearch @pytorch
Block or Report

Block or report jianyuh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

  1. Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

    Python 3.9k 831

  2. BLISlab: A Sandbox for Optimizing GEMM

    C 248 64

  3. Generating Families of Practical Fast Matrix Multiplication Algorithms

    Python 11 2

  4. Tensors and Dynamic neural networks in Python with strong GPU acceleration

    C++ 55.2k 15.3k

  5. FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

    C++ 798 252

246 contributions in the last year

Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr Mon Wed Fri
Activity overview
Contributed to pytorch/FBGEMM, pytorch/pytorch, NVIDIA/cutlass and 5 other repositories

Contribution activity

April 2022

Created 1 repository

Created a pull request in pytorch/FBGEMM that received 2 comments

Add launch_bounds constraint for CUDA kernel to avoid register overuse

Summary: Address CUDA error : too many resources requested for launch error and avoid register overuse. Differential Revision: D35344926

+62 −49 2 comments
Opened 3 other pull requests in 3 repositories
pytorch/pytorch 1 open
pytorch/FBGEMM 1 open
NVIDIA/cutlass 1 merged

Created an issue in NVIDIA/cutlass that received 1 comment

[FEA] FP8 GEMM implementation

Is your feature request related to a problem? Please describe. Recently more details about Nvidia's latest H100 GPU are released in https://develop…

1 comment
1 contribution in private repositories Apr 8

Seeing something unexpected? Take a look at the GitHub profile guide.