pix2tex: Using a ViT to convert images of equations into LaTeX code.
-
Updated
Oct 26, 2023 - Python
pix2tex: Using a ViT to convert images of equations into LaTeX code.
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
🤖 PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
🔥🔥🔥TensorRT-Alpha supports YOLOv8、YOLOv7、YOLOv6、YOLOv5、YOLOv4、v3、YOLOX、YOLOR...🚀🚀🚀CUDA IS ALL YOU NEED.🍎🍎🍎It also supports end2end CUDA C acceleration and multi-batch inference.
A paper list of some recent Transformer-based CV works.
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
An easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".
FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
Official Code of Paper "Reversible Column Networks" "RevColv2"
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.
Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.
reproduction of semantic segmentation using masked autoencoder (mae)
Add a description, image, and links to the vit topic page so that developers can more easily learn about it.
To associate your repository with the vit topic, visit your repo's landing page and select "manage topics."