#

vision-language-model

Here are 20 public repositories matching this topic...

NVlabs / prismer

The implementation of "Prismer: A Vision-Language Model with An Ensemble of Experts".

vqa image-captioning language-model multi-task-learning vision-and-language multi-modal-learning vision-language-model

Updated Apr 29, 2023
Python

jingyi0000 / VLM_survey

Vision-Language Models for Vision Tasks: A Survey

computer-vision deep-learning survey transfer-learning clip knowledge-distillation vision-language-model

Updated Aug 21, 2023

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

large-language-models vision-language-model

Updated Aug 27, 2023
Python

VPGTrans / VPGTrans

Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.

llm vision-language-model large-scale-language-modeling vl-llm

Updated May 29, 2023
Python

OpenGVLab / Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

chat chatbot vqa gradio multi-modality large-language-models llms chatgpt vision-language-model

Updated Aug 27, 2023
Python

AlibabaResearch / AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Alibaba DAMO Academy.

ocr computer-vision artificial-intelligence text-recognition document text-detection document-analysis end-to-end-ocr multimodal scene-text-recognition multimodal-deep-learning scene-text-detection vision-language document-understanding scene-text-detection-recognition document-recognition document-intelligence documentai vision-language-transformer vision-language-model

Updated Jun 13, 2023
C++

atfortes / Awesome-Multimodal-Reasoning

Collection of papers and resources on Multimodal Reasoning, including Vision-Language Models, Multimodal Chain-of-Thought, Visual Inference, and others.

Updated Jul 31, 2023

zwx8981 / LIQE

[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

clip image-quality-assessment blind-image-quality-assessment multitask-learning no-reference-image-quality-assessment vision-language-model

Updated Jul 29, 2023
Python

FeiElysia / ViECap

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023

transferability modality-biases vision-language-model zero-shot-captioning object-hallucination

Updated Aug 18, 2023
Python

yunqing-me / AttackVLM

Code of the paper: On Evaluating Adversarial Robustness of Large Vision-Language Models

deep-generative-model adversarial-attack trustworthy-ai foundation-models large-language-models text-to-image-generation generative-ai vision-language-model image-to-text-generation

Updated Aug 27, 2023
Python

richard-peng-xia / LMPT

LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition

multi-label-image-classification prompt-tuning long-tailed-learning vision-language-model

Updated May 16, 2023
Python

vincentlux / Awesome-Multimodal-LLM

Reading list for Multimodal Large Language Models

machine-learning natural-language-processing computer-vision awesome-list paper-list multimodal-machine-learning large-language-models vision-language-model multimodal-large-language-models

Updated Aug 17, 2023

BatsResearch / menghini-enhanceCLIPwithCLIP-code

Exploring prompt tuning with pseudolabels for multiple modalities, learning settings, and training strategies.

self-training clip prompt-tuning vision-language-model pseudolabels

Updated Jun 5, 2023
Python

M3-IT / YING-VLM

Vision Large Language Models trained on M3IT instruction tuning dataset

large-language-models instruction-tuning vision-language-model

Updated Aug 16, 2023
Python

orrzohar / LOVM

Official Pytorch code for LOVM: Language-Only Vision Model Selection

model-selection multimodal-deep-learning vision-language-model

Updated Jul 19, 2023
Jupyter Notebook

marthaflinderslewis / clip-binding

Code to reproduce the experiments in the paper: Does CLIP Bind Concepts? Probing Compositionality in Large Image Models.

clip vision-language-model

Updated Jun 9, 2023
Python

bnabis93 / vision-language-examples

Vision-lanugage model example code.

tutorial example pytorch transformer embedding-models model-acceleration vision-language model-optimization vision-language-model

Updated Aug 3, 2023
Python

ExplainableML / ProbVLM

ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models

pytorch uncertainty-quantification uncertainty-estimation bayesian-deep-learning vision-language-model

Updated Aug 15, 2023
Python

1adrianb / lasp

pytorch clip zero-shot few-shot-learning vision-language-model

Updated Jul 31, 2023
Python

minhanh151 / PRE

Prompt Learning with Residual Context Optimization for Vision-Language Models (2023)

prompt-learning vision-language-model

Updated Aug 24, 2023
Python

Improve this page

Add a description, image, and links to the vision-language-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-language-model topic, visit your repo's landing page and select "manage topics."