Build software better, together

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

nlp machine-learning deep-learning neural-network artificial-intelligence transformer image-captioning video-recognition multimodal-learning multitask-learning

Updated Oct 31, 2020
Python

pykale / pykale

Star

Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem

machine-learning computer-vision pytorch transfer-learning graph-analysis domain-adaptation medical-image-analysis multimodal-learning knowledge-aware-learning

Updated Apr 7, 2022
Python

ArrowLuo / CLIP4Clip

Star

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

search retrieval ranking clip multimodality multimodal-learning multimodal activitynet retrieval-model msvd msrvtt video-text-retrieval lsmdc didemo video-clip-retrieval

Updated Dec 2, 2021
Python

georgian-io / Multimodal-Toolkit

Star

Multimodal model for text and tabular data with HuggingFace transformers as building block for text data

natural-language-processing tabular-data transformer multimodal-learning huggingface-transformers

Updated Feb 16, 2022
Python

declare-lab / multimodal-deep-learning

Star

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

multimodal-interactions multimodal-learning multimodal-sentiment-analysis multimodal-deep-learning

Updated Nov 29, 2021
OpenEdge ABL

pliang279 / MultiBench

Star

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

machine-learning natural-language-processing computer-vision deep-learning robotics healthcare representation-learning speech-processing multimodal-learning

Updated Apr 4, 2022
HTML

sangminwoo / awesome-vision-and-language

Star

A curated list of awesome vision and language resources (still under construction... stay tuned!)

awesome awesome-list multimodal-learning vision-and-language

Updated Mar 25, 2021

machine-intelligence-laboratory / TopicNet

Star

Interface for easier topic modelling.

pypi topic-modeling multimodal-learning modalities document-representation multimodal-data topic-modelling bigartm-library custom-score

Updated Apr 6, 2022
Python

mmaaz60 / mvits_for_class_agnostic_od

Star

Multi-modal Transformers Excel at Class-agnostic Object Detection

multimodal-learning class-agnostic-detection open-world-detection

Updated Feb 1, 2022
Python

mhw32 / multimodal-vae-public

Star

A PyTorch implementation of "Multimodal Generative Models for Scalable Weakly-Supervised Learning" (https://arxiv.org/abs/1802.05335)

machine-learning variational-autoencoder generative-models multimodal-learning

Updated Aug 17, 2018
Python

pliang279 / MFN

Star

[AAAI 2018] Memory Fusion Network for Multi-view Sequential Learning

machine-learning multimodal-learning

Updated Aug 4, 2020
Python

ABadCandy / BaiDuBigData19-URFC

Star

my solution with 0.67 accuracy

computer-vision deep-learning pytorch multimodal-learning

Updated May 21, 2019
Python

PreferredAI / vista-net

Star

Code for the paper "VistaNet: Visual Aspect Attention Network for Multimodal Sentiment Analysis", AAAI'19

sentiment-analysis attention-mechanism multimodal-learning multimodal-sentiment-analysis

Updated Sep 9, 2020
Python

haamoon / mmtm

Star

Implementation of CVPR 2020 paper "MMTM: Multimodal Transfer Module for CNN Fusion"

pytorch action-recognition gesture-recognition multimodal-learning speech-enhancement multimodal-deep-learning cnn-fusion

Updated Jun 16, 2020
Python

verlab / Learning2Dance_CAG_2020

Star

PyTorch implementation of our graph convolutional network (GCN) for human motion generation from music. Also with paired dance-music data for training!

computer-vision sound-processing graph-convolutional-networks gcn multimodal-learning motion-analysis motion-animation motion-synthesis human-motion human-motion-analysis graph-adversarial-learning computer-and-graphics

Updated Apr 22, 2021
Python

antoyang / just-ask

Star

[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

vqa video-understanding weakly-supervised-learning multimodal-learning visual-question-answering question-generation vision-and-language videoqa pre-training video-question-answering

Updated Oct 6, 2021
Jupyter Notebook

pliang279 / factorized

Star

[ICLR 2019] Learning Factorized Multimodal Representations

machine-learning representation-learning multimodal-learning

Updated Aug 4, 2020
Python

johnarevalo / gmu-mmimdb

Star

Source code for training Gated Multimodal Units on MM-IMDb dataset

representation-learning multimodal-learning

Updated Nov 20, 2020
Python

akashe / Multimodal-action-recognition

Star

Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.

multimodality multimodal-learning multimodal-deep-learning multimodal-data multimodal-fusion multimodal-action-recognition

Updated Jun 7, 2021
Python

snap-research / MMVID

Star

[CVPR 2022] Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

deep-learning transformer bert multimodal-learning video-generation text-to-video multimodal-video-generation

Updated Mar 20, 2022

njustkmg / PaddleMM

Star

Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.

python pytorch classification paddlepaddle imagecaptioning multimodal-learning multimodal crossmodal-retrieval