Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
img
 
 
 
 
 
 
 
 

README.md

DeepMM

What is DeepMM?

DeepMM is a multimodal deep learning package based on keras and tensorflow that combines representations of high-cardinality categorical features together with text-based features in a single deep learning architecture for regression and binary classification use cases.

This model employs the idea of categorical entitiy embeddings (see [https://arxiv.org/abs/1604.06737]) for mapping highly sparse one-hot encoded categorical features into a latent lower-dimensional feature space. A bi-interaction pooling layer (as proposed by He et al. 2017 [https://arxiv.org/abs/1708.05027]) is incorporated to account for second-order feature interactions. An LSTM-based sub-network is used to process the sequential text features.

The architecture is oriented on other deep learning approaches for processing sparse features, such as:

Deep Learning Architecture

General outline of the multimodal model architecture (with concatenation of categorical embedding vectors) with four generic categorical features (C1-C4): image

Usage

The package exposes the architecture as a keras.models.Model object based on Keras' functional API. It supports the integration of pre-trained word embedding vectors.

from deepmm.models import DeepMultimodalModel

# Preprocess data and combine modalities into a single matrix
# ...

model = DeepMultimodalModel(task='regression', num_unique_categories=num_unique_categories, cat_embedding_dim=16,
                            txt_vocab_size=vocabulary_size, txt_embedding_dim=EMBEDDING_DIM, txt_max_len=MAX_LEN,
                           txt_weights=embedding_matrix,
                           cat_hidden_neurons=[100,50,10], cat_dropout=[0.1, 0.2, 0.2], cat_bi_interaction=True,
                           txt_lstm_neurons=32, txt_dropout=0.2, final_hidden_neurons=[64, 32], final_dropout=[0.3, 0.3])

About

Multimodal deep learning package that uses both categorical and text-based features in a single deep architecture for regression and binary classification use cases.

Topics

Resources

License

Releases

No releases published

Packages

No packages published

Languages

You can’t perform that action at this time.