MultiModal-Image-Captioning

This is a Torch implementation of Image Captioning using Multi-modal RNN that use both Word Embeddings and CNN features, as described in Mao et. al.

The implementation is incomplete and work in progress

Meanwhile check out the Tensorflow Implemetaion by J.Mao here

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
COCOData		COCOData
LICENSE		LICENSE
README.md		README.md
VGG		VGG
evaluate.lua		evaluate.lua
main.lua		main.lua
models.lua		models.lua
noise_contrastive.lua		noise_contrastive.lua
preprocess_coco.py		preprocess_coco.py
test_rnn.lua		test_rnn.lua
utils.lua		utils.lua

Provide feedback