This repository contains end-to-end implementation of Neural Projection with Skip Gram (NP-SG) and Deep Averaging Network (DAN) model. We first train a NP-SG model and then leverage the trained projected embeddings to train a DAN for SST-fine classification task. Our goal is to compare the performance of projection based on the fly embeddings generated from locally sensitive hashing with static and non-static embeddings.
-
Clone the repository.
-
Run
pip install -r requirements.txt
. This will download and install all the required packages. -
Run
python setup.py
. This will set-up the directory structure and download required corpora for experiments. -
Request for enWiki9 dataset at [email protected].
-
Set the
config.py
script before running experiments. The experiments spans over two steps -
a. Training a NP-SG model with some corpus (we use a chunk of wiki9). The larger this corpus the better it is.
b. Using the embeddings from step 1, for any downstream task e.g. we train a DAN model for SST-fine data -
To test the pipeline set n=1000, test=True.
-
To run a complete experiement run the following three scripts:
a.python3 data_prep.py
b.python3 train_projection.py
c.python3 train_dan.py
OR
you can run the bash scriptrun.sh
-
Set n > 10,000 to train the NP-SG model on a larger corpus.
We have developed the pipeline to use wiki9, SST-Fine, and Bible Corpus for training the NP-SG model alongwith a DAN model on SST-Fine dataset for five class classification task.
- enwiki9 (to train NP-SG model)
- SST-Fine (to train test classification task)
- Bible corpus (from nltk, for small scale test experiments)
Trainable Embedding? | NP-SG train Dataset | Skip-gram train Size | Test Acc. (SST-Fine) |
---|---|---|---|
No | SST-Fine | 7,000 | 30.9% |
Yes | SST-Fine | 7,000 | 37.68% |
No | enWiki9 | 1,000 | 27.88% |
Yes | enWiki9 | 1,000 | 37.51% |
No | enWiki9 | 5,000 | 29.7% |
Yes | enWiki9 | 5,000 | 38.1% |
No | enWiki9 | 30,000 | 30.43% |
Yes | enWiki9 | 30,000 | 38.42% |
No | enWiki9 | 60,000 | 30.97% |
Yes | enWiki9 | 60,000 | 40.33% |
Few of recent papers that have shown state-of-the-art result with Neural Projections,
- Neural Projection Skip-Gram (https://arxiv.org/pdf/1906.01605.pdf)
- PRADO by Google Research (https://www.aclweb.org/anthology/D19-1506.pdf)
- Self-governing Neural Networks (https://www.aclweb.org/anthology/D18-1105.pdf)
- The classification model we have implemented in this research - Deep Averaging Network (https://people.cs.umass.edu/~miyyer/pubs/2015_acl_dan.pdf)