GitHub - mgpopinjay/nlp-qa-outdomain: Out-domain Question Answering (QA) on SQuAD with bi-directional LSTM

Out-domain Question Answering (QA) on SQuAD with bi-directional LSTM

This experiment assessed the effect of applying out-domain(OD) and domain-specific word embeddings and OD fine-tuning to a Question Answering(QA) system.

The experiment used SQuAD, based on general questions and answers generated from Wikipedia articles, as its baseline model, and a biomedical-focused QA dataset, BioASQ, for OD QA evaluation. In building a model for OD QA, the performance contributions of non-domain specific (GloVe) and biomedical-focused (e.g. BioReddit, BioWordVec, etc.) word embeddings are compared.

The revised model, a SQuAD(BioReddit) baseline fine-tuned with OD BioASQ data, achieved a score gain of +7.22 F1 and +3.0 EM over the baseline model in OD performance on BioASQ data. Meanwhile, the OD fine-tuning of a SQuAD (GloVe) baseline led to a gain of +5.36 F1 and +.27 EM. Comparing both OD fine-tuning cases, the use of in-domain (ID) word embeddings led to less gain in the EM score compared to the use of OD word embeddings.

These preliminary results show favorable evidence for OD fine-tuning as a technique for improving the F1 score in OD QA task, and mild evidence for OD word-embedding in improving the EM score.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
fp_qa		fp_qa
README.md		README.md
fast_bin_reader.py		fast_bin_reader.py
nlp_outdomain_qa_report.pdf		nlp_outdomain_qa_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Out-domain Question Answering (QA) on SQuAD with bi-directional LSTM

About

Languages

mgpopinjay/nlp-qa-outdomain

Folders and files

Latest commit

History

Repository files navigation

Out-domain Question Answering (QA) on SQuAD with bi-directional LSTM

About

Topics

Resources

Stars

Watchers

Forks

Languages