Preparation of French Corpus

French Corpus is a translated version of WebNLG release3.0 English dataset. We used English to French [NMT model][[(https://storage.googleapis.com/samanantar-public/V0.3/models/en-indic.zip)]] provide by https://pytorch.org/hub/pytorch_fairseq_translation/ to generate french sentences.

To generate the french corpus

download the required packages

pip install -r requirements.txt

Generate files for train,dev and test folder

python3 run.py <path to the folder containing english xml files>

In our case, we used english language datapath as it is easy to replace english lex with french lex. WebNLG corpus can be downloaded from this repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Preparation of French Corpus

To generate the french corpus

download the required packages

Generate files for train,dev and test folder

Files

README.md

Latest commit

History

README.md

File metadata and controls

Preparation of French Corpus

To generate the french corpus

download the required packages

Generate files for train,dev and test folder