Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major KBT's: Option to add them to training data when the source glosses are in a different language than the training source text. #441

Open
mmartin9684-sil opened this issue Jul 5, 2024 · 0 comments
Labels
enhancement New feature or request pipeline 3: preprocess Issue related to preprocessing. research Research topics

Comments

@mmartin9684-sil
Copy link
Collaborator

*** This enhancement request is for research purposes. ***

Glosses for the Major KBT's are available in a limited number of languages. Currently, SILNLP will not add glosses to the training data if the source language for the glosses is not the same as the source language of the translation used for training. This excludes many projects from being able to use their KBT's. We would like to investigate the benefits of including the source/vernacular glosses in the training data even when the translation source language differs from the gloss source language.

@mmartin9684-sil mmartin9684-sil added enhancement New feature or request pipeline 3: preprocess Issue related to preprocessing. research Research topics labels Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pipeline 3: preprocess Issue related to preprocessing. research Research topics
Projects
Status: 🆕 New
Development

No branches or pull requests

1 participant