KBT's: option for a book-based filter for the KBT's added to the training data. #442

mmartin9684-sil · 2024-07-05T13:07:39Z

*** This enhancement request is for research purposes. ***

When KBT's are added to the training data during preprocessing, all of the populated KBTs are included. KBT's from completed books may not be that beneficial since the completed verse text is available, and this verse-level training data is likely more suitable for model fine-tuning. Also, for projects with extensively populated KBT's, including all of these KBT's in the training data may swamp the verse-level training data and skew the model results.

The primary benefit of including KBT's in the training data is intended to be for the improvement of proper name translation for new books, so better new book drafts may be possible by only adding KBT's from new books to the training data. An optional book-based filter for limiting the KBT's added to the training data would allow this strategy to be evaluated.

mmartin9684-sil added enhancement New feature or request pipeline 3: preprocess Issue related to preprocessing. research Research topics labels Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KBT's: option for a book-based filter for the KBT's added to the training data. #442

KBT's: option for a book-based filter for the KBT's added to the training data. #442

mmartin9684-sil commented Jul 5, 2024

KBT's: option for a book-based filter for the KBT's added to the training data. #442

KBT's: option for a book-based filter for the KBT's added to the training data. #442

Comments

mmartin9684-sil commented Jul 5, 2024