Skip to content

CogStack/MedCATtutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MedCAT Tutorials

Build Status

Introductory tutorials

In this tutorial, we will walk you through each stage of a basic MedCAT project. The blog posts are there to tell a story and explain why several steps or processes which we have decided to take are necessary. While the Jupyter Notebooks are for a hands-on experience building and training your MedCAT models for information extraction tasks.

Part Title Google Colab Blog Post
1 Introduction - TDS
1.1 [OPTIONAL] Logging With MedCAT Colab -
2 Data set Preparation and Basic Statistics Colab TDS
3.1 Building a new Concept Database (CDB) and Vocabulary (Vocab) Colab TDS
3.2 Unsupervised training and NER+L Colab TDS
3.3 Technical model optimisations Colab -
4.1 Creating a tokenizer model (huggingface) and embeddings for MetaAnnotations Colab -
4.2 Supervised training and fine-tuning + Meta-annotations Colab -
4.3 Annotating documents with the full MedCAT pipeline with MetaAnnotations Colab -
5 Analysing the results Colab TDS

Specialised tutorials

These tutorials expand upon specific aspects of the topics covered across the introductory tutorials. If there is anything in particular you would like us to cover in the future, let us know!

Part Title Google Colab
- Working with SNOMED CT and building a custom Concept Database (CDB) Colab
- Comparing models using regression test tooling Colab

Development/Editing

Make sure jupyter and jq are installed and available on your path. Modifying the companion HTML version directly is discouraged and instead install the following pre-commit hook which will generate them during committing your change on .ipynb files:

git config --local core.hooksPath git-config/hooks

To inspect change during code review, visit Colab and select the target branch and tutorial. After it is opened, click File | Revision history and select start and end revisions you are interested in.

Known Issues:

  • For ContextualVersionConflict on Google Colab, you need to restart the runtime and run the cell again.
  • The pre-commit hook requires nbconvert<6 and jinja2<=3.0.