Skip to content

arthijayaraman-lab/semi-supervised_learning_microscopy_images

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Semi-supervised machine learning workflow for analysis of nanowire morphologies from TEM images

This repository contains implementation of semi-supervised transfer learning workflow for nanowire morphology classification and segmentation from transmission electron microscopy (TEM) images.

Paper: Semi-supervised machine learning workflow for analysis of nanowire morphologies from transmission electron microscopy images is available at Digital Discovery as an accepted manuscript.

Dataset

The peptide / protein nanowires used in this study were synthesized and imaged by Brian Montz in Prof. Todd Emrick's research group at the Department of Polymer Science and Engineering Department, University of Massachusetts Amherst.

The TEM image dataset of the nanowire morphologies along with manual segmentation ground truth masks and image encoders trained via self-supervised training are hosted on Zenodo, DOI: 10.5281/zenodo.6377140

Brief description of functionality of each jupyter notebook

Preprocess images: Perform augmentation for singular morphology images, create binary segmentation ground truth data from manually labeled images. Note: users do not need to run this again.

Percolation_analysis: Serve as pixel-level quantification to distinguish dispersed vs. network morphologies.

SimCLR_Barlow_encoder_training: Perform self-supervised training of image encoders on unlabeled images.

Hyperparameter tuning: Assess hyperparameter tuning results with downstream classification task for self-supervised training.

Assessment of label-efficient training of downstream classification task: Assess classification performance on nanowire morphology images with feature maps obtained from encoders trained under self-supervision with optimized hyperparameters.

Assessment of classification performance on mNP dataset: Assess classification performance on metal nanoparticle morphology images with feature maps obtained from encoders trained under self-supervision with optimized hyperparameters.

Assessment of classification performance on TEM virus dataset: Assess classification performance on TEM virus images with feature maps obtained from encoders trained under self-supervision with optimized hyperparameters.

Training_of_segmentation_models: Assess segmentation performance on nanowire morphology images with feature maps obtained from encoders trained under self-supervision with optimized hyperparameters.

Result_plots_for_classification_segmentation_and_hyperparameter_tuning: Create boxplots for the three sets classification results and segmentation results.

One-shot learning result plots: Examine one-shot learning with the nanowire morphology dataset.

Instructions to use/adapt the notebooks

Download the open-access contents from the Zenodo dataset listed above to your local drive, unzip them into respective folders. A "2022-nanowire-morphology" folder should be created and unzip "dispersed", "bundle", "network" and "singular" into the "2022-nanowire-morphology" folder.

Create a folder named "TEM image datasets" in your google drive and upload the unzipped folders to "TEM image datasets".

Additional folders that are needed to be created (under the main folder "TEM image datasets") to store intermediate models and figures are mentioned in the beginning of each individual notebooks if necessary. These should be created manually when you see the text requirement that's different for each notebook (some may not need additional folder creation).

Notebooks can then be run on google Colab. For the SimCLR_Barlow_encoder_training and Training_of_segmentation_models, users should use GPU runtime on google Colab. The rest of the notebooks can be run on CPU runtime (No GPU).

The notebooks can be adapted to test on different user-provided datasets with modifications.

Citing

If you use the codes in this repository, please cite the following manuscript: S. Lu, B. Montz, T. Emrick and A. Jayaraman, Digital Discovery, 2022, 1, 816 DOI: 10.1039/D2DD00066K

Funding

This project is financially supported by the U.S. National Science Foundation, Grant NSF DMREF #1921839 and #1921871.