Skip to content

Official code for paper "OpenCIL: Benchmarking Out-of-Distribution Detection in Class-Incremental Learning"

License

Notifications You must be signed in to change notification settings

mala-lab/OpenCIL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenCIL: Benchmarking Out-of-Distribution Detection in Class-Incremental Learning

Description

OpenCIL is the first benchmark platform expressly tailored both post-hoc-based and fine-tuning-based out-of-distribution (OOD) detection methods for different types of class incremental learning (CIL) models, evaluating the capability of CIL models in rejecting diverse OOD samples. OpenCIL offers a flexible algorithmic template that allows users to effortlessly implement their unique designs within the platform. Moreover, it provides a new evaluation protocol to fairly and systematically compare diverse OOD detection methods among different incremental steps, and comprehensively evaluate 60 baselines that are composed by 15 OOD detectors and 4 CIL models. We further propose a new baseline called BER that can effectively mitigate the common issues in the 60 baselines.

Setup

Environment

## Installing base package from opencil
conda env create -f environment.yaml
conda activate opencil

## Installing additional package
pip install seaborn

Dataset

Download and Prepare data

Download full dataset at the following GDrive link. Then extract it into folder data which have the following structure:


├── data
│   └── benchmark_imglist
│       ├── cifar100
│       ├── imagenet
│   └── images_classic
│       ├── cifar10
│       ├── cifar100
│       ├── mnist
│       ├── places365
│       ├── svhn
│       ├── texture
│       ├── tin
│   └── images_largescale
│       ├── imagenet_1k
│           ├── train
│           ├── val
│       ├── imagenet_o
│       ├── inaturalist
│       ├── openimage_o
│       ├── species_sub

Note that train.zip and val.zip uploaded at largescale_data in GDrive should be stored inside imagenet_1k folder as shown above.

Dataset explanation

1. ID dataset for training class incremental learning model

Training an class incremental learning approach on two main dataset: small dataset-cifar100 and large scale dataset-imagenet1k.

Training such model requires splitting the original dataset into multiple tasks, Data samples from each task form a disjoint set with data samples from other tasks. Please refer to problem formulation of CIL for further information.

At each task, a collection of data samples from current task along with an exemplar set of samples from previous tasks are fed into the model for training. Below are information about different kinds of setting varying the number of classes that we have for each task to train CIL model on each dataset:

  • cifar100: 100 classes in total with 3 main settings:
    • 5 tasks: 20 classes per each task
    • 10 tasks: 10 classes per each task
    • 20 tasks: 5 classes per each task
    • exemplar size for different task settings: 2,000
  • imagenet1k: 1000 classes in total with 3 main settings:
    • 5 tasks: 200 classes per each task
    • 10 tasks: 100 classes per each task
    • 20 tasks: 500 classes per each task
    • exemplar size for different task settings: 20,000

These data setting for training incremental learning model will be automatically set up once you run the training script. For more information about how data is set up, please refer to main code of loading data opencil/datasets/data_manager_pycil.py which is called via get_datamanager inside opencil/datasets/utils.py.

2. dataset for training fine-tuning-based OOD detection method (optional)

After training an class incremental learning model, we can freeze the feature extractor of CIL model and finetune an extra classifier to reject OOD samples which maintaining incremental classification accuracy.

Furthermore, we only can use the current task ID data to train this extra classifier, so the data split is the same as above.

3. OOD dataset for testing pertrained or finetuned CIL model

After training or finetuning an class incremental learning model, we should test its performance on ood detection score using other ood dataset.

Experiments should be conducted on both small scale and large scale dataset to demonstrate model's robustness:

  • Small scale dataset→ Training CIL model on Cifar100: 100 classes and then perform testing OOD function on the following OOD dataset:
    • nearood: cifar10, tin.
    • farood: mnist, svhn, texture, places365.
  • Large scale dataset→ Training CIL model on Imagenet1K: 1000 classes and then perform testing on OOD function on the following OOD dataset:
    • nearood: species, inaturalist, openimage_o, imagenet_o.
    • farood: texture, mnist.

Usage

Scripts are mainly used in this repository. They are defined in scripts and there are four main types of scripts:

Fast replay all experiences

bash run.sh

Training a CIL method

Scripts for training any supported CIL methods are mostly defined in scripts/cil_only/$CIL_METHOD$. We have benchmarked 4 most common class incremental learning methods in total which are:

  • iCaRL: Incremental Classifier and Representation Learning. Paper
  • BiC: Large Scale Incremental Learning. Paper
  • FOSTER: Feature Boosting and Compression for Class-Incremental Learning. Paper
  • WA: Maintaining Discrimination and Fairness in Class Incremental Learning. Paper

These scripts could be used to train incremental learning model on two main dataset: cifar100 or imagenet1k. You might want to add more CIL methods following exactly the same configuration as aforementioned CIL methods.

How to run:

An example of training CIL model using iCaRL method with 10 tasks on cifar100 dataset is shown below:

bash scripts/cil_only/icarl/cifar100_train_cil.sh 10

Testing OOD scoring function on pretrained CIL model

Scripts for testing ood performance of incremental learning model which has been trained before are mostly stored in scripts/ood_after_cil.

The folder name of these scripts would be the combination of an CIL method that you want to test and an OOD scoring function that you want to use.

For example, testing CIL model trained with iCaRL approach on cifar100 dataset using ebo function would result in the scripts at scripts/ood_after_cil/icarl/icarl_ebo/cifar100_test_ood.sh.

We have 9 most common OOD scoring function being implemented in this repository which are:

  • MSP: A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks. Paper. Note that the name of this ood scoring function is flexibly inside the code which could be msp or baseood.
  • ODIN: Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks. Paper
  • EBO: Energy-based Out-of-distribution Detection. Paper
  • MaxLogit: Scaling out-of-distribution detection for real-world settings Paper
  • GEN: Pushing the limits of softmax-based out-of-distribution detection Paper
  • ReAct: Out-of-distribution detection with rectified activations Paper
  • KLM: Scaling Out-of-Distribution Detection for Real-World Settings. Paper
  • Relation: Neural relation graph: A unified framework for identifying label noise and outlier data Paper
  • NNGuide: Nearest neighbor guidance for out-of-distribution detection Paper

How to run

An example of testing iCaRL CIL model trained on CIFAR100 using MSP scoring function with 10 tasks is shown below

bash scripts/ood_after_cil/icarl/icarl_baseood/cifar100_test_ood.sh 10

Fine-tuning-based OOD detection methods on pretrained CIL model

Scripts for finetuning any supported OOD detection methods on pretrained CIL methods are mostly defined in scripts/cil_only_finetune/$OOD_METHOD$. We have benchmarked 6 most common class incremental learning methods in total which are:

  • LogitNorm: Mitigating neural network overconfidence with logit normalization Paper
  • T2FNorm: Extremely simple scaled train-time feature normalization for ood detection
    Paper
  • AUGMIX: A simple data processing method to improve robustness and uncertainty Paper
  • REGMIX: Mixup as a regularizer can surprisingly improve accuracy and out distribution robustness Paper
  • VOS: Learning what you don’t know by virtual outlier synthesis Paper
  • NPOS: Non-parametric outlier synthesis Paper

and including our proposed baseling BER.

Note that the pretrained CIL model should be prepared first.You can choose one of these 4 supported CIL models as pretrained CIL methods, which is the required parameter.

How to run

An example of fine-tuning-based OOD detection method REGMIX on CIL model iCaRL with 10 tasks on cifar100 dataset is shown below:

bash scripts/cil_only_finetune/regmix/cifar100_train_cil.sh icarl 10

Testing OOD scoring function on finetuned CIL model

Scripts for testing ood performance of finetuned CIL model are mostly stored in scripts/ood_after_cil_finetune/$OOD_METHOD$.

Note that the finetuned CIL model should be prepared first, You can choose one of these 4 supported CIL models as finetuned CIL methods, which is the required parameter.

How to run

An example of testing the OOD detection method REGMIX finetuned CIL model of iCaRL on CIFAR100 with 10 tasks is shown below

bash scripts/ood_after_cil_finetune/regmix/cifar100_test_ood.sh icarl 10

Acknowledgements

The code and the framework are based on the repos OpenOOD