Skip to content

qiqi-helloworld/ABSGD

Repository files navigation

Attentional-Biased Stochastic Gradient Descentpdf

This is the official implementation of Algorithm 1 the paper "Attentional-Biased Stochastic Gradient Descent".

drawing

ABSGD is an instance reweighting method to encourage the model focus on hard samples by assigning higher robust weights $\tilde{p}_i$ to larger losses. Illustraion:

drawing

Key parameters of ABSGD

--mylambda $\lambda$, (default 0.5) : tempurature parameter for robust weights
--abgamma $\gamma \in 0\sim 1$ (default 0.9) : moving average hyper-parameter to maintain history information

Newsdrawing

With the assistant of ABSGD, we achieve 1st in ResNet50 (4th of 16 in total) in the iWildCam out of distribution changllenge, 10/2022. The code repo is provided in the wilds-competition.

Package has been released, to install:

pip3 install absgd

drawing ABSGD Package : Training tutorial and examples:

>>> from absgd.losses import ABLoss
>>> from absgd.optimizers import ABSGD, ABAdam

You can design your own loss. The following is a usecase, for more details pelease refer ABSGD_tutorial.ipynb.

>>> #import library
>>> from absgd.losses import ABLoss
>>> from absgd.optimizers import ABSGD, ABAdam
...
>>> # define loss
>>> mylambda = 0.5
# this can be easily combined with existing CBCE, LDAM loss, please refer our paper https://arxiv.org/pdf/2012.06951.pdf
>>> criterion =  nn.CrossEntropyLoss(reduction='none') 
>>> abloss = ABLoss(mylambda, criterion = criterion)
>>> optimizer = ABSGD()
...
>>> #training
>>> model.train()
>>> for epoch in range(epochs):
>>>     for i, (inputs, targets) in enumerate(train_loader):
            inputs, targets = inputs.cuda(), targets.cuda()
            outputs = model(inputs)
            losses = abloss(outputs, targets)
            optimizer.zero_grad()
            losses.backward()
            optimizer.step()
    # for two-stage $\lambda$ updates
    abloss.updateLambda()

Reminder

If you want to download the code that reproducing the reported table results for the Attentional Biased Stochastic Gradient Descent, please download current repositoary and refer next section!

Reproduce results for the paper!

In the paper, we combine ABSGD with other SOTA losses such as CBCE, LDAM, focal.

        self.criterion = CBCELoss(reduction='none')
        if 'ldam' in self.loss_type:
            self.criterion = LDAMLoss(cls_num_list=args.cls_num_list, max_m=0.5, s=30, reduction = 'none')
        elif 'focal' in self.loss_type:
            self.criterion = FocalLoss(gamma=args.gamma, reduction='none')

Running Examples for Data Imabalace Settings

CIFAR10 with exponential imbalance ratio $\rho = 0.1$

python3 -W ignore train.py --dataset cifar10 --model resnet32 --epochs 200 --batch_size 100 --gpu 1 --loss_type abce --print_freq 100 --lamda 5 --init_lamda 200 --imb_factor 0.1 --seed 0 --CB_shots 160 --lr 0.1 --drogamma 0.7 --abAlpha 0.5 --imb_type exp --train_rule reweight --DP 0.2;

CIFAR100 with step imbalance ratio $\gamma = 0.01$

python3 -W ignore train.py --dataset cifar100 --model resnet32 --epochs 200 --batch_size 128 --gpu 3 --loss_type abce --print_freq 100 --lamda 3 --init_lamda 200 --imb_factor 0.01 --seed 0 --CB_shots 160 --lr 0.1 --drogamma 0.45 --abAlpha 0.3 --imb_type step --train_rule reweight;

Running Examples for Noisy Label Settings

CUDA_VISIBLE_DEVICES=0 python3 -W ignore train.py --lamda -5 --dataset_type clothing1M --batch_size 128 --epoch 10 --droGamma 0.5 --l2_reg 1e-3 --lr 0.005 --alpha 0.1 --nr 0.4 --abAlpha 0.5 --loss ABSCE --class_tau 0 --version Q_ABSCE0.4_cloth1M_-5_drogamma_0.5_lr_0.005_clt_0;
CUDA_VISIBLE_DEVICES=6 python3 -W ignore train.py --lamda -5 --dataset_type clothing1M --batch_size 128 --epoch 10 --droGamma 0.5 --l2_reg 1e-3 --lr 0.005 --alpha 0.1 --nr 0.4 --abAlpha 0.5 --loss ABCE --class_tau 0 --version Q_ABCE0.4_cloth1M_-5_drogamma_0.5_lr_0.005_clt_0;
CUDA_VISIBLE_DEVICES=7 python3 -W ignore train.py --lamda -5 --dataset_type clothing1M --batch_size 128 --epoch 10 --droGamma 0.5 --l2_reg 1e-3 --lr 0.005 --alpha 0.1 --nr 0.4 --abAlpha 0.5 --loss ABTCE --class_tau 0 --version Q_ABTCE0.4_cloth1M_-5_drogamma_0.5_lr_0.005_clt_0;

About

Implemented ABSGD Algorithm in the Paper https://arxiv.org/abs/2012.06951

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published