I think I couldn't prove how cnn_distill has highter performance than base_cnn. #18

K-Won · 2019-10-08T02:28:25Z

This is my situation.
I trained base_cnn in advance using cifar10 dataset for comparing performance between base_cnn and cnn_distill.

Also, I trained base_resnet18 as a teacher using same dataset.
Lastly, I trained cnn_distill using resnet18.

I got two accuracy which were 0.875 from base_cnn and 0.858 from cnn_distill in each metrics_val_best_weights.json.
It looks like that base_cnn is better than cnn_distill.

I didn't change any param in base_cnn and cnn_distill except for one param which was augmentation value from 'no' to 'yes' in base_cnn's params.json.

I think there would be no reason to use knowledge-distillation if base_cnn had higher accuracy.
Please let me know where I was wrong.
Thanks for your time.

zshy1205 · 2019-11-12T06:51:37Z

@K-Won I think if your base model is more complicated， then you can not get promotion. So I think you can try use a small model, and train it with distill or not, then I think it will different between the two model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I think I couldn't prove how cnn_distill has highter performance than base_cnn. #18

I think I couldn't prove how cnn_distill has highter performance than base_cnn. #18

K-Won commented Oct 8, 2019

zshy1205 commented Nov 12, 2019

I think I couldn't prove how cnn_distill has highter performance than base_cnn. #18

I think I couldn't prove how cnn_distill has highter performance than base_cnn. #18

Comments

K-Won commented Oct 8, 2019

zshy1205 commented Nov 12, 2019