Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I think I couldn't prove how cnn_distill has highter performance than base_cnn. #18

Open
K-Won opened this issue Oct 8, 2019 · 1 comment

Comments

@K-Won
Copy link

K-Won commented Oct 8, 2019

This is my situation.
I trained base_cnn in advance using cifar10 dataset for comparing performance between base_cnn and cnn_distill.

Also, I trained base_resnet18 as a teacher using same dataset.
Lastly, I trained cnn_distill using resnet18.

I got two accuracy which were 0.875 from base_cnn and 0.858 from cnn_distill in each metrics_val_best_weights.json.
It looks like that base_cnn is better than cnn_distill.

I didn't change any param in base_cnn and cnn_distill except for one param which was augmentation value from 'no' to 'yes' in base_cnn's params.json.

I think there would be no reason to use knowledge-distillation if base_cnn had higher accuracy.
Please let me know where I was wrong.
Thanks for your time.

@zshy1205
Copy link

@K-Won I think if your base model is more complicated, then you can not get promotion. So I think you can try use a small model, and train it with distill or not, then I think it will different between the two model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants