Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to run the codes #9

Open
naimesha opened this issue Jun 29, 2020 · 29 comments
Open

how to run the codes #9

naimesha opened this issue Jun 29, 2020 · 29 comments

Comments

@naimesha
Copy link

can somebody please explain how to run the codes and what is config in train_classifier()

@cemanil
Copy link
Owner

cemanil commented Jun 30, 2020

Hi,

"Config" in train_classifier is an object that contains the details of the experiment configuration.

Do you mind elaborating on what you wish to run that's not (or insufficiently) covered in the README?

@naimesha
Copy link
Author

naimesha commented Jul 1, 2020 via email

@naimesha
Copy link
Author

naimesha commented Jul 1, 2020 via email

@cemanil
Copy link
Owner

cemanil commented Jul 1, 2020

It sounds like the "Tasks" section of the README contains what you need.
For example, you can run
"""
python ./lnets/tasks/classification/mains/train_classifier.py ./lnets/tasks/classification/configs/standard/fc_classification.json
"""
to train a classification network. The json file is directly processed.

Hope this helps.

@naimesha
Copy link
Author

naimesha commented Jul 1, 2020 via email

@naimesha
Copy link
Author

naimesha commented Jul 1, 2020 via email

@naimesha
Copy link
Author

naimesha commented Jul 1, 2020

please reply for the error. thanks inadvance

@cemanil
Copy link
Owner

cemanil commented Jul 1, 2020

I see - It is possible that the attribute error you're getting is because you're using a different pytorch version. Which version are you using?

The majority of the code should run without problems with the current version, but it might take a few minor modifications.

@naimesha
Copy link
Author

naimesha commented Jul 1, 2020 via email

@cemanil
Copy link
Owner

cemanil commented Jul 1, 2020

That might be it - the code is only tested rigorously on PyTorch version 0.4.. We are planning to upgrade the repo at some point in the future, but that might not be soon enough for your final year project. Perhaps you can try running things on pytorch 0.4?

The other error seems to be due to the fact that the program is trying to load a model that hasn't been saved during training. In the config, you'll find logging.save_model field. Setting that to True should fix the problem.

@naimesha
Copy link
Author

naimesha commented Jul 1, 2020 via email

@naimesha
Copy link
Author

naimesha commented Jul 2, 2020

i tried running the code after changing the loggig.save_model to true but i am still seeing the error
https://user-images.githubusercontent.com/30970597/86319293-d8915f00-bc51-11ea-93e0-1a7ff8cd9bb4.jpeg
after the above error i also changed the logging.best_model to true and got the below error
https://user-images.githubusercontent.com/30970597/86319357-fc54a500-bc51-11ea-96f2-74bdbd31b847.jpeg

@naimesha
Copy link
Author

naimesha commented Jul 2, 2020

hey!
i am experiencing the same error for all codes. error is no such file or directory.
https://user-images.githubusercontent.com/30970597/86326974-1a290680-bc60-11ea-9fb1-b74af69a28a3.png

@naimesha naimesha closed this as completed Jul 2, 2020
@naimesha naimesha reopened this Jul 2, 2020
@naimesha
Copy link
Author

naimesha commented Jul 2, 2020

hey!
i closed the issue by mistake.
please reply when you can.
thank you

@cemanil
Copy link
Owner

cemanil commented Jul 2, 2020

Hi,

I cannot reproduce the error you're getting. In my setup, the best models get saved and are successfully loaded for validation.

This is the command I ran:
"""
python ./lnets/tasks/dualnets/mains/train_dual.py ./lnets/tasks/dualnets/configs/absolute_value_experiment.json
"""
The only modifications I made in the json are 1) set save_model and save_best to True. 2) Reduce the training epochs (so that I can debug faster)

Here are the last few lines printed out by the program before it terminates:
"""
Epoch 8: 16it [00:00, 37.16it/s]
Training loss: -0.9953
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_10_22_28_870290/checkpoints/best.
Averaged validation loss: -0.995928555727005
Epoch 9: 16it [00:00, 36.52it/s]
Training loss: -0.9946
Averaged validation loss: -0.996421679854393
Epoch 10: 16it [00:00, 36.44it/s]
Training loss: -0.9953
Averaged validation loss: -0.9966489151120186
Epoch 11: 16it [00:00, 36.62it/s]
Training loss: -0.9932
Averaged validation loss: -0.9966634809970856
Epoch 12: 16it [00:00, 35.48it/s]
Training loss: -0.9942
Averaged validation loss: -0.9944535940885544
Epoch 13: 16it [00:00, 34.49it/s]
Training loss: -0.9943
Averaged validation loss: -0.988710567355156
Epoch 14: 16it [00:00, 34.47it/s]
Training loss: -0.9945
Averaged validation loss: -0.9972907453775406
Epoch 15: 16it [00:00, 34.08it/s]
Training loss: -0.9912
Averaged validation loss: -0.9979196637868881
Testing best model.
Averaged validation loss: -0.995915874838829
"""

At epoch 8, the best model until that point gets saved.

Could you confirm:

  1. Your program does print out lines starting with "Saving new best model at ..."
  2. After those lines appear, the models do get saved in the directories specified
    ?

@naimesha
Copy link
Author

naimesha commented Jul 2, 2020

this is what i got

Epoch 0: 16it [00:00, 28.46it/s]
Training loss: -0.3053
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.6530682630836964
Epoch 1: 16it [00:00, 26.09it/s]
Training loss: -0.8605
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9699119031429291
Epoch 2: 16it [00:00, 25.11it/s]
Training loss: -0.9813
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9791331179440022
Epoch 3: 16it [00:00, 25.34it/s]
Training loss: -0.9769
Averaged validation loss: -0.9707604050636292
Epoch 4: 16it [00:00, 24.95it/s]
Training loss: -0.9684
Averaged validation loss: -0.9677758105099201
Epoch 5: 16it [00:00, 26.24it/s]
Training loss: -0.9669
Averaged validation loss: -0.9697879105806351
Epoch 6: 16it [00:00, 26.73it/s]
Training loss: -0.9718
Averaged validation loss: -0.9707205519080162
Epoch 7: 16it [00:00, 25.32it/s]
Training loss: -0.9763
Averaged validation loss: -0.9752324745059013
Epoch 8: 16it [00:00, 26.35it/s]
Training loss: -0.9805
Averaged validation loss: -0.9864860586822033
Epoch 9: 16it [00:00, 24.99it/s]
Training loss: -0.9858
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9862877279520035
Epoch 10: 16it [00:00, 30.90it/s]
Training loss: -0.9890
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9877906292676926
Epoch 11: 16it [00:00, 26.30it/s]
Training loss: -0.9908
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9939416795969009
Epoch 12: 16it [00:00, 31.36it/s]
Training loss: -0.9935
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.991676576435566
Epoch 13: 16it [00:00, 30.95it/s]
Training loss: -0.9943
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9936381727457047
Epoch 14: 16it [00:00, 26.20it/s]
Training loss: -0.9920
Averaged validation loss: -0.9930611923336983
Testing best model.
Averaged validation loss: -0.9936569929122925
Traceback (most recent call last):
File "./lnets/tasks/dualnets/mains/train_dual.py", line 176, in
final_state = train_dualnet(dual_model, distrib_loaders, cfg)
File "./lnets/tasks/dualnets/mains/train_dual.py", line 162, in train_dualnet
after_training=False)
File "/home/naimesha/lnets/utils/saving_and_loading.py", line 105, in save_1_or_2_dim_dualnet_visualizations
save_1d_dualnet_visualizations(model, figures_dir, config, epoch, loss)
File "/home/naimesha/lnets/tasks/dualnets/visualize/visualize_dualnet.py", line 134, in save_1d_dualnet_visualizations
save_path = os.path.join(figures_dir, "epoch_{:04}visualize_1d".format(epoch))
TypeError: unsupported format string passed to NoneType.format

@naimesha
Copy link
Author

naimesha commented Jul 2, 2020

and also how do i get the graphs?

@cemanil
Copy link
Owner

cemanil commented Jul 2, 2020

Ah, the unsupported string error can be resolved by modifying {:04} to {}.

The graphs should be saved automatically, (as long as you save the "visualize" flag is set to true, which is the default).

@cemanil
Copy link
Owner

cemanil commented Jul 3, 2020

You probably need to install the foolbox package.

@naimesha
Copy link
Author

naimesha commented Jul 3, 2020

and also how do we get the test error value for classification experiment?
i am able to train the model without any errors. But i couldn't get the test error value
sorry for the previous doubt i just thought everything was ready to go as i installed the setup.py

@cemanil
Copy link
Owner

cemanil commented Jul 3, 2020

Hmm, I expected the training script to automatically run validation. What are the last few lines the training script prints out?

@naimesha
Copy link
Author

naimesha commented Jul 3, 2020

i got the validation acc and loss.log files

@naimesha
Copy link
Author

naimesha commented Jul 3, 2020

and about the foolbox. foolbox is already installed. its showing there is no module named foolbox.adversial

@cemanil
Copy link
Owner

cemanil commented Jul 3, 2020

I see - I suspect this is due to the fact that the foolbox package changed since we released the code. Maybe you could try downgrading foolbox and see if this helps?

@naimesha
Copy link
Author

naimesha commented Jul 3, 2020

what about this error?
from .distances import MSE
ImportError: cannot import name 'MSE'

@cemanil
Copy link
Owner

cemanil commented Jul 4, 2020

I'm guessing you encountered that when you tried to run "eval_adv_robustness.py" (line 79-80)?

If that's the case, then I believe the problem might also be due to the foolbox version and downgrading might help.

@Pallapothu-Naimesha
Copy link

hey!
how do we change the depth for high dimensional code experiment?

@cemanil
Copy link
Owner

cemanil commented Jul 14, 2020

Try adding more hidden layer sizes to the "layers field?
"layers": [
128,
128,
...,
1
],

@Pallapothu-Naimesha
Copy link

Pallapothu-Naimesha commented Jul 19, 2020

hey

can you briefly write what types of learning did we use for different experiments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants