Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About train #31

Open
Zhangpei226 opened this issue Jun 29, 2024 · 1 comment
Open

About train #31

Zhangpei226 opened this issue Jun 29, 2024 · 1 comment

Comments

@Zhangpei226
Copy link

(sifu3) pp@ys:~/SIFU$ python -m apps.train -cfg ./configs/train/sifu.yaml
load from ./data/cape/train.txt
total: 152
load from ./data/cape/val.txt
total: 36
ICON:
w/ Global Image Encoder: True
Image Features used by MLP: ['normal_F', 'normal_B']
Geometry Features used by MLP: ['sdf', 'cmap', 'norm', 'vis', 'sample_id']
Dim of Image Features (local): 6
Dim of Geometry Features (ICON): 7
Dim of MLP's first layer: 78

GPU available: True, used: True
TPU available: None, using: 0 TPU cores
Resume MLP weights from ./data/ckpt/sifu.ckpt
Resume normal model from ./data/ckpt/normal.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

| Name | Type | Params

0 | netG | HGPIFuNet | 413 M
1 | reconEngine | Seg3dLossless | 0

411 M Trainable params
1.3 M Non-trainable params
413 M Total params
1,652.498 Total estimated model params size (MB)
Validation sanity check: 0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/pp/SIFU/apps/train.py", line 154, in
trainer.fit(model=model, datamodule=datamodule)
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 499, in fit
self.dispatch()
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 546, in dispatch
self.accelerator.start_training(self)
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 73, in start_training
self.training_type_plugin.start_training(trainer)
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 114, in start_training
self._results = trainer.run_train()
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 607, in run_train
self.run_sanity_check(self.lightning_module)
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 856, in run_sanity_check
_, eval_results = self.run_evaluation(max_batches=self.num_sanity_val_batches)
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 712, in run_evaluation
for batch_idx, batch in enumerate(dataloader):
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 628, in next
data = self._next_data()
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
return self._process_data(data)
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
data.reraise()
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/_utils.py", line 543, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/pp/SIFU/lib/dataset/PIFuDataset.py", line 217, in getitem
subject = self.subject_list[mid].split("/")[1]
IndexError: list index out of range

@Zhangpei226
Copy link
Author

8 I found that my train.txt only has 150 entries, and val.txt only has one entry, but it shows 152 and 36. Is this the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant