Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed training [SemSeg] #530

Open
wants to merge 55 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
2f0e72b
add dataparallel
sanskar107 Aug 12, 2021
306de5b
add dataparallel class
sanskar107 Aug 12, 2021
b953254
fix bugs
sanskar107 Aug 13, 2021
4dccfc0
objdet multigpu
sanskar107 Aug 24, 2021
2779339
rename scatter
sanskar107 Aug 27, 2021
48a101f
fix objdet
sanskar107 Sep 24, 2021
53479c2
remove comments
sanskar107 Sep 24, 2021
388f948
Merge branch 'dev' into sanskar/multi_gpu
sanskar107 Jan 10, 2022
6139e66
fix cam_img matrix
sanskar107 Jan 10, 2022
d9e1564
add distributed training
sanskar107 Jan 17, 2022
ef0e440
parllel validation
sanskar107 Jan 18, 2022
29b9729
update config
sanskar107 Feb 7, 2022
672248d
gather in run_valid
sanskar107 Feb 8, 2022
620b35c
fix preprocessing
sanskar107 Feb 9, 2022
1c64d65
add shuffle
sanskar107 Feb 18, 2022
a22a8dc
fix rng
sanskar107 Feb 18, 2022
685dd3a
remove customparallel
sanskar107 Feb 18, 2022
bfaa4a2
reset semseg distributed training
sanskar107 Feb 18, 2022
10164f9
Merge branch 'dev' into sanskar/multi_gpu
sanskar107 Feb 18, 2022
6d58cd1
change config
sanskar107 Feb 22, 2022
51a16c3
fix lgtm
sanskar107 Feb 22, 2022
50133bb
address reviews (1)
sanskar107 Mar 25, 2022
1ed11c8
fix model.module....
sanskar107 Mar 25, 2022
39b3117
add semseg labels
sanskar107 Mar 28, 2022
142df86
add waymo semseg preprocessing
sanskar107 Mar 28, 2022
5391b1c
add waymo semseg dataset
sanskar107 Mar 31, 2022
eb3b551
remove dataparallel
sanskar107 Apr 5, 2022
bb7a715
change sampler
sanskar107 Apr 12, 2022
6f6913e
Merge branch 'sanskar/waymo_semseg' into sanskar/dist_semseg
sanskar107 Apr 12, 2022
d35d2f1
add val sampler
sanskar107 Apr 12, 2022
db1128c
add config files
sanskar107 Apr 12, 2022
d718080
fix waymo semseg
sanskar107 Apr 18, 2022
8e8d4f8
fix optim
sanskar107 Apr 18, 2022
3997833
improve tqdm
sanskar107 Apr 18, 2022
5a4d385
gather metric among process
sanskar107 Apr 18, 2022
df76487
enable multi node training
sanskar107 Apr 27, 2022
6fc34b1
add nuscenes semantic
sanskar107 May 4, 2022
80d0040
fix lidarseg
sanskar107 May 4, 2022
670ef48
add megaloader
sanskar107 May 10, 2022
e211235
Merge branch 'dev' into sanskar/dist_semseg
sanskar107 May 11, 2022
1fdf0d1
add megamodel
sanskar107 May 11, 2022
b5c5296
update linear layers
sanskar107 May 13, 2022
e4c4441
add new pipeline
sanskar107 May 25, 2022
ea5f303
add nuscenes semseg
sanskar107 May 25, 2022
4ff5fbe
add kitti360
sanskar107 Jun 3, 2022
43c39c7
fixes
sanskar107 Jun 7, 2022
1803d77
apply style
sanskar107 Jun 7, 2022
c8f4589
fix logging
sanskar107 Jun 9, 2022
5a4371a
add distributed semseg
sanskar107 Sep 20, 2022
8c60c40
Merge branch 'dev' into sanskar/dist_semseg
sanskar107 Sep 20, 2022
40f888f
apply-style
sanskar107 Sep 20, 2022
375bb8f
remove megamodel
sanskar107 Sep 20, 2022
50c8c20
fix docstring
sanskar107 Sep 20, 2022
5c2ba32
address lgtm
sanskar107 Sep 22, 2022
e971747
update preprocess
sanskar107 Sep 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions ml3d/configs/default_cfgs/kitti360.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
name: KITTI360
dataset_path: /Users/sanskara/data/kitti360/
cache_dir: ./logs/cache
class_weights: []
ignored_label_inds: []
test_result_folder: ./test
use_cache: False
6 changes: 6 additions & 0 deletions ml3d/configs/default_cfgs/nuscenes_semseg.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
name: NuScenesSemSeg
dataset_path: # path/to/your/dataset
cache_dir: ./logs/cache
class_weights: [282265, 7676, 120, 3754, 31974, 1321, 346, 1898, 624, 4537, 13282, 260911, 6588, 57567, 56670, 146511, 100633]
ignored_label_inds: [0]
use_cache: False
11 changes: 2 additions & 9 deletions ml3d/configs/default_cfgs/semantickitti.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,6 @@
name: SemanticKITTI
dataset_path: # path/to/your/dataset
cache_dir: ./logs/cache
ignored_label_inds: [0]
use_cache: false
class_weights: [55437630, 320797, 541736, 2578735, 3274484, 552662,
184064, 78858, 240942562, 17294618, 170599734, 6369672, 230413074, 101130274,
476491114, 9833174, 129609852, 4506626, 1168181]
test_result_folder: ./test
test_split: ['11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21']
training_split: ['00', '01', '02', '03', '04', '05', '06', '07', '09', '10']
validation_split: ['08']
all_split: ['00', '01', '02', '03', '04', '05', '06', '07', '09',
'08', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21']
class_weights: [101665, 157022, 631, 1516, 5012, 7085, 1043, 457, 176, 693044, 53132, 494988, 12829, 459669, 236069, 924425, 22780, 255213, 9664, 2024]
6 changes: 6 additions & 0 deletions ml3d/configs/default_cfgs/waymo_semseg.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
name: WaymoSemSeg
dataset_path: # path/to/your/dataset
cache_dir: ./logs/cache
class_weights: [513255, 341079, 39946, 28066, 17254, 104, 1169, 31335, 23359, 2924, 43680, 2149, 483, 639, 1394353, 858409, 90903, 52484, 884591, 24487, 21477, 322212, 229034]
ignored_label_inds: [0]
use_cache: False
39 changes: 39 additions & 0 deletions ml3d/configs/pointtransformer_waymo.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
dataset:
name: WaymoSemSeg
dataset_path: # path/to/your/dataset
cache_dir: ./logs/cache
class_weights: []
ignored_label_inds: [0]
test_result_folder: ./test
use_cache: false
model:
name: PointTransformer
batcher: ConcatBatcher
ckpt_path: # path/to/your/checkpoint
in_channels: 3
blocks: [2, 3, 4, 6, 3]
num_classes: 23
voxel_size: 0.04
max_voxels: 50000
ignored_label_inds: [-1]
augment: {}
pipeline:
name: SemanticSegmentation
optimizer:
lr: 0.02
momentum: 0.9
weight_decay: 0.0001
batch_size: 2
learning_rate: 0.01
main_log_dir: ./logs
max_epoch: 512
save_ckpt_freq: 3
scheduler_gamma: 0.99
test_batch_size: 1
train_sum_dir: train_log
val_batch_size: 2
summary:
record_for: []
max_pts:
use_reference: false
max_outputs: 1
2 changes: 1 addition & 1 deletion ml3d/configs/sparseconvunet_scannet.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ dataset:
test_result_folder: ./test
use_cache: False
sampler:
name: 'SemSegRandomSampler'
name: None
model:
name: SparseConvUnet
batcher: ConcatBatcher
Expand Down
39 changes: 39 additions & 0 deletions ml3d/configs/sparseconvunet_waymo.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
dataset:
name: WaymoSemSeg
dataset_path: # path/to/your/dataset
cache_dir: ./logs/cache
class_weights: []
ignored_label_inds: [0]
test_result_folder: ./test
use_cache: false
model:
name: SparseConvUnet
batcher: ConcatBatcher
ckpt_path: # path/to/your/checkpoint
multiplier: 32
voxel_size: 0.02
residual_blocks: True
conv_block_reps: 1
in_channels: 3
num_classes: 23
grid_size: 4096
ignored_label_inds: [0]
augment: {}
pipeline:
name: SemanticSegmentation
optimizer:
lr: 0.001
betas: [0.9, 0.999]
batch_size: 2
main_log_dir: ./logs
max_epoch: 256
save_ckpt_freq: 3
scheduler_gamma: 0.99
test_batch_size: 1
train_sum_dir: train_log
val_batch_size: 2
summary:
record_for: []
max_pts:
use_reference: false
max_outputs: 1
5 changes: 4 additions & 1 deletion ml3d/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,18 +14,21 @@

from .kitti import KITTI
from .nuscenes import NuScenes
from .nuscenes_semseg import NuScenesSemSeg
from .waymo import Waymo
from .waymo_semseg import WaymoSemSeg
from .lyft import Lyft
from .shapenet import ShapeNet
from .argoverse import Argoverse
from .scannet import Scannet
from .sunrgbd import SunRGBD
from .matterport_objects import MatterportObjects
from .kitti360 import KITTI360

__all__ = [
'SemanticKITTI', 'S3DIS', 'Toronto3D', 'ParisLille3D', 'Semantic3D',
'Custom3D', 'utils', 'augment', 'samplers', 'KITTI', 'Waymo', 'NuScenes',
'Lyft', 'ShapeNet', 'SemSegRandomSampler', 'InferenceDummySplit',
'SemSegSpatiallyRegularSampler', 'Argoverse', 'Scannet', 'SunRGBD',
'MatterportObjects'
'MatterportObjects', 'WaymoSemSeg', 'KITTI360', 'NuScenesSemSeg'
]
10 changes: 7 additions & 3 deletions ml3d/datasets/base_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,10 +127,14 @@ def __init__(self, dataset, split='training'):
if split in ['test']:
sampler_cls = get_module('sampler', 'SemSegSpatiallyRegularSampler')
else:
sampler_cfg = self.cfg.get('sampler',
{'name': 'SemSegRandomSampler'})
sampler_cfg = self.cfg.get('sampler', {'name': None})
if sampler_cfg['name'] == "None":
sampler_cfg['name'] = None
sampler_cls = get_module('sampler', sampler_cfg['name'])
self.sampler = sampler_cls(self)
if sampler_cls is None:
self.sampler = None
else:
self.sampler = sampler_cls(self)

@abstractmethod
def __len__(self):
Expand Down
230 changes: 230 additions & 0 deletions ml3d/datasets/kitti360.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
import numpy as np
import os
import logging
import open3d as o3d

from pathlib import Path
from os.path import join, exists
from glob import glob

from .base_dataset import BaseDataset, BaseDatasetSplit
from ..utils import make_dir, DATASET

log = logging.getLogger(__name__)


class KITTI360(BaseDataset):
"""This class is used to create a dataset based on the KITTI 360
dataset, and used in visualizer, training, or testing.
"""

def __init__(self,
dataset_path,
name='KITTI360',
cache_dir='./logs/cache',
use_cache=False,
class_weights=[
3370714, 2856755, 4919229, 318158, 375640, 478001, 974733,
650464, 791496, 88727, 1284130, 229758, 2272837
],
num_points=40960,
ignored_label_inds=[],
test_result_folder='./test',
**kwargs):
"""Initialize the function by passing the dataset and other details.

Args:
dataset_path: The path to the dataset to use (parent directory of data_3d_semantics).
name: The name of the dataset (KITTI360 in this case).
cache_dir: The directory where the cache is stored.
use_cache: Indicates if the dataset should be cached.
class_weights: The class weights to use in the dataset.
num_points: The maximum number of points to use when splitting the dataset.
ignored_label_inds: A list of labels that should be ignored in the dataset.
test_result_folder: The folder where the test results should be stored.
"""
super().__init__(dataset_path=dataset_path,
name=name,
cache_dir=cache_dir,
use_cache=use_cache,
class_weights=class_weights,
test_result_folder=test_result_folder,
num_points=num_points,
ignored_label_inds=ignored_label_inds,
**kwargs)

self.label_to_names = self.get_label_to_names()
self.num_classes = len(self.label_to_names)
self.label_values = np.sort([k for k, v in self.label_to_names.items()])
self.label_to_idx = {l: i for i, l in enumerate(self.label_values)}
self.ignored_labels = np.array([])

if not os.path.exists(
os.path.join(
dataset_path,
'data_3d_semantics/train/2013_05_28_drive_train.txt')):
raise ValueError(
"Invalid Path, make sure dataset_path is the parent directory of data_3d_semantics."
)

with open(
os.path.join(
dataset_path,
'data_3d_semantics/train/2013_05_28_drive_train.txt'),
'r') as f:
train_paths = f.read().split('\n')[:-1]
train_paths = [os.path.join(dataset_path, p) for p in train_paths]

with open(
os.path.join(
dataset_path,
'data_3d_semantics/train/2013_05_28_drive_val.txt'),
'r') as f:
val_paths = f.read().split('\n')[:-1]
val_paths = [os.path.join(dataset_path, p) for p in val_paths]

self.train_files = train_paths
self.val_files = val_paths
self.test_files = sorted(
glob(
os.path.join(dataset_path,
'data_3d_semantics/test/*/static/*.ply')))

@staticmethod
def get_label_to_names():
"""Returns a label to names dictionary object.

Returns:
A dict where keys are label numbers and
values are the corresponding names.
"""
label_to_names = {
0: 'ceiling',
1: 'floor',
2: 'wall',
3: 'beam',
4: 'column',
5: 'window',
6: 'door',
7: 'table',
8: 'chair',
9: 'sofa',
10: 'bookcase',
11: 'board',
12: 'clutter'
}
return label_to_names

def get_split(self, split):
"""Returns a dataset split.

Args:
split: A string identifying the dataset split that is usually one of
'training', 'test', 'validation', or 'all'.

Returns:
A dataset split object providing the requested subset of the data.
"""
return KITTI360Split(self, split=split)

def get_split_list(self, split):
if split in ['train', 'training']:
return self.train_files
elif split in ['val', 'validation']:
return self.val_files
elif split in ['test', 'testing']:
return self.test_files
elif split == 'all':
return self.train_files + self.val_files + self.test_files
else:
raise ValueError("Invalid split {}".format(split))

def is_tested(self, attr):

cfg = self.cfg
name = attr['name']
path = cfg.test_result_folder
store_path = join(path, self.name, name + '.npy')
if exists(store_path):
print("{} already exists.".format(store_path))
return True
else:
return False

"""Saves the output of a model.

Args:
results: The output of a model for the datum associated with the attribute passed.
attr: The attributes that correspond to the outputs passed in results.
"""

def save_test_result(self, results, attr):

cfg = self.cfg
name = attr['name'].split('.')[0]
path = cfg.test_result_folder
make_dir(path)

pred = results['predict_labels']
pred = np.array(pred)

for ign in cfg.ignored_label_inds:
pred[pred >= ign] += 1

store_path = join(path, self.name, name + '.npy')
make_dir(Path(store_path).parent)
np.save(store_path, pred)
log.info("Saved {} in {}.".format(name, store_path))


class KITTI360Split(BaseDatasetSplit):
"""This class is used to create a split for KITTI360 dataset.

Initialize the class.

Args:
dataset: The dataset to split.
split: A string identifying the dataset split that is usually one of
'training', 'test', 'validation', or 'all'.
**kwargs: The configuration of the model as keyword arguments.

Returns:
A dataset split object providing the requested subset of the data.
"""

def __init__(self, dataset, split='training'):
super().__init__(dataset, split=split)
log.info("Found {} pointclouds for {}".format(len(self.path_list),
split))

def __len__(self):
return len(self.path_list)

def get_data(self, idx):
pc_path = self.path_list[idx]

pc = o3d.t.io.read_point_cloud(pc_path)

points = pc.point['positions'].numpy().astype(np.float32)
feat = pc.point['colors'].numpy().astype(np.float32)
labels = pc.point['semantic'].numpy().astype(np.int32).reshape((-1,))

data = {
'point': points,
'feat': feat,
'label': labels,
}

return data

def get_attr(self, idx):
pc_path = Path(self.path_list[idx])
name = pc_path.name.replace('.pkl', '')

pc_path = str(pc_path)
split = self.split
attr = {'idx': idx, 'name': name, 'path': pc_path, 'split': split}
return attr


DATASET._register_module(KITTI360)
Loading