Skip to content

In this repository is my experimental thesis work on the recognition of museum works through object detection techniques.

License

Notifications You must be signed in to change notification settings

giacomolat/Object-Detection-Sperimental-Thesis-for-Degree

Repository files navigation

Object Detection System of Museum Works

Implementation of a system for the survey and classification of museum works.

Dataset creation using PyTorch

Starting with a dataset of 22 images, i.e. Etruscan vases, they were divided into 5 classes, based on the type of vase. A set of transformations were applied to the images in the dataset in 4 different ways, using the torchvision package of the PyTorch library, resulting in 88 images with dimensions 463x463x3:

1'Mode 2'Mode 3'Mode 4'Mode
Resize, CenterCrop, ToTensor RandomRotation, RandomResizedCrop, RandomHorizontalFlip, ToTensor Resize, RandomCrop, RandomVerticalFlip, GaussianBlur, ToTensor Resize, CenterCrop, ColorJitter, ToTensor

image

For details of my work, see: Data Augmentation and thesis

Creating labels and obtaining the final dataset

Structuring the dataset by assigning labels to certain areas of the image, called bounding boxes, using the Yolo_mark tool.

The file may be empty or it may contain one or more coordinates. Each coordinate is set as 'ID X Y WIDTH HEIGHT', where:

  • ID: indicates the identification attributed to the different classes defined. In our case, five classes were defined between ID=0 and ID=4;
  • X: indicates the X co-ordinate of the centre of the object;
  • Y: indicates the Y co-ordinate of the centre of the object;
  • WIDTH: indicates the width of the object;
  • HEIGHT: indicates the height of the object.

image

For details of my work, see: Dataset YOLO format and thesis

Training and Testing

K-fold Cross validation with Scikit-learn and PyTorch

Once the dataset was obtained, the k-fold cross validation algorithm was applied to analyse its accuracy. The implementation of the algorithm was done through the use of two libraries:

  • Scikit-learn, used to set the number of folds to be applied on the dataset. We chose k=5 as the number of folds, resulting in a split in which 80% refers to the training set, with 163 images, while the remaining 20% refers to the testing set, with 41 images;
  • PyTorch, used for training the training set, with 500 epochs. At the end of the 500 epochs, the accuracy was evaluated using the testing set, containing the data that was not trained.

The training and evaluation process is carried out by alternating between 80% of the training set and 20% of the testing set each time, depending on the number of folds chosen via the Scikit-learn library. Once the algorithm has been run, the accuracy obtained is 99%.

For details of my work, see: K-fold train Validation and thesis

Converting and splitting the dataset with Roboflow

Detectron2 only supports datasets in COCO format, so labels were converted from YOLO TXT format to COCO JSON format), via Roboflow. Then the dataset in COCO format was divided to perform the training via Detectron2 in the following way:

Set Number of Images
Training Set 145
Validation Set 34
Testing Set 25

For details of my work, see: Dataset COCO format and thesis

Detectron2: Choice of model and backbone

Through the Detectron2 Model Zoo repository, the model pre-trained on the COCO dataset, Faster RCNN R-50 FPN 3X, was chosen, with its related architecture. This backbone consists of a ResNet with 50 convolution levels and an FPN for feature extraction.

Name lr
sched
train
time
(s/iter)
inference
time
(s/im)
train
mem
(GB)
box
AP
model id
R50-FPN 3x 0.209 0.038 3.0 40.2 137849458

image

For details of my work, see: thesis

Detectron2: Training Script Python

# Train Configuration using Detectron2 library
from detectron2.config import get_cfg
from detectron2 import model_zoo
from detectron2.evaluation import COCOEvaluator
import os

cfg = get_cfg ()

cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))

cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ("my_dataset_val",)

cfg.DATALOADER.NUM_WORKERS=4

cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")

cfg.SOLVER.IMS_PER_BATCH = 4
cfg.SOLVER.BASE_LR = 0.001

cfg.SOLVER.MAX_ITER = 500

cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 5

cfg.TEST.EVAL_PERIOD = 100
os.makedirs(cfg.OUTPUT_DIR , exist_ok = True)
trainer = COCOEvaluator(cfg)
trainer.resume_or_load(resume = False)
trainer.train()

For details of my work, see: DatasetVasi_maxiter500_cocoevaluator and thesis

Detectron2: Testing Script Python

# Test Configuration using Detectron2 library
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
from detectron2.data import build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
import os

cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR ," model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.85
predictor = DefaultPredictor(cfg)

evaluator = COCOEvaluator("my_dataset_test", cfg , False , output_dir = "./output /")

val_loader = build_detection_test_loader(cfg ," my_dataset_test ")
inference_on_dataset(trainer.model , val_loader , evaluator)

For details of my work, see: DatasetVasi_maxiter500_cocoevaluator and thesis

Detectron2: COCO Metrics in the training and testing phase

The model was evaluated using the "COCO metric with AP at IoU=.50:.05:.95", considering the IoU values relating to the bounding boxes, in which:

  • Train: AP = 79.907%
  • Test: AP = 81.760%

For details of my work, see: DatasetVasi_maxiter500_cocoevaluator and thesis

Detectron2: Results

image image image image image image image image image image image image image image image image image image image image image image image image image

For details of my work, see: DatasetVasi_maxiter500_cocoevaluator and thesis

Releases

No releases published

Packages

No packages published