Skip to content

cl-tohoku/Multi-VidSum-Eval

Repository files navigation

Evaluation scripts for Multi-Vid-Sum task

This repository contains the evaluation scriots for the paper A Challenging Multimodal Video Summary: Simultaneously Extracting and Generating Keyframe-Caption Pairs from Video.

Preparation

Specify the directory to be mounted in docker-compose.yml

Please change the directory to be mounted in docker-compose.yml to the directory where this repository is cloned. Also, please change the working directory to any directory. Please specify the json file of the inference result obtained from Note that do not change target directory in the docker-compose.yml.

Build and enter the docker container

docker-compose up -d
docker exec -it $container_name bash

Download resources for evaluation

cd /work
huggingface-cli download tohoku-nlp/multi-vidsum-eval --local-dir . --local-dir-use-symlinks False --repo-type dataset
tar -xzvf downloads.tar.gz --no-same-owner

Evaluation

You can evaluate the result by the following command.

cd eval_reranking
RESULT_FILE=/work/downloads/inference_results/sample_result.json
PRETRAINED_MODEL_NAME=vid2seq
TAG=sample
bash eval.sh $RESULT_FILE $PRETRAINED_MODEL_NAME $TAG
  • RESULT_FILE: json file of the inference result obtained from
  • PRETRAINED_MODEL_NAME: name of the pretrained model used for captioning. See eval_reranking/eval.sh for details.
  • TAG: tag for the result. Specify any string.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages