Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduction on Composite dataset #5

Open
joeyz0z opened this issue Oct 25, 2023 · 1 comment
Open

Reproduction on Composite dataset #5

joeyz0z opened this issue Oct 25, 2023 · 1 comment

Comments

@joeyz0z
Copy link

joeyz0z commented Oct 25, 2023

I want to reproduce the results on Composite dataset

  1. Firstly, I downloaded Composite dataset from https://imagesdg.wordpress.com/image-to-scene-description-graph/, there are two types :correctness and throughness. I use the first one, correctness.
  2. In the correctness file, the composite data usually includes only 3 or 4 captions rated by humans per image. Some candidate captions look like a paragraph, do I need to truncate it into a short sentence?
    for example:person is pulling bow in the back.A person might be wearing helmet in the scene.person is having tattoo.The scene contains grass and well-maintained grass and garden and playhouse.
  3. I process the composite dataset into flickr8k json style, and use compute_correlations.py to compute the human correlation, but I got different scores results compared with the ones in the paper.

Could you give me some guidance about how to process the composite dataset and reproduce the scores on Composite dataset? I would appreciate it if you could take the time to reply to me.

@sarasarto
Copy link
Collaborator

Hi,
thanks for your interest in our work.

  1. The downloading path is correct, but we use both correctness and throughness.
  2. Although we didn't shorten the sentences, be aware that when you tokenize the captions with CLIP you'll be limited to have only captions that are at most 77 tokens long, otherwise you'll need to truncate them.
  3. In our case, we extracted candidates captions and their corresponding ratings from the CSV files you previously mentioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants