Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use my own midi data as input? #1

Open
dianxin556 opened this issue Sep 23, 2021 · 8 comments
Open

How to use my own midi data as input? #1

dianxin556 opened this issue Sep 23, 2021 · 8 comments

Comments

@dianxin556
Copy link

Sinece the data format is pkl,and how to generate the pkl from midi?Thanks~

@slSeanWU
Copy link
Member

Hi dianxin556,

You may consult this README (from my colleague's repo) for the conversion of MIDI files into REMI event format:
https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

The corpus2events.py code you need in Step 5 is here:
https://github.com/YatingMusic/compound-word-transformer/tree/main/dataset/representations/uncond/remi

Please note that you don't need to run the last two scripts in Step 5.
Instead, you should mark the indices (positions) of Bar events of each piece in the dataset you obtained from corpus2events.py and arrange each piece as such a tuple: (bar_positions, events) to make a dataset MuseMorphose accepts.

@dianxin556
Copy link
Author

Hi,

Hi dianxin556,

You may consult this README (from my colleague's repo) for the conversion of MIDI files into REMI event format: https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

The corpus2events.py code you need in Step 5 is here: https://github.com/YatingMusic/compound-word-transformer/tree/main/dataset/representations/uncond/remi

Please note that you don't need to run the last two scripts in Step 5. Instead, you should mark the indices (positions) of Bar events of each piece in the dataset you obtained from corpus2events.py and arrange each piece as such a tuple: (bar_positions, events) to make a dataset MuseMorphose accepts.

Thanks a lot for your patient reply, I will try it~

@isidontarou0117
Copy link

Could you provide the sample script for generating the pkl?
I don't know how to arrange each piece as such a tuple to make a dataset MuseMorphose accepts.

@eri24816
Copy link

eri24816 commented Jul 7, 2022

Hi,

I'm also trying to use my own audio file as input.

I converted the audio into REMI and tried to feed it into generate.py, but it turns out that the vocabulary of the converted REMI data was not consistent with the one that the pretrained model uses. When I ran generate.py, the following exception occurred (likely due to mismatched vocabulary size):

Traceback (most recent call last):
  File "generate.py", line 217, in <module>
    model.load_state_dict(torch.load(ckpt_path, map_location='cpu'))
  File "/home/user/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for MuseMorphose:
        size mismatch for token_emb.emb_lookup.weight: copying a param with shape torch.Size([333, 512]) from checkpoint, the shape in current model is torch.Size([209, 512]).
        size mismatch for dec_out_proj.weight: copying a param with shape torch.Size([333, 512]) from checkpoint, the shape in current model is torch.Size([209, 512]).
        size mismatch for dec_out_proj.bias: copying a param with shape torch.Size([333]) from checkpoint, the shape in current model is torch.Size([209]).

@eri24816
Copy link

eri24816 commented Jul 8, 2022

Thanks to @slSeanWU's help, I finally found a solution. (In the case that the original data is in mp3)

In the folder compound-word-transformer/dataset/

  1. Place the .mp3 file in mp3/
  2. Do transcription on the audio file with an external tool
  3. Place the transcribed .midi file in midi_transcribed/
  4. Run synchronizer.py, analyzer.py, midi2corpus.py, and representations/uncond/remi/corpus2events.py
  5. Copy folder representations/uncond/remi/ailab17k_from-scratch_remi to the MuseMorphose repo

In the folder MuseMorphose/

  1. run following script to make the dataset compatible with MuseMorphose:
from utils import *
import glob

for orig_file in glob.glob("./ailab17k_from-scratch_remi/events/*.pkl"):
    out_file = orig_file.replace('/events/','/')
    events = pickle_load(orig_file)
    for event in events:
        if event["name"] == "Note_Velocity":
            event["value"] = min(max(40,event["value"]),80)
    bar_idx = []
    for idx, event in enumerate(events):
        if event["name"] == "Bar":
            bar_idx.append(idx)

    result = (bar_idx,events)
    pickle_dump(result,out_file)
  1. Edit attributes.py around line 7-9 with
data_dir = 'ailab17k_from-scratch_remi'
polyph_out_dir = 'ailab17k_from-scratch_remi/attr_cls/polyph'
rhythm_out_dir = 'ailab17k_from-scratch_remi/attr_cls/rhythm'
  1. Run attributes.py
  2. Open the config file. Make data_dir point to the custom dataset
data_dir:         ./ailab17k_from-scratch_remi
  1. Now you can run generate.py on the custom dataset!

@dedededefo
Copy link

Thanks to @slSeanWU's help, I finally found a solution. (In the case that the original data is in mp3)

In the folder compound-word-transformer/dataset/

  1. Place the .mp3 file in mp3/
  2. Do transcription on the audio file with an external tool
  3. Place the transcribed .midi file in midi_transcribed/
  4. Run synchronizer.py, analyzer.py, midi2corpus.py, and representations/uncond/remi/corpus2events.py
  5. Copy folder representations/uncond/remi/ailab17k_from-scratch_remi to the MuseMorphose repo

In the folder MuseMorphose/

  1. run following script to make the dataset compatible with MuseMorphose:
from utils import *
import glob

for orig_file in glob.glob("./ailab17k_from-scratch_remi/events/*.pkl"):
    out_file = orig_file.replace('/events/','/')
    events = pickle_load(orig_file)
    for event in events:
        if event["name"] == "Note_Velocity":
            event["value"] = min(max(40,event["value"]),80)
    bar_idx = []
    for idx, event in enumerate(events):
        if event["name"] == "Bar":
            bar_idx.append(idx)

    result = (bar_idx,events)
    pickle_dump(result,out_file)
  1. Edit attributes.py around line 7-9 with
data_dir = 'ailab17k_from-scratch_remi'
polyph_out_dir = 'ailab17k_from-scratch_remi/attr_cls/polyph'
rhythm_out_dir = 'ailab17k_from-scratch_remi/attr_cls/rhythm'
  1. Run attributes.py
  2. Open the config file. Make data_dir point to the custom dataset
data_dir:         ./ailab17k_from-scratch_remi
  1. Now you can run generate.py on the custom dataset!

Hello, do you know how to train the model with your own data? Remi in folder pickles_ vocab.pkl、train_ pieces.pkl、test_ pieces.pkl、val_ How to get pieces.pkl

@dedededefo
Copy link

您好 dianxin556,

您可以参考此自述文件(来自我同事的回购)以将 MIDI 文件转换为 REMI 事件格式: https ://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md

您在第 5 步中需要的代码corpus2events.py在这里: https: //github.com/YatingMusic/compound-word-transformer/tree/main/dataset/representations/uncond/remi

请注意,您不需要运行步骤 5 中的最后两个脚本。相反,您应该在您从中获取的数据集中 标记每个片段的事件索引(位置) ,并将每个片段安排为这样的元组:MuseMorphose 接受的数据集。Bar``corpus2events.py``(bar_positions, events)

Hello, do you know how to train the model with your own data? How to get remi_ vocab.pkl、train_ pieces.pkl、test_ pieces.pkl、val_pieces.pkl in folder pickles? Thanks!

@dedededefo
Copy link

你好,

您好 dianxin556,
您可以参考此自述文件(来自我同事的回购)以将 MIDI 文件转换为 REMI 事件格式:https ://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md
您在第 5 步中需要的代码corpus2events.py在这里: https: //github.com/YatingMusic/compound-word-transformer/tree/main/dataset/representations/uncond/remi
请注意,您不需要运行步骤 5 中的最后两个脚本。相反,您应该在Bar您从中获取的数据集中标记每个片段的事件索引(位置),并将每个片段安排为corpus2events.py这样的元组:(bar_positions, events)MuseMorphose 接受的数据集。

非常感谢您的耐心回复,我试试看~

Hi!What is Remi_ vocab.pkl、train_ pieces.pkl、test_ pieces.pkl、val_ pieces.pkl in folder pickles? How to get them? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants