Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError: No module named 'transformers_modules.togethercomputer.evo-1-131k-base.9562f3fdc38f09b92594864c5e98264f1bfbca33.tokenizer' #53

Open
adrienchaton opened this issue Apr 23, 2024 · 7 comments

Comments

@adrienchaton
Copy link

adrienchaton commented Apr 23, 2024

Hi all and thanks for open sourcing this interesting model!

I managed to install flash-attention and all other packages so I am able to import Evo package.
But I am stuck with the following error
ModuleNotFoundError: No module named 'transformers_modules.togethercomputer.evo-1-131k-base.9562f3fdc38f09b92594864c5e98264f1bfbca33.tokenizer'

This happens regardless of using the source code

from evo import Evo
import torch
device = 'cuda:0'
evo_model = Evo('evo-1-131k-base') # here it crashes

or trying to load directly from HF

from transformers import AutoConfig, AutoModelForCausalLM
model_name = 'togethercomputer/evo-1-131k-base'
model_config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
model_config.use_cache = True
model = AutoModelForCausalLM.from_pretrained(model_name, config=model_config, trust_remote_code=True) # here it crashes

The error points to transformers_modules.togethercomputer.evo-1-131k-base regardless of which EVO checkpoint I select and I tried to update transformers both to latest or to "4.36.2" as show in https://huggingface.co/togethercomputer/evo-1-131k-base/blob/main/generation_config.json

Any clue on how to solve this error please? Thanks!

@Zymrael
Copy link
Collaborator

Zymrael commented Apr 28, 2024

What is the stack trace for the error you see when it crashes

@adrienchaton
Copy link
Author

Thanks for your reply, I am still stuck with this error and could not use Evo model yet.

Here is the full trace, the same happens if I try to load the phase 2 checkpoint or if I try to load from the Evo package instead of the auto class from HuggingFace

>>> model_config = AutoConfig.from_pretrained('togethercomputer/evo-1-8k-base', trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained('togethercomputer/evo-1-8k-base', config=model_config, trust_remote_code=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "xxx/mambaforge/envs/bm-llms-minimal/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 550, in from_pretrained
    model_class = get_class_from_dynamic_module(
  File "xxx/mambaforge/envs/bm-llms-minimal/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 501, in get_class_from_dynamic_module
    return get_class_in_module(class_name, final_module)
  File "xxx/mambaforge/envs/bm-llms-minimal/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 201, in get_class_in_module
    module = importlib.machinery.SourceFileLoader(name, module_path).load_module()
  File "<frozen importlib._bootstrap_external>", line 529, in _check_name_wrapper
  File "<frozen importlib._bootstrap_external>", line 1029, in load_module
  File "<frozen importlib._bootstrap_external>", line 854, in load_module
  File "<frozen importlib._bootstrap>", line 274, in _load_module_shim
  File "<frozen importlib._bootstrap>", line 711, in _load
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "xxx/.cache/huggingface/modules/transformers_modules/togethercomputer/evo-1-131k-base/9562f3fdc38f09b92594864c5e98264f1bfbca33/modeling_hyena.py", line 11, in <module>
    from .model import StripedHyena
  File "xxx/.cache/huggingface/modules/transformers_modules/togethercomputer/evo-1-131k-base/9562f3fdc38f09b92594864c5e98264f1bfbca33/model.py", line 26, in <module>
    from .tokenizer import ByteTokenizer
ModuleNotFoundError: No module named 'transformers_modules.togethercomputer.evo-1-131k-base.9562f3fdc38f09b92594864c5e98264f1bfbca33.tokenizer'

To answer your question on the huggingface space, I tried both transformers==4.36.2 as shown in the config file or currently I tried with

# Name                    Version                   Build  Channel
transformers              4.39.3             pyhd8ed1ab_0    conda-forge

It is the first HuggingFace model imported from external classes that I try to run so I never came across such error ... Thanks for any hints!

@adrienchaton
Copy link
Author

@Zymrael in case that is relevant, I also tried to manually download the checkpoints and load from the local copies but it didn't help (same error). And I also tried to update transformers to the latest from the github source which also produces the same error.

# Name                    Version                   Build  Channel
transformers              4.41.0.dev0              pypi_0    pypi

@oliverfleetwood
Copy link

I have the same issue

@adrienchaton
Copy link
Author

@oliverfleetwood did you make any progress?
@Zymrael I am not sure what is this issue but my guess could be that some version mismatch may currently lead to it ... would it make sense to try to build an env with everything pinned to the package versions you are using when running Evo?
thanks in advance for any help, that's a pitty not to be able to test it ...

@juliocesar-io
Copy link

juliocesar-io commented May 9, 2024

Hello all, I had the the same issue and I found a workaround, apparently the revision=1.1_fix is not able to download the model from HF...maybe cache issues?

The error I got:

ModuleNotFoundError: No module named 'transformers_modules.togethercomputer.evo-1-131k-base.c206aab77ae5967a069c4200ecb1858588528c9d.tokenizer'

How to fix it

Change to revision=main on the Evo class function load_checkpoint. Located in evo/models.py, like this:


    model_config = AutoConfig.from_pretrained(
        hf_model_name,
        trust_remote_code=True,
        revision='main', # change here
    )
    model_config.use_cache = True

    # Load model.
    model = AutoModelForCausalLM.from_pretrained(
        hf_model_name,
        config=model_config,
        trust_remote_code=True,
        revision='main', # change here
    )

Try to load the model again, eg.

from evo import Evo
import torch

device = 'cuda:0'

evo_model = Evo('evo-1-131k-base')
model, tokenizer = evo_model.model, evo_model.tokenizer
model.to(device)
model.eval()

sequence = 'ACGT'
input_ids = torch.tensor(
    tokenizer.tokenize(sequence),
    dtype=torch.int,
).to(device).unsqueeze(0)
logits, _ = model(input_ids) # (batch, length, vocab)

print('Logits: ', logits)
print('Shape (batch, length, vocab): ', logits.shape)

it should load the checkpoints, make sure the model downloaded correctly from HF https://huggingface.co/togethercomputer/evo-1-131k-base/tree/main and check that your cache folder has all those files in
.cache/huggingface/modules/transformers_modules/togethercomputer/evo-1-131k-base/<commit_hash>, you can also manually download the files using git clone https://huggingface.co/togethercomputer/evo-1-131k-base and put the them in the corresping commit_hash folder.

and then... it works. :)

Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00,  2.82it/s]
---------5---------
Logits:  tensor([[[-13.8125, -23.2500, -23.2500,  ..., -23.2500, -23.2500, -23.2500],
         [ -6.6250, -21.1250, -21.1250,  ..., -21.1250, -21.1250, -21.1250],
         [ -7.2500, -20.8750, -20.8750,  ..., -20.8750, -20.8750, -20.8750],
         [ -8.1875, -20.5000, -20.5000,  ..., -20.5000, -20.5000, -20.5000]]],
       device='cuda:0', dtype=torch.bfloat16, grad_fn=<UnsafeViewBackward0>)
Shape (batch, length, vocab):  torch.Size([1, 4, 512])

If the above doesn't work try the other model evo_model = Evo('evo-1-8k-base')

Environment

  • torch==2.0.1
  • flash-attn==2.5.8
  • transformers==4.36.2
  • python 3.11
  • cuda 11.7

Hope that helps!

@adrienchaton
Copy link
Author

@juliocesar-io thanks a lot, this fixed my issue!!!
FYI, additionally I needed to manually copy the evo/configs/*.yml into my python package for evo

Following on this, I have a couple questions please (maybe @Zymrael knows too?), regarding the code snippet below (helper function for myself)

@torch.inference_mode()
def compute_logits(evo_model, evo_tokenizer, sequences=["ATCG", "AATTCCGG"], cuda_device=0):
    assert type(sequences) is list, "fn. intended for batched processing with list of input sequences"
    assert cuda_device >= 0, f"device {cuda_device} must be int>=0"
    input_ids, seq_lengths = prepare_batch(sequences, evo_tokenizer, prepend_bos=False, device=f'cuda:{cuda_device}')
    # --> input_ids are padded with ones
    # TODO: check against the default prepend_bos=True
    # TODO: what about the attention mask? i.e. lower triangular for masking future steps and always zero on PAD tokens
    logits, inference_params_dict_out = evo_model(input_ids, inference_params_dict=None, padding_mask=None)
    # logits with shape [batch=len(sequences), length=max(seq_lengths), vocab=512]
    # inference_params_dict_out = None
    return logits.cpu().float(), seq_lengths, inference_params_dict_out
  • are both Evo checkpoints trained without BOS token? (the example is without but the default is prepend_bos=True)
  • the tokenizer doesn't return a padding mask and the example doesn't compute one, is it fine to leave it to None or should I manually compute one? (with zeros on PAD tokens)
  • I see helper functions to compute scores and entropies, what is the recommended way to extract residue embeddings please?

Thanks again for your assistance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants