Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MixViT CovMAE #103

Open
samueleruffino99 opened this issue Dec 13, 2023 · 1 comment
Open

MixViT CovMAE #103

samueleruffino99 opened this issue Dec 13, 2023 · 1 comment

Comments

@samueleruffino99
Copy link

samueleruffino99 commented Dec 13, 2023

Hello, I have seen that you reference to the ConvMAE pertained based method as MixViT-COnvMAE, but actually, looking at your implementation the backbone is much more similar to the MixCvT layout, with multiple patch embedding and blocks.
Am I missing something or could be?
Because I am trying to adapt PiMAE as you have done with the ConvMAE model, thank you!

Moreover, I have seen that during training, you are passing templates and search tokes to the same backbone multiple times, how the training procedure deal with it? Because I would like to enrich your model with some kind of notion about hand trajectory (when tracked object is handled or similar).

@yutaocui
Copy link
Collaborator

In terms of the patch embeding style, the MixViT-ConvMAE is more like MixCvT, so you are ture.
For the second question, I don't know what you means, can you give detailed explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants