Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding mean for truncation trick #4

Open
yaseryacoob opened this issue Mar 2, 2021 · 5 comments
Open

Finding mean for truncation trick #4

yaseryacoob opened this issue Mar 2, 2021 · 5 comments

Comments

@yaseryacoob
Copy link

Can you please explain what this does in the notebook? Should this truncation be recomputed if one is to create more diversity of generation?

@avecplezir
Copy link
Collaborator

avecplezir commented Mar 4, 2021

hi! To create more diversity you need to increase the value of the truncation argument in g_ema, which currently is set to 0.6 (please pay attention to the last cell in a notebook). You don't need to recompute truncation_latent.

Finding mean for truncation trick cell computes mean vector in W space and later every sampled noise vector in W space is attracted to the mean vector with power one minus truncation.

@yaseryacoob
Copy link
Author

Thanks for the explanation. I noticed less diversity (than I expected) when I generated 1K images (but using a fixed truncation after it was computed in the block before-last in the notebook). I like the quality of the faces generated. It appears more tricky to project images into latent space. Assuming you guys are ahead of me on that, i would like to share/know your thoughts.

@KirillDemochkin
Copy link

Hi! I am one of the authors who worked on this paper.

Could you give some more details on how you are doing the projection currently?
We actually found that real world images invert very nicely into the latent space. The restored images have more fine details preserved such as earrings, facial hair, clothing patters, complex backgrounds. It is also possible to use multiple style vectors for each image so that different regions of the image correspond to different optimized style vectors.

-Kirill

@yaseryacoob
Copy link
Author

We have used a number of approaches, a quick summary

  1. Similar to STYLEGAN2-ADA projection using VGG metric.
  2. VGG+L2+LPIPS+ID_loss (similar to pSp).

For now we working with FFHQ 256 for speed, but the reconstruction while decent is not as good as we expected. but really want to have the FFHQ1K working well. Send me email to [email protected] and I will share with you the results and we can discuss it more.

Of course I would love to see your projection results and algorithm if you can share.
thanks

@KirillDemochkin
Copy link

I see, as of right now we have been more focused on inverting images via optimization, and the encoder architecture is early stage, but I would love to see what you guys have managed to do so far!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants