Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"latent_stylegan1.npy" mismatch with image and label #41

Open
carter54 opened this issue Dec 22, 2021 · 0 comments
Open

"latent_stylegan1.npy" mismatch with image and label #41

carter54 opened this issue Dec 22, 2021 · 0 comments

Comments

@carter54
Copy link

carter54 commented Dec 22, 2021

The training processes mentioned in #19 (comment) shows the following steps:

- Step 0: Train StyleGAN V1 with your own data set to get a pre-trained StyleGAN (or download instead a pre-trained StyleGAN provided by the Github of the StyleGAN V1 or any other sources)

- Step 1: Follow the instructions in the link for [Tensorflow -> PyTorch] conversion of the pre-trained StyleGAN weights file in issue #1 . This will give you the ".pt" file you need to perform all of the following steps.

- Step 2: **avg_latent_stylegan1.npy**: 
  - First, you need to load the pre-trained StyleGAN you got in step 1. The initialization will however exclude the use of the threshold for this step (as is also done in Step 1 if you want to test the pre-trained StyleGAN).
  - For this step, you only need to use the "G_mapping" part (the mapping network) of "g_all" (the complete StyleGAN V1 ).
  - You need to do a for loop on "range (0, 8000)" where on each iteration: you compute a random "Z latent code" with "np.random.randn(1, 512)" (Gaussian distribution N(mean = 0 , std / var = 1)), send it as input to "G_mapping" and you capture its output (the "W latent code").
  - You then simply calculate the average of the 8,000 "W latent code" that you obtained. You then store this result in a Numpy file that you can name "avg_latent_stylegan1.npy" (or any other name).

- Step 3: **latent_stylegan1.npy** + generation of fake images:
   - First, you need to load the pre-trained StyleGAN you got in step 1. However, this time you will include the threshold in the architecture since you now have the "avg_latent" you need for its initialization.
   - Make a for loop on "range (0, nb_img_you_want)" where at each iteration: you first create a random "Z latent code" ("latent = np.random.randn (1, 512)") then pass it as input to the StyleGAN using the "latent_to_image" function (from the "utils.utils" file on this Github). During this loop, you must store the latent codes linked to the creation of each image since they will be saved in the "latent_stylegan1.npy" file. You also need to save the generated images as you will then need to label them manually (using, for example, LabelMe [https://github.com/wkentaro/labelme](url) ).
   - If you choose to label only a subset of all generated images, then "latent_stylegan1.npy" should contain only the "laten code" related to the images you have chosen - but this can be done later, after you have manually labelled some images that were generated by the pre-trained StyleGAN.

- Step 4: Manually label some fake images generated by the StyleGAN (by using, for example, LabelMe https://github.com/wkentaro/labelme) in order to obtain their corresponding "mask".

- Step 5: Now that you have "avg_latent_stylegan1.npy", "latent_stylegan1.npy", the pre-trained StyleGAN and some pairs of (image, mask) for the DatasetGAN training dataset, you can perform the DatasetGAN training via the file "train_interpreter.py" from this Github. Some modifications might be necessary on some small parts (eg the files extension used for the images which could be different from ".jpg").

I hope you are now better able to implement your own implementation of all of these parts. If you haven't already, I strongly suggest that you read the StyleGAN V1 paper first, then read and understand all the code in the "train_interpreter.py" file on this Github, before you start implementing all of these steps.

_Originally posted by @PoissonChasseur in https://github.com/nv-tlabs/datasetGAN_release/issues/19#issuecomment-892114720_

I followed these steps and found that the generated dataset are very strange... so I checked the code and found this:

  1. "make_training_data.py" do not save the latent code for the first image (image_0.jpg) in "latent_stylegan1.npy"
  2. "train_interpreter.py" load the latent code from "latent_stylegan1.npy", at the same time it load the image from "image_0.jpg" to generate the input
    im_frame = np.load(os.path.join( args['annotation_mask_path'] , name))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant