Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what we can deal with irregular train set ? #303

Closed
iperov opened this issue Mar 19, 2018 · 14 comments
Closed

what we can deal with irregular train set ? #303

iperov opened this issue Mar 19, 2018 · 14 comments

Comments

@iperov
Copy link
Contributor

iperov commented Mar 19, 2018

(sry bad english)

What we can deal with this problem: for example,
we have 1 face at 90 deg, and 10 faces from 0 to 10 deg

set batch size to 1, and we can see when trainer hit 90 deg image, train loss jumps to highest value.

So can model train more with one 90 deg face ? or this will be ineffective?

@bryanlyon bryanlyon reopened this Mar 19, 2018
@bryanlyon
Copy link
Contributor

First, sorry for accidentally closing this issue.

Align during extract should make all faces line up the same way. If you have a face that is coming in sideways, that is the same as a general false positive and should be deleted manually. If you really want that face, you can try running extract again it with -r on. Right now, it'll likely find the bad face again, without aligning it properly, but it might do it properly.

@iperov
Copy link
Contributor Author

iperov commented Mar 19, 2018

what ?

@iperov
Copy link
Contributor Author

iperov commented Mar 19, 2018

maybe I poorly explained. This is not align related.

@bryanlyon
Copy link
Contributor

You shouldn't be training on faces that are very different from all the others. It confuses the model. All faces should be aligned to each other before training.

@iperov
Copy link
Contributor Author

iperov commented Mar 19, 2018

or for example , there are too few faces with closed eyes, so result is constantly opened eyes

@iperov
Copy link
Contributor Author

iperov commented Mar 19, 2018

@Clorr what you think? bryanlyon cannot understand me

@kvrooman
Copy link
Contributor

kvrooman commented Mar 20, 2018

The training loss jumps because the current weights in the frontal face trained NN generalize poorly to the single profile face. Also setting the batch size very small can over-fit as the NN attempts to adjust itself for each batch and it ping-pongs back and forth.

I think your larger question is...
Does the extra information provided by the "profile/eyes closed" face improve the fit on the "frontal/eyes open" faces and vice versa.

Yes, but you either need lots of examples of both to work well or you need to be able to map the two cases to each other in ways the NN can pool information about the two cases, i.e. ears will train the same neuron/feature/filter regardless if its a profile or frontal.

We're accomplishing the first with the image data generator to augment the dataset with random translations, rotations, zooms/scale, flips in order to generate more variance in the training data set. You could also introduce a skew, yaw, and pitch transform to give more made-up "profile" examples to train the autoencoder. Introducing illumination and color shifts would also likely reduce our histogram matching issues.

The second method involves a technique called Spatial Transformer Networks. Very crudely explained, put an affine transform in front of a convolution and give it its own weight for each DoF/dimension of the affline. As the NN trains, it can warp the image by itself by training that weight. Here's a recent paper.
http://www.robots.ox.ac.uk/~joao/publications/henriques_icml2017.pdf
https://arxiv.org/pdf/1703.06211.pdf

Example image
image

@deepfakesclub
Copy link

deepfakesclub commented Mar 20, 2018

I think there are two different problems:

  1. The training spends more time training 0-10 degrees, since the images are drawn randomly for each batch of training. The 90 degree angle will train roughly 10x more slowly. The model will also bias towards the more common example (open eyes vs closed eyes).

  2. During conversion, your model has poor quality for 20-80 degrees of face A -> face B. Even if you solve problem 1, problem 2 may still be present.

Problem 1 maybe can be fixed by adjusting batch sampling to take into account angle distributions if image numbers >> batch size. If image numbers ~ batch size, like in your example, that is interesting, I don't know. If there is exactly one 90 degree face, and it appears multiple times in one batch, would that actually improve the training? I guess it could, because the face is warped, so you are training on different multiple warps at once?

Problem 2 maybe can be fixed by what @kvrooman suggested, or similar idea in that other issue #300

I think @iperov is more asking about problem 1?

@iperov
Copy link
Contributor Author

iperov commented Mar 20, 2018

@deepfakesclub yes

@dfaker
Copy link
Contributor

dfaker commented Mar 20, 2018

@kvrooman nice find!

@Kirin-kun
Copy link

Isn't that #123 revisited?

@Kirin-kun
Copy link

Would there be a way to augment a faceset with transforms not present in the original set?

For example, if we have lots of faces of Cage, but very few, if any, of them with a widely opened mouth, then the model is having problems matching with the Trump faces where he's shouting. Which leads to lip sync problems. Also, the original chin may show because the face is bigger when shouting.

Maybe we could introduce transforms pertaining to the mouth and chins (or other face features), from a given dataset, to create several faces matching the ones in the second dataset. Not only angles. It would be a way to diversify the training set.

And if it can give some unrealistic faces, there would be a manual step to choose which ones are usable in training.

@kvrooman
Copy link
Contributor

kvrooman commented Mar 20, 2018

"Would there be a way to augment a faceset with transforms not present in the original set?"

In the image data batch generator, in addition to the current transforms , randomly select a landmark alignment from the .json file of either Face_A or Face_B. Invert a random normalized face using this randomized warp to get a variety of poses ( just like we would do when we're converting... there will be some distortion as every frontal face is not in the same "frontal" pose as the current alignment is not perfect)However, apply a random scalar ( 0<x<1 ) that is initially capped at a low value ( i.e.0.05 ) to the magnitude of the warping to keep the faces mostly frontal. Over many epochs, increase the cap value to 1. You can maintain stability but also get a larger faceset with all poses in your training set.

@iperov
Copy link
Contributor Author

iperov commented Mar 22, 2018

solved, will code my solution

@iperov iperov closed this as completed Mar 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants