-
-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what we can deal with irregular train set ? #303
Comments
First, sorry for accidentally closing this issue. Align during extract should make all faces line up the same way. If you have a face that is coming in sideways, that is the same as a general false positive and should be deleted manually. If you really want that face, you can try running extract again it with -r on. Right now, it'll likely find the bad face again, without aligning it properly, but it might do it properly. |
what ? |
maybe I poorly explained. This is not align related. |
You shouldn't be training on faces that are very different from all the others. It confuses the model. All faces should be aligned to each other before training. |
or for example , there are too few faces with closed eyes, so result is constantly opened eyes |
@Clorr what you think? bryanlyon cannot understand me |
The training loss jumps because the current weights in the frontal face trained NN generalize poorly to the single profile face. Also setting the batch size very small can over-fit as the NN attempts to adjust itself for each batch and it ping-pongs back and forth. I think your larger question is... Yes, but you either need lots of examples of both to work well or you need to be able to map the two cases to each other in ways the NN can pool information about the two cases, i.e. ears will train the same neuron/feature/filter regardless if its a profile or frontal. We're accomplishing the first with the image data generator to augment the dataset with random translations, rotations, zooms/scale, flips in order to generate more variance in the training data set. You could also introduce a skew, yaw, and pitch transform to give more made-up "profile" examples to train the autoencoder. Introducing illumination and color shifts would also likely reduce our histogram matching issues. The second method involves a technique called Spatial Transformer Networks. Very crudely explained, put an affine transform in front of a convolution and give it its own weight for each DoF/dimension of the affline. As the NN trains, it can warp the image by itself by training that weight. Here's a recent paper. |
I think there are two different problems:
Problem 1 maybe can be fixed by adjusting batch sampling to take into account angle distributions if image numbers >> batch size. If image numbers ~ batch size, like in your example, that is interesting, I don't know. If there is exactly one 90 degree face, and it appears multiple times in one batch, would that actually improve the training? I guess it could, because the face is warped, so you are training on different multiple warps at once? Problem 2 maybe can be fixed by what @kvrooman suggested, or similar idea in that other issue #300 I think @iperov is more asking about problem 1? |
@deepfakesclub yes |
@kvrooman nice find! |
Isn't that #123 revisited? |
Would there be a way to augment a faceset with transforms not present in the original set? For example, if we have lots of faces of Cage, but very few, if any, of them with a widely opened mouth, then the model is having problems matching with the Trump faces where he's shouting. Which leads to lip sync problems. Also, the original chin may show because the face is bigger when shouting. Maybe we could introduce transforms pertaining to the mouth and chins (or other face features), from a given dataset, to create several faces matching the ones in the second dataset. Not only angles. It would be a way to diversify the training set. And if it can give some unrealistic faces, there would be a manual step to choose which ones are usable in training. |
"Would there be a way to augment a faceset with transforms not present in the original set?" In the image data batch generator, in addition to the current transforms , randomly select a landmark alignment from the .json file of either Face_A or Face_B. Invert a random normalized face using this randomized warp to get a variety of poses ( just like we would do when we're converting... there will be some distortion as every frontal face is not in the same "frontal" pose as the current alignment is not perfect)However, apply a random scalar ( 0<x<1 ) that is initially capped at a low value ( i.e.0.05 ) to the magnitude of the warping to keep the faces mostly frontal. Over many epochs, increase the cap value to 1. You can maintain stability but also get a larger faceset with all poses in your training set. |
solved, will code my solution |
(sry bad english)
What we can deal with this problem: for example,
we have 1 face at 90 deg, and 10 faces from 0 to 10 deg
set batch size to 1, and we can see when trainer hit 90 deg image, train loss jumps to highest value.
So can model train more with one 90 deg face ? or this will be ineffective?
The text was updated successfully, but these errors were encountered: