-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The finetuning hyperparameters of resnet50 #27
Comments
You could see https://github.com/keyu-tian/SparK/blob/main/downstream_imagenet/arg.py#L20 and this would result in 80.6 acc. Training losses of each epoch: [0.1574, 0.0063, 0.0056, 0.0054, 0.0053, 0.0053, 0.0052, 0.0052, 0.0051, 0.0051, 0.0050, 0.0050, 0.0050, 0.0050, 0.0050, 0.0049, 0.0049, 0.0049, 0.0049, 0.0050, 0.0048, 0.0048, 0.0049, 0.0048, 0.0048, 0.0048, 0.0048, 0.0048, 0.0048, 0.0048, 0.0048, 0.0047, 0.0048, 0.0048, 0.0047, 0.0047, 0.0048, 0.0048, 0.0047, 0.0048, 0.0047, 0.0047, 0.0047, 0.0047, 0.0047, 0.0048, 0.0047, 0.0048, 0.0047, 0.0047, 0.0047, 0.0046, 0.0046, 0.0046, 0.0047, 0.0046, 0.0046, 0.0045, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0046, 0.0045, 0.0045, 0.0045, 0.0045, 0.0045, 0.0046, 0.0045, 0.0044, 0.0045, 0.0045, 0.0045, 0.0044, 0.0045, 0.0045, 0.0045, 0.0044, 0.0045, 0.0045, 0.0045, 0.0045, 0.0044, 0.0045, 0.0044, 0.0044, 0.0044, 0.0044, 0.0044, 0.0044, 0.0045, 0.0044, 0.0044, 0.0044, 0.0044, 0.0043, 0.0044, 0.0043, 0.0044, 0.0044, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0043, 0.0042, 0.0042, 0.0042, 0.0042, 0.0042, 0.0042, 0.0042, 0.0042, 0.0041, 0.0042, 0.0042, 0.0042, 0.0041, 0.0041, 0.0041, 0.0041, 0.0041, 0.0041, 0.0042, 0.0041, 0.0041, 0.0041, 0.0041, 0.0040, 0.0041, 0.0040, 0.0040, 0.0041, 0.0041, 0.0040, 0.0040, 0.0040, 0.0040, 0.0040, 0.0039, 0.0040, 0.0040, 0.0039, 0.0039, 0.0040, 0.0040, 0.0040, 0.0039, 0.0040, 0.0039, 0.0039, 0.0040, 0.0039, 0.0038, 0.0039, 0.0039, 0.0039, 0.0039, 0.0038, 0.0039, 0.0038, 0.0038, 0.0038, 0.0039, 0.0038, 0.0038, 0.0038, 0.0038, 0.0038, 0.0038, 0.0037, 0.0038, 0.0038, 0.0038, 0.0037, 0.0037, 0.0038, 0.0037, 0.0037, 0.0037, 0.0037, 0.0037, 0.0037, 0.0037, 0.0037, 0.0037, 0.0037, 0.0036, 0.0036, 0.0036, 0.0037, 0.0037, 0.0037, 0.0036, 0.0036, 0.0036, 0.0036, 0.0036, 0.0036, 0.0036, 0.0035, 0.0035, 0.0036, 0.0035, 0.0035, 0.0035, 0.0035, 0.0035, 0.0036, 0.0036, 0.0034, 0.0035, 0.0035, 0.0035, 0.0034, 0.0035, 0.0034, 0.0034, 0.0034, 0.0034, 0.0034, 0.0034, 0.0034, 0.0034, 0.0034, 0.0034, 0.0034, 0.0034, 0.0035, 0.0034, 0.0034, 0.0033, 0.0034, 0.0034, 0.0034, 0.0034, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0032, 0.0032, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0033, 0.0032, 0.0033, 0.0033, 0.0032, 0.0033, 0.0033, 0.0033, 0.0032, 0.0032, 0.0033, 0.0033] Best validation accs (EMA model) of each epoch: [0.12, 0.72, 22.60, 45.40, 52.51, 55.70, 58.43, 60.69, 62.26, 63.42, 64.43, 65.04, 65.57, 66.13, 66.57, 66.89, 67.34, 67.53, 67.94, 68.10, 68.11, 68.37, 68.62, 68.76, 68.87, 68.95, 69.18, 69.34, 69.35, 69.50, 69.67, 69.67, 69.90, 69.98, 69.98, 70.06, 70.10, 70.24, 70.28, 70.35, 70.35, 70.35, 70.44, 70.60, 70.64, 70.79, 71.00, 71.00, 71.00, 71.01, 71.04, 71.20, 71.20, 71.23, 71.23, 71.31, 71.31, 71.43, 71.46, 71.50, 71.60, 71.71, 71.71, 71.73, 71.87, 71.89, 72.13, 72.13, 72.13, 72.13, 72.19, 72.19, 72.25, 72.28, 72.43, 72.50, 72.57, 72.60, 72.69, 72.69, 72.69, 72.69, 72.77, 72.79, 72.93, 72.98, 72.98, 73.05, 73.21, 73.30, 73.30, 73.30, 73.38, 73.51, 73.53, 73.60, 73.61, 73.67, 73.67, 73.67, 73.67, 73.70, 73.79, 73.82, 74.02, 74.02, 74.09, 74.09, 74.09, 74.23, 74.27, 74.31, 74.42, 74.43, 74.50, 74.53, 74.58, 74.64, 74.72, 74.93, 74.93, 74.93, 74.93, 75.04, 75.04, 75.07, 75.14, 75.21, 75.27, 75.33, 75.36, 75.36, 75.45, 75.49, 75.57, 75.62, 75.71, 75.83, 75.83, 75.85, 75.96, 76.01, 76.06, 76.14, 76.16, 76.17, 76.29, 76.29, 76.29, 76.38, 76.45, 76.56, 76.60, 76.64, 76.71, 76.76, 76.80, 76.95, 77.03, 77.10, 77.10, 77.16, 77.18, 77.28, 77.28, 77.37, 77.38, 77.53, 77.59, 77.61, 77.61, 77.74, 77.75, 77.88, 77.96, 77.99, 78.01, 78.09, 78.15, 78.18, 78.24, 78.25, 78.26, 78.37, 78.44, 78.48, 78.63, 78.63, 78.69, 78.69, 78.69, 78.72, 78.73, 78.76, 78.80, 78.88, 78.91, 79.03, 79.07, 79.07, 79.07, 79.09, 79.25, 79.25, 79.25, 79.25, 79.30, 79.30, 79.33, 79.39, 79.44, 79.54, 79.54, 79.59, 79.61, 79.66, 79.66, 79.72, 79.77, 79.82, 79.83, 79.83, 79.89, 79.89, 79.96, 79.97, 80.04, 80.04, 80.06, 80.11, 80.18, 80.18, 80.19, 80.19, 80.19, 80.19, 80.19, 80.22, 80.25, 80.25, 80.25, 80.25, 80.28, 80.30, 80.30, 80.30, 80.30, 80.30, 80.37, 80.39, 80.39, 80.44, 80.44, 80.44, 80.44, 80.44, 80.44, 80.44, 80.44, 80.48, 80.49, 80.49, 80.51, 80.52, 80.53, 80.53, 80.53, 80.53, 80.53, 80.53, 80.53, 80.55, 80.56, 80.58, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59, 80.59] |
@keyu-tian I have tried this setting except that batch size is 2048, for the reason that 8 gpus cannot accommodate 4096 images. Unfortunately, my training losses became NaN. Does the batch size affect so much? |
@Vickeyhw can u provide the command and logs? |
@keyu-tian The args are:
The logs are:
|
i see, i would check that. Perhaps I copied the wrong code of lamb optimizer. BTW, have you tried a ConvNeXt-small? Would it fail too? |
@keyu-tian ConvNeXt-small seems normal so far. |
@keyu-tian Have you found the resnet-50 fine-tuning problem? The ConvNeXt-small reaches 83.96 validation acc fine-tuning from your released pertaining weights. |
The hyperparameter settings (batch size and learning rate) in the paper seem inconsistent with the code. Which could reproduce the performance (80.6 acc) reported in the paper?
The text was updated successfully, but these errors were encountered: