You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Even if i restart the application to resume from my last checkpoint, it leads to this.
To Reproduce
My relevent code is as such:
# Define the model-building function for KerasTunerdefbuild_model(hp):
model=Sequential()
# Hyperparameter tuning for the number of LSTM layers and unitsforiinrange(hp.Int('num_layers', 1, 4)):
model.add(LSTM(units=hp.Int(f'lstm_units_{i}', min_value=32, max_value=128, step=32),
activation='relu', return_sequences=Trueifi<hp.Int('num_layers', 1, 4) -1elseFalse))
# Dense layermodel.add(Dense(units=hp.Int('dense_units', min_value=32, max_value=128, step=32), activation='relu'))
# Hyperparameter tuning for output size (e.g., prediction horizon)output_size=4# You can set this dynamically if neededmodel.add(Dense(output_size)) # Output size should be set here# Compile the modelmodel.compile(optimizer=Adam(learning_rate=hp.Float('learning_rate', min_value=1e-5, max_value=1e-2, sampling='LOG')),
loss='mean_squared_error', metrics=['mae'])
returnmodel# Initialize the tunertuner=kt.BayesianOptimization(build_model,
objective='val_loss',
max_trials=1000,
directory='kerastuner_results',
project_name='prediction',
)
# Define early stopping callbackearly_stopping=EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
# Create tf.data.Dataset for training and validationbatch_size=128# Adjust batch size as needed, may lead to less stable searches... defaults to 32train_dataset=tf.data.Dataset.from_tensor_slices((X_train, y_train)).batch(batch_size).prefetch(tf.data.experimental.AUTOTUNE)
val_dataset=tf.data.Dataset.from_tensor_slices((X_test, y_test)).batch(batch_size).prefetch(tf.data.experimental.AUTOTUNE)
# Perform hyperparameter searchtuner.search(train_dataset, epochs=100, validation_data=val_dataset, callbacks=[early_stopping])
Expected behavior
The tuner continues to tune over time without crashing.
Additional context
My Last successful trials trial.json (67):
Describe the bug
After running a bunch of tunings, I eventually get the following error:
Even if i restart the application to resume from my last checkpoint, it leads to this.
To Reproduce
My relevent code is as such:
Expected behavior
The tuner continues to tune over time without crashing.
Additional context
My Last successful trials
trial.json
(67):And a trial from the middle of the training, just in case:
All of my installed packages and versions (click to expand)
NVCC Version:
Nvidia driver 565.77 on a 3060
Arch linux on 6.6.64-1-lts x86_64 GNU/Linux
Would you like to help us fix it?
Not currently
The text was updated successfully, but these errors were encountered: