Asynchronous Successive Halving (ASHA)¶
Successive halving is an algorithm based on the multi-armed bandit methodology. The ASHA algorithm is a way to combine random search with principled early stopping in an asynchronous way. We highly recommend this blog post by the authors of this method: https://blog.ml.cmu.edu/2018/12/12/massively-parallel-hyperparameter-optimization/ .
[ ]:
import sherpa
import sherpa.algorithms.bayesian_optimization as bayesian_optimization
import keras
from keras.models import Sequential, load_model
from keras.layers import Dense, Flatten
from keras.datasets import mnist
from keras.optimizers import Adam
import tempfile
import os
import shutil
Dataset Preparation¶
[2]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train/255.0, x_test/255.0
Sherpa Setup¶
In this example we use \(R=9\) and \(\eta=3\). That means to obtain one finished configuration we will train 9 configurations for 1 epochs, pick 3 configurations of those and train for 3 more epochs, then pick one out of those and train for another 9 epochs. You can increase the max_finished_configs argument to do a larger search.
[20]:
parameters = [sherpa.Continuous('learning_rate', [1e-4, 1e-2], 'log'),
sherpa.Discrete('num_units', [32, 128]),
sherpa.Choice('activation', ['relu', 'tanh', 'sigmoid'])]
algorithm = alg = sherpa.algorithms.SuccessiveHalving(r=1, R=9, eta=3, s=0, max_finished_configs=1)
study = sherpa.Study(parameters=parameters,
algorithm=algorithm,
lower_is_better=False,
dashboard_port=8995)
INFO:sherpa.core:
-------------------------------------------------------
SHERPA Dashboard running. Access via
http://128.195.75.106:8995 if on a cluster or
http://localhost:8995 if running locally.
-------------------------------------------------------
Make a temporary directory to store model files in. Successive Halving tries hyperparameter configurations for bigger and bigger budgets (training epochs). Therefore, intermediate models have to be saved.
[21]:
model_dir = tempfile.mkdtemp()
Hyperparameter Optimization¶
Note: we manually infer the number of epochs that the model has trained for so we can give this information to Keras.
[22]:
for trial in study:
# Getting number of training epochs
initial_epoch = {1: 0, 3: 1, 9: 4}[trial.parameters['resource']]
epochs = trial.parameters['resource'] + initial_epoch
print("-"*100)
print(f"Trial:\t{trial.id}\nEpochs:\t{initial_epoch} to {epochs}\nParameters:{trial.parameters}\n")
if trial.parameters['load_from'] == "":
print(f"Creating new model for trial {trial.id}...\n")
# Get hyperparameters
lr = trial.parameters['learning_rate']
num_units = trial.parameters['num_units']
act = trial.parameters['activation']
# Create model
model = Sequential([Flatten(input_shape=(28, 28)),
Dense(num_units, activation=act),
Dense(10, activation='softmax')])
optimizer = Adam(lr=lr)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
else:
print(f"Loading model from: ", os.path.join(model_dir, trial.parameters['load_from']), "...\n")
# Loading model
model = load_model(os.path.join(model_dir, trial.parameters['load_from']))
# Train model
for i in range(initial_epoch, epochs):
model.fit(x_train, y_train, initial_epoch=i, epochs=i+1)
loss, accuracy = model.evaluate(x_test, y_test)
print("Validation accuracy: ", accuracy)
study.add_observation(trial=trial, iteration=i,
objective=accuracy,
context={'loss': loss})
study.finalize(trial=trial)
print(f"Saving model at: ", os.path.join(model_dir, trial.parameters['save_to']))
model.save(os.path.join(model_dir, trial.parameters['save_to']))
study.save(model_dir)
----------------------------------------------------------------------------------------------------
Trial: 1
Epochs: 0 to 1
Parameters:{'learning_rate': 0.0006779922111149317, 'num_units': 67, 'activation': 'tanh', 'resource': 1, 'rung': 0, 'load_from': '', 'save_to': '1'}
Creating new model for trial 1...
Epoch 1/1
60000/60000 [==============================] - 4s 72us/step - loss: 0.3451 - acc: 0.9059
10000/10000 [==============================] - 1s 53us/step
Validation accuracy: 0.9426
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/1
----------------------------------------------------------------------------------------------------
Trial: 2
Epochs: 0 to 1
Parameters:{'learning_rate': 0.0007322493943507595, 'num_units': 53, 'activation': 'sigmoid', 'resource': 1, 'rung': 0, 'load_from': '', 'save_to': '2'}
Creating new model for trial 2...
Epoch 1/1
60000/60000 [==============================] - 4s 71us/step - loss: 0.5720 - acc: 0.8661
10000/10000 [==============================] - 0s 47us/step
Validation accuracy: 0.9213
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/2
----------------------------------------------------------------------------------------------------
Trial: 3
Epochs: 0 to 1
Parameters:{'learning_rate': 0.00013292608500661002, 'num_units': 115, 'activation': 'tanh', 'resource': 1, 'rung': 0, 'load_from': '', 'save_to': '3'}
Creating new model for trial 3...
Epoch 1/1
60000/60000 [==============================] - 5s 80us/step - loss: 0.5496 - acc: 0.8571
10000/10000 [==============================] - 0s 48us/step
Validation accuracy: 0.9112
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/3
----------------------------------------------------------------------------------------------------
Trial: 4
Epochs: 1 to 4
Parameters:{'learning_rate': 0.0006779922111149317, 'num_units': 67, 'activation': 'tanh', 'save_to': '4', 'resource': 3, 'rung': 1, 'load_from': '1'}
Loading model from: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/1 ...
Epoch 2/2
60000/60000 [==============================] - 3s 50us/step - loss: 0.1818 - acc: 0.9473
10000/10000 [==============================] - 0s 46us/step
Validation accuracy: 0.9559
Epoch 3/3
60000/60000 [==============================] - 3s 51us/step - loss: 0.1353 - acc: 0.9617
10000/10000 [==============================] - 0s 39us/step
Validation accuracy: 0.9629
Epoch 4/4
60000/60000 [==============================] - 3s 52us/step - loss: 0.1074 - acc: 0.9687
10000/10000 [==============================] - 0s 22us/step
Validation accuracy: 0.9659
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/4
----------------------------------------------------------------------------------------------------
Trial: 5
Epochs: 0 to 1
Parameters:{'learning_rate': 0.0003139094199248622, 'num_units': 88, 'activation': 'sigmoid', 'resource': 1, 'rung': 0, 'load_from': '', 'save_to': '5'}
Creating new model for trial 5...
Epoch 1/1
60000/60000 [==============================] - 4s 68us/step - loss: 0.7136 - acc: 0.8431
10000/10000 [==============================] - 0s 49us/step
Validation accuracy: 0.9098
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/5
----------------------------------------------------------------------------------------------------
Trial: 6
Epochs: 0 to 1
Parameters:{'learning_rate': 0.0008001577665974275, 'num_units': 36, 'activation': 'sigmoid', 'resource': 1, 'rung': 0, 'load_from': '', 'save_to': '6'}
Creating new model for trial 6...
Epoch 1/1
60000/60000 [==============================] - 4s 59us/step - loss: 0.6274 - acc: 0.8588
10000/10000 [==============================] - 0s 48us/step
Validation accuracy: 0.9169
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/6
----------------------------------------------------------------------------------------------------
Trial: 7
Epochs: 0 to 1
Parameters:{'learning_rate': 0.003299640159323735, 'num_units': 63, 'activation': 'tanh', 'resource': 1, 'rung': 0, 'load_from': '', 'save_to': '7'}
Creating new model for trial 7...
Epoch 1/1
60000/60000 [==============================] - 4s 68us/step - loss: 0.2387 - acc: 0.9294
10000/10000 [==============================] - 1s 52us/step
Validation accuracy: 0.9521
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/7
----------------------------------------------------------------------------------------------------
Trial: 8
Epochs: 1 to 4
Parameters:{'learning_rate': 0.003299640159323735, 'num_units': 63, 'activation': 'tanh', 'save_to': '8', 'resource': 3, 'rung': 1, 'load_from': '7'}
Loading model from: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/7 ...
Epoch 2/2
60000/60000 [==============================] - 3s 52us/step - loss: 0.1209 - acc: 0.9641
10000/10000 [==============================] - 1s 52us/step
Validation accuracy: 0.961
Epoch 3/3
60000/60000 [==============================] - 3s 53us/step - loss: 0.0953 - acc: 0.9704
10000/10000 [==============================] - 0s 24us/step
Validation accuracy: 0.9667
Epoch 4/4
60000/60000 [==============================] - 3s 52us/step - loss: 0.0800 - acc: 0.9756
10000/10000 [==============================] - 0s 23us/step
Validation accuracy: 0.9679
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/8
----------------------------------------------------------------------------------------------------
Trial: 9
Epochs: 0 to 1
Parameters:{'learning_rate': 0.0025750610635902832, 'num_units': 48, 'activation': 'sigmoid', 'resource': 1, 'rung': 0, 'load_from': '', 'save_to': '9'}
Creating new model for trial 9...
Epoch 1/1
60000/60000 [==============================] - 4s 62us/step - loss: 0.3477 - acc: 0.9065
10000/10000 [==============================] - 1s 54us/step
Validation accuracy: 0.9421
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/9
----------------------------------------------------------------------------------------------------
Trial: 10
Epochs: 0 to 1
Parameters:{'learning_rate': 0.0025240507488864423, 'num_units': 124, 'activation': 'tanh', 'resource': 1, 'rung': 0, 'load_from': '', 'save_to': '10'}
Creating new model for trial 10...
Epoch 1/1
60000/60000 [==============================] - 5s 85us/step - loss: 0.2297 - acc: 0.9303
10000/10000 [==============================] - 1s 58us/step
Validation accuracy: 0.9644
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/10
----------------------------------------------------------------------------------------------------
Trial: 11
Epochs: 1 to 4
Parameters:{'learning_rate': 0.0025240507488864423, 'num_units': 124, 'activation': 'tanh', 'save_to': '11', 'resource': 3, 'rung': 1, 'load_from': '10'}
Loading model from: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/10 ...
Epoch 2/2
60000/60000 [==============================] - 5s 78us/step - loss: 0.1079 - acc: 0.9670
10000/10000 [==============================] - 1s 63us/step
Validation accuracy: 0.971
Epoch 3/3
60000/60000 [==============================] - 5s 77us/step - loss: 0.0761 - acc: 0.9764
10000/10000 [==============================] - 0s 28us/step
Validation accuracy: 0.9731
Epoch 4/4
60000/60000 [==============================] - 4s 73us/step - loss: 0.0599 - acc: 0.9811
10000/10000 [==============================] - 0s 30us/step
Validation accuracy: 0.9692
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/11
----------------------------------------------------------------------------------------------------
Trial: 12
Epochs: 4 to 13
Parameters:{'learning_rate': 0.0025240507488864423, 'num_units': 124, 'activation': 'tanh', 'save_to': '12', 'resource': 9, 'rung': 2, 'load_from': '11'}
Loading model from: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/11 ...
Epoch 5/5
60000/60000 [==============================] - 5s 77us/step - loss: 0.0466 - acc: 0.9850
10000/10000 [==============================] - 1s 60us/step
Validation accuracy: 0.973
Epoch 6/6
60000/60000 [==============================] - 4s 74us/step - loss: 0.0416 - acc: 0.9866
10000/10000 [==============================] - 0s 27us/step
Validation accuracy: 0.9726
Epoch 7/7
60000/60000 [==============================] - 5s 75us/step - loss: 0.0354 - acc: 0.9884
10000/10000 [==============================] - 0s 27us/step
Validation accuracy: 0.9744
Epoch 8/8
60000/60000 [==============================] - 4s 72us/step - loss: 0.0292 - acc: 0.9908
10000/10000 [==============================] - 0s 28us/step
Validation accuracy: 0.9739
Epoch 9/9
60000/60000 [==============================] - 4s 73us/step - loss: 0.0286 - acc: 0.9905
10000/10000 [==============================] - 0s 27us/step
Validation accuracy: 0.974
Epoch 10/10
60000/60000 [==============================] - 4s 72us/step - loss: 0.0245 - acc: 0.9919
10000/10000 [==============================] - 0s 27us/step
Validation accuracy: 0.9725
Epoch 11/11
60000/60000 [==============================] - 4s 72us/step - loss: 0.0233 - acc: 0.9916
10000/10000 [==============================] - 0s 31us/step
Validation accuracy: 0.9714
Epoch 12/12
60000/60000 [==============================] - 4s 72us/step - loss: 0.0203 - acc: 0.9935
10000/10000 [==============================] - 0s 28us/step
Validation accuracy: 0.972
Epoch 13/13
60000/60000 [==============================] - 4s 74us/step - loss: 0.0195 - acc: 0.9934
10000/10000 [==============================] - 0s 27us/step
Validation accuracy: 0.9727
Saving model at: /var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/12
The best found hyperparameter configuration is:
[23]:
study.get_best_result()
[23]:
{'Iteration': 6,
'Objective': 0.9744,
'Trial-ID': 12,
'activation': 'tanh',
'learning_rate': 0.0025240507488864423,
'load_from': '11',
'loss': 0.08811961327217287,
'num_units': 124,
'resource': 9,
'rung': 2,
'save_to': '12'}
This model is stored at:
[24]:
print(os.path.join(model_dir, study.get_best_result()['save_to']))
/var/folders/5v/l788ch2j7tg0q0y1rt04c08w0000gn/T/tmpa7vbw5xz/12
To remove the model directory:
[25]:
# Remove model_dir
shutil.rmtree(model_dir)