sherpa package

Submodules

sherpa.algorithms module

SHERPA is a Python library for hyperparameter tuning of machine learning models. Copyright (C) 2018 Lars Hertel, Peter Sadowski, and Julian Collado.

This file is part of SHERPA.

SHERPA is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

SHERPA is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with SHERPA. If not, see <http://www.gnu.org/licenses/>.

sherpa.core module

SHERPA is a Python library for hyperparameter tuning of machine learning models. Copyright (C) 2018 Lars Hertel, Peter Sadowski, and Julian Collado.

This file is part of SHERPA.

SHERPA is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

SHERPA is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with SHERPA. If not, see <http://www.gnu.org/licenses/>.

class sherpa.core.AlgorithmState[source]

Bases: object

Used internally to signal the sherpa._Runner class when to wait or when algorithm is done.

DONE = 'DONE'
WAIT = 'WAIT'
class sherpa.core.Choice(name, range)[source]

Bases: sherpa.core.Parameter

Choice parameter class.

sample()[source]
class sherpa.core.Continuous(name, range, scale='linear')[source]

Bases: sherpa.core.Parameter

Continuous parameter class.

sample()[source]
class sherpa.core.Discrete(name, range, scale='linear')[source]

Bases: sherpa.core.Parameter

Discrete parameter class.

sample()[source]
class sherpa.core.Ordinal(name, range)[source]

Bases: sherpa.core.Parameter

Ordinal parameter class. Categorical, ordered variable.

sample()[source]
class sherpa.core.Parameter(name, range)[source]

Bases: object

Defines a hyperparameter with a name, type and associated range.

Parameters:
  • name (str) – the parameter name.
  • range (list) – either [low, high] or [value1, value2, value3].
  • scale (str) – linear or log, defines sampling from linear or log-scale. Not defined for all parameter types.
static from_dict(config)[source]

Returns a parameter object according to the given dictionary config.

Parameters:config (dict) – parameter config.

Example:

{'name': '<name>',
 'type': '<continuous/discrete/choice>',
 'range': [<value1>, <value2>, ... ],
 'scale': <'log' to sample continuous/discrete from log-scale>}
Returns:the parameter range object.
Return type:sherpa.core.Parameter
static grid(parameter_grid)[source]

Creates a list of parameters given a parameter grid.

Parameters:parameter_grid (dict) – Dictionary mapping hyperparameter names lists of possible values.

Example

{'parameter_a': [aValue1, aValue2, ...],
 'parameter_b': [bValue1, bValue2, ...],
 ...}
Returns:list of parameter ranges for SHERPA.
Return type:list[sherpa.core.Parameter]
class sherpa.core.Study(parameters, algorithm, lower_is_better, stopping_rule=None, dashboard_port=None, disable_dashboard=False, output_dir=None)[source]

Bases: object

The core of an optimization.

Includes functionality to get new suggested trials and add observations for those. Used internally but can also be used directly by the user.

Parameters:
  • parameters (list[sherpa.core.Parameter]) – a list of parameter ranges.
  • algorithm (sherpa.algorithms.Algorithm) – the optimization algorithm.
  • lower_is_better (bool) – whether to minimize or maximize the objective.
  • stopping_rule (sherpa.algorithms.StoppingRule) – algorithm to stop badly performing trials.
  • dashboard_port (int) – the port for the dashboard web-server, if None the first free port in the range 8880 to 9999 is found and used.
  • disable_dashboard (bool) – option to not run the dashboard.
  • output_dir (str) – directory path for CSV results.
  • random_seed (int) – seed to use for NumPy random number generators throughout.
add_observation(trial, objective, iteration=1, context={})[source]

Add a single observation of the objective value for a given trial.

Parameters:
  • trial (sherpa.core.Trial) – trial for which an observation is to be added.
  • iteration (int) – iteration number e.g. epoch.
  • objective (float) – objective value.
  • context (dict) – other metrics or values to record.
add_trial(trial)[source]

Adds a trial into queue for next suggestion.

Trials added via this method forego the suggestions made by the algorithm and are returned by the get_suggestion method on a first in first out basis.

Parameters:trial (sherpa.core.Trial) – the trial to be enqueued.
finalize(trial, status='COMPLETED')[source]

Once a trial will not add any more observations it must be finalized with this function.

Parameters:
  • trial (sherpa.core.Trial) – trial that is completed.
  • status (str) – one of ‘COMPLETED’, ‘FAILED’, ‘STOPPED’.
get_best_result()[source]

Retrieve the best result so far.

Returns:row of the best result.
Return type:pandas.DataFrame
get_suggestion()[source]

Obtain a new suggested trial.

This function wraps the algorithm that was passed to the study.

Returns:a parameter suggestion.
Return type:dict
keras_callback(trial, objective_name, context_names=[])[source]

Keras Callbacks to add observations to study

Parameters:
  • trial (sherpa.core.Trial) – trial to send metrics for.
  • objective_name (str) – the name of the objective e.g. loss, val_loss, or any of the submitted metrics.
  • context_names (list[str]) – names of all other metrics to be monitored.
static load_dashboard(path)[source]

Loads a study from an output dir without the algorithm.

Parameters:path (str) – the path to the output dir.
Returns:
the study running the dashboard, note that
currently this study cannot be used to continue the optimization.
Return type:sherpa.core.Study
next()[source]
save(output_dir=None)[source]

Stores results to CSV and attributes to config file.

Parameters:output_dir (str) – directory to store CSV to, only needed if Study output_dir is not defined.
should_trial_stop(trial)[source]

Determines whether given trial should stop.

This function wraps the stopping rule provided to the study.

Parameters:trial (sherpa.core.Trial) – trial to be evaluated.
Returns:decision.
Return type:bool
class sherpa.core.Trial(id, parameters)[source]

Bases: object

Represents one parameter-configuration here referred to as one trial.

Parameters:
  • id (int) – the Trial ID.
  • parameters (dict) – parameter-name, parameter-value pairs.
class sherpa.core.TrialStatus[source]

Bases: object

COMPLETED = 'COMPLETED'
FAILED = 'FAILED'
INTERMEDIATE = 'INTERMEDIATE'
STOPPED = 'STOPPED'
sherpa.core.optimize(parameters, algorithm, lower_is_better, scheduler, command=None, filename=None, output_dir='./output_20200731-053036', max_concurrent=1, db_port=None, stopping_rule=None, dashboard_port=None, resubmit_failed_trials=False, verbose=1, load=False, mongodb_args={}, disable_dashboard=False)[source]

Runs a Study with a scheduler and automatically runs a database in the background.

Parameters:
  • algorithm (sherpa.algorithms.Algorithm) – takes results table and returns parameter set.
  • parameters (list[sherpa.core.Parameter]) – parameters being optimized.
  • lower_is_better (bool) – whether lower objective values are better.
  • command (str) – the command to run for the trial script.
  • filename (str) – the filename of the script to run. Will be run as “python <filename>”.
  • output_dir (str) – where scheduler and database files will be stored.
  • scheduler (sherpa.schedulers.Scheduler) – a scheduler.
  • max_concurrent (int) – the number of trials that will be evaluated in parallel.
  • db_port (int) – port to run the database on.
  • stopping_rule (sherpa.algorithms.StoppingRule) – rule for stopping trials prematurely.
  • dashboard_port (int) – port to run the dashboard web-server on.
  • resubmit_failed_trials (bool) – whether to resubmit a trial if it failed.
  • verbose (int, default=1) – whether to print submit messages (0=no, 1=yes).
  • load (bool) – option to load study, currently not fully implemented.
  • mongodb_args (dict[str, any]) – arguments to MongoDB beyond port, dir, and log-path. Keys are the argument name without “–”.
sherpa.core.run_dashboard(path)[source]

Run the dashboard from a previously run optimization.

Parameters:path (str) – the output dir of the previous optimization.

sherpa.database module

SHERPA is a Python library for hyperparameter tuning of machine learning models. Copyright (C) 2018 Lars Hertel, Peter Sadowski, and Julian Collado.

This file is part of SHERPA.

SHERPA is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

SHERPA is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with SHERPA. If not, see <http://www.gnu.org/licenses/>.

class sherpa.database.Client(host=None, port=None, test_mode=False, **mongo_client_args)[source]

Bases: object

Registers a session with a Sherpa Study via the port of the database.

This function is called from trial-scripts only.

Variables:
  • host (str) – the host that runs the database. Passed host, host set via environment variable or ‘localhost’ in that order.
  • port (int) – port that database is running on. Passed port, port set via environment variable or 27010 in that order.
get_trial()[source]

Returns the next trial from a Sherpa Study.

Returns:The trial to run.
Return type:sherpa.core.Trial
keras_send_metrics(trial, objective_name, context_names=[])[source]

Keras Callbacks to send metrics to SHERPA.

Parameters:
  • trial (sherpa.core.Trial) – trial to send metrics for.
  • objective_name (str) – the name of the objective e.g. loss, val_loss, or any of the submitted metrics.
  • context_names (list[str]) – names of all other metrics to be monitored.
send_metrics(trial, iteration, objective, context={})[source]

Sends metrics for a trial to database.

Parameters:
  • trial (sherpa.core.Trial) – trial to send metrics for.
  • iteration (int) – the iteration e.g. epoch the metrics are for.
  • objective (float) – the objective value.
  • context (dict) – other metric-values.

sherpa.schedulers module

SHERPA is a Python library for hyperparameter tuning of machine learning models. Copyright (C) 2018 Lars Hertel, Peter Sadowski, and Julian Collado.

This file is part of SHERPA.

SHERPA is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

SHERPA is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with SHERPA. If not, see <http://www.gnu.org/licenses/>.

class sherpa.schedulers.LocalScheduler(submit_options='', output_dir='', resources=None)[source]

Bases: sherpa.schedulers.Scheduler

Runs jobs locally as a subprocess.

Parameters:
  • submit_options (str) – options appended before the command.
  • resources (list[str]) – list of resources that will be passed as SHERPA_RESOURCE environment variable. If no resource is available ‘’ will be passed.
get_status(job_id)[source]

Obtains the current status of the job.

Parameters:job_id (str) – identifier returned when submitting the job.
Returns:the job-status.
Return type:sherpa.schedulers._JobStatus
kill_job(job_id)[source]

Kills a given job.

Parameters:job_id (str) – identifier returned when submitting the job.
submit_job(command, env={}, job_name='')[source]

Submits a job to the scheduler.

Parameters:
  • command (list[str]) – components to the command to run by the scheduler e.g. ["python", "train.py"]
  • env (dict) – environment variables to pass to the job.
  • job_name (str) – this specifies a name for the job and its output directory.
Returns:

a job ID, used for getting the status or killing the job.

Return type:

str

class sherpa.schedulers.SGEScheduler(submit_options, environment, output_dir='')[source]

Bases: sherpa.schedulers.Scheduler

Submits jobs to SGE, can check on their status, and kill jobs.

Uses drmaa Python library. Due to the way SGE works it cannot distinguish between a failed and a completed job.

Parameters:
  • submit_options (str) – command line options such as queue -q, or -P for project, all written in one string.
  • environment (str) – the path to a file that contains environment variables; will be sourced before job is run.
  • output_dir (str) – path to directory in which stdout and stderr will be written to. If not specified this will use the same as defined for the study.
get_status(job_id)[source]
Parameters:job_ids (str) – SGE process ID.
Returns:The job status.
Return type:sherpa.schedulers._JobStatus
kill_job(job_id)[source]

Kills a job submitted to SGE.

Parameters:job_id (str) – the SGE process ID of the job.
submit_job(command, env={}, job_name='')[source]

Submits a job to the scheduler.

Parameters:
  • command (list[str]) – components to the command to run by the scheduler e.g. ["python", "train.py"]
  • env (dict) – environment variables to pass to the job.
  • job_name (str) – this specifies a name for the job and its output directory.
Returns:

a job ID, used for getting the status or killing the job.

Return type:

str

class sherpa.schedulers.SLURMScheduler(submit_options, environment, output_dir='')[source]

Bases: sherpa.schedulers.Scheduler

Submits jobs to SLURM, can check on their status, and kill jobs.

Uses drmaa Python library.

Parameters:
  • submit_options (str) – command line options such as queue -q, all written in one string.
  • environment (str) – the path to a file that contains environment variables; will be sourced before job is run.
  • output_dir (str) – path to directory in which stdout and stderr will be written to. If not specified this will use the same as defined for the study.
get_status(job_id)[source]
Parameters:job_ids (str) – SLURM process ID.
Returns:The job status.
Return type:sherpa.schedulers._JobStatus
kill_job(job_id)[source]

Kills a job submitted to SLURM.

Parameters:job_id (str) – the SLURM process ID of the job.
submit_job(command, env={}, job_name='')[source]

Submits a job to the scheduler.

Parameters:
  • command (list[str]) – components to the command to run by the scheduler e.g. ["python", "train.py"]
  • env (dict) – environment variables to pass to the job.
  • job_name (str) – this specifies a name for the job and its output directory.
Returns:

a job ID, used for getting the status or killing the job.

Return type:

str

class sherpa.schedulers.Scheduler[source]

Bases: object

The job scheduler gives an API to submit jobs, retrieve statuses of specific jobs, and kill a job.

get_status(job_id)[source]

Obtains the current status of the job.

Parameters:job_id (str) – identifier returned when submitting the job.
Returns:the job-status.
Return type:sherpa.schedulers._JobStatus
kill_job(job_id)[source]

Kills a given job.

Parameters:job_id (str) – identifier returned when submitting the job.
submit_job(command, env={}, job_name='')[source]

Submits a job to the scheduler.

Parameters:
  • command (list[str]) – components to the command to run by the scheduler e.g. ["python", "train.py"]
  • env (dict) – environment variables to pass to the job.
  • job_name (str) – this specifies a name for the job and its output directory.
Returns:

a job ID, used for getting the status or killing the job.

Return type:

str

Module contents

SHERPA is a Python library for hyperparameter tuning of machine learning models. Copyright (C) 2018 Lars Hertel, Peter Sadowski, and Julian Collado.

This file is part of SHERPA.

SHERPA is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

SHERPA is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with SHERPA. If not, see <http://www.gnu.org/licenses/>.