sequd.pybatdoe

sequd.pybatdoe.batch_grid

class pybatdoe.batch_grid.GridSearch(para_space, max_runs=100, estimator=None, cv=None, scoring=None, refit=True, n_jobs=None, random_state=0, verbose=False)[source]

Implementation of grid search.

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • n_jobs (int or None, optional, optional, default=None) – Number of jobs to run in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. See the package joblib for details.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default = False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import GridSearch
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=0, shuffle=True)
>>> clf = GridSearch(ParaSpace, max_runs=100, estimator=estimator, cv=cv,
             scoring=None, n_jobs=None, refit=False, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

grid search is not recommend for high dimensional hyperparameter tunning. As it is limited by the max_run specified by the user, the grid points may be badly distributed.

sequd.pybatdoe.batch_rand

class pybatdoe.batch_rand.RandSearch(para_space, max_runs=100, estimator=None, cv=None, scoring=None, refit=True, n_jobs=None, random_state=0, verbose=False)[source]

Implementation of Random Search.

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • n_jobs (int or None, optional, optional, default=None) – Number of jobs to run in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. See the package joblib for details.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import RandSearch
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=0, shuffle=True)
>>> clf = RandSearch(ParaSpace, max_runs=100, estimator=estimator, cv=cv,
             scoring=None, n_jobs=None, refit=False, rand_seed=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

sequd.pybatdoe.batch_lhs

class pybatdoe.batch_lhs.LHSSearch(para_space, max_runs=100, estimator=None, cv=None, scoring=None, refit=True, n_jobs=None, random_state=0, verbose=False)[source]

Implementation of Latin Hypercube Sampling.

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • n_jobs (int or None, optional, optional, default=None) – Number of jobs to run in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. See the package joblib for details.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import LHSSearch
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=0, shuffle=True)
>>> clf = LHSSearch(ParaSpace, max_runs=100, estimator=estimator, cv=cv,
             scoring=None, n_jobs=None, refit=False, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

sequd.pybatdoe.batch_sobol

class pybatdoe.batch_sobol.SobolSearch(para_space, max_runs=100, estimator=None, cv=None, scoring=None, refit=True, n_jobs=None, random_state=0, verbose=False)[source]

Implementation of Sobol Sequence.

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • n_jobs (int or None, optional, optional, default=None) – Number of jobs to run in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. See the package joblib for details.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import SobolSearch
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=0, shuffle=True)
>>> clf = SobolSearch(ParaSpace, max_runs=100, estimator=estimator, cv=cv,
             scoring=None, n_jobs=None, refit=False, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator=None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator = None or refit=False.

sequd.pybatdoe.batch_ud

class pybatdoe.batch_ud.UDSearch(para_space, max_runs=100, max_search_iter=100, estimator=None, cv=None, scoring=None, refit=True, n_jobs=None, random_state=0, verbose=False)[source]

Implementation of Uniform Design.

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • max_search_iter (int, optional, default=100) – The maximum number of iterations used to generate uniform design or augmented uniform design.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • n_jobs (int or None, optional, optional, default=None) – Number of jobs to run in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. See the package joblib for details.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import UDSearch
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=0, shuffle=True)
>>> clf = UDSearch(ParaSpace, max_runs=100, max_search_iter=100, estimator=estimator, cv=cv,
             scoring=None, n_jobs=None, refit=False, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

sequd.pybayopt

sequd.pybayopt.bayopt_gpei

class pybayopt.bayopt_gpei.GPEIOPT(para_space, max_runs=100, time_out=10, estimator=None, cv=None, scoring=None, refit=True, random_state=0, verbose=False)[source]

Interface of Gaussian Process - Expected Improvement (Bayesian Optimization).

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • time_out (float, optional, default=10) – The time out threshold (in seconds) for generating the next run.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default = None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import GPEIOPT
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=0, shuffle=True)
>>> clf = GPEIOPT(ParaSpace, max_runs=100, time_out=10,
            estimator=estimator, cv=cv, scoring=None, refit=None, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

sequd.pybayopt.bayopt_smac

class pybayopt.bayopt_smac.SMACOPT(para_space, max_runs=100, estimator=None, cv=None, scoring=None, refit=True, random_state=0, verbose=False)[source]

Interface of SMAC (Bayesian Optimization).

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import SMACOPT
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=0, shuffle=True)
>>> clf = SMACOPT(ParaSpace, max_runs=100,
            estimator=estimator, cv=cv, scoring=None, refit=None, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

sequd.pybayopt.bayopt_tpe

class pybayopt.bayopt_tpe.TPEOPT(para_space, max_runs=100, estimator=None, cv=None, scoring=None, refit=True, random_state=0, verbose=False)[source]

Interface of Hyperopt (Bayesian Optimization).

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import TPEOPT
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=0, shuffle=True)
>>> clf = TPEOPT(ParaSpace, max_runs=100, estimator=estimator, cv=cv, scoring=None, refit=None, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

sequd.pysequd

sequd.pysequd.seqrand

class pysequd.seqrand.SeqRand(para_space, n_runs_per_stage=20, max_runs=100, n_jobs=None, estimator=None, cv=None, scoring=None, refit=True, random_state=0, verbose=False)[source]

Implementation of random search in sequential version.

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • n_runs_per_stage (int, optional, default=20) – The positive integer which represent the number of levels in generating uniform design.

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • n_jobs (int or None, optional, optional, default=None) – Number of jobs to run in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. See the package joblib for details.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import SeqRand
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> Level_Number = 20
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=1, shuffle=True)
>>> clf = SeqRand(ParaSpace, n_runs_per_stage=20, max_runs=100, n_jobs=None,
             estimator=None, cv=None, scoring=None, refit=None, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

fit(x, y=None)[source]

Run fit with all sets of parameters.

Parameters
  • x (array, shape = [n_samples, n_features]) – input variales.

  • y (array, shape = [n_samples] or [n_samples, n_output], optional) – target variable.

fmax(wrapper_func)[source]

Search the optimal value of a function.

Parameters

func (callable function) – the function to be optimized.

plot_scores()[source]

Visualize the scores history.

sequd.pysequd.snto

class pysequd.snto.SNTO(para_space, n_runs_per_stage=20, max_runs=100, max_search_iter=100, n_jobs=None, estimator=None, cv=None, scoring=None, refit=True, random_state=0, verbose=False)[source]

Implementation of SNTO.

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • n_runs_per_stage (int, optional, default=20) – The positive integer which represent the number of levels in generating uniform design.

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • max_search_iter (int, optional, default=100) – The maximum number of iterations used to generate uniform design or augmented uniform design.

  • n_jobs (int or None, optional, optional, default=None) – Number of jobs to run in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. See the package joblib for details.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default = None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import SNTO
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=1, shuffle=True)
>>> clf = SNTO(ParaSpace, n_runs_per_stage=20, max_runs=100, max_search_iter=100, n_jobs=None,
             estimator=None, cv=None, scoring=None, refit=None, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

fit(x, y=None)[source]

Run fit with all sets of parameters.

Parameters
  • x (array, shape = [n_samples, n_features]) – input variales.

  • y (array, shape = [n_samples] or [n_samples, n_output], optional) – target variable.

fmax(wrapper_func)[source]

Search the optimal value of a function.

Parameters

func (callable function) – the function to be optimized.

plot_scores()[source]

Visualize the scores history.

sequd.pysequd.sequd

class pysequd.sequd.SeqUD(para_space, n_runs_per_stage=20, max_runs=100, max_search_iter=100, n_jobs=None, estimator=None, cv=None, scoring=None, refit=True, random_state=0, verbose=False)[source]

Implementation of sequential uniform design.

Parameters
  • para_space (dict or list of dictionaries) –

    It has three types:

    Continuous:

    Specify Type as continuous, and include the keys of Range (a list with lower-upper elements pair) and Wrapper, a callable function for wrapping the values.

    Integer:

    Specify Type as integer, and include the keys of Mapping (a list with all the sortted integer elements).

    Categorical:

    Specify Type as categorical, and include the keys of Mapping (a list with all the possible categories).

  • n_runs_per_stage (int, optional, default=20) – The positive integer which represent the number of levels in generating uniform design.

  • max_runs (int, optional, default=100) – The maximum number of trials to be evaluated. When this values is reached, then the algorithm will stop.

  • max_search_iter (int, optional, default=100) – The maximum number of iterations used to generate uniform design or augmented uniform design.

  • n_jobs (int or None, optional, optional, default=None) – Number of jobs to run in parallel. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. See the package joblib for details.

  • estimator (estimator object) – This is assumed to implement the scikit-learn estimator interface.

  • cv (cross-validation method, an sklearn object.) – e.g., StratifiedKFold and KFold` is used.

  • scoring (string, callable, list/tuple, dict or None, optional, default=None) – A sklearn type scoring function. If None, the estimator’s default scorer (if available) is used. See the package sklearn for details.

  • refit (boolean, or string, optional, default=True) – It controls whether to refit an estimator using the best found parameters on the whole dataset.

  • random_state (int, optional, default=0) – The random seed for optimization.

  • verbose (boolean, optional, default=False) – It controls whether the searching history will be printed.

>>> import numpy as np
>>> from sklearn import svm
>>> from sklearn import datasets
>>> from sequd import SeqUD
>>> from sklearn.model_selection import KFold
>>> iris = datasets.load_iris()
>>> ParaSpace = {'C':{'Type': 'continuous', 'Range': [-6, 16], 'Wrapper': np.exp2},
           'gamma': {'Type': 'continuous', 'Range': [-16, 6], 'Wrapper': np.exp2}}
>>> estimator = svm.SVC()
>>> cv = KFold(n_splits=5, random_state=1, shuffle=True)
>>> clf = SeqUD(ParaSpace, n_runs_per_stage=20, max_runs=100, max_search_iter=100, n_jobs=None,
             estimator=None, cv=None, scoring=None, refit=None, random_state=0, verbose=False)
>>> clf.fit(iris.data, iris.target)
Variables
  • best_score_ (float) – The best average cv score among the evaluated trials.

  • best_params_ (dict) – Parameters that reaches best_score_.

  • best_estimator_ (sklearn estimator) – The estimator refitted based on the best_params_. Not available if estimator = None or refit=False.

  • search_time_consumed_ (float) – Seconds used for whole searching procedure.

  • refit_time_ (float) – Seconds used for refitting the best model on the whole dataset. Not available if estimator=None or refit=False.

fit(x, y=None)[source]

Run fit with all sets of parameters.

Parameters
  • x (array, shape = [n_samples, n_features]) – input variales.

  • y (array, shape = [n_samples] or [n_samples, n_output], optional) – target variable.

fmax(wrapper_func)[source]

Search the optimal value of a function.

Parameters

func (callable function) – the function to be optimized.

plot_scores()[source]

Visualize the scores history.