Search Space¶
When not defined explicitly, make_experiment
will use search_space_general
as its search space, which is defined as follows
search_space_general = GeneralSearchSpaceGenerator(n_estimators=200)
Define Search Space¶
To use a specific search space, one can change the parameter search_space
when calling make_experiment
. Taking defining the max_depth
as 20 for xgboost
as an example:
from hypergbm import make_experiment
from hypergbm.search_space import GeneralSearchSpaceGenerator
my_search_space = \
GeneralSearchSpaceGenerator(n_estimators=200, xgb_init_kwargs={'max_depth': 20})
train_data = ...
experiment = make_experiment(train_data,
search_space=my_search_space,
...)
If you want to use searchable parameters, we recommend doing this by defining a subclass of GeneralSearchSpaceGenerator
. For example, if we want the algorithm to search among 3 choices of the max_depth
for xgboost
:
from hypergbm import make_experiment
from hypergbm.search_space import GeneralSearchSpaceGenerator
from hypernets.core.search_space import Choice
class MySearchSpace(GeneralSearchSpaceGenerator):
@property
def default_xgb_init_kwargs(self):
return { **super().default_xgb_init_kwargs,
'max_depth': Choice([10, 20 ,30]),
}
my_search_space = MySearchSpace()
train_data = ...
experiment = make_experiment(train_data,
search_space=my_search_space,
...)
Support Machine Learning Models¶
HyperGBM has already supported XGBoost, LightGBM, CatBoost, and HistGradientBoosting. They are taken as components of the Search Space to be searched for training a model. Supporting other machine learning algorithms can be done by following 3 steps:
Encapsulating your algorithms as a subclass of HyperEstimator
Add the encapsulated algorithms to the search sapce and define the search parameters
Use your Search Space in
make_experiment
Please see the following example:
from sklearn import svm
from hypergbm import make_experiment
from hypergbm.estimators import HyperEstimator
from hypergbm.search_space import GeneralSearchSpaceGenerator
from hypernets.core.search_space import Choice, Int, Real
from hypernets.tabular.datasets import dsutils
class SVMEstimator(HyperEstimator):
def __init__(self, fit_kwargs, C=1.0, kernel='rbf', gamma='auto', degree=3, random_state=666, probability=True,
decision_function_shape=None, space=None, name=None, **kwargs):
if C is not None:
kwargs['C'] = C
if kernel is not None:
kwargs['kernel'] = kernel
if gamma is not None:
kwargs['gamma'] = gamma
if degree is not None:
kwargs['degree'] = degree
if random_state is not None:
kwargs['random_state'] = random_state
if decision_function_shape is not None:
kwargs['decision_function_shape'] = decision_function_shape
kwargs['probability'] = probability
HyperEstimator.__init__(self, fit_kwargs, space, name, **kwargs)
def _build_estimator(self, task, kwargs):
if task == 'regression':
hsvm = SVMRegressorWrapper(**kwargs)
else:
hsvm = SVMClassifierWrapper(**kwargs)
hsvm.__dict__['task'] = task
return hsvm
class SVMClassifierWrapper(svm.SVC):
def fit(self, X, y=None, **kwargs):
return super().fit(X, y)
class SVMRegressorWrapper(svm.SVC):
def fit(self, X, y=None, **kwargs):
return super().fit(X, y)
class GeneralSearchSpaceGeneratorPlusSVM(GeneralSearchSpaceGenerator):
def __init__(self, enable_svm=True, **kwargs):
super(GeneralSearchSpaceGeneratorPlusSVM, self).__init__(**kwargs)
self.enable_svm = enable_svm
@property
def default_svm_init_kwargs(self):
return {
'C': Real(0.1, 5, 0.1),
'kernel': Choice(['rbf', 'poly', 'sigmoid']),
'degree': Int(1, 5),
'gamma': Real(0.0001, 5, 0.0002)
}
@property
def default_svm_fit_kwargs(self):
return {}
@property
def estimators(self):
r = super().estimators
if self.enable_svm:
r['svm'] = (SVMEstimator, self.default_svm_init_kwargs, self.default_svm_fit_kwargs)
return r
my_search_space = GeneralSearchSpaceGeneratorPlusSVM()
train_data = dsutils.load_blood()
experiment = make_experiment(train_data, target='Class',
search_space=my_search_space)
estimator = experiment.run()
print(estimator)