Use HyperGBM with Python
HyperGBM is developed with Python. We recommend using the Python tool make_experiment to create experiment and train the model.
The basic steps for training the model with make_experiment are as follows:
Prepare the dataset(pandas or dask DataFrame)
Create experiment with make_experiment
Call the .run() method of experiment to performing training and get the model
Predict with trained model or save it with the Python tool pickle
Prepare the dataset
Both pandas and dask can be loaded depending on your task types to get DataFrame for training the model.
Taking loading the sklearn dataset breast_cancer as an example,one can get the dataset by following several procedures:
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
X,y = datasets.load_breast_cancer(as_frame=True,return_X_y=True)
X_train,X_test,y_train,y_test = train_test_split(X,y,train_size=0.7,random_state=335)
train_data = pd.concat([X_train,y_train],axis=1)
where train_data is used for model trianing while X_test are y_test used for evaluating the model.
Create experiment with make_experiment
Users can creating experiment for the prepared dataset and start training the model following procedures below:
from hypergbm import make_experiment
experiment = make_experiment(train_data, target='target', reward_metric='precision')
estimator = experiment.run()
where estimator is the trianed model.
Save the model
It is recommended to save the model with pickle:
import pickle
with open('model.pkl','wb') as f:
pickle.dump(estimator, f)
Evaluate the model
The model can be evaluated with tools provided by sklearn:
from sklearn.metrics import classification_report
y_pred=estimator.predict(X_test)
print(classification_report(y_test, y_pred, digits=5))
output:
precision recall f1-score support
0 0.96429 0.93103 0.94737 58
1 0.96522 0.98230 0.97368 113
accuracy 0.96491 171
macro avg 0.96475 0.95667 0.96053 171
weighted avg 0.96490 0.96491 0.96476 171
More info:
Please refer to the docstring of make_experiment for more information about it:
print(make_experiment.__doc__)
If you are using Notebook or IPython, the following code can provide more information about make_experiment:
make_experiment?