Version 0.2.3

We add the following new features to this version:

Data cleaning
- Support automatically recognizing categorical columns among features with numerical datatypes
- Support performing data cleaning with several specific columns reserved
Feature generation
- Support datatime, text and Latitude and Longitude features
- Support distributed training
Modelling algorithms
- XGBoost：Change distributed training from dask_xgboost to xgboost.dask to be compatible with official website of XGBoost
- LightGBM：Support distributed trianing for more machines
Model training
- Support reproducing the searching process
- Support searching with low fidelity
- Predicting learning curves based on statistical information
- Support hyperparameter optimizing without making modification
- Time limit of EarlyStopping is now adjusted to the whole experiment life-cycle
- Support defining pos_label
- eval-set supports Dask dataset for distributed training
- Optimizing the cache strategy for model training
Search algorithms
- Add GridSearch algorithm
- Add Playback algorithm
Advanced Features
- Add feature selection with various strategies for the first stage
- Feature selection for the second stage now supports more strategies
- Pseudo-label supports various data selection strategies and multi-class classification
- Optimizing performance of concepts drift handling
- Add cache mechanism during processing of advanced features
Visualization
- Experiment information visualization
- Training process visualization
Command Line tool
- Most features of experiments for model training are now supported by command line tools
- Support model evaluating
- Support model predicting