API¶
ML-Ensemble estimators behave identically to Scikit-learn estimators, with one main difference: to properly instantiate an ensemble, at least on layer, and if applicable a meta estimator, must be added to the ensemble. Otherwise, there is no ensemble to estimate. The difference can be summarized as follows.
# sklearn API
estimator = Estimator()
estimator.fit(X, y)
# mlens API
ensemble = Ensemble().add(list_of_estimators).add_meta(estimator)
ensemble.fit(X, y)
Ensemble estimators¶
SuperLearner ([folds, shuffle, random_state, ...]) |
Super Learner class. |
Subsemble ([partitions, partition_estimator, ...]) |
Subsemble class. |
BlendEnsemble ([test_size, shuffle, ...]) |
Blend Ensemble class. |
SequentialEnsemble ([shuffle, random_state, ...]) |
Sequential Ensemble class. |
Model Selection¶
Evaluator (scorer[, cv, shuffle, ...]) |
Model selection across several estimators and preprocessing pipelines. |
Preprocessing¶
EnsembleTransformer ([shuffle, random_state, ...]) |
Ensemble Transformer class. |
Subset ([subset]) |
Select a subset of features. |
Visualization¶
corrmat (corr[, figsize, annotate, inflate, ...]) |
Function for generating color-coded correlation triangle. |
clustered_corrmap (corr, cls[, ...]) |
Function for plotting a clustered correlation heatmap. |
corr_X_y (X, y[, top, figsize, fontsize, ...]) |
Function for plotting input feature correlations with output. |
pca_plot (X, estimator[, y, cmap, figsize, ...]) |
Function to plot a PCA analysis of 1, 2, or 3 dims. |
pca_comp_plot (X[, y, figsize, title, ...]) |
Function for comparing PCA analysis. |
exp_var_plot (X, estimator[, figsize, ...]) |
Function to plot the explained variance using PCA. |
For developers¶
The following base classes are good starting points for building new ensembles. You may want to study the source code directly.
Indexers¶
IdTrain ([size]) |
Container to identify training set. |
BlendIndex ([test_size, train_size, X, ...]) |
Indexer that generates two non-overlapping subsets of X . |
FoldIndex ([n_splits, X, raise_on_exception]) |
Indexer that generates the full size of X . |
SubsetIndex ([n_partitions, n_splits, X, ...]) |
Subsample index generator. |
FullIndex ([X]) |
Vacuous indexer to be used with final layers. |
ClusteredSubsetIndex (estimator[, ...]) |
Clustered Subsample index generator. |
Estimation routines¶
ParallelProcessing (caller) |
Parallel processing engine. |
ParallelEvaluation (caller) |
Parallel cross-validation engine. |
Stacker (job, layer) |
Stacked fit sub-process class. |
Blender (job, layer) |
Blended fit sub-process class. |
SubStacker (job, layer) |
Stacked subset fit sub-process class. |
SingleRun (job, layer) |
Single run fit sub-process class. |
Evaluation (evaluator) |
Evaluation engine. |
BaseEstimator (layer) |
Base class for estimating a layer in parallel. |