mlens.utils package¶
Submodules¶
Module contents¶
ML-ENSEMBLE
author: | Sebastian Flennerhag |
---|---|
copyright: | 2017 |
licence: | MIT |
-
mlens.utils.
check_inputs
(X, y=None, check_level=0)[source]¶ Pre-checks on input arrays X and y.
Checks input data according to
check_level
to ensure format is roughly in line with what a typical estimator expects.If
check_level = 0
this test is turned off.Parameters: - X (nd-array, list or sparse matrix) – Input data.
- y (nd-array, list or sparse matrix) – Labels.
- check_level (int (default = 2)) –
level of strictness in checking input arrays.
check_level = 0
no checks, returns X, ycheck_level
= 1 will raises warnings if any non-critical test fails. Returns boolean FAIL flag.check_level = 2
will impose Scikit-learn array check, which convertsX
andy
to numpy arrays and raises error if conversion fails.
Returns: - FAIL (fail flag, optional) – boolean for whether any test failed. Returned if
check_level = 1
- X_converted (numpy array, optional) – The converted and validated X. Returned if
check_level = 2
- y_converted (numpy array, optional) – The converted and validated y. Returned if
check_level = 2
. - random_state (object, optional) – numpy RandomState object.
-
mlens.utils.
check_instances
(instances)[source]¶ Helper to ensure all instances are named.
Check if
instances
is formatted as expected, and if not convert formatting or throw traceback error if impossible to anticipate formatting.Parameters: instances (iterable) – instance iterable to test. Returns: formatted – formatted instances
object. Will be formatted as a dict if preprocessing cases are detected, otherwise as a list. The dict will contain lists identical to those in the single preprocessing case. Each list is of the form[('name', instance]
and no names overlap.Return type: list or dict Raises: LayerSpecificationError : – Raises error if formatting fails, which is most likely due to wrong ordering of tuple entries, or wrong argument in the wrong position.
-
mlens.utils.
check_is_fitted
(estimator, attr)[source]¶ Check that ensemble has been fitted.
Parameters: - estimator (estimator instance) – ensemble instance to check.
- attr (str) – attribute to assert existence of.
-
mlens.utils.
check_ensemble_build
(inst, attr='layers')[source]¶ Check that layers have been instantiated.
-
mlens.utils.
assert_correct_format
(estimators, preprocessing)[source]¶ Initial check to assert layer can be constructed.
-
mlens.utils.
check_initialized
(inst)[source]¶ Check if a ParallelProcessing instance is initialized properly.
-
mlens.utils.
safe_print
(*objects, **kwargs)[source]¶ Safe print function for backwards compatibility.
-
class
mlens.utils.
CMLog
(verbose=False)[source]¶ Bases:
object
CPU and Memory logger.
Class for starting a monitor job of CPU and memory utilization in the background in a Python script. The
monitor
class records thecpu_percent
,rss
andvms
as collected by the psutil library for the parent process’ pid.CPU usage and memory utilization are stored as attributes in numpy arrays.
Examples
>>> from time import sleep >>> from mlens.utils.utils import CMLog >>> cm = CMLog(verbose=True) >>> cm.monitor(2, 0.5) >>> _ = [i for i in range(10000000)] >>> >>> # Collecting before completion triggers a message but no error >>> cm._collect() >>> >>> sleep(2) >>> cm._collect() >>> print('CPU usage:') >>> cm.cpu [CMLog] Monitoring for 2 seconds with checks every 0.5 seconds. [CMLog] Job not finished. Cannot _collect yet. [CMLog] Collecting... done. Read 4 lines in 0.000 seconds. CPU usage: array([ 50. , 22.4, 6. , 11.9])
Raises: ImportError : – Depends on psutil. If not installed, raises ImportError on instantiation. Parameters: verbose (bool) – whether to notify of job start. -
collect
()[source]¶ Collect monitored data.
Once a monitor job finishes, call
_collect
to read the CPU and memory usage into python objects in the current process. If called before the job finishes, _collect issues a print statement to try again later, but no warning or error is raised.
-
monitor
(stop=None, ival=0.1, kill=True)[source]¶ Start monitoring CPU and memory usage.
Parameters: - stop (float or None (default = None)) – seconds to monitor for. If None, monitors until
_collect
is called. - ival (float (default=0.1)) – interval of monitoring.
- kill (bool (default = True)) – whether to kill the monitoring job if
_collect
is called before timeout (stop
). If set to False, calling_collect
will cause the instance to wait until the job completes.
- stop (float or None (default = None)) – seconds to monitor for. If None, monitors until
-