Mapper to normalize features (Z-scoring).
Z-scoring can be done chunk-wise (with independent mean and standard deviation per chunk) or on the full data. It is possible to specify a sample attribute, unique value of which would then be used to determine the chunks.
By default, Z-scoring parameters (mean and standard deviation) are estimated from the data (either chunk-wise or globally). However, it is also possible to define fixed parameters (again a global setting or per-chunk definitions), or to select a specific subset of samples from which these parameters should be estimated.
If necessary, data is upcasted into a configurable datatype to prevent information loss.
Notes
It should be mentioned that the mapper can be used for forward-mapping of datasets without prior training (it will auto-train itself upon first use). It is, however, not possible to map plain data arrays without prior training. Also, for obvious reasons, it is also not possible to perform chunk-wise Z-scoring of plain data arrays.
Reverse-mapping is currently not implemented.
Available conditional attributes:
(Conditional attributes enabled by default suffixed with +)
Parameters : | params : None or tuple(mean, std) or dict
param_est : None or tuple(attrname, attrvalues)
chunks_attr : str or None
dtype : Numpy dtype, optional
enable_ca : None or list of str
disable_ca : None or list of str
auto_train : bool
force_train : bool
space: str, optional :
postproc : Node instance, optional
descr : str
|
---|