Preprocessing

Learn.jl provides MinMaxScaler, StandardScaler, PCA, and FastICA for preprocessing. Currently the only function for preprocessing is fit_transform!. Its use is the same for all types of preprocessors.

Function

fit_transform!{T<:AbstractFloat}(pre::Preprocessor, X::Matrix{T})

Fit the preprocessor pre to the inputs X and transform X.

Parameters:
  • pre – The preprocessor object encapsulating parameters for the estimator. This parameter will be modified by the function.
  • X – X assumes rows for observations and columns as features.

MinMaxScaler

The min-max-scaler scales all features independently to a given range, which is [0, 1] by default.

MinMaxScaler() = MinMaxScaler(0.0, 1.0)
mms = MinMaxScaler(10.0, 50.0)
X_mms = fit_transform!(mms, X)
@test all(minimum(X_mms, 1) .== 10.0)
@test all(maximum(X_mms, 1) .== 50.0)
@test X != X_mms

StandardScaler

The standard-scaler, also called z-score normalization, scales all features independently to zero mean and unit variance.

ss = StandardScaler()
X_ss = fit_transform!(ss, X)

PCA

PCA wraps the implementation in MultivariateStats.

PCA(;n_components::Union{Void, Int}=nothing)
pca = PCA()
X_pca = fit_transform!(pca, X)

FastICA

FastICA wraps the implementation in MultivariateStats.

FastICA(;n_components::Noint=nothing, whiten::Bool=false, max_iter::Noint=nothing, tol::Nofloat=nothing)
ica = FastICA(;n_components=2, whiten=true)
X_ica = fit_transform!(ica, X)